Search tips
Search criteria 


Logo of jgimedspringer.comThis journalToc AlertsSubmit OnlineOpen Choice
J Gen Intern Med. 1998 August; 13(8): 534–540.
PMCID: PMC1497006

Analyzing the Time and Value of Housestaff Inpatient Work

Timothy R Dresselhaus, MD, MPH,1,2 Jeff Luck, PhD, MBA,2,3 Brian C Wright,1 Roger G Spragg, MD,1 Martin L Lee, PhD,2 and Samuel A Bozzette, MD, PhD1,2,3



To determine time allocation and the perceived value to education and patient care of the weekday activities of internal medicine housestaff on inpatient rotations and to compare the work activities of interns and residents.


An observational study. We classified activities along five dimensions (association, location, activity, time, and value), developed a computer-assisted self-interview survey, and demonstrated its face and content validity, internal consistency, and interrater reliability. Subjects were assigned survey computers for 5 consecutive weekdays over a 24-week period, into which they entered data when prompted several times a day.


The medical service of a university-affiliated Veterans Administration Medical Center.


Sixty housestaff (36 interns, 24 residents) rotating on the inpatient wards.


We analyzed activities according to content (direct patient care, indirect patient care, education), association, and location. Likert-scale ratings of perceived value to education and patient care were also obtained. Housestaff provided complete responses to 3,812 (95%) of 3,992 prompts by a median of 11 seconds; 93% of responses were logically consistent across the measured dimensions. Housestaff spent more time in indirect patient care (56%) than in direct patient care (14%) or educational activities (45%). Formal educational activities had the highest educational value (66 on 0–100 scale), and direct care had the highest value to patient care (81). Over 30% of time was spent in administrative activities, which had low educational value (40). Compared with residents, interns allocated significantly less time to educational activities (38% vs 57%) and more time to lower-value activities such as documentation (19% vs 12%).


Improved data collection methods demonstrate that housestaff in our program, particularly interns, spend much of their workday in activities that are low in educational and patient care value. Selective elimination or delegation of such activities would preserve higher-value experiences during reductions in overall inpatient training time. Planners can use automated random sampling to guide the rational redesign of housestaff work.

Keywords: work sampling, time study, housestaff, computer

Political and economic forces are changing the inpatient work of housestaff. The Libby Zion case first drew public attention to residents' hours, resulting in legislation to curtail work hours.1 Accrediting and professional organizations have also called for work limits to balance the requirements of patient service, education, and residents' personal lives.2, 3 As academic medical centers seek to contain costs, training is shifting from the more expensive hospital setting to outpatient clinics, leaving less time for inpatient training.4, 5 Initiatives to reduce the perceived physician surplus and correct the specialist-generalist imbalance by regulating postgraduate training positions threaten to limit further resources available for hospital-based training activities.6, 7

In response, the value of remaining inpatient training experiences must be maximized to meet educational objectives and support patient care. This can be accomplished if reductions in ward time focus on increasing housestaff efficiency by delegating or eliminating activities of low marginal value. To identify such activities, reliable measurements are needed of housestaff time allocation and the value of work activities.

Previous studies of housestaff time allocation have used one of three methods: time diaries,8 time-and-motion analysis,916 or work sampling.17, 18 Retrospective time diaries may inaccurately describe time allocation owing to biased recall.8 Time-and-motion analysis using trained observers is an expensive strategy for measuring behaviors and is biased by the intrusion of observers.17 Work sampling has been successfully applied to housestaff using pagers and written self-report, proving a relatively inexpensive and potentially less biased approach.17, 18

By and large, these time studies have described time allocation but not assessed the value of activities. Those studies addressing value have relied on the external judgment of observers or the retrospective input of experts.13, 15, 16 Handheld computers now make it possible to perform real-time measurements of multiple dimensions of housestaff work, including value, by combining work sampling and computer-assisted self-interview. Compared with paper-and-pencil data entry, these instruments provide superior accuracy, simplified data collection, interactive logic, and reliability checks.19

Using automated random sampling, our objective was to measure time allocation and the perceived value of the inpatient work activities of housestaff and to compare the activities of interns and residents. We performed an observational study in which housestaff were provided handheld computers and randomly prompted during weekdays to complete a work survey. In addition, a separate reliability study to assess interobserver agreement was performed by pairing housestaff subjects with medical student observers.


Instrument Development

We used expert judgment to classify physician work by five dimensions: who(association), where(location), when(time), what(activity type), and why(appropriateness and value of task). A work survey was developed based on this conceptual framework. Who refers to association with others and was defined by a list of 11 possible contacts. Where refers to the location of an activity and was defined by 12 discrete sites. When refers to the time at which an activity occurs. To characterize activities (what ), we developed a hierarchical, branching questionnaire in which questions progressed from lesser to greater specificity.

Why incorporates three dimensions of perceived value: value of the activity in terms of housestaff education, value to patient care, and appropriateness of the activity for housestaff performance. Value to education and patient care was measured using Likert-scale questions. To determine appropriateness, the instrument asked housestaff to define alternative personnel when deemed more appropriate to perform the activity.

We programmed the survey into the Psion Series 3A handheld computer. In addition to administering the survey and recording responses, the computer was programmed to record the time of the prompt as well as the time and duration of the response. The computer prompted housestaff to complete a “sign-on” dialogue at the beginning and a “sign-off” dialogue at the end of each day to specify the start and end time of work.

Study Design

Reliability Study.

To assess interobserver reliability, medical housestaff rotating on the inpatient wards of the San Diego Veterans Affairs (VA) Medical Center, an affiliate of the University of California, San Diego, were randomly paired with fourth-year medical students. Housestaff subjects and student observers were given identical computers, though value questions were deleted from the observers' instruments. The subjects' computers were programmed to prompt randomly during consecutive 48-minute intervals, cueing both members of the subject-observer pair to complete the work survey. Observations were made weekdays between 8:00 amand 5:00 pm. The subject-observer pairs were together for 3 consecutive days, with one exception in which an intern was replaced owing to work schedule. Observers were instructed not to talk to housestaff. Housestaff received $25 and observers $75 for each day of participation. Housestaff and students gave informed consent prior to participation.

Main Study.

Interns (postgraduate year-1 [PGY-1]) and residents (PGY-2, PGY-3) rotating on the wards of the San Diego VA Medical Center were enrolled as subjects. The study was conducted over a 24-week period midpoint in the academic year when housestaff were familiar with their roles. Subjects were drawn from one of four ward teams. Two teams were staffed by one resident and two interns. The other two teams were assigned one resident and four interns; these four interns functioned as two pairs of interns, with each member of a pair alternating between the inpatient and outpatient setting every 4 days.

We randomly selected subjects with representation of interns and residents proportionate to their numbers on the wards (2:1). Participants were assigned survey computers for 5 consecutive weekdays. Housestaff typically participated for 2 of the 4 weeks of the inpatient rotation. For paired interns, the computer was carried by the intern assigned to the ward. Computers were programmed to prompt randomly within consecutive intervals of 75 minutes (for residents) or 90 minutes (for interns), sampling frequencies determined to be acceptable to housestaff based on pilot data. We sampled interns less intensively because their schedules were perceived to be busier.

Observations were made during weekdays between 6:00 amand 8:00 pm. Observations between 1:00 pmand 5:00 pmon assigned clinic days were excluded. Housestaff gave informed consent and received $10 for each day of participation.

Data Analysis

Reliability Study.

To determine the promptness of responses, we measured the elapsed time between the prompt and the responses of housestaff subjects and student observers (response times). To determine the response burden, we measured the time required to complete the work survey (completion times). The distributions of subjects' and observers' response times and completion times were compared using the Wilcoxon Ranked-Sum Test.

To assess interobserver agreement, we generated κ statistics for the dimensions of association, location, and activity type.20 For the dimension of time, we determined the difference between the house officer's and observer's response time for each paired response. A Student's t test was performed to test the hypothesis that the mean difference in response time was zero.

Main Study.

We determined the median length of the work day by performing a Kaplan-Meier time-to-event analysis of each house officer day. The beginning of the day was the earlier of two times: the start time as entered in the sign-on dialogue or the first completed response. The end of the day was the earlier of two times: the time entered in the sign-off dialogue or 8 pm. If the sign-off dialogue was incomplete, the day was censored at the time of the last completed response. Days shortened by clinic were excluded from the determination of the length of the workday.

The unit of analysis was the house officer day; however, the unit of time sampled was a week. We therefore examined these data for a cluster effect using a hierarchical (nested) analysis of variance.

To analyze work activities, we defined five primary categories reflecting the fundamental content of housestaff work: (1) direct patient care (activity involving the patient); (2) indirect patient care (activity supporting patient care but without patient contact); (3) education (activity designed to be educational or involving a work-related interaction with a supervisor or colleague); (4) personal (activity unrelated to work); (5) transit (between activities). These categories are mutually exclusive with the exception of education, which overlaps with patient care (direct or indirect). For example, attending rounds could simultaneously support both patient care and education.

We next calculated the proportion of time spent in activities. First determining the number of responses in a given category for each day (the unit of analysis), a proportion was then calculated by dividing by the number of prompts on that particular day. These proportions were then pooled across all housestaff days by a weighted average scheme, where the weight for a given day was the number of prompts for that day: that is,

equation image

where wi= number of prompts for day i, pi= proportion of responses for a given category on day i, and n= number of house officer days. A 95% confidence interval for p was based on the usual large sample normal approximation:

equation image

Proportions were also determined in the same fashion separately for interns and residents, and statistically compared using the large sample z test for two proportions.21 Similar proportions were generated for activity location and association.

For activity content, we calculated the mean score for the perceived educational value and value to patient care, converting Likert-scale ratings (9-point scale) to a 0 to 100 scale. Differences in ratings of value between interns and residents were statistically compared using the Mann-Whitney u Test. Overall differences in ratings of value between categories (direct patient care, indirect patient care, education) were statistically compared using the Kruskal-Wallis Test. Values of p < .05 were regarded as significant.

The reproducibility of perceived value scores was determined by identifying and pairing activities that recurred within the same day for a given subject and calculating the proportion of paired scores that were within one Likert-scale unit of the other.

As another measure of reliability, we evaluated the consistency of all complete responses by prospectively determining the appropriate range of locations, associations, and times for each activity. These criteria were used to develop a computer program that applied a logic check to each response.


Reliability Study

Thirteen housestaff (4 residents, 9 interns) were paired with 12 medical student observers for 36 total days of observation. During this period, housestaff were prompted 341 times. Of these prompts, 98% resulted in complete responses by housestaff and 97% in complete responses by observers. The time to respond to prompts was similar for housestaff (median 12 seconds; 95% within 57 seconds) and observers (median 13 seconds; 95% within 68 seconds). Housestaff took longer to complete each response (median 54 seconds; 95% within 209 seconds) than observers (median 46 seconds; 95% within 110 seconds;p= .0001), possibly because the work survey presented to housestaff was longer.

A total of 326 paired observations were obtained. Interobserver agreement was greatest for the dimension of location, with 91% exact agreement and a κ of 0.82, indicating much greater than chance agreement. Proportion of responses in exact agreement was lower but similar for association (69.9%) and activity type (69.6%). The κ values for association (0.64) and activity type (0.49) suggest good and moderate agreement, respectively.20 However, the κ statistic for activity type certainly underestimates agreement owing to the large number of possible response categories (nearly 2,000) and its neglect of the many paired responses to this complex survey instrument with substantial but not exact agreement.

The mean difference between subjects' and observers' response times was 0.49 seconds. This did not differ significantly from 0 (t test: p= .78). The median absolute difference in response times was 5 seconds (95% within 31 seconds), confirming close subject-observer agreement in the timing of responses.

Main Study

In this study, we provided 60 housestaff (36 interns, 24 residents) with survey computers, sampling their activities over 589 weekdays (Table 1) On average, interns participated 10.8 days and residents 8.3 days. The median length of nonclinic workdays was 11.0 hours for interns and 9.9 hours for residents. Residents were prompted more times each day (mean 7.2) than interns (mean 6.6) because of the shorter time interval between prompts for residents.

Table 1
Table MainStudy: Responses of Housestaff Subjects

Complete responses were obtained from 3,812 (95%) of 3,992 prompts. The median time to respond (11 seconds; 95% within 99 seconds) and the median duration of time to complete the survey (53 seconds; 95% within 193 seconds) were similar to those in the reliability study.

The 3,812 complete responses were evaluated for their consistency by applying a logic check. Of these, 3,530 (93%) passed the check, indicating a coherent relation between the activity being performed and the identified location, time, and contacts.

Ratings of value for paired activities recurring on the same day were highly reproducible. For educational value, the proportion of paired responses in agreement was 81% (direct patient care), 66% (indirect patient care), and 67% (educational activities). For value to patient care, the agreement of paired responses was 77% (direct patient care), 62% (indirect patient care), and 70% (educational activities).

Overall, housestaff spent the most of their day in indirect patient care (56.3%) (Table 2) About 45% of the time was spent in educational activities and 15% in direct patient care. Interns spent more time performing documentation than residents (19.0% vs 11.7%;p < .05). Residents allocated proportionately more time than interns to educational activities overall (57.3% vs 38.4%;p < .001), to discussion of patient issues (30.8% vs 19.0%;p < .001), to informal communication with physicians (13.3% vs 7.3%;p < .05), and to attending rounds (11.4% vs 6.5%;p < .05).

Table 2
Time Allocation and Value by Activity Content for Inpatient Work*

Housestaff assigned a higher mean score to the perceived educational value of educational activities (66) than to direct patient care (59) or indirect patient care (48) (Table 2;p < .0001). The subcategories with the lowest educational scores were indirect patient care activities such as documentation (39), discharge planning (33), and initiating consultations (34). Compared with interns, residents gave higher ratings of educational value to indirect patient care overall (53 vs 45;p < .001) and to ordering tests or obtaining test results (51 vs 38;p < .05), while interns gave higher ratings to attending rounds (75 vs 68;p < .05) and informal educational activities (80 vs 66;p < .05).

Conversely, housestaff gave higher patient value ratings to direct patient care (81) relative to indirect patient care (71) or educational activities (68) (Table 2;p < .0001). When compared with interns' scores of patient value, residents gave higher scores to direct patient care overall (85 vs 79;p < .05), other patient examinations (87 vs 76;p < .05), and work rounds (81 vs 72;p < .05).

When asked who was the most appropriate person to perform the activity, housestaff identified themselves in 87% of instances. About 9% of activities were deemed appropriate for other physicians to perform. Housestaff identified nonphysicians as more appropriate to the task in less than 5% of instances. Less than 1% of reported activities were felt to be inappropriate for anyone to perform.

In general, housestaff were with physicians and students almost half the time (48.2%) (Table 3) Residents spent proportionately more time than interns with other housestaff (48.3% vs 31.8%;p < .001) and other physicians such as chief residents and fellows (13.4% vs 6.8%;p < .05). Interns spent more time alone than did residents (42.2% vs 31.1%;p < .05). Overall, only 12% of time was spent with patients or their families.

Table 3
Time Allocation by Association

The majority of time was spent in one of four locations: the ward (50.9%), conference rooms (14.3%), offices (13.9%), or hallways (7.4%). Interns spent almost twice as much time on the wards than did residents (58.6% vs 37.2%;p < .001).

Although a week was the unit of time sampled, the unit of analysis was the house officer day. In examining these data for a cluster effect, we found no significantly greater variation in the data from week to week than in the daily data.


We have applied automated random sampling to perform real-time measurements of housestaff time allocation and the value of work activities. These data support the feasibility and reliability of this approach, and also provide important insights into the daytime activities of housestaff, which may inform efforts to redesign the inpatient work of medical interns and residents.

This method of work sampling was not excessively burdensome to housestaff and provided virtually complete and mostly verifiable information on work performed during the time intervals sampled. Short completion times and high complete response rates indicate that the response burden was modest and that responses to a complex but interactive computer instrument can be efficiently obtained.19 The short elapsed times between prompts and responses confirm a close relation between the prompt and the time of actual data entry, a relation that has been uncertain in previous work-sampling studies.17, 18 The reliability of automated sampling is further indicated by the high degree of interobserver agreement between housestaff subjects and medical student observers and also the consistency of responses against an external logic check.

Although the main results may not generalize to all training programs, we believe they reflect patterns typical of the inpatient experiences of housestaff in many internal medicine residencies. These data reveal, for instance, that housestaff spend the greatest part of their day in indirect patient care, despite the fact that these activities receive the lowest ratings for educational value and lower ratings for patient care value than direct patient care. Nearly one third of interns' time is devoted to documentation and to ordering tests or obtaining test results, activities given low scores for educational value.

Though time interacting with patients received higher value ratings than indirect patient care, housestaff spend less than 15% of their time in these activities. Of time allocated to direct patient care, most is dedicated to the initial history and physical examination. It may be inferred that, over the course of a patient's hospitalization, time spent with the patient diminishes substantially.

We observe that interns allocate less time than residents to activities that simultaneously support both patient care and education. If the proportions for all activities are summed, the total for interns is 109% and for residents 129%. The difference of 20% is explained by the greater overlap between patient care (direct or indirect) and education in the work activities of residents. There thus appears to be a richer content to residents' activities compared with interns' activities.

This analysis corroborates the findings of previous time studies, which have also shown that housestaff spend little time with patients and substantial time in administrative and educational activities.13, 18 However, the explanatory power of these studies is more limited owing to their measurement of no more than two dimensions of housestaff work. These studies also dichotomize activities as education or patient care, neglecting the dual content of many activities, which enhances value to both education and patient care.

What are the implications of these results for the reform of residency training as inpatient time is increasingly constrained? If value is to be preserved while continuing to meet educational objectives and support patient care, low-value activities must be reduced or delegated and high-value activities maintained. For instance, simplifying required documentation or delegating some of these tasks to other kinds of workers might reduce time dedicated to low-value administrative tasks. Valuable time with patients might be increased by shifting indirect activities (e.g., work rounds, attending rounds) to the bedside, thereby simultaneously enhancing both patient care and education. Programs might integrate housestaff tasks with those of other disciplines to strengthen collaboration, reduce solitude, and increase efficiency.

We believe that some housestaff work could be performed by nonphysicians. For instance, nearly 40% of interns' time is allocated to documentation, testing and procedures, discharge planning, initiating consultations, and miscellaneous administrative tasks. If half of these activities were delegated to others, interns' inpatient time could be substantially reduced at no cost to the higher-value activities. This is consistent with the findings of Knickman et al., who concluded that at least 20% of housestaff activities could be done by nonphysicians.13

In our study, only 13% of activities were seen as delegable by housestaff, and most of these to other housestaff ! This may be due to housestaff's difficulty reconceptualizing their work, particularly in real time. In order to decide which activities may be delegated to nonphysicians or eliminated altogether, more systematic methods must be employed, such as total quality management or work reengineering. For either of these, time studies provide the raw data for rethinking current models of training and envisioning alternative approaches.

This study has provided our training program with the kind of detailed, multidimensional information needed to respond to the increasing constraints on inpatient time. It has increased our understanding of the work experiences of housestaff and provided insight into the value of those experiences. Further research is needed to evaluate housestaff work across types of training programs and differing systems of care, which, in turn, would inform local and national initiatives to enhance the value of training experiences. For such evaluations, automated random sampling may be the method of choice for obtaining the reliable information required to guide the redesign of house officer work.

Registration Period: September 1, 1998 – December 1, 1998

Examination Dates: August 24–25, 1999

Registration Period: July 1, 1998 – November 1, 1998

Examination Dates: April 16, 1999

Important Note: The 1999 Sports Medicine Examination is the last one for which Diplomates may qualify through a practice pathway.

For more information and application forms, please contact:

Registration Section American Board of Internal Medicine 510 Walnut Street, Suite 1700 Philadelphia, PA 19106-3699 Telephone: (800) 441-2246 or (215) 446-3500 Fax: (215) 446-3590 E-mail: gro.miba@tseuqer


Support for this work was provided by the Western Region of the Veterans Administration (Ambulatory Care and Education Initiative, 94-04), the Veterans Administration Center for the Study of Provider Behavior, and the RAND Graduate School. Dr. Bozzette is a Senior Research Associate of the HSR&D Service, Department of Veterans Affairs.

The authors acknowledge the housestaff (Department of Medicine, University of California, San Diego) and medical students (University of California, San Diego) who contributed by their participation; Sandra Berry (Rand Survey Group) and Dr. Craig Scott (Department of Medical Education, University of Washington), who made many helpful suggestions; as well as Ming-Ming Wang, MS, who provided programming support.


1. Asch DA, Parker RM. The Libby Zion case: one step forward, or two steps backward? N Engl J Med. 1988;318:771–5. [PubMed]
2. American Medical Association Directory of Graduate Medical Education Programs . Special Requirements for Residency Training Programs in Internal Medicine. Chicago, Ill: ACGME; 1994.
3. American College of Physicians Working conditions and supervision for residents in internal medicine: recommendations. Ann Intern Med. 1989;110:657–63. [PubMed]
4. Inglehart JK. Rapid changes for academic medical centers (first of two parts) N Engl J Med. 1994;331:1391–5. [PubMed]
5. Inglehart JK. Rapid changes for academic medical centers (second of two parts) N Engl J Med. 1995;332:407–11. [PubMed]
6. Shine KI. Freeze the number of Medicare-subsidized graduate medical education positions. JAMA. 1995;273:1057–8. [PubMed]
7. Epstein AM. US teaching hospitals in the evolving health care system. JAMA. 1995;273:1203–7. [PubMed]
8. Oddone E, Guarisco S, Simel D. Comparison of housestaffs' estimates of their workday activities with results of a random work-sampling study. Acad Med. 1993;68:859–61. [PubMed]
9. Payson HE, Gaenslen EC, Stargardter FL. Time study of an internship on a university medical service. N Engl J Med. 1961;264:439–43. [PubMed]
10. Gillanders W, Heiman M. Time study comparisons of 3 intern programs. J Med Educ. 1971;46:142–9. [PubMed]
11. Lurie NL, Rank B, Parenti C, Woolley T, Snoke W. How do house officers spend their nights? A time study of internal medicine house staff on call. N Engl J Med. 1989;320:1673–7. [PubMed]
12. Nerenz D, Rosman H, Newcomb C, et al. The on-call experience of interns in internal medicine. Arch Intern Med. 1990;150:2294–7. [PubMed]
13. Knickman JR, Lipkin M, Finkler SA, et al. The potential for using non-physicians to compensate for the reduced availability of residents. Acad Med. 1992;67:429–38. [PubMed]
14. Parenti C, Lurie N. Are things different in the light of day? A time study of internal medicine housestaff days. Am J Med. 1993;94:654–8. [PubMed]
15. Raimondi AJ. Analysis (time study) of service and education in a neurosurgery residency program. Neurosurgery. 1978;2:213–6. [PubMed]
16. Gledhill T, McDermott E, Clark CG. The educational value of being a house surgeon. Med Educ. 1985;19:305–7. [PubMed]
17. Brock DM, Scott CS, Pendergrass T, MacDonald SC. Sampling clinicians' activities using electronic pagers. Eval Health Prof. 1990;13:315–42. [PubMed]
18. Guarisco S, Oddone E, Simel D. Time analysis of a general medicine service: results from a random work sampling study. J Gen Intern Med. 1994;9:272–7. [PubMed]
19. Forster D, Behrens RH, Campbell H, Byass P. Evaluation of a computerized field data collection system for health surveys. Bull World Health Organ. 1991;69:107–11. [PubMed]
20. Altman DG. Practical Statistics for Medical Research. London, UK: Chapman and Hall; 1991. pp. 404–8.
21. Dixon WJ, Massey FJ. Introduction to Statistical analysis. 4th ed. New York, NY: McGraw-Hill Book Co.; 1983.

Articles from Journal of General Internal Medicine are provided here courtesy of Society of General Internal Medicine