Search tips
Search criteria 


Logo of amjepidLink to Publisher's site
Am J Epidemiol. 2010 August 1; 172(3): 327–333.
Published online 2010 June 23. doi:  10.1093/aje/kwq111
PMCID: PMC2917054

Linking the Iowa Women's Health Study Cohort to Medicare Data: Linkage Results and Application to Hip Fracture


This study linked the Iowa Women's Health Study cohort to Medicare administrative data and assessed the value of using Medicare and survey-based sources to study hip fracture incidence. The authors used Social Security number to combine the Iowa Women's Health Study cohort Medicare enrollment and claims data for 1986–2004. Hip fractures were identified from Medicare and follow-up-mail, survey-based sources. Estimates of hip fracture incidence after age 65 years and postfracture mortality were compared. The authors were able to match to Medicare 99.2% of the 40,978 Iowa Women's Health Study participants who survived to age 65 years. Although both Medicare and survey-based hip fracture incidence showed the expected positive association with age and negative association with body mass index, hip fracture incidence was considerably underestimated by self-report (2.61 per 1,000 person-years of observation vs. 4.20 per 1,000 person-years of observation from Medicare-based estimates). Similarly, 1-year postfracture mortality was significantly underestimated by survey-based measures (1% vs. 14% for Medicare-based estimates). Medicare data are an outstanding source of health care information to supplement for older cohorts that have identifiers such as Social Security numbers. These data are useful for studying clinically unambiguous and high morbidity and mortality conditions. They enable less-biased collection of health data.

Keywords: cohort studies, data collection, hip fractures, Medicare

Since the mid-1970s, the National Institutes of Health has sponsored many large, population-based cohort studies that examine the relation between baseline risk factors and development of acute and chronic diseases. With growing knowledge of risk-factor/disease associations, interest in expanding the scientific uses of the cohorts is increasing. At the same time, the challenges and costs of direct follow-up continue to rise over time. Identifying efficient follow-up mechanisms for established cohorts will expand their scientific utility.

The Iowa Women's Health Study (IWHS) is a prospective cohort of 41,836 postmenopausal women. At baseline in 1986, the women were aged 55–69 years; by 1996, all surviving women had become eligible for Medicare (age 65 years). The baseline mailed questionnaire assessed diet, lifestyle, anthropometrics, family and personal medical history, and reproductive history (1), and 5 mailed follow-up questionnaires over 18 years built on these baseline data. Linking the IWHS cohort with Medicare enrollment and claims data offers an important opportunity to study many more health outcomes without increasing the survey response burden. Medicare provides electronic records of payments for a wide range of inpatient (since 1986) and outpatient (since 1991) health care, with the notable exceptions of prescription drugs (a benefit available in 2006) and long-term care. The data are an established population-based source of detailed health information on the elderly (25). The combination of the Medicare data and the population-based Surveillance, Epidemiology, and End Results (SEER) Program cancer registries’ data is the basis for many seminal papers on cancer outcomes (69).

Linking cohorts to administrative data is particularly valuable for studies of endpoints associated with high levels of morbidity or mortality because the completeness of the administrative data reduces loss to follow-up. Likewise, as cohorts age, challenges associated with response burden, cognitive decline, and institutionalization limit the completeness of survey-based follow-up measures. In such cases, external measures, such as health care claims, provide opportunities for validation or direct measurement of these endpoints. Finally, without alternate sources to measure endpoints, it may not be possible to evaluate new risk factor–outcome associations because of the increased recall bias that would result from survey-based assessments of long-prior events.

In this paper, we evaluate the success of linking the IWHS cohort with Medicare data and demonstrate the potential of these linked data. We use hip fractures to illustrate the value of the linkage and the issues that may pose challenges to using linked data. Hip fractures are relatively common among older persons, at 9.2 hospitalizations per 1,000 women in 2006 (10). In 1994, Baron et al. (11) reported that hip fracture claims had a high degree of internal consistency between hospital and physician reports. Hip fractures are also associated with high (more than 20%) 1-year mortality in the elderly (12), making survey-based assessments of hip fracture incidence challenging.


Data sources

The IWHS comprises a prospective cohort of women aged 55–69 years in 1986 randomly selected from the Iowa driver's license list. Of the 99,826 potentially eligible women, 41,836 responded to a mailed questionnaire (1). There have been 5 follow-up mail surveys since baseline: 1987, 1989, 1992, 1997, and 2004.

We obtained Medicare enrollment data for the IWHS cohort for the period 1986–2004. Medicare hospitalization data (MedPAR file) were obtained for the same period. Hospital outpatient, carrier (physicians and other suppliers), home health, hospice, and durable medical equipment claims are available for services between 1991 and 2004.


The IWHS cohort was matched to Medicare records by using the same algorithm that links Surveillance, Epidemiology, and End Results Program and Medicare data (13). Each study participant's Social Security number, name, sex, and date of birth were submitted for linkage. Matches were determined based on combinations of these identifiers. Data for only 11 of those who were matched by this algorithm were discarded after further comparison of matching variables. Most of the mismatches were associated with children who were receiving benefits through the participant's Social Security number and a respondent was mistakenly linked to the child's Medicare records.

We identified those women whose Medicare claims were most likely to represent their complete health care experience. We selected women who had both Medicare part A (hospitalization) and Medicare part B (physician and other services) coverage and were not enrolled in a Medicare managed care plan (i.e., fee-for-service enrollees).

Assessment of hip fractures from self-report and Medicare data

Self-reported hip fractures were assessed on each of the 5 IWHS follow-up surveys. Respondents were asked to indicate whether a hip fracture had occurred since (the date of the last survey) but were not asked to report the exact date of the hip fracture. The baseline mail survey did not collect information specific to hip fractures but rather asked whether the subject had a history of any fracture after age 35 years. Hip fractures were identified from Medicare data by using International Classification of Diseases, Ninth Revision, code 820.XX from an inpatient hospital (MedPAR) claim (14, 15). Although some women had multiple incident hip fractures, only the first hip fracture identified from each source was included in the analysis.

Statistical analysis

To assess the additional information provided by the linkage, we calculated the percentage of claims-based hip fractures also identified on a follow-up mail survey for women aged 65 years or older. Date of hospital admission was used as the date of hip fracture for those fractures identified from Medicare sources. Date of fracture for the self-report-only group was imputed as the halfway point between the date indicated on the survey (e.g., “Since July 1989, have you suffered a hip fracture …”) and the date on which the survey was mailed (16). Date of death was obtained from the linkage to Iowa vital records and the National Death Index for both IWHS and Centers for Medicare & Medicaid Services survival analyses.

Age-specific hip fracture incidence was calculated by using person-years of observation after age 65 years as the denominator and incident fractures as the numerator. Survey-based estimates censored subjects at their last survey response. Medicare-based estimates included only person-years of observation for periods with both Medicare part A and part B coverage and no managed care enrollment. We evaluated 1-year survival (days) after fracture using the Kaplan-Meier method.


Overall linkage results

Figure 1 illustrates the timing of the IWHS baseline and mailed surveys with the gradual aging of the IWHS cohort into the Medicare program at age 65 years, along with the varying availability of Medicare data. As shown, while some IWHS participants were eligible for Medicare at baseline (1986), the entire cohort was not enrolled in the program until 1996.

Figure 1.
Availability of Iowa Women's Health Study survey and Medicare data over time. Medicare hospitalization data became available in 1986, and Medicare outpatient data became available in 1991.

Ninety-eight percent of the IWHS cohort survived to age 65 years (typical Medicare eligibility). We were able to successfully match 99.2%, or all but 93 of these women. Of the 858 who did not survive to age 65 years, 216 were linked to the Centers for Medicare & Medicaid Services data (25%). This enrollment prior to age 65 years included a mix of beneficiaries who qualified because of disability or end-stage renal disease (71%) and people who died close enough to their 65th birthday that their Medicare enrollment had been automatically initiated (29%).

All but 87 of the IWHS participants who survived to age 65 years and were linked to Medicare data had been, for at least some period of time, enrolled in the Medicare fee-for-service program. Table 1 shows the number of IWHS participants who survived to age 65 years, enrolled in Medicare, and did or did not use health care, by calendar year.

Table 1.
Availability of Medicare Data for IWHS Participants by Year, 1986–2004

Variability in maintaining both part A and part B coverage was a source of loss of Medicare-based information. Eight percent of the cohort had only part A coverage for some period of time after age 65 years, indicating the likelihood of a primary payer other than Medicare. Incomplete Medicare coverage impacted only 3% of the total potential person-years of observation for the cohort (546,083 of 560,131 possible person-years of observation).

Ascertainment of hip fractures

Overall, 1,195 women reported a hip fracture on a follow-up mail survey, and 2,246 women had a Medicare hospitalization for hip fracture. We found 781 women who had hip fractures according to both data sources (Table 2). Only 35% of women with Medicare-identified hip fractures were also identified as having a hip fracture by using the survey data. On the other hand, 65% of women with a survey-reported hip fracture also had a Medicare-identified hip fracture. Failure to be enrolled in Medicare at the time of fracture was the explanation for approximately one-third of the 414 women who reported a hip fracture and for whom we could not find a corresponding hip fracture recorded in the hospitalization data. There was no clear explanation for the discrepancy for the remaining women.

Table 2.
Identification of Hip Fractures From Medicare and Self-report, IWHS, 1986–2004

The majority of the 2,246 women whose fractures were identified by using Medicare claims (74%) survived to the next follow-up survey (Table 3). Of these survivors, 907 women, representing 40% of all Medicare-identified fractures and 55% of survivors, responded to the first possible follow-up survey after hip fracture. The majority of these respondents (83%) reported having suffered a hip fracture. The percentage of hip fractures identified by Medicare that were also identified by self-report decreased dramatically across IWHS surveys. Loss to follow-up because of mortality increased over the study period. Only 68% of the Medicare-identified hip fracture patients between 1997 and 2004 were alive at follow-up 5 compared with 95% of those with hip fractures between 1986 and 1989 who survived to follow-up 2. This trend of increasing loss to mortality was influenced by both the aging of the IWHS cohort and the increasing time between follow-up surveys. In addition, the rate of survey response among hip fracture survivors decreased over time. Ninety-two percent of survivors who had a hip fracture between 1986 and 1989 responded to the next follow-up survey (1 or 2), but only 46% of survivors who had a hip fracture between 1997 and 2004 responded to follow-up 5.

Table 3.
Reporting of Medicare-identified Hip Fractures on IWHS Follow-up Surveys

Overall estimates of hip fracture incidence were lower when the survey-based measure compared with the Medicare-based measure was used (2.61 vs. 4.20 hip fractures per 1,000 person-years, respectively). The difference between survey and Medicare-based estimates of hip fracture incidence was minimal at younger ages but dramatically increased at older ages. In addition, the differences between survey and Medicare-based measures were maintained for all body mass index groups (Table 4). Both approaches showed the expected positive association of hip fracture with age and negative association with body mass index. However, the gradient in risk ratios for the survey-based measure of hip fracture was less prominent than for those based on Medicare data.

Table 4.
Comparison of Hip Fracture Rates and Rate Ratios for IWHS Participants Aged 65 Years or Older, 1986–2004

Comparing the characteristics of the IWHS participants whose Medicare-identified hip fractures were not found in the survey data with those whose hip fractures were also reported on a follow-up survey provides important insight into the role of nonresponse bias in this population. Those whose hip fractures were also reported on a follow-up survey were younger at hip fracture than those with Medicare-only identified hip fractures (median age, 75.1 years vs. 77.6 years). The 2 groups did not differ on several other baseline risk factors for hip fracture: body mass index or waist-to-hip ratio (Table 5).

Table 5.
Comparison of Fracture Risk Factors and Mortality Between All Hospitalizations for Hip Fracture and the Subset Reported on a Survey, IWHS, 1986–2004

There were, however, substantial differences in estimated age-specific post-hip-fracture mortality. One-year mortality among those with Medicare-identified hip fractures was 14% compared with just 1% among the subset also identified on a survey. Mortality in the overall Medicare group increased with age, while mortality for women whose fractures were also identified by survey response was stable across age categories. It was not possible to calculate true 1-year mortality for those women whose self-reported fractures were not also found in Medicare data.


In cohorts such as the IWHS with high-quality identifiers, linkage with Medicare administrative data is feasible and technically straightforward. The linkage is particularly valuable for studying endpoints associated with high levels of morbidity or mortality since they will be more subject to loss to follow-up bias than outcomes with low levels of morbidity or mortality. Medicare also offers opportunities to more validly assess the population impact of high morbidity and mortality conditions, to measure the mortality associated with acute and chronic conditions, and to link that mortality to pre-event risk factors. With Medicare data, exact event dates are available, which offers increased precision for assessing both short- and long-term survival, costs, and associated health care use.

Furthering the value of Medicare linkage are the detrimental effects of aging on the quality and availability of survey-based data. With age, challenges associated with mortality, response burden, cognitive decline, and institutionalization limit the value of survey-based measures. In such cases, external measures such as those from health care claims provide opportunities for direct measurement of disease incidence as well as external validation of responses. Conditions such as hip fracture that are unambiguous in their diagnosis and that require health care are ideally suited for analysis with Medicare claims data. The claims-based literature has grown considerably in recent years, and algorithms are available for a wide range of acute and chronic conditions such as stroke, small bowel obstruction, diabetes, congestive heart failure, and arthritis (17).

Likewise, claims can be used to identify and quantify the receipt of health care, including influenza vaccination, elective joint repair, mammography use, and rehabilitation. Beyond using claims to assess risk factor associations, the data can also be used to estimate economic costs of disease, treatment patterns, and outcomes following diagnosis. All extend the value of baseline and follow-up data for these cohorts.

Some of the apparent mismatches between Medicare-reported and survey-reported hip fractures represent self-report errors. An earlier IWHS study of hip fractures (18) found that data on 12% of hip fractures reported on the survey could not be found in medical records. These mismatches were later confirmed by the respondent to have been erroneously reported on the survey. The Women's Health Initiative was able to confirm 78.2% of self-reported hip fractures with medical records (19). These estimates are consistent with our findings that a measurable percentage of self-reported hip fractures may represent survey error.

In this example, we illustrated the value of Medicare claims for studying hip fractures in older women. We demonstrated that an analysis based on self-report would underestimate both the incidence of and mortality after hip fracture. The mortality effects, as we illustrated, are considerable. People rarely die directly from hip fractures; rather, fractures can start a cascade of disability that leads to complications such as pneumonia, sepsis, or the deconditioning resulting from the fall and associated immobility.

A key advantage to linkage with Medicare data is that they are available from a single source in standard formats, which results in considerable efficiency compared with other electronic or secondary data, particularly when studying geographically dispersed populations. Linkage is facilitated by the program's ability to match records on the basis of each beneficiary's own Social Security number. This advantage allows individuals to be identified even if they claim their Medicare benefits under another person's work history (most often a spouse). The ability to link medical claims longitudinally and nationally is particularly useful for tracking health events for IWHS participants who move out of state or become institutionalized.

There are attributes of the IWHS cohort that are unique and may contribute to the high level of success of this linkage process. Most importantly, the cohort had extremely complete and high-quality identifiers. We were able to link 99.2% of the IWHS women who survived to age 65 years. Second, there was relatively little Medicare managed care in Iowa. In the IWHS cohort, only 1% of the total possible person-years of observation after age 65 years was lost to managed care enrollment. The rest of the 3% loss was due to incomplete part A and/or part B enrollment. Managed care exceeds 30% in other areas of the United States (e.g., California) (20); linkages to populations in high-managed-care areas would observe significant loss. It is important to note that managed care enrollment varies over both geography and time. Thus, the loss due to managed care may change over time and differently across regions of the United States. Although managed care enrollees represent a source of missing utilization data, these enrollees will link to Medicare enrollment records, so the amount of missing data can be quantified.

Our high linkage rate is consistent with the experience of the Surveillance, Epidemiology, and End Results Program/Medicare project. Medicare linkages do not reflect the experiences of the small percentage of people older than age 65 years who are not covered by Medicare—largely persons who never participated in the Social Security or Railroad Board Retirement programs. Others who may be underrepresented in Medicare populations include civil servants who, in the 1960s and 1970s, were allowed to opt out of Medicare, and a small group of elderly who have Medicare part A coverage (i.e., hospitalization) but not Medicare part B coverage (i.e., other services). Reasons for waiving part B coverage include having other (primary) insurance such as from an employer and relying on another system such as the Veterans Administration or Indian Health Service for primary care. In our study, we found that approximately 7% of IWHS participants spent some period of time without part B coverage. Overall, however, the impact on person-years of observation was minimal: we found complete part A plus part B fee-for-service enrollment for 97% of the potential person-years of observation for our cohort.

With expanded use of Medicare data for cohort follow-up, methodological advances will likely follow. One challenge is how to appropriately study a cohort with a large percentage of left truncation. That is, there are gaps between date of a survey and date of Medicare enrollment when it is unknown whether the outcome of interest was experienced. For 65% of the cohort, there was a gap between baseline survey and Medicare enrollment at age 65 years. Current Medicare data users often adopt approaches such as using the first year of Medicare enrollment to confirm that elderly are disease free or to identify comorbidities. With linked cohorts, the surveys themselves offer this opportunity, but the challenge of left truncation remains.

In conclusion, the linkage experience of the IWHS cohort illustrates the value of using Medicare administrative data to extend the utility of established cohorts for identifying additional endpoints, particularly those associated with considerable morbidity and mortality. Even when the condition of interest may not be associated with survival or response, administrative data may offer a means to reduce response burden and enable surveys to focus on topics such as quality of life and risk factor changes not measurable by using external data sources.


Author affiliations: Division of Health Policy and Management, School of Public Health, University of Minnesota, Minneapolis, Minnesota (Beth Virnig, Sara B. Durham); Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, Minnesota (Aaron R. Folsom); and Department of Health Sciences Research, Division of Epidemiology, Mayo Clinic, Rochester, Minnesota (James Cerhan).

This study was funded by the National Cancer Institute (grant 5RO1-CA039742).

Conflict of interest: none declared.



Iowa Women's Health Study


1. Folsom AR, Kushi LH, Anderson KE, et al. Associations of general and abdominal obesity with multiple health outcomes in older women: the Iowa Women's Health Study. Arch Intern Med. 2000;160(14):2117–2128. [PubMed]
2. Wennberg DE, Lucas FL, Siewers AE, et al. Outcomes of percutaneous coronary interventions performed at centers without and with onsite coronary artery bypass graft surgery. JAMA. 2004;292(16):1961–1968. [PubMed]
3. Begg CB, Riedel ER, Bach PB, et al. Variations in morbidity after radical prostatectomy. N Engl J Med. 2002;346(15):1138–1144. [PubMed]
4. Wilkinson GS, Kuo YF, Freeman JL, et al. Intravenous bisphosphonate therapy and inflammatory conditions or surgery of the jaw: a population-based analysis. J Natl Cancer Inst. 2007;99(13):1016–1024. [PubMed]
5. Welch HG, Fisher ES. Diagnostic testing following screening mammography in the elderly. J Natl Cancer Inst. 1998;90(18):1389–1392. [PubMed]
6. Woodward WA, Giordano SH, Duan Z, et al. Supraclavicular radiation for breast cancer does not increase the 10-year risk of stroke. Cancer. 2006;106(12):2556–2562. [PubMed]
7. Warren JL, Yabroff KR, Meekins A, et al. Evaluation of trends in the cost of initial cancer treatment. J Natl Cancer Inst. 2008;100(12):888–897. [PubMed]
8. Snyder CF, Earle CC, Herbert RJ, et al. Trends in follow-up and preventive care for colorectal cancer survivors. J Gen Intern Med. 2008;23(3):254–259. [PMC free article] [PubMed]
9. Morris AM, Wei Y, Birkmeyer NJ, et al. Racial disparities in late survival after rectal cancer surgery. J Am Coll Surg. 2006;203(6):787–794. [PubMed]
10. Stevens JA, Rudd RA. Declining hip fracture rates in the United States [published online ahead of print May 19, 2010] Age Aging. (doi:10.1093/ageing/afq044)
11. Baron JA, Lu-Yao G, Barrett J, et al. Internal validation of Medicare claims data. Epidemiology. 1994;5(5):541–544. [PubMed]
12. Leibson CL, Tosteson AN, Gabriel SE, et al. Mortality, disability, and nursing home use for persons with and without hip fracture: a population-based study. J Am Geriatr Soc. 2002;50(10):1644–1650. [PubMed]
13. Potosky AL, Riley GF, Lubitz JD, et al. Potential for cancer related health services research using a linked Medicare-tumor registry database. Med Care. 1993;31(8):732–748. [PubMed]
14. Nguyen-Oghalai TU, Kuo YF, Zhang DD, et al. Discharge setting for patients with hip fracture: trends from 2001 to 2005. J Am Geriatr Soc. 2008;56(6):1063–1068. [PMC free article] [PubMed]
15. Forte ML, Virnig BA, Kane RL, et al. Geographic variation in device use for intertrochanteric hip fractures. J Bone Joint Surg Am. 2008;90(4):691–699. [PubMed]
16. Nicodemus KK, Folsom AR, Anderson KE. Menstrual history and risk of hip fractures in postmenopausal women. The Iowa Women's Health Study. Am J Epidemiol. 2001;153(3):251–255. [PubMed]
17. Munger RG, Cerhan JR, Chiu BC. Prospective study of dietary protein intake and risk of hip fracture in postmenopausal women. Am J Clin Nutr. 1999;69(1):147–152. [PubMed]
18. Warrenton, VA: Buccaneer Computer Systems & Service, Inc; CMS Chronic Condition Data Warehouse Condition Categories. ( (Accessed June 4, 2010)
19. Chen Z, Kooperberg C, Pettinger MB, et al. Validity of self-report for fractures among a multiethnic cohort of postmenopausal women: results from the Women's Health Initiative Observational Study and Clinical Trials. Menopause. 2004;11(3):264–274. [PubMed]
20. Menlo Park, CA: The Henry J. Kaiser Family Foundation; 2003. Medicare+choice. (Fact sheet #2052-06). ( (Accessed October 30, 2008)

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press