Search tips
Search criteria 


Logo of geronbLink to Publisher's site
J Gerontol B Psychol Sci Soc Sci. 2009 November; 64B(Suppl 1): i86–i93.
PMCID: PMC2763517

Medication Data Collection and Coding in a Home-Based Survey of Older Adults



To describe the collection, coding, and validity of medication data from the National Social Life, Health and Aging Project (NSHAP)—a survey of a national probability sample of adults aged 57–85 years.


Medication data were collected during an in-home interview by direct observation using a computer-based log and included prescription, over-the-counter, and nutritional supplements. The Multum® drug database was used for coding drug names and for mapping those names to therapeutic categories. Drugs not included in Multum® were assigned to medication classes by extending Multum’s typology. Internal and external validity of the medication data are examined and analytic use of the medication data is discussed.


Only 0.9% of respondents refused to participate in the medication log. Ninety-nine percent of all entries were identified and mapped to a medication class. Use of medication classes correlated highly with the presence of corresponding health conditions and related biological measures. The prevalence of use of common therapeutic classes of medications in NSHAP is comparable to that found in other national studies.


Nearly all NSHAP respondents cooperated with the medication use data collection protocol. Medication data obtained by the in-home, direct observation medication log method were found to be internally and externally valid.

Keywords: Elderly, Medication, Medication log, Methods, Population-based, Validation

THE 2004 United States Health Report determined that 80% of the geriatric population (adults aged 65 years and older) surveyed in 1999–2000 took at least one prescription drug in the prior month (National Center for Health Statistics, 2004). About half of the individuals in the same age group reported taking three or more prescription drugs, representing an increase of nearly 50% over the prior 5–10 years. Considering that older adults in the United States consume a disproportionately large and increasing share of medications, there is a growing need to understand and account for the physiological role of medications, physical, and mental (psychological, cognitive) side effects of medications, and medication use behavior in analyses pertinent to older adult health.

Accurate assessment of medication use in studies of older adult health is essential and requires methods for collection and coding of medication data, incorporation of medication data into health-related analyses, and interpretation of medication-related findings. Incorporation of medication data into biosocial models of older adult health is important for informing health care practices, improving medication-related health outcomes, and understanding biological pathways through which social factors and conditions affect health in later life. Incorporation of biological measures as predictors of health or social outcomes in gerontological research requires consideration of the effects of medications on both the biomeasures and outcomes of interest.

Methods for the collection and coding of medication data in population-based survey research differ across studies and are described with variable detail in the social and epidemiological research literature (Pahor et al., 1994). Most studies derive medication data from chart review, administrative/claims data, or self-report of prescribed medicines (thus, nonprescription medications are not included). For medication use data, an in-home inventory of medications obtained by direct observation has been shown to be a more reliable measurement tool than self-report recall methods (Landry et al., 1988; Psaty et al., 1992). Coding of medication data depends on the study objective and also varies across studies. Because a variety of proprietary databases can be purchased and used to code medication data, including First Databank® and Multum®, classification of medication data may not be consistent across studies.

This paper describes the collection, coding, and validity of the medication use data from the National Social Life, Health and Aging Project (NSHAP)—a survey of a national probability sample of 3,005 community-residing older adults aged 57–85 years. The most comparable prior studies are the Medical Expenditure Panel Survey (MEPS), the Slone Survey, and the National Health and Nutrition Examination Survey (NHANES). Important differences between the data collection protocols used in these studies and those used in NSHAP include: (a) MEPS collects information on prescription acquisition over a 12-month period (Moeller et al., 2001), which may overestimate the prevalence of current medication use because people may purchase or acquire more medications than they actually use; (b) the Slone Survey has a substantially smaller sample size of older adults and uses telephone-based self-report (Kaufman, Kelly, Rosenberg, Anderson, & Mitchell, 2002; Slone Survey, 2005); and (c) NHANES measures any use, as opposed to regular use, of prescription medications in the past month (Centers for Disease Control and Prevention [CDC], 2008b). Another important data set, the Medicare Current Beneficiaries Survey, collects information on prescribed medications and publishes only aggregate information on the number of filled prescriptions (Gaskin, Briesacher, Limcango, & Bringantti, 2006). For these reasons, estimates of medication use based on NSHAP may vary from those obtained in previous studies.



Medication data were collected during an in-home interview by direct observation using a computer-based medication log or inventory. NSHAP field staff specifically asked “to record all medications that you [respondent] take on a regular schedule, like every day or every week. This will include prescription and nonprescription medications, over-the-counter medicines, vitamins, and herbal and alternative medicines.” The interviewer, who had no specialized knowledge of medications, directly recorded names of medications from the medication packaging (e.g., bottle, tube, blister pack) into the computer. Interviewers were allowed to record up to 20 medications. It is possible that, in some cases, the interviewer recorded a medication based only on the respondent’s report (e.g., if the bottle or package was unavailable), though the fact that this did not arise as a major issue during interviewer debriefing suggests that it was rare (information to indicate that medication data were logged based on respondent’s report rather than direct observation was not recorded in our computer-based survey instrument).

Coding of Medications

Initial identification of medication names was accomplished using the National Drug Code (NDC) directory, Micromedex®, and the National Library of Medicine Medline Plus® Drug and Supplements database. Reported medication names were translated to generic drug names when possible and then coded according to the set of drug names in the Multum drug classification database. We selected Multum from several available drug databases because it is also used by NHANES, MEPS, and Health and Retirement Study to code their medication data. The Multum database is accessible online ( with authorization from Multum; the Multum Lexicon Plus version was used here. Of the 5,801 unique verbatim responses, 5,229 (90%) were matched to a Multum drug name (Figure 1). All entries were first coded by a clinical pharmacist (D.M.Q.) and then double-checked by both D.M.Q. and a research assistant working under D.M.Q.’s supervision (A.M.). To preserve confidentiality, medications indicated for the treatment of HIV/AIDS are excluded from the public-use data set.

Figure 1.
Flowchart summarizing drug name coding.

In all, 416 (7%) unique responses were identified as drugs that were not included in the Multum Lexicon Plus database. These names were added to our local copy of the database and classified by D.M.Q. according to the existing Multum therapeutic categories (described subsequently); we refer to them here as “extended Multum drug names.” Because Multum’s coverage of alternative medicines and nutritional products is not complete, most extended Multum drug names are included in these categories. Some responses had insufficient detail to accurately identify exact drug name but had enough information to adequately match to a Multum therapeutic category; these were also assigned an extended Multum drug name—for example, “unspecified potassium.”

For extended Multum drug names, the components were identified using multiple resources, including the National Library of Medicine Medline Plus Drugs and Supplements database, the Natural Standard® database, and Web sites for product names and labels. Despite the inconsistent and nonstandard terminology used to classify dietary supplements, we grouped these products as either alternative medicines or nutritional products. This approach was chosen for consistency with Multum classifications, NHANES, the United States Slone Survey on Medication Use, and the National Health and Interview Survey. Alternative medicines included two subcategories: (a) herbals/botanicals (e.g., alfalfa) and (b) nutraceuticals. Nutraceuticals are nonherbal, nonvitamin, and nonmineral supplements (e.g., glucosamine). Nutritional products included: (a) vitamin and/or mineral combinations, (b) single vitamins, (c) minerals/electrolytes, and (d) iron products. Weight loss supplements, fiber products, and uncategorized products were coded as “other supplements.”

Of the remaining 156 unique entries (2.7%) that could not be coded according to Multum, 71 clearly indicated a therapeutic objective and were coded according to this; for example, the response “medication for diabetes” is coded as “diabetic medication.” Only 85 unique responses were unrecognizable.

Many analysts will want to work with the medication data in terms of therapeutic categories; for example, they may wish to identify all respondents who are currently taking a particular type of cardiovascular agent. Multum’s therapeutic categories are based on the American Hospital Formulary Service therapeutic categories and are organized into a three-level hierarchy permitting classification at the therapeutic level, pharmacological level, and drug category level. This is illustrated in Figure 2, which shows the way in which the therapeutic (Level 1) classification “cardiovascular agents” is subdivided into the pharmacological (Level 2) categories “antihypertensive combinations,” “diuretics,” and “angiostatin converting enzyme (ACE) inhibitors.” All diuretics are then further subdivided into specific drug (Level 3) categories, such as “thiazide diuretics.”

Figure 2.
Example of Multum drug name classification.

The Multum database maps each Multum drug name to a specific category in the hierarchy, and our extended Multum names have also been mapped to specific categories as described earlier. To facilitate the use of the data set, variables have been created for each category to indicate whether the respondent reported taking an item in that category or in one of the categories below it in the hierarchy. For example, a respondent taking lisinopril (mapped by Multum to the category “angiotensin converting enzyme [ACE] inhibitors”) is coded as taking both ACE inhibitors and cardiovascular agents, whereas a respondent taking hydrochlorothiazide (mapped by Multum to the category “thiazide diuretics”) is coded as taking thiazide diuretics, diuretics, and cardiovascular agents.

As with individual drugs, Multum also maps combination drugs to a specific category. In these cases, the NSHAP database records not only the location of the combination but also includes the location of each component. Thus, a respondent taking hydrocholoriazide–lisinopril would not only be coded as taking both antihypertensive combinations and cardiovascular agents but would also be coded under the categories for single-agent hydrochlorothiazide and single-agent lisinopril (as described earlier).

Because the Multum database links individual drug names to one or more NDCs, and because each NDC code is classified as either prescription or nonprescription, we were able to identify those medications in the NSHAP data set that are available by prescription only (hereafter referred to as “prescription medications”). Drugs that are not linked to an NDC code by Multum or for which this linkage is incomplete were individually reviewed by D.M.Q. and classified accordingly.

Assessing Validity of Medication Use Data

Internal validity.—

Internal validity of the directly observed medication log as an instrument for collecting medication use data was assessed using three strategies: (a) extent of agreement between current use of a disease-specific class of medications (e.g., antidiabetic agents) and self-report of ever being diagnosed with the disease (e.g., diabetes), (b) concordance between gender and use of gender-specific medications (e.g., estrogens and erectile dysfunction agents), and (c) differences in two biological correlates of medication use—resting pulse (measured twice and then averaged) as a correlate of beta-adrenergic–antagonist or –agonist use and glycosylated hemoglobin (HbA1c) as a correlate of glucose-lowering drugs—between users and nonusers. The use of beta-blockers (drugs that block receptors in the heart and vasculature, with a consequent decrease in heart rate) is expected to cause a decrease in pulse rate, whereas the use of beta-agonists (drugs that promote the activity of beta-receptors in the lung, with a consequent increase in heart rate) is expected to cause an increase. HbA1c levels (a measure of glucose metabolism over the previous 60–90 days) are expected to be higher in diabetes medication users.

External validity.—

To evaluate the external validity of the medication data, comparisons using available data from other national studies were performed. To increase comparability, only those NSHAP respondents aged 65 years and older were included. Stagnitti (2004) and Daniel and Malone (2007) report on outpatient prescription drugs purchased by adults older than 65 years in the MEPS. The Slone Survey (2005) reports on medication use in children and adults in the United States, but only data for those aged 65 years and older are used for this comparison. An analysis of NHANES’ prescribed medicines data was conducted for respondents older than 65 years (CDC, 2008a).

Example of Analytic Application

To illustrate one of the ways in which the medication data may be used in a substantive analysis and the impact it can have on the results, we fit linear regression models (Weisberg, 1985) to both systolic blood pressure (SBP) and diastolic blood pressure (DBP), each measured twice and then averaged. Measurements were taken from the left arm using a Lifesource digital blood pressure monitor (Model UA-767PVL). Covariates included age, gender, race/ethnicity, and education (as a proxy for socioeconomic status)—each of which is known to be related to blood pressure. Because the use of antihypertensives not only affects blood pressure but may also mediate and/or moderate the relationship between blood pressure and the covariates, we fit separate models among antihypertensive users and nonusers. Antihypertensive users were identified as those using one or more medications from the following Multum therapeutic categories: angiotensin converting enzyme (ACE) inhibitors, beta-adrenergic blockers, diuretics, and calcium channel blockers. Drugs in each of these categories have a known lowering effect on blood pressure.

Statistical Considerations

All population estimates (proportions, means, and regression coefficients) are weighted using the weights distributed with the data set, which adjust for differential probabilities of selection and differential nonresponse. Standard errors were computed using the linearization method (Binder, 1983), taking into account the stratification and clustering of the sample design. Approximate 95% confidence intervals (CIs) for all estimates were obtained by inverting the corresponding Wald test. Hypothesis tests in Table 4 (represented by starring the corresponding coefficients) were two sided.

Table 4.
Estimated Coefficients and SEs (in parentheses) From Regression Models Predicting SBP and DBP Among Antihypertensive Medicationa Users and Nonusers



Only 0.9% of respondents refused to participate in the medication log protocol. Among those who did participate, 8.8% reported taking no medications regularly. Only 0.7% reported taking 20 or more medications. The mean number of medications reported was 5.2.


The 2,717 respondents reporting at least one medication generated a total of 15,389 log entries. Of these, 14,522 (94.4%) were matched to a Multum drug name and 702 (4.6%) were matched to an extended Multum drug name. This left only 165 (1.1%) unmatched; of these, 78 were coded according to therapeutic objective and 87 were unrecognizable.

Internal Validity

Table 1 shows the proportion of respondents who reported having been diagnosed with diabetes, thyroid problems, hypertension, or an enlarged prostate, estimated separately for those regularly taking a corresponding medication and those who were not. Ninety-five percent of those taking an antidiabetic agent reported having diabetes, as compared with only 5% of those not taking one. In contrast, among those men taking alpha blockers or 5-alpha reductase inhibitors, only 65% reported having an enlarged prostate, whereas 22% of men not taking these medications reported the condition. All the 158 respondents who reported taking estrogen were women, and all the 29 respondents who reported taking erectile dysfunction agents were men.

Table 1.
Prevalence of Self-Reported Health Conditions Among Users of Disease-Specific Drug Classes

The average pulse among those taking beta-blockers was 8.1 beats/min (95% CI = 7.0–9.3) lower than among those not taking these medications (Table 2). In contrast, the average pulse was 4.4 beats/min (95% CI = 1.8–6.9) higher among those taking beta-agonists than among those not taking them. Glycosylated hemoglobin levels were 1.4 units (% of total hemoglobin; 95% CI = 1.2–1.6) higher among those taking antidiabetic agents than among those not taking them.

Table 2.
Validation of Medication Use Measurement Using Biological Indicators: Differentials for Users of Beta-Agonists, Beta-Blockers, and Antidiabetic Agents

External Validity

Table 3 compares NSHAP with MEPS, Slone Survey, and NHANES with respect to the prevalence of medication use overall, the use of medications from several specific therapeutic categories, and the use of several specific medications. In general, the estimates for NSHAP were fairly similar to those from the other studies, with the observed differences being in the expected directions. Specifically, the estimated prevalence of medication use (both any medication and prescription medications) was slightly higher for NSHAP than for Slone (which relied on self-report over the phone), whereas the prevalence and mean number of prescription medications were somewhat lower for NSHAP than for MEPS (which recorded all drugs purchased, rather than actually used, during the past year).

Table 3.
A Comparison Between NSHAP Medication Data and Other Current National Surveys for Ages ≥65 Years

Analytic Application

Table 4 shows the results of regressing SBP and DBP on gender, age, race/ethnicity, and education separately for those taking antihypertensive medications and those not taking them. The effects of the covariates on SBP are substantially greater among those not taking antihypertensives; in fact, among antihypertensive users, the effects of gender, race/ethnicity, and education are not even statistically significant at the .05 level. Among nonusers, SBP is on average lower among women, Hispanics, and those with higher education, whereas it is higher among Blacks. The same is true for DBP with regard to race/ethnicity and education; however, the results for gender and age are slightly different. In particular, the effects of gender and age are greater among antihypertensive users, and they are reversed; DBP is greater among women users and decreases with age.


Nearly all NSHAP respondents cooperated with the medication log protocol. The resulting data were then successfully coded using the proprietary Multum database, thus facilitating comparisons with studies such as NHANES and MEPS (both of which also used Multum to code their data). Nearly all the recorded entries were able to be matched either to a Multum drug name or to an extended Multum drug name and have, therefore, been classified according to Multum’s therapeutic categories. Because this process involved identifying a wide range of prescription and nonprescription drugs using both generic and brand names, and because it also involved extending the Multum database to accommodate additional alternative medicines and nutritional products, it required the expertise of both pharmaceutical and medical experts (D.M.Q. and S.T.L.). The resulting data set, however, could be used by analysts without such expertise as long as they know which therapeutic categories are appropriate for their analyses (as discussed subsequently).

As has been found previously (Landry et al., 1988; Psaty et al., 1992), the medication use data collected using this procedure were found to correspond well with gender (for gender-specific medications), with the presence of specific health conditions and with differences in biological measures affected by medication use. Although the agreement between reporting a specific condition and the observed use of medication indicated for that condition was not perfect, this is to be expected for several reasons: (a) A specific drug class may have more than one therapeutic application (e.g., antihypertensives may be used for purposes other than treating blood pressure, such as congestive heart failure or migraine prophylaxis), (b) an individual may have been diagnosed with a condition that did not currently require treatment with medication, (c) an individual may have been diagnosed with a health condition and prescribed medical treatment but was either unable or unwilling to comply with the treatment (untreated disease), or (d) an individual was taking a medication for a condition that was not reported to the interviewer.

With respect to external validity, the NSHAP data yield estimates of medication use among community-residing older adults, which are similar to those from the best prior medication use data for this population (CDC, 2008b; Gaskin et al., 2006; Kaufman et al., 2002; Moeller et al., 2001). Subtle differences between these data sets are likely due to variations in data collection protocols, differences in the period of time for which current medication use is defined (e.g., “daily or weekly,” “in the week prior to the interview,” or “in the past month”), and the fact that some studies do not collect data on nonprescription medications. For example, the prevalence of central nervous system agent use (a class of agents that includes analgesic drugs) was 52% in NSHAP versus 46% in MEPS. Although one might have expected a higher prevalence estimate from MEPS due to its longer (12 month) time window, the fact that it focused solely on prescription medications meant that it did not include the widely used nonprescribed analgesic medications (e.g., acetaminophen or ibuprofen). For additional details on the prevalence and patterns of medication use estimated from the NSHAP data set, see Qato and colleagues (2008).

The NSHAP medication use data are suitable both for studies in which medication use is the primary outcome and for those in which the use of one or more specific medications (or medication classes) is a covariate. The simple example shown here—stratifying an analysis of blood pressure by the use of antihypertensives—demonstrates the importance of controlling for medication use in the analysis of health outcomes. In particular, high blood pressure among those not currently taking antihypertensives reflects either disease that is undiagnosed, disease that is diagnosed but for which no treatment has been prescribed, or an inability or unwillingness to use the prescribed medication. In contrast, high blood pressure among those taking antihypertensives may reflect an inadequate treatment or a lack of adherence to the prescribed treatment. As illustrated in our example, the social factors affecting these two processes may be different. Although information on medication dosage, compliance, and cost would permit addressing these issues in greater detail, NSHAP’s omnibus nature prevented collecting this additional level of detail. Linkage of the NSHAP data set to Medicare prescription drug data or drug cost data sets may be feasible in the future but will require additional permission from NSHAP participants.

The public-use data set is designed to facilitate a wide range of analyses by including a series of indicator (0/1) variables, each of which indicates that a respondent reported taking at least one medication in a given Multum drug category. These variables can then be easily combined, as necessary. Doing this requires knowledge of the Multum classification hierarchy (a complete description of which is distributed with the NSHAP data set) and requires medical and/or pharmaceutical knowledge. For example, when controlling for the use of antihypertensives, one cannot rely solely on the therapeutic classification “cardiovascular agents” because subclasses such as “antianginal agents,” “antiarrhythmic agents,” and “inotropic agents” are not specific for the treatment of hypertension; instead, one must combine the various subclasses specific for hypertension as we did here. It is also important to note that Multum’s therapeutic classes identify the overall therapeutic effect of the drug and not necessarily the effect of all active ingredients. Information on therapeutic or pharmacological categories of specific drugs may be obtained online using the Drugdex database in Micromedex and the National Library of Medicine.


The National Social Life, Health and Aging Project is supported by the National Institute on Aging, Office of Women’s Health Research, Office of AIDS Research, and the Office of Behavioral and Social Science Research (5R01AG021487). The National Institutes of Health, National Institute on Aging, University of Chicago—NORC Center on Demography and Economics of Aging Core on Biomarkers in Population-Based Health and Aging Research (5 P30 AG 012857) The University of Chicago Program in Pharmaceutical Policy (PI: David Meltzer) also supported the efforts of S.T.L. and D.M.Q. on the manuscript. The University of Chicago Program in Pharmaceutical Policy is supported by the Merck Foundation.


We gratefully acknowledge the research assistance of Alison Feder and Jessica Schwartz.


  • Binder D. On the variances of asymptotically normal estimators from complex surveys. International Statistical Review. 1983;51:279–292.
  • Centers for Disease Control and Prevention. National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data (2003-2004) Hyattsville, MD: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention; 2008a. Retrieved September 4, 2007, from
  • Centers for Disease Control and Prevention. National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Questionnaire. Hyattsville, MD: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention; 2008b. Retrieved September 4, 2007, from
  • Daniel GW, Malone DC. Characteristics of older adults who meet the annual prescription drug expenditure threshold for Medicare medication therapy management programs. Journal of Managed Care Pharmacy. 2007;13:142–154. [PubMed]
  • Gaskin DJ, Briesacher BA, Limcango R, Bringantti B. Exploring racial and ethnic disparities in prescription drug spending and use among Medicare beneficiaries. American Journal of Geriatric Pharmacotherapy. 2006;4:96–111. [PubMed]
  • Kaufman DW, Kelly JP, Rosenberg L, Anderson TE, Mitchell AA. Recent patterns of medication use in the ambulatory adult population of the United States: The Slone survey. Journal of the American Medical Association. 2002;287:337–344. [PubMed]
  • Landry J, Smyer M, Tubman J, Lago D, Roberts J, Simonson W. Validation of two methods of data collection of self-reported medicine use among the elderly. The Gerontologist. 1988;28:672–676. [PubMed]
  • Moeller JF, Stagnitti MN, Horan E, Ward P, Kieffer N, Hock E. Methodology report #12: Outpatient prescription drugs: Data collection and editing in the 1996 MEPS (HC-010A) Rockville, MD: Agency for Healthcare Research and Quality; 2001.
  • National Center for Health Statistics. Health, United States, 2004, with chartbook on trends in the health of Americans. Hyattsville, MD: Centers for Disease Control and Prevention, National Center for Health Statistics; 2004.
  • Pahor M, Chrischilles E, Guralnik J, Brown S, Wallace R, Carbonin P. Drug data coding and analysis in epidemiologic studies. European Journal of Epidemiology. 1994;10:405–411. [PubMed]
  • Psaty B, Lee M, Savage P, Rutan G, German P, Lyles M. Assessing the use of medications in the elderly: Methods and initial experience in the cardiovascular health study. Journal of Clinical Epidemiology. 1992;45:683–692. [PubMed]
  • Qato D, Alexander G, Conti R, Johnson M, Schumm P, Lindau S. Use of prescription and over-the-counter medications and dietary supplements among older adults in the United States. Journal of the American Medical Association. 2008;300:2867–2878. [PMC free article] [PubMed]
  • Slone Survey. Patterns of medication use in the United States: A report from the Slone Survey. 2005. Retrieved March 2009, from
  • Stagnitti M. The top five therapeutic classes of outpatient prescription drugs ranked by total expense for the medicare population age 65 and older in the U.S. civilian noninstitutionalized population. Rockville, MD: Agency for Healthcare Research and Quality; 2004.
  • Weisberg S. Applied linear regression. 2nd ed. New York: Wiley; 1985.

Articles from The Journals of Gerontology Series B: Psychological Sciences and Social Sciences are provided here courtesy of Oxford University Press