|Home | About | Journals | Submit | Contact Us | Français|
Although many epidemiologists use the National Death Index (NDI) as the “gold standard” for ascertainment of US mortality, high search costs per year and per subject for large cohorts warrant consideration of less costly alternatives. In this study, for 1995–2001 deaths, the authors compared matches of a random sample of 11,968 National Institutes of Health (NIH)-AARP Diet and Health Study subjects to the Social Security Administration's Death Master File (DMF) and commercial list updates (CLU) with matches of those subjects to the NDI. They examined how varying the lower limits of estimated DMF match probabilities (m scores of 0.60, 0.20, and 0.05) altered the benefits and costs of mortality ascertainment. Observed DMF/CLU ascertainment of NDI-identified decedents increased from 89.8% to 95.1% as m decreased from 0.60 (stringent) to 0.20 (less stringent) and increased further to 96.4% as m decreased to 0.05 (least stringent). At these same cutpoints, the false-match probability increased from 0.4% of the sample to 0.6% and then 2.3%. Limiting NDI cause-of-death searches to subjects found in DMF searches using less stringent match criteria, further supplemented by CLU vital status updates, improves vital status assessment while increasing substantially the cost-effectiveness of ascertaining mortality in large prospective cohort studies.
In 2005, Buchanich et al. (1) reported a potentially serious problem of mortality underascertainment in studies employing a widely used research resource, the Social Security Administration's (SSA) Death Master File (DMF). Mortality ascertainment falls short by 1.7%–4.9% in DMF “exact matches” (those sharing unique identifying information with a subject record). Furthermore, according to Buchanich et al. (1), an SSA insider estimated that 12%–13% of DMF mortality records are omitted from data provided by half of the US states. In addition, Sesso et al. (2) estimated that the DMF contains 94.7% of the deaths of males ascertained through the National Death Index (NDI) but only 31.1% of the deaths of females.
Thus, Buchanich et al. (1) and Sesso et al. (2) have cast doubt on a strategy of using less expensive DMF searches both to identify deceased subjects and to select subjects for NDI Plus cause-of-death searches. (An NDI Plus search, for an additional fee, returns matches with causes of death, in addition to the date of death returned by a standard NDI search.) In the prospective National Institutes of Health (NIH)-AARP Diet and Health Study (3), relying on the NDI exclusively to ascertain approximately 2% annual mortality in a cohort of more than half a million subjects would be cost-prohibitive. Assuming a constant 2% mortality rate per year, NDI Plus search costs compound over years 1–5 of follow-up as follows: [(1 – 0.02)year × 500,000] × 0.21=$599,328.
A key question, therefore, is whether the additional cost of cohort-wide matching to NDI Plus at $0.21 per subject per year—as compared with first restricting submissions to SSA DMF/commercial list update (CLU) matches and paying the National Center for Health Statistics $5 to search the NDI for the cause of death of each supposedly known decedent—would be justified by a potential gain in mortality ascertainment. A related question is how stringency of matching criteria affects the DMF-NDI comparison. With regard to this second question, Hill and Rosenwaike (4) introduced the method of linkage as an important factor in mortality ascertainment through automated searches of death registries. They argued that the use of stringent criteria for matching on names probably accounts for underascertainment of deaths among females, whose names do not remain reliable identifiers between the college years and death. Such unrealistically stringent matching criteria sacrifice linkage sensitivity for specificity and increase the risk of sex-related bias. In general, the estimated sensitivity of linkage falls in the 75%–95% range, with poorer results being obtained for foreign-born, very young, or pre-1975 decedents (5–8).
In the current study, we compared DMF/CLU matches with NDI Plus results for deaths occurring through 2001 to examine how varying the lower limit of estimated DMF match probability (m scores of 0.60, 0.20, and 0.05) alters the benefits and costs of a subsequent NDI Plus search if the search is limited to likely (though not necessarily exact) DMF matches supplemented by CLU matches. We also examined the benefits and costs of alternative NDI Plus search strategies.
The NIH-AARP Diet and Health Study cohort (3) and mortality ascertainment methods (9) used in this analysis have been described previously. An overview of data sources and methods used in this substudy appears in Figure 1.
For a pilot study (10), we selected a random sample of 12,000 subjects from the NIH-AARP cohort. NDI restrictions on the release of mortality data for 32 subjects reduced the effective sample size to 11,968. In 1995, the NIH-AARP study subjects resided in 6 states (California, Florida, Louisiana, New Jersey, North Carolina, and Pennsylvania) and 2 metropolitan areas (Atlanta, Georgia, and Detroit, Michigan), although by 2003 many of them had dispersed nationwide. In May 2003, we submitted identifying information on all 11,968 subjects to the National Center for Health Statistics for an NDI Plus mortality search (including information on cause of death) for deaths occurring through 2001. For subjects with alternative identifiers, we submitted multiple sets of identifiers, as permitted under NDI guidelines, bringing the total number of submissions to 16,153 records. Because there is no Social Security number (SSN) available for approximately 15% of the NIH-AARP study cohort and there is only a partial SSN available for another 10% or so, we designed the pilot study to test for failures in passive ascertainment of health outcomes through matching of the study cohort to the NDI and other secondary data sources.
The National Center for Health Statistics maintains the NDI, a central, computerized index of death-record information on file in US state vital statistics offices. The NDI compares subject-identifying items such as SSN, name(s), date of birth, sex, race, marital status, state of birth, state of residence, and father's surname with NDI records, and for potential matches it returns information on date and state of death, death certificate number, and indicators of the quality of the match: NDI score, class of the match (classes 1–5), and an exact match flag. The NDI computes match scores as sums of weights: Wi=log2(1/pi), where pi approximates the chance of linked cohort/registry identifying items agreeing in a false match. Weights for agreeing items have positive signs; otherwise signs are negative. Responsibility for reviewing possible matches and separating correct matches from false matches defaults to users (11).
Almost all US citizens in the workforce and many noncitizen residents participate in the US Social Security program. DMF records contain SSN; first, middle, and last names; date of birth; date of death; and zip codes (postal codes) for place of issuance of the SSN card, place to which the last SSA payment was sent (often blank), and place to which the death benefit was sent for each SSA record of a death. The SSA updates the DMF using death benefit applications and other sources of vital status data. Westat (Rockville, Maryland), a research firm specializing in study design and analysis, maintains a copy of the DMF (currently containing more than 80 million records) on a Linux SAS server, and it receives and applies quarterly updates, including additions, corrections, and deletions, supplied by a database vendor (CSRA, Inc., Charlottesville, Virginia).
To update subjects' contact information, we use one of many CLU services—MaxCOA by Anchor Computing (Farmingdale, New York)—which, in addition to address updates, provides special vital status codes for individuals specifically and more generally for a household member at a given address. Definitions of these codes appear in the Appendix Table. CLU codes specific to individuals rather than households (A1, A2, B1, C1, C2, C3, E1) identify subjects as possible mortality cases. To improve NDI ascertainment rates, we submit a subject that matches any 1 of the individual codes for NDI cause-of-death ascertainment. Nonetheless, we require supporting evidence of death for mortality ascertainment.
In practice, in the NIH-AARP Study, we augment DMF and CLU matches with mortality reporting in returned mail from surveys and in correspondence with subjects' caretakers. In the current analysis, we considered only DMF and CLU results as data sources for screening of NDI Plus submissions.
No common and reliable SSN or other “key” exists in each record of the cohort, NDI, and DMF data sources. Linkage of these data requires matching on less reliable “natural keys” such as names and dates of birth.
The National Center for Health Statistics' NDI matching service takes care of the problem of linking large files on many different key values by assigning to each cohort-NDI pair of records a score that represents the estimated probability of a match. This service reduces analysis of linkage results to classifying potential matches by NDI score (plus other indicators of match quality) and selecting appropriate classes of matches. Using prior experience with NDI scores and classes as a guide, we separated NDI potential matches into 3 major classes by score (or by maximum score in cases of multiple sets of identifiers): ≤0, 1–31, and >31. We used automated and manual reviews of nonmatches and matches to resolve instances of matching of a subject to more than 1 death certificate or matching of more than 1 subject to a death certificate, and we checked cohort-NDI pairings with borderline scores.
Searching more than 80 million DMF records for matches with a cohort of more than 500,000 persons requires special strategies and techniques. Specifically, Westat's proprietary program, WesMatch, creates multiple indexes from different combinations of imperfect and incomplete key “patterns” in the smaller data set, and then looks up key values in the indexes as it scans the larger data set. The patterns resemble shorthand text messages or vanity license plates (“ru@bf” or “UT#105”). Table 1 describes data values used to build these keys. Requiring conjunctive equivalence of date of birth, first letter of the first name, and Soundex code of the last name make the indexes less prone to large proportions of false matches. Linking on identifiers with fewer distinct values, such as first initial of the first name or Soundex code of the last name, reduces the risk of failing to find matches due to incidental discrepancies in identifiers (say, Larson vs. Larsen). Century-old Soundex methods encode pronunciations of names into standard strings of characters and digits; for example, Larson and Larsen both encode to L625, and Beaudreau and Boodrow both encode to B36. The Appendix contains additional detail on the linkage process.
We used a generalized linear model, fitted with the SAS procedure PROC GENMOD (SAS Institute Inc., Cary, North Carolina), to assign standardized match probability estimates (m scores) to pairings of records, making it easier to rank and separate matches and nonmatches. Predictors of m consist of results from similarity comparisons of identifying fields. A logit link function, g(μ)=log[μ/(1 – μ)], and a binomial (proportion) distribution with an associated variance function, V(μ)=μ(1 – μ), model the steep transition from an m close to 0 for very dissimilar cohort-DMF record pairs to an m close to 1 for pairs with strong and multiple similarities. Because of high levels of collinearity among similarity scores for identifiers in matching cohort and DMF records, we limited predictors to similarity scores for SSN, combined first and last names, date of birth, and zip code. Because females in the cohort would be more likely to change their last names and less likely to have an SSN, we also included the subject's sex (−1=female, 0=missing, 1=male) as a predictor. The m score cutpoints of 0.60 (stringent), 0.20 (less stringent), and 0.05 (least stringent) represent different tolerances for ambiguity in matching. A cutpoint of m=0.05, for instance, suggests that 1 subject may have 20 different cohort-DMF pairings with m scores of 0.05, but the number of such pairs decreases to 5 at m=0.20 and less than 2 on average at m=0.60. This variant of “probabilistic record linkage” builds on the tradition of Fellegi and Sunter's seminal work (12) and subsequent extensions by Jaro (13). Winkler (14) recently evaluated the state of the technology.
Overall, we found no evidence of matches with NDI scores less than 1. Table 2 breaks down the 5,309 potential subject matches with the NDI into classes by NDI match score, NDI class, and the criteria “exact match” and “has SSN.” The review process identified 694 (13.1%) of the 5,309 potential matches as correct matches.
Table 3 shows the impact of match probability cutpoints with progressively lower limits (m scores of 0.60, 0.20, and 0.05) on DMF ascertainment and, through the proposed role of DMF/CLU matching in selecting subjects for NDI searches, on NDI ascertainment. The prevalence of NDI ascertainment failure decreased from 10.2% (71/694) to 4.9% (34/694) as m decreased from a stringent criterion of 0.60 to a less stringent criterion of 0.20, and to 3.6% (25/694) as m decreased to the least stringent criterion of 0.05. Submissions of DMF/CLU matches not found in the NDI increased from 47 (0.4% of the pilot sample) to 80 (0.6%) as m decreased from stringent (0.60) to less stringent (0.20).
As Table 4 shows, for the stringent and less stringent cutpoints, sensitivity—as measured in this context by the ratio of NDI matches also matching either the DMF or CLU to matches to the NDI—increased from 89.8% (623/694) to 95.1% (660/694), while specificity—the ratio of cohort records not matching the DMF, CLU, or NDI to cohort records not matching the NDI—decreased from 99.6% (11,227/11,274) to 99.3% (11,194/11,274). In an analysis of the ratio of either DMF or CLU and NDI matches to any combination of DMF and CLU matches, the positive predictive value decreased from 93.0% (623/670) to 89.2% (660/740) as stringency of matching decreased from 0.60 to 0.20, increasing the burden of excess NDI submissions from 47 to 80 (70%). A further decrease in the cutpoint score to 0.05 increased sensitivity slightly, but it also increased the burden of extra submissions to the NDI to 294 out of 694 (42.4% of NDI matches). Overall, DMF ascertainment responded favorably to less stringent matching criteria. Figure 2 depicts the observed trade-off between the benefit of increased sensitivity of linkage (less NDI ascertainment loss) and greater proportions of false matches (specificity and positive predictive value of linkage) relative to correct matches as match probability m scores decrease from 0.70 to 0.01.
NDI Plus search cost savings depend critically on the annual expected mortality rate (AEMR) and the annualized false discovery rate (AFDR) due to false matches at a cutpoint—for example, a match probability lower limit of 0.20. Essentially the false-match proportion of surviving subjects divided by the number of years being searched, the AFDR measures the burden of extra NDI submissions of false matches. For a given AEMR and AFDR, the ratio of NDI costs of searching a full cohort with unknown vital status at $0.21 per year (y) × the number of subjects (N) to the NDI costs of a search of (AEMR + AFDR) × y years × N subjects × $5 per supposed decedent equals
Canceling out y × N and calculating the ratio of NDI prices simplifies the expression to ~24 × (AEMR + AFDR).
A sum of AEMR and AFDR around 4% would then put the ratio of costs at an indifference level of 1 (that is, a full NDI search would cost about the same as a search of decedents identified through DMF/CLU matching). The observed NIH-AARP cohort AEMR of less than 1.5% during the 7-year 1995–2001 interval and an AFDR of (80)/(11,194+80)/7 (for cutpoint=0.20 in Table 3) amounts to less than half of the break-even point of 4%, indicating a potential search cost savings of more than 50% for NDI Plus searches limited to DMF/CLU matches (relative to NDI searches of the full cohort). The benefit from an NDI search of the full cohort would be an approximately 5% gain in the ascertainment rate. As Figure 2 shows, the benefit from decreasing the m score cutpoint from 0.60 to 0.20 would be to increase ascertainment from 90% to 95%, but the benefit of decreasing the cutpoint further to 0.05 would be only a 1% increase. For a decrease of the m cutpoint from 0.60 to 0.20, the cost of NDI submissions would increase 10% ([740 − 670]/670=10%), as shown in Table 4, while costs for a further decrease of the m cutpoint from 0.20 to 0.05 would increase 30% ([963 − 740]/740=30%). Strictly in terms of NDI Plus search costs, the cost of searching the sample of 11,968 persons over a period of 7 years at $0.21 per subject would be $17,593 as compared with the cost of searching for 740 (441+201+18+80=740) DMF/CLU matches (as shown under the less stringent criterion of 0.20 in Table 3) at $5 each, or a total of $3,700. Overall, cause-of-death ascertainment for 34 of 694 mortality cases (4.9%) costs an additional $13,893 ($17,593 − $3,700=$13,893; $409 per case). For the sample, the AEMR of 0.8% ([694/7]/11,968=0.8%) and the AFDR of 0.04% ([34/7]/[11,194+80]=0.04%) amount to approximately 20% of the 4% indifference level sum of AEMR and AFDR, and thus the cost of an NDI Plus search for DMF/CLU matches amounts to approximately 20% of the cost of an NDI Plus search of the entire sample.
DMF/CLU searches identify a very large proportion of subjects who would be found by a later NDI search and may reduce the time and cost of an NDI search of the full cohort. As an alternative to methods suggested in this study, Buchanich et al. (1) recommend using an SSA vital status search to identify surviving subjects and eliminate them from NDI searches (see http://www.ssa.gov/policy/about/|epidemiology.html). For this, the SSA employs a State Verification Enumeration System matching algorithm that requires stringent (though not perfect) matching of SSNs, names, and dates of birth. For studies that did not have a sufficient SSN for a substantial number of subjects, this strategy would prove less effective, because any subject without an SSN would have an “unknown” status in an SSA vital status search and could not be eliminated from submissions for an NDI search.
Regarding the concern that approximately 12% or greater underascertainment in DMF search results from an unwillingness of half of the US states to allow the SSA to record some reported deaths in the DMF, we found no evidence through 2001 of larger-than-expected numbers of NDI matches missing from any 1 state. Nonetheless, the small number of pilot study NDI matches not found in the DMF did not afford us sufficient power to assess the statistical significance of our observations. A larger study of DMF prescreening after 2001, now ongoing, will soon provide more definitive results. In the meantime, SSA staff report that historically only 5% of deaths reported to the SSA have come from a single state's vital status data (15).
For the NIH-AARP Diet and Health Study, we observed that a strategy of DMF prescreening of NDI submissions, combined with CLU or other alternative sources of mortality data, yields 95% accurate and cost-effective results, at least for NDI ascertainment of deaths occurring through 2001. Combined DMF/CLU and NDI mortality ascertainment may exceed 95%, since expected NDI mortality ascertainment approaches 95%, according to Horm (16), and high-quality DMF/CLU matches may add mortality cases not present in the NDI.
All searches of databases for epidemiologic outcomes entail the potential for misclassification. For studies such as NIH-AARP that have large sample sizes and low annual expected death rates (well below 4%), DMF/CLU prescreening of NDI submissions reduces costs substantially and results in minimal loss of mortality ascertainment.
Author affiliations: Westat, Rockville, Maryland (Sigurd W. Hermansen); Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland (Michael F. Leitzmann, Arthur Schatzkin); and Institute of Epidemiology and Preventive Medicine, University of Regensburg, Regensburg, Germany (Michael F. Leitzmann).
This work was supported by the Intramural Research Program of the National Institutes of Health (Division of Cancer Epidemiology and Genetics, National Cancer Institute).
The authors thank Dr. Moussa Sarr, Dr. Stanley E. Legum, Ann Truelove, Himanshi Singh, and Kerry Grace Morrissey of Westat for advice and helpful comments on earlier versions of this paper.
Conflict of interest: none declared.
A subject record might have, among others, these identifying field values:
The index SFB (Table 1) comprises the SAS numeric value for the date of birth and the ASCII codes for the first letter of the first name, “R,” and “S53,” the yield of the SAS SOUNDEX() function with the last name as its argument. (For details on the Soundex phonetic coding algorithm, see http://www.nist.gov/dads/HTML/soundex.html.) If the program finds a key match (“hit”) on SFB or on any 1 of the other indexes for a record in the larger data set B, it writes the name of the index, its value, and data from that record to data set resultsB.
The full process for indexing and searching A (cohort records) and B (Death Master File) follows these steps:
where the shorthand notation | =<identifier> means “given equivalent values of <identifier>” and extends to logical expressions.
Adding elements of identifiers to make an index more complex decreases the chances that it will hit on a false match. The elements of indexes combine conjunctively, much as a series of conditions connected by “AND's.” In contrast, adding an index to indexes a1 … a7 and b1 … b7 decreases the chances that index look-ups will fail to find a match. The indexes combine disjunctively, much as a series of conditions connected by “OR's.”
Many alternatives to the proprietary WesMatch program exist, including the free program Link Plus (http://www.cdc.gov/cancer/npcr/tools/registryplus/lp.htm) for more typical-sized cohorts and BigMatch (17) for very large cohorts. Note that in the record-linkage literature, “blocking” plays much the same role as “indexing” in WesMatch.
|A1||Exact match by Social Security number, first name, last name, and zip code|
|A2||Weighted name (close to exact name) match by Social Security number and zip code|
|B1||Exact match by first name, last name, and address|
|C1||Address exact match/name near-exact match|
|C2||Name exact match/address near-exact match|
|C3||Near-exact match on both name and address|
|D1||Exact last name/exact address (household-level match)|
|D2||Exact last name/address near-exact match (household-level match)|
|D3||Exact address/weighted last name match (household-level match)|
|D4||Weighted last name/near-exact address match (household-level match)|
|E1||Name and zip code exact match/(no address used)—high-probability potential match|
|E2||Name and zip code exact match/(no address used)—medium-probability potential match|
|E3||Name and zip code exact match/(no address used)—low-probability potential match|
|I||Insufficient data to match (key address elements (street name, city, state, or zip code) missing)|
|N||No match (excludes “I” records)|