|Home | About | Journals | Submit | Contact Us | Français|
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Population-based allele frequencies and genotype prevalence are important for measuring the contribution of genetic variation to human disease susceptibility, progression, and outcomes. Population-based prevalence estimates also provide the basis for epidemiologic studies of gene–disease associations, for estimating population attributable risk, and for informing health policy and clinical and public health practice. However, such prevalence estimates for genotypes important to public health remain undetermined for the major racial and ethnic groups in the US population. DNA was collected from 7,159 participants aged 12 years or older in Phase 2 (1991–1994) of the Third National Health and Nutrition Examination Survey (NHANES III). Certain age and minority groups were oversampled in this weighted, population-based US survey. Estimates of allele frequency and genotype prevalence for 90 variants in 50 genes chosen for their potential public health significance were calculated by age, sex, and race/ethnicity among non-Hispanic whites, non-Hispanic blacks, and Mexican Americans. These nationally representative data on allele frequency and genotype prevalence provide a valuable resource for future epidemiologic studies in public health in the United States.
Completion of the human genome sequence (1–3) and recent advances in the analysis of genome-wide associations for several common diseases (4–20) are generating tremendous opportunities for epidemiologic studies to evaluate the role of genetic variants in the etiology of common human diseases. Identification of allelic variants has accelerated as a result of the cataloging and mapping of single nucleotide polymorphisms (SNPs) throughout the genome by the International HapMap Project (21–23) and characterization of the scope of structural variation, including copy number variants, in the genome (24–27). Application of these advances to improve public health requires assessing the frequency of these variants in distinct populations, identifying diseases influenced by these variants, determining the magnitude of the associated risks, and elucidating gene–gene and gene–environment interactions. Although the number of published investigations in these areas of human genome epidemiology has increased rapidly, with publication of more than 6,000 reports yearly (28), methodological issues have made it difficult to integrate the evidence and, thus, to easily translate the findings into public health improvements (29–31).
Early studies of genotype prevalence used samples that were convenient to obtain, and minimal information was provided on the selection of participants (31). In addition, most estimates were calculated from data on small study populations, which limited the accuracy of estimates of allele frequency and genotype prevalence. Furthermore, frequencies for most genetic polymorphisms have been measured only in select US racial and ethnic groups and have not been presented by age group or by sex. Although select polymorphism frequencies have been reported in large populations (32, 33), these studies were community based or controls from larger case-control studies. In contrast, data on genetic variants can be obtained from large, well-designed, epidemiologically well-characterized, and population-based US surveys such as the Third National Health and Nutrition Examination Survey (NHANES III) (34, 35). These data are a unique and unparalleled resource for epidemiologic research to assess genetic variation in the population, gene–disease associations, interactions between gene–gene and gene–environment factors, and population-attributable risk for genetic variants.
In particular, NHANES III offers the opportunity to assess genetic variation among major racial and ethnic groups in the United States, for whom multiple health disparities exist (36–40). Health disparities result from the complex interactions of social, environmental, behavioral, and genetic influences in a diverse population (36, 41, 42). Public health strategies to address health disparities are more likely to be effective when they are based on sound integration of such risk information at the population level. NHANES III is a paradigm for complex analysis of unbiased, population-based data on social, environmental, behavioral, and biologic characteristics—including genetic variation—in relation to health status.
NHANES III is a complex, multistage sample survey conducted by the National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention (CDC) (35, 43), during 1988–1994. This cross-sectional study was designed to provide national statistics on the health and nutritional status of the civilian, noninstitutionalized population in the United States aged 2 months or older. Certain populations, including young children, older adults, non-Hispanic blacks, and Mexican Americans, were oversampled (35). As with standard NHANES analyses, race/ethnic groups were defined on the basis of the combination of the reported race (black, white, other) and reported ethnicity (not Hispanic, Mexican American, other Hispanic) of survey participants (35). Detailed household interviews were conducted to obtain information on sociodemographic variables, medical history, health-related behaviors, and use of medications. As part of the survey, physical examinations and laboratory and radiologic measurements were performed in special mobile examination centers (35).
During Phase 2 of NHANES III (1991–1994), 10,052 participants aged 12 years or older were examined in the mobile examination centers. As part of the examination consent, participants agreed that their blood could be kept for long-term storage and future research, although genetic research was not mentioned specifically. In August 2001, the CDC/NCHS Ethics Review Board approved a revised plan for use of these specimens according to guidelines in the August 1999 National Bioethics Advisory Commission report on the use of stored biologic materials for research. This revised plan allows linkage of the genetic laboratory results to NHANES data through the NCHS Research Data Center to ensure that confidentiality of the study participants’ identities is maintained (44). Attempts were made to establish Epstein-Barr virus-transformed cell lines (35, 44) from white blood cells obtained from 8,200 of the Phase 2 participants. However, the final NHANES III DNA bank contains 7,159 participants because of the inability to transform and grow a successful immortalized cell line (n =1,004), concerns regarding laboratory practice and quality assurance (n=21), and exclusion of 16 individuals who were not genotyped. The bank is jointly maintained by both NCHS and the National Center for Environmental Health at CDC. Demographic characteristics of participants in the DNA bank are included in Table 1. Sixty-two percent of participants were from households with multiple family members (average, 1.59 members per household; range, 1–11). This prevalence study was approved by the NCHS Ethics Review Board.
Members from a multidisciplinary working group reviewed available phenotype data from NHANES III, performed systematic literature reviews, and identified candidate genes and physiologic pathways thought to be associated with diseases of public health significance at the time of project initiation. The selection of polymorphisms for this study was also based upon input from the SNP500Cancer resource (45), which had already developed genotyping assays for numerous SNPs in the selected genes based on their potential importance to physiologic processes, epidemiologic studies, and health outcomes.
The selected variants are in genes that encode proteins in 6 major cellular and physiologic pathways: 1) nutrient metabolism (e.g., homocysteine, lipids, glucose, and alcohol); 2) immune and inflammatory responses; 3) xenobiotic metabolism (e.g., of drugs, carcinogens, or environmental contaminants); 4) DNA repair; 5) hemostasis and the renin-angiotensin-aldosterone system; and 6) oxidative stress. The variants are in pathways affecting the development of multiple diseases, including cardiovascular disease, diabetes, cancer, and infectious diseases, as well as modulation of the effects of environmental and occupational exposures.
DNA analysis for the project was performed at two facilities because neither lab had methodology developed to analyze all of the genetic variants: 1) the Core Genotyping Facility, National Cancer Institute (NCI), National Institutes of Health, Bethesda, Maryland (http://cgf.nci.nih.gov), and 2) the Division of Laboratory Sciences, National Center for Environmental Health, CDC, Atlanta, Georgia. Each lab analyzed all DNA specimens for each subset of genotyping assays performed.
Most polymorphisms were assayed by either the TaqMan assay (5′ nuclease assay; Applied Biosystems, Foster City, California) or the MGB Eclipse assay (3′ hybridization-triggered fluorescence reaction; Nanogen (formerly Epoch Biosciences), Bothell, Washington). Two polymorphisms were genotyped by pyrosequencing, and one was by capillary fragment analysis. Water controls and DNA samples with known genotypes, purchased from Coriell Cell Repositories (Camden, New Jersey), were included on each 384-well plate. Detailed genotyping methods, including primer and probe sequences, are described in Web Appendix 1 and Web Table 1, respectively. (This information is described in Web-only material that includes 8 Web appendixes, 1 Web table, and 1 Web figure; each is preceded by “Web” in the text. All are posted on the Journal’s website (http://aje.oxfordjournals.org/).)
The NHANES III genotyping data were monitored by a quality assurance and quality control committee composed of experts in laboratory science at CDC and NCI. The group monitored results of NHANES III quality control genotyping to ensure that the data met quality control guidelines established by NCHS.
Initial quality assurance assessments determined that at least 7,128 specimens, depending on the laboratory, were suitable for genotyping analysis on the basis of sample quality. All polymorphisms with genotyping call rates below 95% completion did not meet quality control criteria and were removed from further analyses. NHANES provides 480 quality control specimens for all studies that use the NHANES III DNA bank samples. These include blind replicates of approximately 6% of the 7,159 samples, to determine the accuracy and reproducibility of the assays. Assays that passed the blind-replicate analyses (>98% concordance according to NCHS guidelines) were tested for deviation from Hardy-Weinberg proportions calculated separately for each race/ethnic group in a standard unweighted analysis (46). The threshold for a genetic variant to pass Hardy-Weinberg analysis was P ≥ 0.01 (2 sided) for at least 2 of the 3 main race/ethnic groups (i.e., non-Hispanic white, non-Hispanic black, and Mexican American), with use of a chi-square goodness-of-fit test. The race/ethnicity category “other” was not used in determining the deviation from Hardy-Weinberg proportions because of the genetic heterogeneity of this group. Data from 192 samples were removed from certain assays because of a sample handling issue discovered in one of the laboratories. Genetic variants that met all quality control guidelines were used for further analyses. The range of successful genotype identifications for these variants was 97.5%–99.9% (median, 99.2%). Results from the tests of deviations from Hardy-Weinberg proportions for these variants are listed in Web Appendix 2.
Overall, 90 variants in 50 genes were available for estimation of allele frequency and genotype prevalence. Nearly all (n=87) of the variants genotyped are SNPs, and 3 are insertion/deletions. Various diseases or conditions for which these genes have a confirmed or purported association are shown in Web Appendix 3. This list is not comprehensive, but it demonstrates that the genes studied are involved in major pathways that have a role in the etiology of several diseases or conditions with public health significance.
Because NHANES III is a multistage, complex sample survey, all statistical analyses must account for sample weights and the survey design to produce unbiased national estimates and appropriate standard errors. The variance in clustered data caused by households with multiple related study participants was accounted for by use of the appropriate sample weights and the survey design in SUDAAN software (SUDAAN Statistical Software Center, Research Triangle Park, North Carolina). Point estimates and variances were calculated by using sample weights recalculated (47) for the Genetic Component of NHANES III. These weights were derived from the appropriate NHANES III, Phase 2, mobile examination center (MEC) sample weights to adjust for participant refusal to consent to future research and from the inability to generate cell lines and obtain DNA as mentioned above. NHANES genetic weights are specifically estimated for the genetic component of the 7,159 DNA bank participants, and none of the other weights provided by NHANES is appropriate. More detailed information about statistical weights in NHANES III is available online (http://www.cdc.gov/nchs/about/major/nhanes/nh3data_genetic.htm).
Analyses were conducted by using SAS-callable SUDAAN, version 9.01, and SAS, version 9.1 (SAS Institute, Inc., Cary, North Carolina). Deviations from Hardy-Weinberg proportions were tested with a chi-square goodness-of-fit approach by using SAS/Genetics (SAS Institute, Inc.). Allele frequency and genotype prevalence were calculated in SUDAAN and weighted by using the NHANES III Genetic Component sample weights for each gene variant for all major race/ethnic groups (i.e., non-Hispanic white, non-Hispanic black, Mexican American, and other) (data for “other” are not shown), age groups, and sexes. Point estimates and 95% confidence intervals were calculated and weighted for each race/ethnic group in SUDAAN to obtain the nationally representative estimates for the US population. The Taylor series linearization approach (48, 49), which derives a linear approximation of variance estimates to develop corrected standard errors and confidence intervals, was implemented to estimate variances. Tests of the difference in allele frequencies among race/ethnic groups (“other” was excluded), age groups, and sexes were performed by using polytomous logistic regression. Tests of the differences in genotype prevalence among these groups were evaluated using the Wald chi-square method. Statistical significance was considered as P<0.05. The differences in allele frequency and genotype prevalence by age and by sex were examined after adjustment for race/ethnicity by using the Cochrane-Mantel-Haenszel test at a significance level of 0.05.
Demographic characteristics of the 7,159 participants in the NHANES III DNA bank are described in Table 1. After adjustment for the NHANES III sample design and for nonresponse in the genetic component, there were slightly more women than men in the US population between 1991 and 1994. The weighted frequency for each of the 3 main race/ethnic groups was 73.5% non-Hispanic white, 11.7% non-Hispanic black, and 5.7% Mexican American. The highest weighted frequency of persons aged 60 or more years and the lowest weighted frequency of persons aged 12–19 years were observed in non-Hispanic whites.
Weighted allele frequency point estimates for the 90 genetic variants for each of the 3 major race/ethnic subgroups in the US population are shown in Table 2. Complete frequency estimates with confidence intervals are shown in Web Figure 1 and Web Appendix 4. Allele frequencies were significantly different across race/ethnic groups for 88 (97.8%) of the variants studied (P<0.05, two tailed), except for rs4986893 (CYP2C19, no homozygotes for minor allele) and rs1801274 (FCGR2A). Summary allele frequencies among the three major race/ethnic groups are shown in Table 3. Of the 90 candidate gene variants, 80 (88.9%) among non-Hispanic whites, 79 (87.8%) among non-Hispanic blacks, and 80 (88.9%) among Mexican Americans had allele frequencies of 0.02 or greater.
Differences in minor allele frequency of more than 20% (absolute value) compared with non-Hispanic whites are 22.4% (22 of 90 polymorphisms) for non-Hispanic blacks and 7.8% (2 of 90) for Mexican Americans (data not shown). Comparisons between NHANES III allele frequency estimates and other publicly available data sources are shown in Figure 1, with variants in MTHFR and VDR as examples. As observed, the NHANES III study includes much larger sample sizes, resulting in frequency estimates with small confidence intervals.
Significant differences in genotype prevalence across race/ethnic groups were seen for all variants except three: rs4986893 (CYP2C19), rs1801274 (FCGR2A), and rs2066470 (MTHFR) (Web Appendix 4). Deviations from Hardy-Weinberg proportions were seen for 4 of the 90 polymorphisms (4.4%) among non-Hispanic whites, 1 (1.1%) among non-Hispanic blacks, and 5 (5.6%) among Mexican Americans (P<0.01) (Web Appendix 2).
Weighted allele and genotype frequencies did not differ significantly by age group in the US population for the majority of the polymorphisms. However, 16 variants (17.8%) differed significantly in allele frequency, and 21 (23.3%) differed significantly in genotype prevalence by age (data not shown). After adjustment for race/ethnicity, these numbers decreased dramatically to 5 (5.6%) polymorphisms for allele frequency (Web Appendix 5) and to 14 (15.6%) variants for genotype prevalence (Web Appendix 6). However, we found that some of the race/ethnicity-adjusted tests may not be reliable because of zero cell counts.
There were no significant differences in allele frequencies or genotype prevalence by sex, except for rs1800451 (MBL2) and rs1800482 (NOS2A) (data not shown). After adjustment for race/ethnicity, the allele frequencies of 3 variants were statistically significant by sex— rs2243248 (IL4), rs1800482 (NOS2A), and rs361525 (TNF) (Web Appendix 7). After adjustment for race/ethnicity, the genotype prevalence of two variants was significantly different between men and women—rs2031920 (CYP2E1) and rs2243248 (IL4) (Web Appendix 8). (All results presented in this study are available online from the website of the National Office of Public Health Genomics at CDC (http://www.cdc.gov/genomics/).)
Our study evaluated the allele frequency and genotype prevalence of polymorphisms that have known or proposed associations with common diseases in a large, minority-enriched, and nationally representative sample of the US population. This is the first relatively large-scale, population-based effort in the United States to obtain such data by race/ethnic group. These data and future planned analyses will serve as an important reference for investigations into US population structure, for examinations of gene–disease associations in other investigations of the NHANES data set, for calculation of attributable risk, and for use as a reference by researchers in the design of further studies to discover associations of alleles and genotypes with common diseases.
Estimates of allele frequency and genotype prevalence are available from a number of existing gene variant databases, including the International HapMap Project (21–23) (http://www.hapmap.org) and the SNP500Cancer Database (45) (http://snp500cancer.nci.nih.gov). However, comparisons between NHANES III and such databases are limited because of significant differences in inclusion criteria, study populations, and classification of racial and ethnic groups between NHANES III and the other studies. Especially important is that these public databases function as genomic discovery tools. Consequently, their study populations were drawn largely from a small number of non-population-based samples. These small numbers of participants preclude accurate estimation of allele frequency and genotype prevalence, especially for rare variants or those that vary significantly by race and ethnicity. We compared two variants in MTHFR and VDR with other data resources and found substantial differences in allele frequencies (Figure 1). SNP500Cancer reports the C allele frequency of rs731236 (VDR) as 48.3% (95% confidence interval (CI): 35.3, 61.3) in non-Hispanic whites and as 35.4% (95% CI: 21.4, 49.4) in the African-American population. However, NHANES III estimates are 38.1% (95% CI: 36.0, 40.3) and 28.2% (95% CI: 26.8, 29.6), respectively. In conclusion, the NHANES III estimates of allele frequency and genotype prevalence in the US population are more representative and stable than are those calculated from previously available data.
In this study, allele frequency (in 88 of 90 genetic variants) and genotype prevalence (in 87 of 90 variants) differed significantly by race/ethnic group. Non-Hispanic blacks had considerable differences in minor allele frequency compared with non-Hispanic whites, with almost one-quarter of variants differing by at least 20% (absolute difference). In contrast, less than 10% of variants differed by at least 20% in allele frequency between Mexican Americans and non-Hispanic whites. Differences in allele and genotype frequency could partially contribute to differences in disease occurrence between population subgroups. As an example, the Pro12Ala variant of PPARG (rs1801282) has been studied extensively in relation to type 2 diabetes, with the Pro allele (C) being associated with increased disease prevalence (50). This finding has been duplicated in some genome-wide association studies (13–15), although not in all populations (51, 52). The higher CC genotype prevalence in non-Hispanic blacks (95.0%) compared with non-Hispanic whites (75.8%) may be a strong contributing factor to the increased risk of type 2 diabetes among non-Hispanic blacks, as this PPARG variant has been estimated to have a large population attributable risk of ~25% (50). Because differences in the occurrence of common human diseases among populations reflect variation in genetic factors, environmental factors, and their interaction, population-based genotype data, when coupled with other disease risk factors, will give us better insight into the causes of population differences in the occurrence of various diseases.
On the other hand, allele frequency and genotype prevalence did not differ significantly between men and women for most of the genetic variants studied (≥97.8%). Similar findings on allele frequency or genotype prevalence by sex have also been reported in some large studies (32, 33). Although we report statistically significant differences by age for approximately one-fifth of the genetic variants studied, most of these differences were no longer present after adjustment for race/ethnicity. This finding is likely attributable to the differences in age distribution between the race/ethnic groups (Table 1). Some of the significant differences in allele frequencies by age may indicate survival advantage, and other studies have found variants in or near MTHFR (53, 54), PON1 (55, 56), TLR4 (55, 57), and TNF (58) associated with aging or longevity. However, few genes have been reproducibly shown to do so (59, 60), and our results could be due to insufficient sample sizes or due to statistical chance in analyses.
There has been a concern that multiple individuals from a household were included (average household, 1.59 individuals; range, 1–11) in NHANES III for the estimation of allele and genotype frequencies. However, the estimates were calculated by using methods specifically designed to analyze data from surveys with complex designs. These methods adopt NHANES III sample weights and adjust the variance of the estimate among the correlated observations. NHANES III is a population-based survey that reflects the actual and overall genetic structure of the general US population, which contains many related individuals within or between subpopulations. Thus, inclusion of related individuals in the NHANES III survey should enhance the generalizability of estimates derived from these data.
Some potential limitations of this study are notable. First, NHANES III categorizes race and ethnicity according to self-reported affiliation, as do most epidemiologic studies. There is considerable literature on the accuracy of this social measure as a proxy for genetic ancestry (61–65). Despite the possible misclassification or oversimplification of genetic ancestry, these data may help to elucidate the uncertain contribution of genetic variation (65, 66) to the complex interactions among social, environmental, and behavioral influences in a diverse population that contribute to racial and ethnic health disparities. Another concern is that homozygotes were not detected for some rare polymorphisms in this study, and thus the statistical tests for these genetic variants may not be reliable. In addition, future studies of gene–disease associations and gene–environment interactions with rare variants may be limited by insufficient sample sizes when analyses are performed separately for each race/ethnic group and control for large numbers of variables.
In the near future, we plan to use race and ethnicity, as well as geographic information, to conduct a focused examination of the genetic substructure of the US population and subpopulations. This issue is generating increased interest, because latent population substructure has been discovered in populations previously thought to be relatively homogeneous (67, 68). Such analyses are, therefore, especially important for the heterogeneous US population and considering the high levels of admixture within African-American and Mexican-American populations (69–72). Multiple studies demonstrate that population substructure must be taken into account in the design and interpretation of genetic association studies (67, 68, 70, 73–75). Further research on population characteristics and genetic diversity will be invaluable in conducting genetic epidemiologic studies in the United States.
Determination of the prevalence of genetic polymorphisms associated with common diseases of public health importance in the US population and in subgroups of the population is a critical first step in evaluating the genetic epidemiology of complex diseases. These prevalence estimates can be used in predicting sample size requirements for future epidemiologic studies to evaluate genetic determinants of susceptibility to chronic and infectious diseases, the severity of disease, and interactions with other risk factors. Because data on genotype frequency are particularly sparse for non-Hispanic blacks and Mexican Americans, our estimates are useful in sample size calculations for studying the genomic contribution to the health of these populations. Investigations currently underway examine the associations of the reported genetic variants with select nutritional, biochemical, and clinical characteristics in the NHANES III data set that serve as markers or risk factors for numerous health outcomes. These outcomes include asthma and chronic obstructive pulmonary disease, diabetes, cardiovascular disease, viral infections, and osteoporosis.
With the recent successes of genome-wide association studies, the resource of the NHANES III DNA bank offers significant opportunities to move beyond investigations of candidate genes, as was done here. Many recent genome-wide association studies have uncovered replicable genetic associations with diseases such as breast (4–6), prostate (7, 9), and colorectal (8, 10) cancers; heart disease (11, 12); diabetes (13–17); and obesity (18–20). However, many of these large-scale, case-control studies did not use representative samples of the underlying populations from which the cases were derived. NHANES is the only nationally representative, population-based sample survey that systematically collects physical, physiologic, imaging, laboratory, and interview data on a large number of individuals in the United States. Use of a whole-genome approach to assess the prevalence of genetic polymorphisms, including copy number variants, in the NHANES III DNA bank will be an important next step toward identifying genetic variants that can help to predict disease susceptibility and progression. This approach will also provide the basis for estimating the numbers of people in the United States who may benefit from genome-based tools, such as risk factor reduction; disease screening efforts; or diagnostic tests, drugs, or other preventive or therapeutic interventions. Current and future NHANES III prevalence estimates will be deposited into a publicly accessible database for research.
Thus, this first effort in NHANES begins to lay a strong scientific foundation for studying the impact of genetic variation on common diseases in the United States and in the future evaluation of biomarkers and diagnostic tests. Information derived from NHANES will provide an important reference and will enhance the translation of genomic information into clinical and public health practice.
Author affiliations: National Office of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, Georgia (Man-huei Chang, Nicole F. Dowling, Ramal Moonesinghe, Renée M. Ned, Ajay Yesupriya, Muin J. Khoury); National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention, Centers for Disease Control and Prevention, Atlanta, Georgia (Mary Lou Lindegren, Mary R. Reichler); National Institute for Occupational Safety and Health, Centers for Disease Control and Prevention, Cincinnati, Ohio (Mary A. Butler); National Cancer Institute, National Institutes of Health, Rockville, Maryland (Stephen J. Chanock); National Center for Environmental Health, Centers for Disease Control and Prevention, Atlanta, Georgia (Margaret Gallagher); National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, Georgia (Cynthia A. Moore); National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, Maryland (Christopher L. Sanders); and Core Genotyping Facility, Division of Cancer Epidemiology and Genetics, Advanced Technology Program, SAIC Frederick, Inc., NCI-Frederick, Maryland (Robert Welch).
The authors would like to dedicate this article in memory of Bob Welch, a treasured collaborator and friend.
This work was supported by the Centers for Disease Control and Prevention, Atlanta, Georgia.
The authors would like to thank the following individuals from the CDC/NCI NHANES III Genomics Working Group: National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, Georgia (Dr. Craig Hooper, Dr. Quanhe Yang); National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, Georgia (Dr. Karon Abe, Dr. Heidi M. Blanck, Dr. Ingrid J. Hall, Dr. Guiseppina Imperatore, Dr. Ann Malarcher, Dr. Glen Satten); National Center for Environmental Health, Centers for Disease Control and Prevention, Atlanta, Georgia (Dr. Amanda Brown, Dr. Omar Henderson, Dr. Deborah Koontz); National Center for Environmental Health, Agency for Toxic Substances and Disease Registry, Atlanta, Georgia (Olivia Harris); National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention, Centers for Disease Control and Prevention, Atlanta, Georgia (Dr. Michael Aidoo, Dr. Robert Chen, Dr. Janet McNicholl); National Center for Zoonotic, Vector-Borne, and Enteric Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia (Dr. Venkatachalam Udhayakumar); National Institute for Occupational Safety and Health, Centers for Disease Control and Prevention, Morgantown, West Virginia (Dr. Ainsley Weston); National Office of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, Georgia (Dr. Marta Gwinn, Tiebin Liu, Dr. Wei Yu); Office of the Director, Coordinating Center for Health Promotion, Centers for Disease Control and Prevention, Atlanta, Georgia (Dr. Karen Steinberg); National Cancer Institute, National Institutes of Health, Rockville, Maryland (Dr. Neil Caporaso, Amy Hutchinson); Office of the Medical Director, March of Dimes Foundation, New York, New York (Bruce Lin); Department of Medicine, University of Washington School of Medicine, Seattle, Washington (Dr. Jai Lingappa); Department of Epidemiology and Community Medicine, University of Ottawa, Ontario, Canada (Dr. Julian Little); Division of Medical Genetics, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, Utah (Dr. Lorenzo Botto); Department of Infectious Diseases, College of Veterinary Medicine, University of Georgia, Athens, Georgia (Dr. Tom Hodge); Program in Health Decision Science, Harvard University, Cambridge, Massachusetts (Davene Wright).
The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government.
Conflict of interest: none declared.