We implemented a population-based, case–cohort design, using ASD surveillance data from ADDM and publicly available birth certificate files from the National Center for Health Statistics (NCHS) (CDC 2009a
) and the North Carolina State Center for Health Statistics (Howard W. Odum Institute for Research in Social Science at the University of North Carolina at Chapel Hill 2009). Our study population was defined as all children born in 1992, 1994, 1996, and 1998 who resided at birth within regions subsequently under ADDM surveillance during their 8th year of life. We excluded some counties with populations < 100,000 in which birth county is suppressed in birth certificate files and excluded some ADDM regions in which surveillance was incomplete within county boundaries. Finally, we restricted the sample to ADDM sites that were able to successfully obtain the needed birth certificate variables. Regions included for at least 1 year in this study included 5 northern counties in Alabama, all of Arkansas, Miami-Dade county in Florida, 5 counties in metropolitan Atlanta in Georgia, Baltimore county and 5 surrounding counties in Maryland, 6 counties in metropolitan St. Louis in Missouri and Illinois, Union County just south of Newark in New Jersey, 10 counties surrounding Greensboro and Durham in North Carolina, Philadelphia County in Pennsylvania, 5 counties in southeastern Wisconsin including Milwaukee, and all of West Virginia.
Surveillance ascertainment and case subgroups.
Cases were all children with ADDM surveillance-ascertained ASDs born within the source population as defined above. The ADDM network has performed active, population-based surveillance for ASDs in select regions of the United States biannually since 2000. The surveillance methodology does not directly evaluate children, but relies on developmental records through the child’s 8th year of life at key agencies, including medical agencies, early intervention services, and public schools (CDC 2007
; Van Naarden Braun et al. 2007
). An ADDM clinician ascertained whether characteristics and behaviors in a child’s developmental record met the standardized ADDM case definition for ASDs, based on the Diagnostic and Statistical Manual, Fourth Edition–Text Revision
(DSM-IV-TR) (American Psychiatric Association 2000
). Information on these children was obtained in compliance with all applicable regulations for the protection of human health and educational data, including approval by institutional review boards in each ADDM region.
The ADDM network recorded variables that further characterized the phenotype for children meeting the ADDM case definition of ASD. Variables that were abstracted directly from the developmental record included a
) community diagnosis: whether a community provider had ever diagnosed the child with autistic disorder (AD) and/or ASD not otherwise specified (ASD-NOS); and, if so, b
) timing of diagnosis: whether the child’s age in months at the earliest documented diagnosis was early (dichotomized using the median value, < 50 months) or late (≥ 50 months), and c
) co-occurring intellectual disability (ID), defined as IQ ≤ 70 on tests such as the Battelle–cognitive domain (Newborg 2004
), Differential Ability Scales (Elliott 2007
), Stanford-Binet–4th ed. (Thorndike et al. 1986
), Wechsler Intelligence Scale for Children-III (Wechsler 1991
), and the Wechsler Preschool and Primary Scale of Intelligence (Wechsler 1989
). A fourth case subgrouping variable was newly derived by ADDM clinicians based on a review of the entire composite record; clinicians classified cases as AD, requiring documented symptoms corresponding to DSM-IV-TR criteria for AD, or ASD-NOS, requiring fewer or less severe symptoms and including Asperger’s disorder and pervasive developmental disorder.
Children could be identified as cases only if they resided within the surveillance regions when they were 8 years of age. We limited the case group to those children also born within the surveillance areas so that our case group arose from the underlying population. We determined county of birth from birth certificate data obtained by each ADDM surveillance site.
Maternal smoking in pregnancy and covariates. Information on maternal smoking during pregnancy was obtained from birth certificate data. Smoking is collected using a yes/no check-box in a method that varies by state but usually involves abstraction of the medical record. Demographic factors, including maternal education, age, marital status, and race/ethnicity were also obtained from birth certificates. We used a variable of county population size available in the NCHS birth certificate data as a proxy for the urbanicity of each county ().
Prevalence of ASDs and case subgroups by child and family characteristics, with characteristics of the source population
Primary statistical analysis.
We estimated prevalence ratios (PRs) of ASD by level of maternal smoking (yes/no) using logistic regression. We did not identify or remove cases from the denominator data; consequently, cases were included in the denominator data set representing all children born in the eligible geographic regions and birth years. Thus, odds ratios from these models are mathematically equivalent to PRs. We were unable to confirm the ASD status for individuals who moved out of the surveillance region between birth and 8 years of age, leading to a slight underestimation of prevalence. We included factors in multivariable models that may have the potential to confound the association between maternal smoking, and excluded factors that may be acting as causal intermediaries because they are influenced by maternal smoking (e.g., low birth weight) (Cole and Hernan 2002
; Greenland and Brumback 2002
). Selected potential confounders included maternal education (modeled using restricted quadratic splines) (Durkin et al. 2010
), race and ethnicity (categorized as non-Hispanic white, non-Hispanic black, Hispanic, or other) (Mandell et al. 2009
), marital status (yes/no), and maternal age (restricted quadratic splines) (Durkin et al. 2008
). Next, we evaluated whether county population size (in five categories as in NCHS data), birth year (as categories), and surveillance site (as categories) confounded our estimates.
We evaluated modification of the association between maternal smoking and ASD by a
) child sex, because it has been found to modify other environmental–chemical–neurodevelopmental associations (Bellinger et al. 1990
; Braun et al. 2011
; Ris et al. 2004
) maternal race/ethnicity and c
) education; and variables that may capture differences in ADDM surveillance activities or general temporal or spatial trends: d
) birth year and e
) county population size. Modifiers were evaluated on the multiplicative scale by inspecting PRs stratified by the potential modifier and by performing likelihood ratio tests. The likelihood ratio tests compared a fully adjusted model to a model that additionally included cross-product terms between a potential modifier and maternal smoking. Factors for which the likelihood ratio test p-
value was < 0.10 were considered to modify the association between maternal smoking and ASDs.
We repeated our multivariable models for several case subgroups in exploratory analyses, assuming that different subgroups may exhibit differential susceptibility to tobacco smoke. Subgroups that corresponded to higher- and lower-functioning ASDs, such as subgroups based on co-occurring intellectual disabilities (ID), have been suggested as ASD endophenotypes that correspond to genetic liability (Szatmari et al. 2007
). We used the following available variables from ADDM to define case subgroups: whether a prior community diagnosis was AD or ASD-NOS, the timing of first diagnosis (assuming that earlier diagnosis in part served as a marker of more numerous or more severe symptoms), ADDM-determined subgroup (AD or ASD-NOS), and the presence of co-occurring ID. Because of differences in the data available between sites and years, some analyses of ASD subgroups were limited to a data subset that contained the needed variables ().
Distribution of maternal smoking during pregnancy for ASDs and case subgroups
Autism has consistently been found to be more prevalent in groups of higher social class in the United States, leading to concerns that autism may be under-ascertained in children of lower social class. Such gradients are even found in ADDM data, despite its active surveillance methodology that can recognize a case without a prior documented diagnosis (Durkin et al. 2010
). To evaluate the impact of under-ascertainment on our results, we performed a sensitivity analysis correcting for such outcome misclassification. Because of the strong association between maternal education and smoking in pregnancy (CDC 2010
), ascertainment that varies by maternal education has the potential to affect results, even without an assumption of differential ASD ascertainment within smoking strata. We used standard formulas to correct for outcome misclassification and varied the specificity assumptions as allowable without creating negative cell counts (Rothman et al. 2008
). We adjusted the number of cases using different estimates of sensitivity in each stratum of maternal education, assuming the highest outcome sensitivity in the stratum with a college degree and comparatively less sensitivity for all other educational strata based on the ASD prevalence observed in our data [see Supplemental Material, Table S1
)]. This process assumed that only the surveillance ascertainment of ASD, but not the true prevalence of the condition, varied by maternal education. We constrained our stratum-specific sensitivity values so that the overall outcome sensitivity corresponded to that found in an ADDM evaluation study: 0.60 (Avchen et al. 2010
). Outcome sensitivity values for ASD by strata of maternal education were as follows: college degree: 0.80; some college: 0.72; high school degree: 0.53; and less than high school: 0.35 (see Supplemental Material, Tables S1 and S2
). We also performed outcome misclassification corrections separately for ADDM-determined AD and ASD-NOS (see Supplemental Material, Tables S1 and S3
). After adjusting numbers of cases and controls using an Excel spreadsheet, we calculated confidence intervals (CIs) using PROC Freq/CMH in SAS (SAS Institute Inc., Cary, NC) applied to the resultant simulated data (Robins et al. 1986
To evaluate a potential selection bias from including infants who died in the first year of life, we performed an analysis removing infant deaths in regions and years for which information on infant death was available, using the Birth Cohort Linked Birth–Infant Death Data Files from NCHS (CDC 2009a
). These files were available for birth years 1996 and 1998 in counties with populations > 250,000.
We performed a subanalysis to evaluate the impact of residential mobility on our results, because the source population included children who had moved out of the study area and could not be identified as ASD cases at age 8. This subanalysis was limited to children from North Carolina born in 1994 and 1996. We traced the residential histories of a random sample of this birth cohort to determine residency within the surveillance catchment area at age 8. Tracing was conducted by searching on maternal and paternal names from the birth certificate using commercial databases of multiple residences over time provided by LexisNexis. We then compared PRs of smoking and ASD using a) a denominator of all included North Carolina children versus b) a denominator of children remaining within the North Carolina surveillance area.