|Home | About | Journals | Submit | Contact Us | Français|
The propensity score method has been underused in research concerning asthma epidemiology, which is useful for addressing covariate imbalance in observational studies. Objective: To examine the impact of neighborhood environment on asthma incidence by applying the propensity score method.
The study was designed as a retrospective cohort study. Study subjects were all children born in Rochester, Minn, between 1976 and 1979. Asthma status was previously determined by applying predetermined criteria. We applied the propensity score method to match children who lived in census tracts facing or not facing intersections with major highways or railroads. The propensity score of children living in a census tract facing intersections was formulated from a logistic regression model with 16 variables that may not be balanced between comparison groups. The Cox proportional hazard models were used in the matched samples to estimate hazard ratios of neighborhood environment and some other variables of interest and their corresponding 95% CIs.
After matching with propensity scores, we found that children who lived in census tracts facing intersections with major highways or railroads had a higher risk of asthma (hazard ratios, 1.385–1.669 depending on the matching methods) compared with the matched counterparts who lived in census tracts not facing intersections with major highways or railroads.
Neighborhood environment may be an important risk factor in understanding the development of pediatric asthma. The propensity score method is a useful tool in addressing covariate imbalance and exploring for causal effect in studying asthma epidemiology.
The influence of neighborhood environment on health outcomes is becoming an important area of research given the significant regional variation of health outcomes including asthma and different levels of collective resources (eg, access to health care services) among neighborhoods.1–6 In investigating the role of a certain exposure or treatment such as neighborhood environment, which cannot be assigned randomly, in asthma outcomes, covariate imbalance between 2 comparison groups is a significant obstacle to observational studies. In this respect, propensity score approach can be considered as an effective way to reduce covariate imbalance. The concept of the propensity score was introduced by Rosenbaum and Rubin7 in the early 1980s. A propensity score is a conditional probability of assignment of a subject to a particular intervention (eg, treatment vs comparative) given a set of observed covariates. The main rationale for applying the propensity score to assess the influence of neighborhood environment on health outcomes (eg, asthma incidence) is to reduce covariate imbalance between comparison groups in observational studies. This covariate imbalance occurs as a consequence of nonrandomization over assignment to a specific setting, as opposed to the designed experiment studies (ie, randomized controlled trials). Thus, proper application of the propensity score to an observational study is to create a quasi-randomized experiment through balancing covariates between settings. 8 This approach is useful for observational studies that do not have control over assignment of a subject to exposure or treatment status. Although the propensity score approach has been used in the epidemiologic literature concerning a variety of health outcomes,9–11 few studies have applied the propensity score approach to research concerning asthma epidemiology. Currently, the propensity score methods have been underused in studying asthma epidemiology, especially asthma incidence in observational studies.
In this report, we assessed the influence of neighborhood environment on asthma incidence in the 1976 to 1979 birth cohort in Rochester, Minn, using a propensity score approach. A key procedure with this approach is matching schemes (eg, incomplete or exact vs nearest available matching). Currently, there is little guidance on which matching methods are desirable or suitable in application of the propensity score approach. To our knowledge, this is the first study that has assessed the effect of neighborhood environment on asthma incidence using propensity score method.
The study was approved by the institutional review boards at the Mayo Clinic and Olmsted Medical Center.
Study subjects were the population-based birth cohort, which has been previously described.13,14 Briefly, all children born in Rochester between January 1, 1976, and December 31, 1979, were identified by using computerized birth certificate information obtained from the Minnesota Department of Health, Division of Vital Statistics. Information on incident asthma cases for the children in the birth cohort was obtained through merging data from a previously assembled database designed to examine the incidence of asthma among all Rochester residents between 1964 and 1983 with the data from the previously mentioned birth cohort.15
The criteria for identifying asthma cases have been previously described and are delineated in Table I.6,15 Subjects of the study were children who were residents of Rochester who experienced onset of asthma between January 1, 1976, and December 31, 1983.
The exposure status of interest in this study was whether a subject lived in a census tract facing intersections with major highways or railroads. Data on a child’s census tract were based on the mother’s census tract residency at delivery.
There were 16 census tracts in Rochester, Minn. Census tracts are small, relatively permanent statistical subdivisions of a county usually having between 2500 and 8000 persons.16 Census tracts tend to share similar population characteristics and often reflect neighborhood.2 Thus, census tract level has been suggested to use for area-level analysis instead of other census-based area-level measures.17 We categorized census tracts into 2 groups according to whether a census tract faced either an intersection with 2 major highways or an intersection with a major highway and a railroad. This variable served as a proxy for exposure to higher traffic volumes (ie, a census tract facing intersections with highways or a railroad was considered to have a higher level of exposure to traffic volumes). Previous studies have used traffic intersections as markers for increased traffic-related pollutants.18–20 The Environmental Protection Agency uses intersection level as 1 of its major criteria for screening for potential carbon monoxide hotspots.19
The propensity score is a conditional probability that a subject would live in a census tract facing intersections given all observed covariates. It can be mathematically expressed as
where e(x) is the propensity score, z is an exposure or treatment status (ie, zi = 1 as a principle treatment or exposure vs zi = 0 as a comparative one), and x is a vector of covariates. Given e(x), the assignment to a particular setting z is independent of the observed covariate x. The propensity score is a function of the observed covariates (ie, a scalar with vectors of observed covariates that ranges between 0 and 1). In this study, the main exposure status of interest was the binary variable z indicating whether a subject lived in a census tract facing either an intersection with a major highway or an intersection (z = 1) with a highway and a railroad or not (z = 0). The conditional distribution of covariates, x | e(x), is the same for the group of subjects assigned to the principle exposure or treatment (ie, those who lived in census tracts facing with intersections, z = 1) and the group for comparative one (ie, those who lived in census tracts not facing with intersections, z = 0).7,8 Assignment to the exposure, z, can be considered a random process if independence of the assignment to exposure z and the health outcome can be assured given the propensity score e(x). Thus, subjects with a given propensity score will have similar covariate (x) distribution between the group with the principal exposure and that with the comparative one. In other words, the main rationale for the propensity score is that this methodologic approach is making the comparison groups (ie, the status facing intersections with major highways or railroads in this study) more comparable (by reducing covariate imbalance through matching the propensity score) in assessing its association with outcomes (ie, asthma status in this study). Making the comparison groups more comparable is the main reason for randomization of a clinical trial and is difficult, if not impossible, to accomplish in observational studies. The propensity score for each subject is obtained by fitting a logistic regression model that included the predictor variable (ie, the status facing intersections with major highways or railroads) as outcome and the following covariates: sex, number of prenatal visit, birth weight, ethnic group, number of siblings at birth, age of father at birth, age of mother at birth, marriage status, mother’s education level, complication not related to pregnancy, induction, complication related to labor, known hospitalization ever after index date of asthma, family history of atopic disease, and smoking history. We used all these observed variables regardless of statistical significance. After the propensity score was constructed, we matched the propensity score by using 2 approaches: (1) nearest available matching on the propensity score, and (2) matching within a caliper of 0.2 SD of the logit of propensity score. This caliper of 0.2 SD of logit of propensity score was suggested by Austin21 after an extensive simulation study. In this article, we compared both matching methods in covariate imbalance and conducted sensitivity analysis for the effect estimates in hazard ratios (HRs).
After matching the propensity score between children who lived in census tracts facing intersections with major highways or railroads and those who lived in census tracts not facing such intersections, we assessed covariate imbalance before and after matching for each method. The estimated propensity scores are compared between the 2 groups by using histogram plots which are presented in Online Repository figures (Figures E1 and E2). We report standardized difference and bias reduction for each observed covariates by each matching method. Standardized difference S is calculated as follows:
where exposed and exposed are the means of a given covariate for the exposed (subjects who lived in census tracts facing intersections with major highways or railroads) and unexposed (those who lived in census tracts not facing with such intersections), respectively. The sample SDs of a given covariate for the exposed (subjects who lived in census tracts facing intersections with major highways or railroads) and unexposed (those who lived in census tracts not facing with such intersections) are and , respectively. The bias reduction (BR) is calculated as
where the absolute value of the standardized difference in means for the matched sample is divided by the absolute value of the standard difference in means for the unmatched sample.
As discussed, the propensity score was constructed by using a logistic regression model that predicts the conditional probability of whether one lived in census tracts facing intersections with major highways or railroads and those who lived in census tracts not facing such intersections using 16 variables. Then 2 matching schemes as described were applied to obtain the matched data set. In the matched data set, we have equal number of children who lived in a census tract facing an intersection and lived in a census tract not facing an intersection because of the 1:1 matching. Multivariate Cox proportional hazard models of neighborhood environment were applied on the propensity-matched data set with the following covariates: sex, body weight, mother’s education level, mother’s age at child’s birth, and mean family income per census tract. The propensity score matching was used as a strata in the marginal approach of multivariate Cox proportional hazard models. For matching within caliper, we used 0.2 SD as a caliper on the basis of the literature. 21 We compared the multivariate analysis results including neighborhood environment relevant covariates between 2 matching schemes and with the multilevel survival analysis conducted previously.
From 1976 to 1979, a total of 3970 children were born to mothers who were residents of the city of Rochester at the time of their delivery. Eleven children died at birth, yielding 3959 children in the birth cohort for follow-up. Of these 3959 children (mean follow-up of 4.9 person-years), 1023 (25.8%) moved out of the city during the study period, and 33 (0.8%) died—all without a previous diagnosis of definite or probable asthma. However, we did not include 26 children who did not have census tract information from our analysis. Thus, there were 3933 children considered in our study. A total of 215 children were diagnosed with asthma between 1976 and 1983. Of the study cohort, 51.4% were boys and 97% were white.
We assessed propensity score overlapping between children who lived in census tracts facing intersections with major highways or railroads and those who lived in census tracts not facing such intersections. The results are summarized in Figs E1 and E2 and depict similar distributions of the matched cohorts in propensity scores suggesting a successful matching. Then we assessed covariate imbalance before and after matching for each matching method (ie, nearest available matching vs matching within a caliper). The results are summarized in this article’s Table E1 in the Online Repository at www.jacionline.org before matching and Tables E2 and E3 in the Online Repository after matching. The propensity score matching reduced standardized differences significantly, suggesting that both matching methods reduced covariate imbalance between the comparison groups significantly.
Seven census tracts (44%) faced intersections with major highways or railroads. Of the 3933 study subjects, 1947 subjects lived in census tracts that faced intersections, and among these subjects, 124 developed asthma (6.4%), whereas 1986 subjects lived in census tracts that did not face intersections, and among these subjects, 90 subjects developed asthma (4.5%; P = .011).
After matching on the estimated nearest propensity score, children who lived in census tracts facing the intersections with major highways or railroads had a higher risk of developing asthma compared with those who lived in census tracts not facing with such intersections based on a univariate analysis (HR, 1.4; 95% CI,1.05–1.87; P =.022). The results based on matching with caliper by using the propensity score matching within a caliper of 0.2 SD of logit propensity score excluding 635 subjects showed insignificance for univariate analysis (HR, 1.14; 95% CI, 0.91–1.43; P = .26). We examined the reason incomplete matching resulted in different results by assessing the differences between the excluded subjects and the rest. Of the 635 subjects excluded because of incomplete matching, 298 children lived in census tracts facing the intersections with major highways or railroads, whereas 337 children lived in census tracts not facing with such intersections. This exclusion made certain variables such as sex (P = .0048), ethnicity (P < .001), multiple birth status (P = .0254), age of mothers at birth (P < .001), complication not related to pregnancy (P = .025), and induction of labor (P = .003) significantly different between the comparison groups that were not different in the overall cohort. Thus, exclusion of these subjects as a result of incomplete matching within a caliper (0.2 SD) appeared to distort the study cohort quite significantly.
The effect estimates according to different matching methods in multivariate models are summarized in Table II. Different methods of matching resulted in similar findings in multivariate models. The results suggest that children who lived in census tracts facing the intersections with major highways or railroads had a higher risk of developing asthma compared with those who lived in census tracts not facing such intersections. In addition, sex of the study subjects, subjects’ maternal educational levels, and maternal age at birth of subjects were significantly affected asthma incidence. We also performed unmatched analysis by using a Cox model (ie, a conventional regression method). The effect size and statistical significance of facing intersections with major highways and railroads were similar to those based on the propensity score approaches.
Based on the complete matching using nearest propensity score, children who lived in census tracts facing the intersections with major highways or railroads had about 40% to 70% increased risks of developing childhood asthma compared with those who lived in census tracts not facing such intersections. The effect size can be compared with those previously obtained from multilevel survival analysis methods (HR, 1.6; P = .018, ie, 60% increased risk of asthma),6 and the hazard estimates of facing intersections from propensity score method and multilevel method appear to be similar. Our previous work based on a multilevel analysis focused on assessing the role of neighborhood environment in asthma incidence, whereas in this current work, we highlight the usefulness of the propensity score approach in asthma research.
Although the effect sizes appear to be modest, we believe they are still significant given the modest effect of the reported impacts of environmental tobacco smoke on asthma incidence (odds ratio [OR],1.13–1.31 depending on age of child),22,23 asthma prevalence (OR, 1.21–1.5 depending on smoking status of both parents), the prevalence of wheeze (OR, 1.24–1.47), and lower respiratory illness (OR, 1.5–1.72) in young children.24 Although it is debatable whether the estimates after propensity score matching need to be adjusted for relevant covariates, we adjusted to have conservative estimates. The univariate results based on nearest propensity score matching between children who lived in census tracts with and without facing the intersections or railroads were not significantly different from those of a multivariate model, but the effect size of the estimates depended on the type of matching.
Ecologic studies have shown that the amount of fine particulates (eg, sulfate-containing aerosols) is significantly higher at highway intersection sites (29%) than comparison sites such as suburban areas (7%).18 Studies at the individual level have reported that among children living within 150 m of a main road, the risk of wheezing increases by an OR of 1.08 per 30-m increment in primary school children,25 and this finding has been corroborated by others.26,27 Assuming that whether a census tract faces intersections with highways or railroads might reflect proximity of the residences of children to traffic, it is likely for children who live in a census tract that is more proximal to intersections to have an increased exposure to more traffic-related pollutants, such as nitrogen dioxide, which may trigger asthma symptoms.24 Therefore, given the consistent results between 2 different methods and the literature support, neighborhood environment needs to be considered and included in designing a study for asthma epidemiology.28 The further discussion of the literature pertaining to the association between neighborhood environment and asthma incidence was made in our previous article.6
However, in applying the propensity score methods to the role of exposure in outcomes in observational study, it is important to apply an appropriate matching method.21 In our study, if we applied incomplete matching within a caliper (0.2 SD), whether a child lived in census tracts with and without facing the intersections did not show significant influence on the risk of asthma in the univariate analysis, whereas it did in the multivariate analysis. This is because the excluded subjects who lived in census tracts facing with the intersections disproportionately represented children with asthma. Also, the excluded subjects (n = 635, 16% of the cohort) were significantly different from the original cohort with regard to sex (P = .0048), ethnicity (P < .001), multiple birth status (P = .0254), age of mother at birth (P < .001), complication not related to pregnancy (P = .025), and induction of labor (P = .003). Certain variables such as sex, birth weight, and ethnicity are important covariates or confounders. Thus, in our study subjects, exclusion of a subsample of subjects distorts the study sample in a way supporting a null hypothesis. Rosenbaum and Rubin29 suggested that incomplete matching introduces more bias than inexact complete matching using a nearest available matching method. They also suggested residual covariate imbalance as a result of inexact matching can be adjusted in a regression model.29 Along with this recommendation, we suggest applying both matching methods (nearest matching vs exact matching within a caliper) for analysis. If subjects excluded as a result of matching within a caliper are substantially different from the original cohort, the nearest matching method needs to be considered, which introduces a nondifferential misclassification bias with regard to asthma status, and adjustment for residual covariate imbalance by using a regression method can be considered. Another aspect in choosing matching methods to be considered is the degree of reduction of standardized difference between different matching methods. In our study, the complete matching using nearest propensity score had similar degree of reduction in standardized differences to that of exact matching within a caliper in overcoming covariate imbalance. Thus, in our study, we believe the complete matching with nearest propensity score is more suitable than exact matching within a caliper. The results by a conventional approach (ie, unmatched Cox model) showed similar results to those by the propensity score method, which may suggest no significant gain in the propensity score method. However, the propensity score method took into account all covariates, which were used for formulating the propensity score for matching, whereas the conventional unmatched Cox model did not include 12 potential covariates, which might affect the results significantly if included.
Strengths of our study include a population-based longitudinal study in design and reliable definition of incident asthma. There are also inherent limitations in our study because of its retrospective design. Multidimensional aspects of socioeconomic status (SES) could not be addressed because no other indicators for SES were available from the current data set. There is a potential misclassification bias as a result of migration across census tracts because the results were based on census tract information at the time of delivery, not onset of asthma. However, it is likely to be a nondifferential misclassification bias without regard to exposure (census tract with or without facing intersections) or outcomes (asthma status). Some potential covariates were not included in the model, such as atopy status, air quality index, exposure to tobacco smoke or infection, and breast-feeding. Because these factors have been reported to be associated with individual SES and we included maternal educational levels as a measure of SES in calculating the propensity score, it is unlikely to change the results significantly, but future studies need to take these factors into account.30 Last, given the predominant white population, our results may not be generalizable to other ethnic groups, but at the cost of generalizability, our study population allows us to separate the effect of ethnicity from that of SES or SES-related factors.31–33
In conclusion, neighborhood environment is an important construct in studying asthma epidemiology. In studying the influence of neighborhood environment on childhood asthma, the propensity score approach is a useful method. We suggest applying both nearest matching with propensity score and exact matching within a caliper for matching. We also suggest choosing a matching method depending on comparison between the excluded subjects and the original cohort as well as standardized differences.
Supported by the Scholarly Clinician Award from the Mayo Foundation and made possible by the Rochester Epidemiology Project (R01-AR30582) from the National Institute of Arthritis and Musculoskeletal and Skin Diseases.
Disclosure of potential conflict of interest: The authors have declared that they have no conflict of interest.
Clinical implications: Clinicians need to be concerned about neighborhood environment beyond home environment of individuals in understanding asthma. The guidelines for assessment of neighborhood environment in practice and research may be needed.