This study utilizes data from the Utah Population Database (UPDB). The UPDB is one of the world’s richest sources of linked population-based information that focus on demographic, genetic, epidemiological, and public health outcomes. It includes information on over 7 million individuals spanning two centuries. Measures of height and weight with which to calculate BMI, overweight, and obesity, as well as spatial location are obtained from contemporary driver license data that have been included in the UPDB under an agreement with the Utah Department of Public Safety. As part of the University of Utah’s Institutional Review Board approval process, the UPDB staff retains the driver license address information and provides researchers with driver license BMI information linked to census block groups via Universal Transverse Mercator (UTM) coordinates. Height and weight information are converted to BMI (weight in kg/height in m2
) and then recoded to categorical measures of overweight (25≤BMI< 30) and obesity (BMI≥30) in relation to healthy weight (18.5≤ BMI < 25). We exclude individuals who are underweight (BMI<18.5) from our analysis because these individuals may have health conditions that limit their physical activity. We use the adult guidelines for overweight among youth age as previous research has established that youth aged 14 and over generally follow adult weight classifications (Dietz 1999
; Dietz and Bellizzi 1999
The UPDB has the advantage of extensive coverage but the potential limitation of reliance on self-reported weight and a time lag between the measurement of physical environment and weight measures. The weight data likely share the limitations of self-reported weight in other studies. Specifically, individuals often underestimate their weight (Nawaz, Chan et al. 2001
; Gorber, Tremblay et al. 2007
). Nevertheless, self-reported weights, such as those in the CDC Behavioral Risk Factor Surveillance System (BRFSS), have proved valuable for monitoring obesity trends in the United States (Mokdad, Ford et al. 2003
; Centers for Disease Control and Prevention 2007
). Given self-reported weight underestimation, the time lag between census and driver license data, and the fact that individuals typically gain weight over time, the estimates in this study are likely underestimates of current weight. We have no evidence, however, that reporting errors for weight are associated with geography. Moreover, the effects of weight misreporting on our estimates are mitigated when self-reported values are used to derive BMI categories
(i.e., there are fewer individuals who are misclassified
because of the reliance on self-reported weight).
For this study, we select individuals in the UPDB between the ages of 17 to 20 and 27 to 30 in 2000, who had valid driver licenses, and who lived in Salt Lake County. These age and geographic restrictions result in samples of 23,334 males and 21,021 females who were between the ages of 17 and 20, and 30,142 males and 26,078 females between the ages of 27 and 30. The age category 27 to 30 is used as the comparison because the majority of these individuals have established their own residences (White 1994
), completed schooling, and have exercised choice in their residential location. We focus on residents of Salt Lake County because of its considerable variation in neighborhood diversity, density, and design as measured for 564 census block groups in the county (Smith, Brown et al. 2008
; Zick, Smith et al. 2009
The Federal Highway Administration (2006) estimates that, nationally, 57% of 17 year-olds had a driver license in 2000. By age 20, the percentage was 77%. By age 27–30, the percentage of individuals with driver licenses was over 90%. Unfortunately, we do not have percentages for Salt Lake County but we assume the fraction holding driver licenses in the county mirrors these national numbers. Utah requires that drivers provide height and weight information at the time they get their license and that it be updated after a change of residence, name changes, loss of license, or at the time of renewal which is required every ten years. Assuming that most of the 27–30 year olds recently renewed their driver licenses, both age groups should have relatively current height and weight reports. We choose to focus on age in 2000 because it represents the year with the most census data.
Neighborhood characteristics taken from the 2000 Census and measured at the block group and census tract level are linked to individuals in the UPDB based on the UTM for their residences. Measures of density, housing age, and percentage of residents who walk to work are assessed at the block group level. Pedestrian-friendly design is measured by street connectivity and our proxy for this is the number of intersections within one kilometer of the resident’s home. Street connectivity is derived from street data in the U.S. Census TIGER/Line file (U. S. Bureau of the Census 2008
At the individual level, all analyses control for gender as recorded on the driver license. Additional socio-demographic census variables taken from the 2000 census include neighborhood racial/ethnic composition (the proportion of the block group that is Hispanic, African-American, Hawaiian/Pacific Islander, and Asian), median family income, and median age of individuals in the block group.
The UPDB is a relational database where parents and siblings’ information are linked to each other. We capitalize on these linked records to capture the effects of the parental environment (e.g., parental preferences for foods, exercise, and residential location) by including the mother’s and fathers’ BMI calculated from their driver license data that is closest to the year 2000. These parental BMI measures are adjusted for age of the parents and the year of the driver license. For those individuals where the parents’ BMI data are missing, we use the mean value (26.07 for mothers and 28.04 for fathers) and include a dummy variable that equals to “1” if parental BMI information is missing, “0” otherwise.
Regressions are estimated to assess if the “three D’s” relate to BMI and the risk of overweight or obesity when controlling for individual, familial and neighborhood socio-demographic characteristics. The regressions are estimated separately for males and females. To test explicitly for neighborhood selection effects, we include both the 17–20 year olds and the 27–30 year olds together in the same regression. We also include a dummy variable set equal to one if the respondent is age 27–30, and set equal to zero otherwise, that is interacted with all of the independent variables in the regression. We then repeat the estimation separately for each age group.
We view the coefficients for the 17–20 year olds as valid estimates of the structural relationships between neighborhood design and individual BMI under the assumption that adolescents have little or no voice in residential choice. That is, adolescents have most likely not chosen where they live and consequently, their neighborhood characteristics may be viewed as predetermined factors that influence BMI and overweight/obesity risk.
We compare the estimates obtained with the 17–20 year olds to: (1) estimates obtained when the two age groups are combined, and (2) estimates obtained using only the 27–30 year olds. Both comparisons provide insights about the relative roles of causation and selection. Differences in the results across either alternative specification relative to the model estimated with the 17–20 year olds is an indication of selection. Unfortunately, we cannot formally test for differences when we compare the estimates based on the 17–20 year olds to those obtained when we combine the two age groups. But, we can formally test for differences when comparing the models estimated with the 17–20 year olds to models estimated with 27–30 year olds by including interaction terms between an age dummy and all of the other independent variables in a pooled sample. The tests of statistical significant for these interactions provide further confirmation of any selection effects we observe when contrasting the combined age estimates with the estimates based on the 17–20 year olds.
All estimation uses SAS software (Cary, NC, 2002 Version 9.1.3 using PROC MIXED). Analyses adjust for statistical dependence among observations induced by clustering of cases within block groups (Binder 1983
; Särndal, Swenson et al. 1992
). The significance level adopted is p