|Home | About | Journals | Submit | Contact Us | Français|
Ground-level ozone is an important atmospheric oxidant, which exhibits considerable spatial and temporal variability in its concentration level. Existing modeling approaches for ground-level ozone include chemical transport models, land-use regression, Kriging, and data fusion of chemical transport models with monitoring data. Each of these methods has both strengths and weaknesses. Combining those complementary approaches could improve model performance. Meanwhile, satellite-based total column ozone, combined with ozone vertical profile, is another potential input. We propose a hybrid model that integrates the above variables to achieve spatially and temporally resolved exposure assessments for ground-level ozone. We used a neural network for its capacity to model interactions and nonlinearity. Convolutional layers, which use convolution kernels to aggregate nearby information, were added to the neural network to account for spatial and temporal autocorrelation. We trained the model with AQS 8-hour daily maximum ozone in the continental United States from 2000 to 2012 and tested it with left out monitoring sites. Cross-validated R2 on the left out monitoring sites ranged from 0.74 to 0.80 (mean 0.76) for predictions on 1 km×1 km grid cells, which indicates good model performance. Model performance remains good even at low ozone concentrations. The prediction results facilitate epidemiological studies to assess the health effect of ozone in the long term and the short term.
Ground-level ozone is a serious public health concern. The adverse effects of ozone are well documented including respiratory symptoms (Schwartz et al. 1994, Hao et al. 2015, Gent et al. 2003), the development of asthma (McConnell et al. 2002, Sousa, Alvim-Ferraz, and Martins 2013), airway inflammation (Koren et al. 1989, Tank et al. 2011), and mortality (Franklin and Schwartz 2008, Turner et al. 2015, Atkinson et al. 2016, Bell and Dominici 2008). These Health effects have been reported for both long- and short-term exposures (Jerrett et al. 2009, Bell 2004). Ozone is one of criteria pollutants regulated by the Environmental Protection Agency (EPA) based on maximum of 8-hour average. Ground-level ozone is a product of photochemical reactions involving NO, NO2, hydrocarbons, nitrogen oxides (NOx) and volatile organic compounds (VOCs). Ground-level ozone concentration is typically characterized by a diurnal variability with peak concentrations occurring at daytime. Many parameters, including local combustion sources, land-surface characteristics and atmospheric conditions, influence ozone formation and removal, resulting in high spatial and temporal variability of ozone concentration. Therefore, predicting ozone concentrations is challenging, especially at fine resolutions.
Fine spatial and temporal resolutions are critical to assessing human exposures for health studies. Many early epidemiological studies used ozone measurements from the nearest monitoring sites to assign exposure (Jerrett et al. 2009). This approach introduces non-differential measurement error, because it fails to capture ozone scavenging by nitric oxide (NO) and other sources of local variability.
Other approaches of accessing ozone concentrations involve spatial interpolation, land-use regression, satellite-based data modeling, and chemical transport model. Spatial interpolation, such as inverse-distance weighting (Breton et al. 2012) and Kriging (Tranchant and Vincent 2000), was used to estimate ozone exposures for epidemiology studies. Often, a radius threshold is chosen in interpolation (Bell 2006). Spatial interpolation has the advantage of low computation cost and reduces measurement error, but often generates over-smoothed distributions, which inadequately represents local variability (Abraham and Comrie 2004). Due to complex transport and chemistry, terrain variability can cause ozone concentration to vary remarkably within a short distance, which imposes an even greater challenge for spatial interpolation (Loibi et al. 1994). Land-use regression (LUR) assumes that land-use terms are predictors for ozone level and uses covariates such as traffic, population density and elevation to model ozone (Malmqvist et al. 2014). LUR is relatively easy to implement and has satisfying model performance at small scales, but has limited capacity to capture temporal variations and can miss some short-term and regional patterns (Hoek et al. 2008). Satellite observations measure ozone over larger spatial and temporal scales than most LURs. Most satellite ozone measurements are column-based, such as TOMS (Total Ozone Mapping Spectrometer), GOME (Global Ozone Monitoring Experiment) (Burrows et al. 1999) and OMI (Ozone Monitoring Instrument) (Levelt et al. 2006). Some satellite measurements also provide vertical distribution of ozone, including SBUV (Solar Backscatter Ultra Violet), GOME and later OMI. Two OMI ozone data products, produced by the OMI-TOMS and the OMI-DOAS retrieval algorithms, demonstrate high agreement with total column ozone observation at a global scale, with about 1%disagreement (Balis et al. 2007, McPeters et al. 2008). At ground level, OMI ozone observations are close to ground monitor-based mean concentrations but at higher elevations these observations deviate from the monitors (Wang et al. 2011). The discrepancy can be as large as 20% (Liu, Bhartia, et al. 2010).
A chemical transport model (CTM) is a more advanced tool of estimating ozone, which simulates the formation, dispersion and deposition of ozone. CTMs, such as GEOS-Chem (Bey et al. 2001), MOZART (Brasseur et al. 1998) and CMAQ (Byun and Schere 2006) have been applied to estimate ground-level ozone at city level (Lei et al. 2007, Sokhi et al. 2006), country level (Liu, Zhang, et al. 2010, Tong and Mauzerall 2006), continent level (Fusco and Logan 2003, Pfister et al. 2008) or beyond. Due to limitations of both computational capacity and the spatial resolution of emission inventories, ozone estimation from CTM is usually not spatially resolved enough to assess exposure at local scale. Typical scales are 4°×5°, 2.0°×2.5°, 0.500°×0.667° or 0.2500°×0.3125°, although CMAQ-Urban can produce very fine scale predictions in selected urban locations with good emission inventories. However, CTMs deviate substantially from real world measurements due to imperfect data and chemistry, and these errors tend to increase at finer time or spatial scales. One limitation of many ozone models is that their performance is only tested against the monitoring sites used to train the models, which does not test the validity of the model in areas without monitoring data. Cross-validation can test model validity at unmonitored areas by leaving out monitors during model training, and subsequently testing the correlation between the model and the left out monitors.
With both strengths and weaknesses, the aforementioned approaches are complementary to each other. This study proposes a hybrid approach, which integrates informative variables and existing ground-level ozone modeling approaches into a neural network-based framework. Ten-fold cross-validation was used to test model performance and avoid overfitting. After model training, we predicted ground-level ozone at nationwide 1 km×1 km grid cells and produced spatially- and temporally-resolved ozone exposure assessments, which can be used by epidemiologists to assess the acute and chronic health effects of ozone.
A similar hybrid approach has been applied to assess human exposures to PM2.5 mass and chemical components (Di et al. 2015, Di et al. 2016, Kloog et al. 2014, Kloog et al. 2011). This study applies a hybrid approach similar to the previous model of PM2.5, but incorporates additional variables due to ozone’s distinct gaseous nature and chemical characteristics. We present a new model for ground-level ozone that relies on multiple data sources and the application of neural network with convolutional layers.
The spatial area is the continental United States, which includes the 48 contiguous states and Washington, D.C. The study period is 2000-2012, covering 4,749 days.
Monitoring data for ozone concentrations across the study area were collected by the USEPA Air Quality System (AQS). There were 1,877 monitoring sites available within the study area during the study period, but some of them reported data for a subset of the study period or reported data intermittently. Monitoring sites were densely located in the Eastern United States and the Western Coast, while the Mountain Region and other remote areas had fewer monitoring sites (Fig. 1). We calibrated the model to the 8-hour daily maximum ozone (daily 8hr-max ozone). In this paper, unless specified otherwise, the term “ozone” refers to daily 8hr-max ozone at ground level.
We used GEOS-Chem Version 9.0.2 to simulate ozone formation, dispersion and deposition. GEOS-Chem incorporates meteorological inputs, emission inventories and atmospheric chemical reactions. Its methodology has been described in previous literature (Bey et al. 2001). We first performed a global 2.0°×2.5° simulation and exported boundary conditions. We then performed a nested grid simulation at 0.500°×0.667° for the North America. For years from 2000 to 2004, 2.0°×2.5° outputs were used instead because meteorological inputs at 0.500°×0.667° were not available.
The OMI instrument is on board the EOS-Aura satellite, which was launched in July 2004 (Levelt et al. 2006). OMI’s raw data was processed by two distinct algorithms, which yielded two different data products. Data product OMTO3e (Version 003) was produced from the TOMS Version 8.5 algorithm, which is based on TOMS Version 8 algorithm (Bhartia and Wellemeyer 2002). The other data product OMDOAO3e (Version 003) was produced from OMI-DOAS algorithms (Veefkind et al. 2006). The two algorithms generally agree with each other, with a mean difference in the total column ozone below 3%, though larger differences occur at high latitude areas and over clouds (Kroon et al. 2008). Both data products have a spatial resolution of 0.25°×0.25° and are available since July 2004.
Satellite instruments measure total column ozone, however the vertical distribution profile is needed to obtain ground-level ozone concentration. We adopted an approach similar to the approach used in modeling PM2.5, where AOD is a column measurement of aerosol and researchers used the vertical profile from a chemical transport model to calibrate AOD to ground-based PM2.5 (Liu 2004, van Donkelaar et al. 2010). GEOS-Chem simulates ozone concentrations at different layers. We defined a scaling factor as the fraction of ground-level ozone in the total column ozone, and used this factor to calibrate satellite-based column ozone to ground-level ozone. One advantage of GEOS-Chem ozone vertical profile is the absence of missing values. GEOS-Chem tropospheric ozone predictions agree with monitor observations in terms of the overall characteristics, but significant differences exist by region and by season (Liu et al. 2006). OMI also provides ozone vertical profile (data product OMO3PR Version 003) (Ahmad et al. 2003), in which an optimal estimation algorithm adjusts ozone in each atmospheric layer based on a priori information and minimizes the difference between modeled and measured ozone (Rodgers 2000). Although some missing values occur occasionally, comparison of retrieved and measured ozone indicates good agreement (Veefkind, Kroon, and de Haan 2009). The OMI ozone profile has a spatial resolution of 13 km×48 km. We linearly interpolated the data at all missing values.
Ozone precursors include nitrogen oxides (NOx), carbon monoxide (CO), methane (CH4), and volatile organic compounds (VOCs). Ozone precursors react with the presence of sunlight and form ozone. NO, in contrast, decreases ozone concentration by inducing ozone scavenging (Graedel, Farrow, and Weber 1977). Although emission inventories of these compounds are used in the GEOS-Chem model, they lack the temporal resolution of the monitoring data. To account for those relevant atmospheric reactions, we included AQS daily measurements of sulfur dioxide (SO2), nitrogen dioxide (NO2), NOx, and VOCs into our ozone model. AQS measurements are point measurements and sparsely located. We applied distance-decay functions to aggregate point data from monitors into convolutional layers (Section Convolutional Layer, Supplementary Material).
In order to obtain higher spatial and temporal coverage, we also used satellite-based total column SO2 and total column NO2 from OMI data products (OMSO2e Version 003 and OMNO2d Version 003) (Krotkov et al. 2011).
Our model used meteorological fields from the NCEP North American Regional Reanalysis data. This dataset assimilates multiple measurements from land-surface, ship, radiosonde, pibal, aircraft, satellite and other sources, with a resolution of 0.3° (about 32 km) at the daily level (Kalnay et al. 1996). The reanalysis dataset was chosen because it has both relatively high spatiotemporal resolution and no missing values. We used 16 meteorological variables in order to fully capture meteorological conditions and account for complex atmospheric processes. The variables included air temperature, accumulated total precipitation, downward shortwave radiation flux, accumulated total evaporation, planetary boundary layer height, low cloud area fraction, precipitation rate, precipitable water for the entire atmosphere, pressure, specific humidity at 2 m, visibility, wind speed, medium cloud area fraction, high cloud area fraction and, albedo. Wind speed was computed as the vector sum of u-wind (east-west component of the wind) at 10m and v-wind (north-south component) at 10m.
Land-use terms are proxies for ozone formation or removal, and capture spatial variations at local scale, which may not be measured by satellite or modeled by GEOS-Chem. The detailed procedure of processing elevation, road density, NEI (National Emissions Inventory), population density, percentage of urban and NDVI (normalized difference vegetation index) has been specified somewhere else (Kloog et al. 2012). We used two variables to approximate vegetation: the percentage of vegetation from NCEP North American Regional Reanalysis data and 16-day 1-km MODIS NDVI data product MOD13A2 (Didan 2015). For days without NDVI values, we linearly interpolated values from neighboring days.
Regional and monthly dummy variables were used to capture different associations between the above variables and monitored ozone by season and climate type. The major climate types were used to define the regional dummy variable (Kottek et al. 2006).
We used a neural network for its capacity to model nonlinearity and interactions among variables (Bishop 1995, Haykin and Network 2004). The target variable was monitored ozone from the AQS network and the predictor variables included the aforementioned variables. The input variables were available for the entire study area. Some variables had a small proportion of missing values and we estimated the missing data using linear interpolation (Table S2, supplementary material). Not all variables were available during the entire study period. For each year, we fitted a neural network with available variables in that year. Most existing studies fitted models with in situ information, the values of each variable at the monitoring sites; however, information about neighboring areas can be also informative. For instance, nearby traffic volume influences in situ ozone levels by either providing ozone precursors or scavenging ozone. To incorporate the nearby information into the neural network, we used convolutional layers (LeCun and Bengio 1995). A convolutional layer is computed by applying a convolution kernel (e.g., mean, inverse distance weighted mean) to the inputs in order to compute a scalar summary of the neighboring cells, which is then used as an additional predictor. By choosing kernels, we obtained different aggregations of neighboring information, which gave the neural network more flexibility to capture spatial autocorrelation and improved model fit. We computed convolutional layers for each land-use variable, predicted ozone of nearby areas, and predicted ozone of proceeding and subsequent days. To create the convolutional layers for predicted ozone, we first fitted the neural network and obtained intermediate ozone predictions. Then we computed spatial and temporal convolutional layers for predicted ozone and fitted the neural network again with those convolutional layers (Fig. S2). The details of convolutional layers and fitting a neural network are presented in the supplementary material.
We used ten-fold cross-validation to validate neural network results, in which all monitors were randomly divided into 10 splits. We then trained a neural network with 9 splits of the monitors and made ozone predictions for the remaining 1 split. The process was repeated nine times and made ozone predictions for the other 9 splits. Combining the predicted ozone from the 10 splits together yielded ozone predictions for all monitors. We calculated total R2, spatial R2 and temporal R2 for all monitors as well as by region and season to evaluate model performance. Calculations of R2 and other metrics of model performance (bias and slope) are specified in the supplementary material.
To make ozone predictions, we trained a neural network with all monitors. The trained neural network was used to predict ozone at 1 km×1 km grid cells for the whole study area during the entire study period. We prepared input variables at 1 km×1 km grid cells and made ozone predictions with the trained neural network. We linearly interpolated the data if missing values were present. All programming work was implemented in Matlab (version 2014a, The MathWorks, Inc.).
After conducting ten-fold cross-validation, total R2 ranged from 0.74 to 0.80 with mean R2 =0.76 (Table 1). Slope was near 1; bias was about 1.20 ppb for the whole concentration range and 2.82 ppb below 75 ppb (Table 1, Table S3). Model performance did not vary much by year; nor was there any temporal trend in model fit. In contrast, model performance varied by season, with highest R2 observed in autumn, followed by summer, spring and winter (Table 2). By region, model performance in the Middle Atlantic, South Atlantic, East North central, West South Central and Pacific States was near or above the national average; while the New England, Mountain and West North Central States were below the national average (Table 3). Above regional division is from the U.S. Census Bureau (Table S1, Fig. S7). Figure 2 visualizes model fits for the study area. Wyoming, Montana, Western Colorado, Eastern Washington State, Eastern Tennessee and Marine had lower fits than other states.
Figure 3 visualizes the spatial pattern of ozone in the study area. The Mountain States had the highest ozone levels for all seasons. Areas around the Appalachian Mountain also witnessed high ozone levels, although less so. The Eastern United States, with much lower ozone year round, experienced higher ozone levels in summer. Figure 3 also presents low concentrations in cities and along highways. In terms of temporal trend, Figure 4 presents a general decreasing trend of ozone, although less obvious in some regions.
This study proposed a hybrid model framework, which integrated satellite-based data, CTM outputs, ozone vertical profiles, meteorological variables, land-use terms and other atmospheric compounds that were related to ozone formation or deposition. Convolutional layers aggregated nearby information and improved model fit. The average cross-validated R2 between predicted and monitored daily 8hr-max ozone was 0.76 (0.74~0.80 by year). Few existing studies have ever modeled 8-hour maximum ground-level ozone at daily basis or attempted to make predictions at nationwide 1 km×1 km grid cells. We believe that this level of temporal/spatial coverage and model performance is an improvement over previous ozone prediction approaches. Epidemiological studies investigating the acute and chronic effects of ozone will benefit from more accurate and granular exposure assessments.
Our hybrid approach has several advantages and innovations. First, model performance surpasses existing studies. Some previous studies adopted land-use regression, Kriging or other methods and achieved RMSE > 10 ppb in Belgium (Hooyberghs et al. 2006); RMSE > 10 ppb in Italy (Carnevale et al. 2008); daily R2 = 0.653 in Quebec (Adam-Poupart et al. 2014). Our hybrid model outperformed land-use regression results, with averaged cross-validated annual R2 = 0.76 and RMSE = 7.36 ppb. Another improvement is that LUR is usually constrained to specific locations, while our hybrid model covers the entire continental United States. In terms of CTMs, some CMAQ simulations achieved normalized mean error (NME) less than 35%over the continental United States in summer (Tong and Mauzerall 2006); improved to NME 17.9% but focused on the Eastern United States (Appel et al. 2007); and continued to obtain NME between 17.7% and 21.7% (Zhang et al. 2009). CMAQ simulation was becoming better over time, but our hybrid model still outperformed it with a cross-validated NME = 13.13%. Combing multiple CTM simulations and comparing with monitored ozone, some researchers obtained mean R2 = 0.57 for the continental United States for the whole year (Reidmiller et al. 2009), compared with mean R2 = 0.76 in our study. This indicates that our hybrid model surpasses CTM simulations as a whole. Besides, convolutional layers take neighboring information into account, which is also applicable to other studies. Other methods, such as Kriging, have been widely used to aggregate nearby information in ozone modeling. For a convolutional layer, the specific aggregation depends on the kernel function, which is more versatile than Kriging. More importantly, being an input layer of a neural network, a convolutional layer can have complex interaction with other variables, which can better capture much more complex nonlinear atmospheric processes. By introducing convolutional layers, this study introduces a new way of incorporating neighboring information to improve model performance.
We integrated multiple data sources into a single ozone-modeling framework and improved model fit. Not all of the variables contributed equally to model performance. Satellite-based ozone measurement, GEOS-Chem simulations and land-use terms were critical to model performance. Hence, previous studies also combined land-use regression with chemical transport model (Akita et al. 2014), or land-use regression with Kriging (Wang et al. 2015), at the regional or municipal scales. Other variables including regional dummy variables, and certain meteorological variables played an auxiliary role. Some variables are complementary to each other. For example, satellite-based instruments, like OMI, have daily measurements with a large spatial coverage, but their values are averaged column measurements of ozone for a large volume of air. AQS monitors measure ground-level ozone at specific locations. Thus, satellite-based measurements cannot capture variability at small scales like monitors do (Wang et al. 2011). On the other hand, land-use terms are proxies for local emission which gives rise to local variability, but they usually do not provide much information on the temporal variability. Land-use terms and satellite observations are complementary to each other because land-use terms are at small local scales and satellite observations have wide time and space coverage. Combining both data sets overcomes weaknesses and improves the model. The use of neural network rather than a regression did not singularly drive model performance; a study also used neural network with only land-use terms and achieved model performance inferior to ours (RMSE > 10 ppb) (Carnevale et al. 2008).
We found an east-west gradient of ozone concentration (Fig. 3). High concentrations in the Western United States and Mountain States are attributable to factors including high elevation, deep boundary layer, large-scale subsidence, slow ozone deposition to the arid terrain and slow ozone loss caused by dry conditions (Fiore et al. 2002). The high ground-level ozone in the Mountain States reflects stratospheric intrusion, which can produce some transient peak ozone concentrations at ground level (Davies and Schuepbach 1994). Compared with the high concentrations in the Mountain States, urban areas had lower ozone. Other air pollutants (e.g. NO) react with ozone and cause ozone scavenging in urban areas, such as San Francisco, Los Angeles, New York City, Houston and Chicago as well as areas along highways (Fig. 3). For the same reason, we observed higher ozone concentrations in rural areas than urban areas in general (Fig. 5). We found a general trend of decreasing concentrations over time that agrees with trends observed in monitoring data alone (Camalier, Cox, and Dolwick 2007, Cox and Chu 1996), but the trend is less evident at the national level and in several regions (Fig. 4). Figure 5 presents temporal trend by season. In spring and autumn there is an increasing trend over time, because NO emission controls in recent years have reduced ozone scavenging and raised background ozone levels. The decreasing summer averages reflect the implemented emission control policies for ozone precursors, but this trend was reversed after the recession. The temporal trend in each region may deviate from the national trend (Fig. S4). The regional discrepancies and different effects of emission control in spring, summer, and autumn have been described in previous literature (Cooper et al. 2012). The increasing trend in winter is almost consistent in all regions, which is related to suppressed ozone scavenging due to decreasing NOx concentration via NOx titration (Austin et al. 2014, Jhun et al. 2014). This suggests a side-effect of controlling air pollutant: pollution emission control (e.g. NOx) may ironically lead to ozone increase under certain conditions (Li et al. 2013).
Model performance was good at typical concentrations. Figure 6 presents that the linearity between predicted and monitored ozone held below 110 ppb. Furthermore, model performance was still good with mean R2 almost unchanged below 75 ppb, the EPA 8hr-max ozone standard (Table S3). This performance will enable epidemiologists to assess the adverse effect of ozone even at low concentrations. Conversely, the model’s linearity had much uncertainty above 120 ppb due to insufficient data (Fig. 6); meanwhile, model performance dropped at high concentrations (Fig. S5). The inability to accurately predict extreme values is a limitation of our model, which may limit its usage in epidemiological studies that focus on peak concentrations. In terms of model performance over time, there was a slight decreasing trend in temporal R2, which may result from out-of-date land-use variables. Population density was retrieved for year 2000 and assumed to be constant over time. Population density data for year 2000 do not reflect population density in recent years. Updating population density to be time-varying may improve model performance. This hybrid approach used daily 8hr-max ozone as ozone metric, which avoided noisy ozone fluctuations at night and improved model fit. Although our model performed less well in some remote and sparsely populated areas at daily basis, the annual average demonstrated less discrepancy (Fig. S6).
Some limitations remain in our hybrid approach. First, this hybrid approach combines multiple datasets into a single framework and thus requires many variables that may not be available to countries where public available datasets are sparse. Second, the prediction interval is not available in the prediction results. A formal assessment of uncertainty level is critical in epidemiological studies to determine statistical power. Both issues are worthy of further investigations.
In this paper, we introduced a hybrid model that predicts daily 8hr-max ozone across the continental United States. The main feature of this model is its ability to integrate information from multiple data sources. Specifically, we integrated data from satellite-based ozone measurements, ozone vertical profile, CTM outputs, land-use terms, meteorological variables, concentrations of ozone precursors and other air pollutants, NDVI, and regional/monthly dummy variables. The hybrid model used neural network with convolutional layers, which aggregated information from neighborhood to improve model fit. We calibrated the model using AQS daily 8hr-max ozone measurements. Mean cross-validated R2 was 0.76, ranging from 0.74 to 0.80 for the entire United States. The model performed better in the Eastern United States. The trained neural network predicted daily 8hr-max ozone at nationwide 1 km×1 km grid cells from 2000 to 2012. These ozone assessments can help scientists investigate the health effect of ozone.
This publication was made possible by USEPA grant R01 ES024332-01A1, RD83479801, and NIEHS grant ES000002. Its contents are solely the responsibility of the grantee and do not necessarily represent the official views of the USEPA. Further, USEPA do not endorse the purchase of any commercial products or services mentioned in the publication. Moreover, we thank the China Section of the Air & Waste Management Association for the generous scholarship we received to cover the cost of page charges, which made the publication of this paper possible.