Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Air Waste Manag Assoc. Author manuscript; available in PMC 2018 January 1.
Published in final edited form as:
J Air Waste Manag Assoc. 2017 January; 67(1): 39–52.
doi:  10.1080/10962247.2016.1200159
PMCID: PMC5741295

A Hybrid Model for Spatially and Temporally Resolved Ozone Exposures in the Continental United States

Qian Di, graduate student,1 Sebastian Rowland, graduate student,1 Petros Koutrakis, professor,1 and Joel Schwartz, professor1


Ground-level ozone is an important atmospheric oxidant, which exhibits considerable spatial and temporal variability in its concentration level. Existing modeling approaches for ground-level ozone include chemical transport models, land-use regression, Kriging, and data fusion of chemical transport models with monitoring data. Each of these methods has both strengths and weaknesses. Combining those complementary approaches could improve model performance. Meanwhile, satellite-based total column ozone, combined with ozone vertical profile, is another potential input. We propose a hybrid model that integrates the above variables to achieve spatially and temporally resolved exposure assessments for ground-level ozone. We used a neural network for its capacity to model interactions and nonlinearity. Convolutional layers, which use convolution kernels to aggregate nearby information, were added to the neural network to account for spatial and temporal autocorrelation. We trained the model with AQS 8-hour daily maximum ozone in the continental United States from 2000 to 2012 and tested it with left out monitoring sites. Cross-validated R2 on the left out monitoring sites ranged from 0.74 to 0.80 (mean 0.76) for predictions on 1 km×1 km grid cells, which indicates good model performance. Model performance remains good even at low ozone concentrations. The prediction results facilitate epidemiological studies to assess the health effect of ozone in the long term and the short term.


Ground-level ozone is a serious public health concern. The adverse effects of ozone are well documented including respiratory symptoms (Schwartz et al. 1994, Hao et al. 2015, Gent et al. 2003), the development of asthma (McConnell et al. 2002, Sousa, Alvim-Ferraz, and Martins 2013), airway inflammation (Koren et al. 1989, Tank et al. 2011), and mortality (Franklin and Schwartz 2008, Turner et al. 2015, Atkinson et al. 2016, Bell and Dominici 2008). These Health effects have been reported for both long- and short-term exposures (Jerrett et al. 2009, Bell 2004). Ozone is one of criteria pollutants regulated by the Environmental Protection Agency (EPA) based on maximum of 8-hour average. Ground-level ozone is a product of photochemical reactions involving NO, NO2, hydrocarbons, nitrogen oxides (NOx) and volatile organic compounds (VOCs). Ground-level ozone concentration is typically characterized by a diurnal variability with peak concentrations occurring at daytime. Many parameters, including local combustion sources, land-surface characteristics and atmospheric conditions, influence ozone formation and removal, resulting in high spatial and temporal variability of ozone concentration. Therefore, predicting ozone concentrations is challenging, especially at fine resolutions.

Fine spatial and temporal resolutions are critical to assessing human exposures for health studies. Many early epidemiological studies used ozone measurements from the nearest monitoring sites to assign exposure (Jerrett et al. 2009). This approach introduces non-differential measurement error, because it fails to capture ozone scavenging by nitric oxide (NO) and other sources of local variability.

Other approaches of accessing ozone concentrations involve spatial interpolation, land-use regression, satellite-based data modeling, and chemical transport model. Spatial interpolation, such as inverse-distance weighting (Breton et al. 2012) and Kriging (Tranchant and Vincent 2000), was used to estimate ozone exposures for epidemiology studies. Often, a radius threshold is chosen in interpolation (Bell 2006). Spatial interpolation has the advantage of low computation cost and reduces measurement error, but often generates over-smoothed distributions, which inadequately represents local variability (Abraham and Comrie 2004). Due to complex transport and chemistry, terrain variability can cause ozone concentration to vary remarkably within a short distance, which imposes an even greater challenge for spatial interpolation (Loibi et al. 1994). Land-use regression (LUR) assumes that land-use terms are predictors for ozone level and uses covariates such as traffic, population density and elevation to model ozone (Malmqvist et al. 2014). LUR is relatively easy to implement and has satisfying model performance at small scales, but has limited capacity to capture temporal variations and can miss some short-term and regional patterns (Hoek et al. 2008). Satellite observations measure ozone over larger spatial and temporal scales than most LURs. Most satellite ozone measurements are column-based, such as TOMS (Total Ozone Mapping Spectrometer), GOME (Global Ozone Monitoring Experiment) (Burrows et al. 1999) and OMI (Ozone Monitoring Instrument) (Levelt et al. 2006). Some satellite measurements also provide vertical distribution of ozone, including SBUV (Solar Backscatter Ultra Violet), GOME and later OMI. Two OMI ozone data products, produced by the OMI-TOMS and the OMI-DOAS retrieval algorithms, demonstrate high agreement with total column ozone observation at a global scale, with about 1%disagreement (Balis et al. 2007, McPeters et al. 2008). At ground level, OMI ozone observations are close to ground monitor-based mean concentrations but at higher elevations these observations deviate from the monitors (Wang et al. 2011). The discrepancy can be as large as 20% (Liu, Bhartia, et al. 2010).

A chemical transport model (CTM) is a more advanced tool of estimating ozone, which simulates the formation, dispersion and deposition of ozone. CTMs, such as GEOS-Chem (Bey et al. 2001), MOZART (Brasseur et al. 1998) and CMAQ (Byun and Schere 2006) have been applied to estimate ground-level ozone at city level (Lei et al. 2007, Sokhi et al. 2006), country level (Liu, Zhang, et al. 2010, Tong and Mauzerall 2006), continent level (Fusco and Logan 2003, Pfister et al. 2008) or beyond. Due to limitations of both computational capacity and the spatial resolution of emission inventories, ozone estimation from CTM is usually not spatially resolved enough to assess exposure at local scale. Typical scales are 4°×5°, 2.0°×2.5°, 0.500°×0.667° or 0.2500°×0.3125°, although CMAQ-Urban can produce very fine scale predictions in selected urban locations with good emission inventories. However, CTMs deviate substantially from real world measurements due to imperfect data and chemistry, and these errors tend to increase at finer time or spatial scales. One limitation of many ozone models is that their performance is only tested against the monitoring sites used to train the models, which does not test the validity of the model in areas without monitoring data. Cross-validation can test model validity at unmonitored areas by leaving out monitors during model training, and subsequently testing the correlation between the model and the left out monitors.

With both strengths and weaknesses, the aforementioned approaches are complementary to each other. This study proposes a hybrid approach, which integrates informative variables and existing ground-level ozone modeling approaches into a neural network-based framework. Ten-fold cross-validation was used to test model performance and avoid overfitting. After model training, we predicted ground-level ozone at nationwide 1 km×1 km grid cells and produced spatially- and temporally-resolved ozone exposure assessments, which can be used by epidemiologists to assess the acute and chronic health effects of ozone.

A similar hybrid approach has been applied to assess human exposures to PM2.5 mass and chemical components (Di et al. 2015, Di et al. 2016, Kloog et al. 2014, Kloog et al. 2011). This study applies a hybrid approach similar to the previous model of PM2.5, but incorporates additional variables due to ozone’s distinct gaseous nature and chemical characteristics. We present a new model for ground-level ozone that relies on multiple data sources and the application of neural network with convolutional layers.


Study Domain

The spatial area is the continental United States, which includes the 48 contiguous states and Washington, D.C. The study period is 2000-2012, covering 4,749 days.

Monitoring Data

Monitoring data for ozone concentrations across the study area were collected by the USEPA Air Quality System (AQS). There were 1,877 monitoring sites available within the study area during the study period, but some of them reported data for a subset of the study period or reported data intermittently. Monitoring sites were densely located in the Eastern United States and the Western Coast, while the Mountain Region and other remote areas had fewer monitoring sites (Fig. 1). We calibrated the model to the 8-hour daily maximum ozone (daily 8hr-max ozone). In this paper, unless specified otherwise, the term “ozone” refers to daily 8hr-max ozone at ground level.

Figure 1
Ozone monitoring sites in the United States.

Chemical Transport Model Output

We used GEOS-Chem Version 9.0.2 to simulate ozone formation, dispersion and deposition. GEOS-Chem incorporates meteorological inputs, emission inventories and atmospheric chemical reactions. Its methodology has been described in previous literature (Bey et al. 2001). We first performed a global 2.0°×2.5° simulation and exported boundary conditions. We then performed a nested grid simulation at 0.500°×0.667° for the North America. For years from 2000 to 2004, 2.0°×2.5° outputs were used instead because meteorological inputs at 0.500°×0.667° were not available.

Satellite-based Ozone Measurements

The OMI instrument is on board the EOS-Aura satellite, which was launched in July 2004 (Levelt et al. 2006). OMI’s raw data was processed by two distinct algorithms, which yielded two different data products. Data product OMTO3e (Version 003) was produced from the TOMS Version 8.5 algorithm, which is based on TOMS Version 8 algorithm (Bhartia and Wellemeyer 2002). The other data product OMDOAO3e (Version 003) was produced from OMI-DOAS algorithms (Veefkind et al. 2006). The two algorithms generally agree with each other, with a mean difference in the total column ozone below 3%, though larger differences occur at high latitude areas and over clouds (Kroon et al. 2008). Both data products have a spatial resolution of 0.25°×0.25° and are available since July 2004.

Ozone Vertical Profile

Satellite instruments measure total column ozone, however the vertical distribution profile is needed to obtain ground-level ozone concentration. We adopted an approach similar to the approach used in modeling PM2.5, where AOD is a column measurement of aerosol and researchers used the vertical profile from a chemical transport model to calibrate AOD to ground-based PM2.5 (Liu 2004, van Donkelaar et al. 2010). GEOS-Chem simulates ozone concentrations at different layers. We defined a scaling factor as the fraction of ground-level ozone in the total column ozone, and used this factor to calibrate satellite-based column ozone to ground-level ozone. One advantage of GEOS-Chem ozone vertical profile is the absence of missing values. GEOS-Chem tropospheric ozone predictions agree with monitor observations in terms of the overall characteristics, but significant differences exist by region and by season (Liu et al. 2006). OMI also provides ozone vertical profile (data product OMO3PR Version 003) (Ahmad et al. 2003), in which an optimal estimation algorithm adjusts ozone in each atmospheric layer based on a priori information and minimizes the difference between modeled and measured ozone (Rodgers 2000). Although some missing values occur occasionally, comparison of retrieved and measured ozone indicates good agreement (Veefkind, Kroon, and de Haan 2009). The OMI ozone profile has a spatial resolution of 13 km×48 km. We linearly interpolated the data at all missing values.

NOx, SO2, VOC Data

Ozone precursors include nitrogen oxides (NOx), carbon monoxide (CO), methane (CH4), and volatile organic compounds (VOCs). Ozone precursors react with the presence of sunlight and form ozone. NO, in contrast, decreases ozone concentration by inducing ozone scavenging (Graedel, Farrow, and Weber 1977). Although emission inventories of these compounds are used in the GEOS-Chem model, they lack the temporal resolution of the monitoring data. To account for those relevant atmospheric reactions, we included AQS daily measurements of sulfur dioxide (SO2), nitrogen dioxide (NO2), NOx, and VOCs into our ozone model. AQS measurements are point measurements and sparsely located. We applied distance-decay functions to aggregate point data from monitors into convolutional layers (Section Convolutional Layer, Supplementary Material).

In order to obtain higher spatial and temporal coverage, we also used satellite-based total column SO2 and total column NO2 from OMI data products (OMSO2e Version 003 and OMNO2d Version 003) (Krotkov et al. 2011).

Meteorological Data

Our model used meteorological fields from the NCEP North American Regional Reanalysis data. This dataset assimilates multiple measurements from land-surface, ship, radiosonde, pibal, aircraft, satellite and other sources, with a resolution of 0.3° (about 32 km) at the daily level (Kalnay et al. 1996). The reanalysis dataset was chosen because it has both relatively high spatiotemporal resolution and no missing values. We used 16 meteorological variables in order to fully capture meteorological conditions and account for complex atmospheric processes. The variables included air temperature, accumulated total precipitation, downward shortwave radiation flux, accumulated total evaporation, planetary boundary layer height, low cloud area fraction, precipitation rate, precipitable water for the entire atmosphere, pressure, specific humidity at 2 m, visibility, wind speed, medium cloud area fraction, high cloud area fraction and, albedo. Wind speed was computed as the vector sum of u-wind (east-west component of the wind) at 10m and v-wind (north-south component) at 10m.

Land-Use Terms

Land-use terms are proxies for ozone formation or removal, and capture spatial variations at local scale, which may not be measured by satellite or modeled by GEOS-Chem. The detailed procedure of processing elevation, road density, NEI (National Emissions Inventory), population density, percentage of urban and NDVI (normalized difference vegetation index) has been specified somewhere else (Kloog et al. 2012). We used two variables to approximate vegetation: the percentage of vegetation from NCEP North American Regional Reanalysis data and 16-day 1-km MODIS NDVI data product MOD13A2 (Didan 2015). For days without NDVI values, we linearly interpolated values from neighboring days.

Regional and Monthly Dummy

Regional and monthly dummy variables were used to capture different associations between the above variables and monitored ozone by season and climate type. The major climate types were used to define the regional dummy variable (Kottek et al. 2006).

Neural Network

We used a neural network for its capacity to model nonlinearity and interactions among variables (Bishop 1995, Haykin and Network 2004). The target variable was monitored ozone from the AQS network and the predictor variables included the aforementioned variables. The input variables were available for the entire study area. Some variables had a small proportion of missing values and we estimated the missing data using linear interpolation (Table S2, supplementary material). Not all variables were available during the entire study period. For each year, we fitted a neural network with available variables in that year. Most existing studies fitted models with in situ information, the values of each variable at the monitoring sites; however, information about neighboring areas can be also informative. For instance, nearby traffic volume influences in situ ozone levels by either providing ozone precursors or scavenging ozone. To incorporate the nearby information into the neural network, we used convolutional layers (LeCun and Bengio 1995). A convolutional layer is computed by applying a convolution kernel (e.g., mean, inverse distance weighted mean) to the inputs in order to compute a scalar summary of the neighboring cells, which is then used as an additional predictor. By choosing kernels, we obtained different aggregations of neighboring information, which gave the neural network more flexibility to capture spatial autocorrelation and improved model fit. We computed convolutional layers for each land-use variable, predicted ozone of nearby areas, and predicted ozone of proceeding and subsequent days. To create the convolutional layers for predicted ozone, we first fitted the neural network and obtained intermediate ozone predictions. Then we computed spatial and temporal convolutional layers for predicted ozone and fitted the neural network again with those convolutional layers (Fig. S2). The details of convolutional layers and fitting a neural network are presented in the supplementary material.

We used ten-fold cross-validation to validate neural network results, in which all monitors were randomly divided into 10 splits. We then trained a neural network with 9 splits of the monitors and made ozone predictions for the remaining 1 split. The process was repeated nine times and made ozone predictions for the other 9 splits. Combining the predicted ozone from the 10 splits together yielded ozone predictions for all monitors. We calculated total R2, spatial R2 and temporal R2 for all monitors as well as by region and season to evaluate model performance. Calculations of R2 and other metrics of model performance (bias and slope) are specified in the supplementary material.

To make ozone predictions, we trained a neural network with all monitors. The trained neural network was used to predict ozone at 1 km×1 km grid cells for the whole study area during the entire study period. We prepared input variables at 1 km×1 km grid cells and made ozone predictions with the trained neural network. We linearly interpolated the data if missing values were present. All programming work was implemented in Matlab (version 2014a, The MathWorks, Inc.).


After conducting ten-fold cross-validation, total R2 ranged from 0.74 to 0.80 with mean R2 =0.76 (Table 1). Slope was near 1; bias was about 1.20 ppb for the whole concentration range and 2.82 ppb below 75 ppb (Table 1, Table S3). Model performance did not vary much by year; nor was there any temporal trend in model fit. In contrast, model performance varied by season, with highest R2 observed in autumn, followed by summer, spring and winter (Table 2). By region, model performance in the Middle Atlantic, South Atlantic, East North central, West South Central and Pacific States was near or above the national average; while the New England, Mountain and West North Central States were below the national average (Table 3). Above regional division is from the U.S. Census Bureau (Table S1, Fig. S7). Figure 2 visualizes model fits for the study area. Wyoming, Montana, Western Colorado, Eastern Washington State, Eastern Tennessee and Marine had lower fits than other states.

Figure 2
Model performance in the continental United States. This figure visualizes the total R2 between monitored and predicted ozone. We interpolated R2 to areas without monitors using Kriging interpolation. Spring was defined as March to May; summer was defined ...
Table 1
Cross-validated total R2, spatial R2, temporal R2, and corresponding MSE between monitored and predicted ozone in each year for the study area.
Table 2
Cross-validated total R2, spatial R2, temporal R2, and corresponding MSE between monitored and predicted ozone in each year divided by season. The definition of seasons was specified in Fig. 2.
Table 3
Cross-validated total R2, spatial R2, temporal R2, and corresponding MSE between monitored and predicted ozone in each year divided by U.S. Census Divisions.

Figure 3 visualizes the spatial pattern of ozone in the study area. The Mountain States had the highest ozone levels for all seasons. Areas around the Appalachian Mountain also witnessed high ozone levels, although less so. The Eastern United States, with much lower ozone year round, experienced higher ozone levels in summer. Figure 3 also presents low concentrations in cities and along highways. In terms of temporal trend, Figure 4 presents a general decreasing trend of ozone, although less obvious in some regions.

Figure 3
Spatial distribution of predicted ozone. The trained neural network predicted ozone at 1 km×1 km grid cells. Those figures visualize annual averages and seasonal averages of predicted ozone for 2000~2004, 2005~2008 and 2009~2012.
Figure 4
Regional trend in ozone levels at national and regional levels. Concentration is defined by annual fourth maximums of daily 8hr-max.


This study proposed a hybrid model framework, which integrated satellite-based data, CTM outputs, ozone vertical profiles, meteorological variables, land-use terms and other atmospheric compounds that were related to ozone formation or deposition. Convolutional layers aggregated nearby information and improved model fit. The average cross-validated R2 between predicted and monitored daily 8hr-max ozone was 0.76 (0.74~0.80 by year). Few existing studies have ever modeled 8-hour maximum ground-level ozone at daily basis or attempted to make predictions at nationwide 1 km×1 km grid cells. We believe that this level of temporal/spatial coverage and model performance is an improvement over previous ozone prediction approaches. Epidemiological studies investigating the acute and chronic effects of ozone will benefit from more accurate and granular exposure assessments.

Our hybrid approach has several advantages and innovations. First, model performance surpasses existing studies. Some previous studies adopted land-use regression, Kriging or other methods and achieved RMSE > 10 ppb in Belgium (Hooyberghs et al. 2006); RMSE > 10 ppb in Italy (Carnevale et al. 2008); daily R2 = 0.653 in Quebec (Adam-Poupart et al. 2014). Our hybrid model outperformed land-use regression results, with averaged cross-validated annual R2 = 0.76 and RMSE = 7.36 ppb. Another improvement is that LUR is usually constrained to specific locations, while our hybrid model covers the entire continental United States. In terms of CTMs, some CMAQ simulations achieved normalized mean error (NME) less than 35%over the continental United States in summer (Tong and Mauzerall 2006); improved to NME 17.9% but focused on the Eastern United States (Appel et al. 2007); and continued to obtain NME between 17.7% and 21.7% (Zhang et al. 2009). CMAQ simulation was becoming better over time, but our hybrid model still outperformed it with a cross-validated NME = 13.13%. Combing multiple CTM simulations and comparing with monitored ozone, some researchers obtained mean R2 = 0.57 for the continental United States for the whole year (Reidmiller et al. 2009), compared with mean R2 = 0.76 in our study. This indicates that our hybrid model surpasses CTM simulations as a whole. Besides, convolutional layers take neighboring information into account, which is also applicable to other studies. Other methods, such as Kriging, have been widely used to aggregate nearby information in ozone modeling. For a convolutional layer, the specific aggregation depends on the kernel function, which is more versatile than Kriging. More importantly, being an input layer of a neural network, a convolutional layer can have complex interaction with other variables, which can better capture much more complex nonlinear atmospheric processes. By introducing convolutional layers, this study introduces a new way of incorporating neighboring information to improve model performance.

We integrated multiple data sources into a single ozone-modeling framework and improved model fit. Not all of the variables contributed equally to model performance. Satellite-based ozone measurement, GEOS-Chem simulations and land-use terms were critical to model performance. Hence, previous studies also combined land-use regression with chemical transport model (Akita et al. 2014), or land-use regression with Kriging (Wang et al. 2015), at the regional or municipal scales. Other variables including regional dummy variables, and certain meteorological variables played an auxiliary role. Some variables are complementary to each other. For example, satellite-based instruments, like OMI, have daily measurements with a large spatial coverage, but their values are averaged column measurements of ozone for a large volume of air. AQS monitors measure ground-level ozone at specific locations. Thus, satellite-based measurements cannot capture variability at small scales like monitors do (Wang et al. 2011). On the other hand, land-use terms are proxies for local emission which gives rise to local variability, but they usually do not provide much information on the temporal variability. Land-use terms and satellite observations are complementary to each other because land-use terms are at small local scales and satellite observations have wide time and space coverage. Combining both data sets overcomes weaknesses and improves the model. The use of neural network rather than a regression did not singularly drive model performance; a study also used neural network with only land-use terms and achieved model performance inferior to ours (RMSE > 10 ppb) (Carnevale et al. 2008).

We found an east-west gradient of ozone concentration (Fig. 3). High concentrations in the Western United States and Mountain States are attributable to factors including high elevation, deep boundary layer, large-scale subsidence, slow ozone deposition to the arid terrain and slow ozone loss caused by dry conditions (Fiore et al. 2002). The high ground-level ozone in the Mountain States reflects stratospheric intrusion, which can produce some transient peak ozone concentrations at ground level (Davies and Schuepbach 1994). Compared with the high concentrations in the Mountain States, urban areas had lower ozone. Other air pollutants (e.g. NO) react with ozone and cause ozone scavenging in urban areas, such as San Francisco, Los Angeles, New York City, Houston and Chicago as well as areas along highways (Fig. 3). For the same reason, we observed higher ozone concentrations in rural areas than urban areas in general (Fig. 5). We found a general trend of decreasing concentrations over time that agrees with trends observed in monitoring data alone (Camalier, Cox, and Dolwick 2007, Cox and Chu 1996), but the trend is less evident at the national level and in several regions (Fig. 4). Figure 5 presents temporal trend by season. In spring and autumn there is an increasing trend over time, because NO emission controls in recent years have reduced ozone scavenging and raised background ozone levels. The decreasing summer averages reflect the implemented emission control policies for ozone precursors, but this trend was reversed after the recession. The temporal trend in each region may deviate from the national trend (Fig. S4). The regional discrepancies and different effects of emission control in spring, summer, and autumn have been described in previous literature (Cooper et al. 2012). The increasing trend in winter is almost consistent in all regions, which is related to suppressed ozone scavenging due to decreasing NOx concentration via NOx titration (Austin et al. 2014, Jhun et al. 2014). This suggests a side-effect of controlling air pollutant: pollution emission control (e.g. NOx) may ironically lead to ozone increase under certain conditions (Li et al. 2013).

Figure 5
Seasonal averages by urban and rural area. Seasonal averages were computed by averaging predicted ozone at all 1 km×1 km grid cells in urban or rural areas. Urban areas are defined by developed areas above 50% based on National Land Cover Database ...

Model performance was good at typical concentrations. Figure 6 presents that the linearity between predicted and monitored ozone held below 110 ppb. Furthermore, model performance was still good with mean R2 almost unchanged below 75 ppb, the EPA 8hr-max ozone standard (Table S3). This performance will enable epidemiologists to assess the adverse effect of ozone even at low concentrations. Conversely, the model’s linearity had much uncertainty above 120 ppb due to insufficient data (Fig. 6); meanwhile, model performance dropped at high concentrations (Fig. S5). The inability to accurately predict extreme values is a limitation of our model, which may limit its usage in epidemiological studies that focus on peak concentrations. In terms of model performance over time, there was a slight decreasing trend in temporal R2, which may result from out-of-date land-use variables. Population density was retrieved for year 2000 and assumed to be constant over time. Population density data for year 2000 do not reflect population density in recent years. Updating population density to be time-varying may improve model performance. This hybrid approach used daily 8hr-max ozone as ozone metric, which avoided noisy ozone fluctuations at night and improved model fit. Although our model performed less well in some remote and sparsely populated areas at daily basis, the annual average demonstrated less discrepancy (Fig. S6).

Figure 6
Relationship between measured and predicted ozone. We fitted a regression of predicted ozone on monitored ozone with penalized spline. To assess the linearity between predicted and monitored ozone, we did not specify the degrees of freedom. This figure ...

Some limitations remain in our hybrid approach. First, this hybrid approach combines multiple datasets into a single framework and thus requires many variables that may not be available to countries where public available datasets are sparse. Second, the prediction interval is not available in the prediction results. A formal assessment of uncertainty level is critical in epidemiological studies to determine statistical power. Both issues are worthy of further investigations.


In this paper, we introduced a hybrid model that predicts daily 8hr-max ozone across the continental United States. The main feature of this model is its ability to integrate information from multiple data sources. Specifically, we integrated data from satellite-based ozone measurements, ozone vertical profile, CTM outputs, land-use terms, meteorological variables, concentrations of ozone precursors and other air pollutants, NDVI, and regional/monthly dummy variables. The hybrid model used neural network with convolutional layers, which aggregated information from neighborhood to improve model fit. We calibrated the model using AQS daily 8hr-max ozone measurements. Mean cross-validated R2 was 0.76, ranging from 0.74 to 0.80 for the entire United States. The model performed better in the Eastern United States. The trained neural network predicted daily 8hr-max ozone at nationwide 1 km×1 km grid cells from 2000 to 2012. These ozone assessments can help scientists investigate the health effect of ozone.

Supplementary Material

Supplementary Material


This publication was made possible by USEPA grant R01 ES024332-01A1, RD83479801, and NIEHS grant ES000002. Its contents are solely the responsibility of the grantee and do not necessarily represent the official views of the USEPA. Further, USEPA do not endorse the purchase of any commercial products or services mentioned in the publication. Moreover, we thank the China Section of the Air & Waste Management Association for the generous scholarship we received to cover the cost of page charges, which made the publication of this paper possible.


  • Abraham JS, Comrie AC. Real-time ozone mapping using a regression-interpolation hybrid approach, applied to Tucson, Arizona. J Air Waste Ma. 2004;54:914–925. doi: 10.1080/10473289.2004.10470960. [PubMed] [Cross Ref]
  • Adam-Poupart A, Brand A, Fournier M, Jerrett M, Smargiassi A. Spatiotemporal modeling of ozone levels in Quebec (Canada): a comparison of kriging, land-use regression (LUR), and combined Bayesian maximum entropy–LUR approaches. Environ Health Persp. 2014;122:970–976. doi: 10.1289/ehp.1306566. [PMC free article] [PubMed] [Cross Ref]
  • Ahmad SP, Levelt PF, Bhartia PK, Hilsenrath E, Leppelmeier GW, Johnson JE. Atmospheric products from the ozone monitoring instrument (OMI), Optical Science and Technology. Proceedings of SPIE conference on Earth Observing Systems VIII; San Diego, California. 3-8 August 2003; 2003. pp. 619–630.
  • Akita Y, Baldasano JM, Beelen R, Cirach M, De Hoogh K, Hoek G, Nieuwenhuijsen M, Serre ML, De Nazelle A. Large scale air pollution estimation method combining land use regression and chemical transport modeling in a geostatistical framework. Environ Sci Technol. 2014;48:4452–4459. doi: 10.1021/es405390e. [PubMed] [Cross Ref]
  • Appel KW, Gilliland AB, Sarwar G, Gilliam RC. Evaluation of the Community Multiscale Air Quality (CMAQ) model version 4.5: sensitivities impacting model performance: part I---ozone. Atmos Environ. 2007;41:9603–9615. doi: 10.1016/j.atmosenv.2007.08.044. [Cross Ref]
  • Atkinson RW, Butland BK, Dimitroulopoulou C, Heal MR, Stedman JR, Carslaw N, Jarvis D, Heaviside C, Vardoulakis S, Walton H. Long-term exposure to ambient ozone and mortality: a quantitative systematic review and meta-analysis of evidence from cohort studies. BMJ Open. 2016;6:e009493. doi: 10.1136/bmjopen-2015-009493. [PMC free article] [PubMed] [Cross Ref]
  • Austin E, Zanobetti A, Coull B, Schwartz J, Gold DR, Koutrakis P. Ozone trends and their relationship to characteristic weather patterns. J Expo Sci Env Epid. 2014;25:532–542. doi: 10.1038/jes.2014.45. [PMC free article] [PubMed] [Cross Ref]
  • Balis D, Kroon M, Koukouli M, Brinksma E, Labow G, Veefkind J, McPeters R. Validation of Ozone Monitoring Instrument total ozone column measurements using Brewer and Dobson spectrophotometer ground-based observations. J Geophys Res -Atmos. 2007;112:D24S46. doi: 10.1029/2007JD008796. [Cross Ref]
  • Bell ML, McDermott A, Zeger SL, Samet JM, Dominici F. Ozone and Short-term Mortality in 95 US Urban Communities, 1987-2000. JAMA. 2004;292:2372–2378. doi: 10.1001/jama.292.19.2372. [PMC free article] [PubMed] [Cross Ref]
  • Bell ML. The use of ambient air quality modeling to estimate individual and population exposure for human health research: a case study of ozone in the Northern Georgia Region of the United States. Environ Int. 2006;32:586–593. doi: 10.1016/j.envint.2006.01.005. [PubMed] [Cross Ref]
  • Bell ML, Dominici F. Effect modification by community characteristics on the short-term effects of ozone exposure and mortality in 98 US communities. Am J Epidemiol. 2008;167:986–997. doi: 10.1093/aje/kwm396. [PMC free article] [PubMed] [Cross Ref]
  • Bey I, Jacob DJ, Yantosca RM, Logan JA, Field BD, Fiore AM, Li Q, Liu HY, Mickley LJ, Schultz MG. Global modeling of tropospheric chemistry with assimilated meteorology: Model description and evaluation. J Geophys Res. 2001;106:23073–23095. doi: 10.1029/2001JD000807. [Cross Ref]
  • Bhartia PK, Wellemeyer C. Greenbelt, MD: 2002. [31st December 2015]. TOMS-V8 total O3 algorithm, OMI Algorithm Theoretical Basis Document. available at:
  • Bishop CM. Neural networks for pattern recognition. United Kingdom: Oxford University Press; 1995.
  • Brasseur G, Hauglustaine D, Walters S, Rasch P, Müller JF, Granier C, Tie X. MOZART, a global chemical transport model for ozone and related chemical tracers: 1. Model description. J Geophys Res -Atmos. 1998;103:28265–28289. doi: 10.1029/98JD02397. [Cross Ref]
  • Breton CV, Wang X, Mack WJ, Berhane K, Lopez M, Islam TS, Feng M, Lurmann F, McConnell R, Hodis HN. Childhood air pollutant exposure and carotid artery intima-media thickness in young adults. Circulation. 2012;126:1614–1620. doi: 10.1161/CIRCULATIONAHA.112.096164. [PMC free article] [PubMed] [Cross Ref]
  • Burrows JP, Weber M, Buchwitz M, Rozanov V, Ladstätter-Weißenmayer A, Richter A, DeBeek R, Hoogen R, Bramstedt K, Eichmann K-U. The global ozone monitoring experiment (GOME): Mission concept and first scientific results. J Atmos Sci. 1999;56:151–175.<0151:TGOMEG>2.0.CO;2.
  • Byun D, Schere KL. Review of the governing equations, computational algorithms, and other components of the Models-3 Community Multiscale Air Quality (CMAQ) modeling system. Appl Mech Rev. 2006;59:51–77. doi: 10.1115/1.2128636. [Cross Ref]
  • Camalier L, Cox W, Dolwick P. The effects of meteorology on ozone in urban areas and their use in assessing ozone trends. Atmos Environ. 2007;41:7127–7137. doi: 10.1016/j.atmosenv.2007.04.061. [Cross Ref]
  • Carnevale C, Finzi G, Pisoni E, Singh V, Volta M. Neural networks and co-kriging techniques to forecast ozone concentrations in urban areas. Proceedings of the iEMSs fourth biennial meeting; Barcelona, Spain. 21 September 2008; 2008. pp. 1125–1132.
  • Cooper OR, Gao RS, Tarasick D, Leblanc T, Sweeney C. Long-term ozone trends at rural ozone monitoring sites across the United States, 1990–2010. J Geophys Res -Atmos. 2012;117:D22307. doi: 10.1029/2012JD018261. [Cross Ref]
  • Cox WM, Chu S-H. Assessment of interannual ozone variation in urban areas from a climatological perspective. Atmos Environ. 1996;30:2615–2625. doi: 10.1016/1352-2310(95)00346-0. [Cross Ref]
  • Davies T, Schuepbach E. Episodes of high ozone concentrations at the earth’s surface resulting from transport down from the upper troposphere/lower stratosphere: a review and case studies. Atmos Environ. 1994;28:53–68. doi: 10.1016/1352-2310(94)90022-1. [Cross Ref]
  • Di Q, Kloog I, Koutrakis P, Lyapustin A, Wang Y, Schwartz J. Assessing PM2.5 Exposures with High Spatio-Temporal Resolution across the Continental United States. Envir Sci Tech. 2015 doi: 10.1021/acs.est.5b06121. [PubMed] [Cross Ref]
  • Di Q, Schwartz J, Koutrakis P. A Hybrid Prediction Model for PM 2.5 Mass and Components Using a Chemical Transport Model and Land Use Regression. Atmos Environ. 2016;131:390–399. doi: 10.1016/j.atmosenv.2016.02.002. [Cross Ref]
  • Didan K. MOD13A2 MODIS/Terra Vegetation Indices 16-Day L3 Global 1km SIN Grid V006. NASA EOSDIS Land Processes DAAC. 2015
  • Fiore AM, Jacob DJ, Bey I, Yantosca RM, Field BD, Fusco AC, Wilkinson JG. Background ozone over the United States in summer: Origin, trend, and contribution to pollution episodes. J Geophys Res -Atmos. 2002;107:D15. doi: 10.1029/2001JD000982. [Cross Ref]
  • Franklin M, Schwartz J. The impact of secondary particles on the association between ambient ozone and mortality. Environ Health Persp. 2008;116:453–458. doi: 10.1289/ehp.10777. [PMC free article] [PubMed] [Cross Ref]
  • Fry JA, Xian G, Jin S, Dewitz JA, Homer CG, LIMIN Y, Barnes CA, Herold ND, Wickham JD. Completion of the 2006 national land cover database for the conterminous United States. Photogramm Eng Rem S. 2011;77:858–864.
  • Fusco AC, Logan JA. Analysis of 1970–1995 trends in tropospheric ozone at Northern Hemisphere midlatitudes with the GEOS-CHEM model. J Geophys Res -Atmos. 2003;108:D15. doi: 10.1029/2002JD002742. [Cross Ref]
  • Gent JF, Triche EW, Holford TR, Belanger K, Bracken MB, Beckett WS, Leaderer BP. Association of low-level ozone and fine particles with respiratory symptoms in children with asthma. JAMA. 2003;290:1859–1867. doi: 10.1001/jama.290.14.1859. [PubMed] [Cross Ref]
  • Graedel TE, Farrow LA, Weber TA. Photochemistry of the “Sunday Effect” Environ Sci Technol. 1977;11:690–694. doi: 10.1021/es60130a005. [Cross Ref]
  • Hao Y, Balluz L, Strosnider H, Wen X, Li C, Qualters JR. Ozone, fine particulate matter, and chronic lower respiratory disease mortality in the United States. Am J Resp Crit Care. 2015;192:337–341. doi: 10.1164/rccm.201410-1852OC. [PMC free article] [PubMed] [Cross Ref]
  • Haykin S, Network N. A comprehensive foundation, Neural Networks. Second Edition. Canada: McMaster University; 2004.
  • Hoek G, Beelen R, de Hoogh K, Vienneau D, Gulliver J, Fischer P, Briggs D. A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmos Environ. 2008;42:7561–7578. doi: 10.1016/j.atmosenv.2008.05.057. [Cross Ref]
  • Hooyberghs J, Mensink C, Dumont G, Fierens F. Spatial interpolation of ambient ozone concentrations from sparse monitoring points in Belgium. J Environ Monitor. 2006;8:1129–1135. doi: 10.1039/B612607N. [PubMed] [Cross Ref]
  • Jerrett M, Burnett RT, Pope CA, III, Ito K, Thurston G, Krewski D, Shi Y, Calle E, Thun M. Long-term ozone exposure and mortality. New Engl J Med. 2009;360:1085–1095. doi: 10.1056/NEJMoa0803894. [PMC free article] [PubMed] [Cross Ref]
  • Jhun I, Coull BA, Zanobetti A, Koutrakis P. The impact of nitrogen oxides concentration decreases on ozone trends in the USA. Air Qual Atmos Health. 2014;8:283–292. doi: 10.1007/s11869-014-0279-2. [PMC free article] [PubMed] [Cross Ref]
  • Kalnay E, Kanamitsu M, Kistler R, Collins W, Deaven D, Gandin L, Iredell M, Saha S, White G, Woollen J, Zhu Y, Leetmaa A, Reynolds R, Chelliah M, Ebisuzaki W, Higgins W, Janowiak J, Mo KC, Ropelewski C, Wang J, Jenne R, Joseph D. The NCEP/NCAR 40-Year Reanalysis Project. B Am Meteorol Soc. 1996;77:437–471. doi: 10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2. [Cross Ref]
  • Kloog I, Nordio F, Coull BA, Schwartz J. Incorporating local land use regression and satellite aerosol optical depth in a hybrid model of spatiotemporal PM2.5 exposures in the Mid-Atlantic States. Environ Sci Technol. 2012;46:11913–11921. doi: 10.1021/es302673e. [PMC free article] [PubMed] [Cross Ref]
  • Koren HS, Devlin RB, Graham DE, Mann R, McGee MP, Horstman DH, Kozumbo WJ, Becker S, House DE, McDonnell WF. Ozone-induced inflammation in the lower airways of human subjects. Am Rev Respir Dis. 1989;139:407–415. doi: 10.1164/ajrccm/139.2.407. [PubMed] [Cross Ref]
  • Kottek M, Grieser J, Beck C, Rudolf B, Rubel F. World map of the Köppen-Geiger climate classification updated. Meteorol Z. 2006;15:259–263.
  • Kroon M, Veefkind JP, Sneep M, McPeters R, Bhartia P, Levelt P. Comparing OMI-TOMS and OMI-DOAS total ozone column data. J Geophys Res -Atmos. 2008;113:D16. doi: 10.1029/2007JD008798. [Cross Ref]
  • Krotkov N, Leonard P, Walters M, Leonard P, Lead P. OMI/Aura Sulfur Dioxide (SO2) Total Column Daily L3 Best Pixel Global 0.25deg Lat/Lon Grid, version 003. Greenbelt, MD, USA: Goddard Space Flight Center Distributed Active Archive Center (GSFC DAAC); 2012. [30 August 2015]. [Cross Ref]
  • LeCun Y, Bengio Y. The handbook of brain theory and neural networks. 1995. Convolutional networks for images, speech, and time series.
  • Lei W, Foy Bd, Zavala M, Volkamer R, Molina L. Characterizing ozone production in the Mexico City Metropolitan Area: a case study using a chemical transport model. Atmos Chem Phys. 2007;7:1347–1366. doi: 10.5194/acp-7-1347-2007. [Cross Ref]
  • Levelt PF, Van den Oord GH, Dobber MR, Mälkki A, Visser H, De Vries J, Stammes P, Lundell JO, Saari H. The ozone monitoring instrument. IEEE T Geosci Remote. 2006;44:1093–1101. doi: 10.1109/TGRS.2006.872333. [Cross Ref]
  • Li Y, Lau AK, Fung JC, Zheng J, Liu S. Importance of NOx control for peak ozone reduction in the Pearl River Delta region. J Geophys Res Atmos. 2013;118:9428–9443. doi: 10.1002/jgrd.50659. [Cross Ref]
  • Liu X, Bhartia P, Chance K, Froidevaux L, Spurr R, Kurosu T. Validation of Ozone Monitoring Instrument (OMI) ozone profiles and stratospheric ozone columns with Microwave Limb Sounder (MLS) measurements. Atmos Chem Phys. 2010b;10:2539–2549. doi: 10.5194/acp-10-2539-2010. [Cross Ref]
  • Liu X-H, Zhang Y, Xing J, Zhang Q, Wang K, Streets DG, Jang C, Wang W-X, Hao J-M. Understanding of regional air pollution over China using CMAQ, part II. Process analysis and sensitivity of ozone and particulate matter to precursor emissions. Atmos Environ. 2010a;44:3719–3727. doi: 10.1016/j.atmosenv.2010.03.036. [Cross Ref]
  • Liu X, Chance K, Sioris CE, Kurosu TP, Spurr RJ, Martin RV, Fu TM, Logan JA, Jacob DJ, Palmer PI. First directly retrieved global distribution of tropospheric column ozone from GOME: Comparison with the GEOS-Chem model. J Geophys Res -Atmos. 2006;111:D2. doi: 10.1029/2005JD006564. [Cross Ref]
  • Liu Y, Park RJ, Jacob DJ, Li Q, Kilaru V, Sarnat JA. Mapping annual mean ground-level PM2.5 concentrations using Multiangle Imaging Spectroradiometer aerosol optical thickness over the contiguous United States. J Geophys Res. 2004;109:D22. doi: 10.1029/2004JD005025. [Cross Ref]
  • Loibi W, Winiwarter W, Kopsca A, Zufger J, Baumann R. Estimating the spatial distribution of ozone concentrations in complex terrain. Atmos Environ. 1994;28:2557–2566. doi: 10.1016/1352-2310(94)90430-8. [Cross Ref]
  • Malmqvist E, Olsson D, Hagenbjörk-Gustafsson A, Forsberg B, Mattisson K, Stroh E, Strömgren M, Swietlicki E, Rylander L, Hoek G. Assessing ozone exposure for epidemiological studies in Malmö and Umeå, Sweden. Atmos Environ. 2014;94:241–248. doi: 10.1016/j.atmosenv.2014.05.038. [Cross Ref]
  • McConnell R, Berhane K, Gilliland F, London SJ, Islam T, Gauderman WJ, Avol E, Margolis HG, Peters JM. Asthma in exercising children exposed to ozone: a cohort study. Lancet. 2002;359:386–391. doi: 10.1016/S0140-6736(02)07597-9. [PubMed] [Cross Ref]
  • McPeters R, Kroon M, Labow G, Brinksma E, Balis D, Petropavlovskikh I, Veefkind JP, Bhartia P, Levelt P. Validation of the AURA Ozone Monitoring Instrument total column ozone product. J Geophys Res -Atmos. 2008;113:D15. doi: 10.1029/2007JD008802. [Cross Ref]
  • Pfister G, Emmons L, Hess P, Lamarque JF, Thompson A, Yorks J. Analysis of the Summer 2004 ozone budget over the United States using Intercontinental Transport Experiment Ozonesonde Network Study (IONS) observations and Model of Ozone and Related Tracers (MOZART-4) simulations. J Geophys Res -Atmos. 2008;113:D23. doi: 10.1029/2008JD010190. [Cross Ref]
  • Reidmiller D, Fiore AM, Jaffe D, Bergmann D, Cuvelier C, Dentener F, Duncan BN, Folberth G, Gauss M, Gong S. The influence of foreign vs North American emissions on surface ozone in the US. Atmos Chem Phys. 2009;9:5027–5042. doi: 10.5194/acp-9-5027-2009. [Cross Ref]
  • Rodgers CD. Inverse methods for atmospheric sounding: theory and practice. World scientific; Singapore: 2000.
  • Schwartz J, Dockery DW, Neas LM, Wypij D, Ware JH, Spengler JD, Koutrakis P, Speizer FE, Ferris BG., Jr Acute effects of summer air pollution on respiratory symptom reporting in children. Am J Resp Crit Care. 1994;150:1234–1242. doi: 10.1164/ajrccm.150.5.7952546. [PubMed] [Cross Ref]
  • Sokhi RS, San José R, Kitwiroon N, Fragkou E, Pérez JL, Middleton D. Prediction of ozone levels in London using the MM5–CMAQ modelling system. Environ Modell Softw. 2006;21:566–576. doi: 10.1016/j.envsoft.2004.07.016. [Cross Ref]
  • Sousa S, Alvim-Ferraz M, Martins F. Health effects of ozone focusing on childhood asthma: what is now known–a review from an epidemiological point of view. Chemosphere. 2013;90:2051–2058. doi: 10.1016/j.chemosphere.2012.10.063. [PubMed] [Cross Ref]
  • Tank J, Biller H, Heusser K, Holz O, Diedrich A, Framke T, Koch A, Grosshennig A, Koch W, Krug N. Effect of acute ozone induced airway inflammation on human sympathetic nerve traffic: a randomized, placebo controlled, crossover study. PLoS ONE. 2011;6:e18737. doi: 10.1371/journal.pone.0018737. [PMC free article] [PubMed] [Cross Ref]
  • Tong DQ, Mauzerall DL. Spatial variability of summertime tropospheric ozone over the continental United States: Implications of an evaluation of the CMAQ model. Atmos Environ. 2006;40:3041–3056. doi: 10.1016/j.atmosenv.2005.11.058. [Cross Ref]
  • Tranchant B, Vincent A. Statistical interpolation of ozone measurements from satellite data (TOMS, SBUV and SAGE II) using the kriging method. Ann Geophys. 2000;18:666–678. doi: 10.1007/s00585-000-0666-x. [Cross Ref]
  • Turner MC, Jerrett MC, Pope A, Krewski D, Gapstur SM, Diver WR, Beckerman BS, Marshall JD, Su J, Crouse DL. Long-Term Ozone Exposure and Mortality in a Large Prospective Study. Am J Resp Crit Care. 2015 doi: 10.1164/rccm.201508-1633OC. [PMC free article] [PubMed] [Cross Ref]
  • vanDonkelaar A, Martin RV, Brauer M, Kahn R, Levy R, Verduzco C, Villeneuve PJ. Global Estimates of Ambient Fine Particulate Matter Concentrations from Satellite-Based Aerosol Optical Depth: Development and Application. Environ Health Persp. 2010;118:847–855. doi: 10.1289/ehp.0901623. [PMC free article] [PubMed] [Cross Ref]
  • Veefkind JP, De Haan JF, Brinksma EJ, Kroon M, Levelt PF. Total ozone from the Ozone Monitoring Instrument (OMI) using the DOAS technique. IEEE T Geosci Remote. 2006;44:1239–1244. doi: 10.1109/TGRS.2006.871204. [Cross Ref]
  • Wang L, Newchurch M, Biazar A, Liu X, Kuang S, Khan M, Chance K. Evaluating AURA/OMI ozone profiles using ozonesonde data and EPA surface measurements for August 2006. Atmos Environ. 2011;45:5523–5530. doi: 10.1016/j.atmosenv.2011.06.012. [Cross Ref]
  • Wang M, Keller JP, Adar SD, Kim SY, Larson TV, Olives C, Sampson PD, Sheppard L, Szpiro AA, Vedal S. Development of long-term spatiotemporal models for ambient ozone in six metropolitan regions of the United States: The MESA Air study. Atmos Environ. 2015;123:79–87. doi: 10.1016/j.atmosenv.2015.10.042. [PMC free article] [PubMed] [Cross Ref]
  • Xian G, Homer C, Dewitz J, Fry J, Hossain N, Wickham J. Change of impervious surface area between 2001 and 2006 in the conterminous United States. Photogramm Eng Rem S. 2011;77:758–762.
  • Zhang Y, Vijayaraghavan K, Wen XY, Snell HE, Jacobson MZ. Probing into regional ozone and particulate matter pollution in the United States: 1. A 1 year CMAQ simulation and evaluation using surface and satellite data. J Geophys Res -Atmos. 2009;114:D22304. doi: 10.1029/2009JD011898. [Cross Ref]