Home | About | Journals | Submit | Contact Us | Français |

**|**Int J Environ Res Public Health**|**v.14(7); 2017 July**|**PMC5551172

Formats

Article sections

Authors

Related links

Int J Environ Res Public Health. 2017 July; 14(7): 734.

Published online 2017 July 6. doi: 10.3390/ijerph14070734

PMCID: PMC5551172

Alejandro Ivan Aguirre-Salado,^{1,}^{*} Humberto Vaquera-Huerta,^{2} Carlos Arturo Aguirre-Salado,^{3} Silvia Reyes-Mora,^{1} Ana Delia Olvera-Cervantes,^{1} Guillermo Arturo Lancho-Romero,^{1} and Carlos Soubervielle-Montalvo^{3}

Kim Natasha Dirks, Academic Editor

Received 2017 May 22; Accepted 2017 July 3.

Copyright © 2017 by the authors.

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

We implemented a spatial model for analysing ${\mathrm{PM}}_{10}$ maxima across the Mexico City metropolitan area during the period 1995–2016. We assumed that these maxima follow a non-identical generalized extreme value (GEV) distribution and modeled the trend by introducing multivariate smoothing spline functions into the probability GEV distribution. A flexible, three-stage hierarchical Bayesian approach was developed to analyse the distribution of the ${\mathrm{PM}}_{10}$ maxima in space and time. We evaluated the statistical model’s performance by using a simulation study. The results showed strong evidence of a positive correlation between the ${\mathrm{PM}}_{10}$ maxima and the longitude and latitude. The relationship between time and the ${\mathrm{PM}}_{10}$ maxima was negative, indicating a decreasing trend over time. Finally, a high risk of ${\mathrm{PM}}_{10}$ maxima presenting levels above 1000 $\mathsf{\mu}$g/m${}^{3}$ (return period: 25 yr) was observed in the northwestern region of the study area.

Air pollution in urban areas has become a major problem [1]. Increases in industry and urban traffic due to economic and population growth have led to an increase in gas and particulate emission that contribute to air pollution [2]. Among air pollutants, fine inhalable particles, known as particulate matter (PM) are associated with respiratory illnesses such as bronchitis, emphysema, asthma and other chronic obstructive pulmonary diseases [3]. PM can be classified by size; particles of 10 $\mathsf{\mu}$m or less in diameter are called ${\mathrm{PM}}_{10}$. They can consist of diverse solid and liquid particles, which may contain chemical constituents such as nitrates, sulfates and organic carbon [4]. PM originates from factories, internal combustion engines, heating systems, volcanoes, and deserts, and can include dust particles formed through the mechanical breakdown of larger particles during agricultural and mining processes [5]. In the Mexico City metropolitan area, a study conducted by the U.S. Department of Energy (DOE) and Mexico’s Petróleos Mexicanos (PEMEX) showed that approximately 50% of the ${\mathrm{PM}}_{10}$ was composed of dust from roadways, construction, and bare land [6].

Previous studies on the damage caused to human health by breathable particulate matter have revealed an association between high concentrations of ${\mathrm{PM}}_{10}$ and mortality due to respiratory diseases [1,7]. To reduce exposure and minimize the adverse effects of these particulates on public health, several studies have been conducted with a focus on understanding the causes and factors related to the origin and flow of these particles [4,8]. Most of these studies have relied on continuous measurements of ${\mathrm{PM}}_{10}$ to predict future concentrations based on various models, such as multiple regression models, neural networks [9] and generalized linear models [5]. However, these short-term forecasting methodologies were developed for use in locations with limited infrastructure, and where obtaining continuous measurements of ${\mathrm{PM}}_{10}$ concentrations is difficult. In other, more densely populated regions, such as the Mexico City metropolitan area, there are systems that measure these concentrations and send information in real time to a central location for the immediate activation or deactivation of alerts or emergency procedures. In these cases, it is more convenient to study air pollutants through the use of robust methodologies such as the theory of extreme values.

Extreme value theory is used in many fields, such as environmental studies, engineering and finance, to monitor the quantitative risk of future extreme events [10,11]. In the environmental sciences, it is used to model extremes of temperature, rainfall, wind and pollution, among other phenomena. The asymptotic distribution of maxima is known as the generalized extreme value distribution, which is described by three parameters corresponding to location, scale and shape [12]. These parameters are estimated using the maximum likelihood method [13]. However, this method is not robust with a small sample size, so many other estimation methods have also been proposed, such as the method of moments, the use of L-moments and the use of weighted probability moments [14,15,16].

Recently, new methodologies have been proposed for the study of extreme values, mainly for application to hydroclimatological and environmental data, all of them based on the generalized extreme value distribution. For example: Gaetan and Grigoletto [17] proposed the use of Markov random fields approximated based on smoothing kernel parameters for modeling the parameters of the GEV distribution. Reich et al. [18] studied heat waves using a Bayesian hierarchical model with the generalized Pareto distribution (GPD). Cooley and Sain [19] studied maximum rainfall events by assigning a normal prior to the parameters of the GPD. Sang and Gelfand [11] studied the extreme values of spatial stochastic processes and modeled the observed trend as a function of covariates.

In a real scenario, it is common that conditions change and that the assumptions of stationarity that are required in a traditional analysis of extreme values not met; this is because there are often trends of extreme values [20,21,22]. Recent studies have introduced covariate functions for describing extreme value distributions to model either the location parameter, the scale parameter or both. Regarding the location parameter, Weissman [23] used a sine function, Tawn [24] and Scarf [25] proposed a linear function, Rosen and Cohen [26] and Pauli and Coles [27] used splines, and Bocci et al. [28] used a geoadditive model. In the case of the scale parameter, because this parameter is assumed to be positive, it is more common to model its logarithm; therefore, Yee and Stephenson [29] used additive models, El Adlouni et al. [10] and Rodriguez et al. [30] used linear functions, and Cannon [31] proposed the use of neural networks.

In this article, we present a spatio-temporal analysis of extreme ${\mathrm{PM}}_{10}$ values in the Mexico City metropolitan area. The objective of this study was to analyse ${\mathrm{PM}}_{10}$ data collected over time and in different spatial locations to gain insight into the distribution of ${\mathrm{PM}}_{10}$ maxima and to quantify the risk of future extreme ${\mathrm{PM}}_{10}$ pollution events, even in non-monitored regions.

The Mexico City metropolitan area is one of the world’s largest urban agglomerations, with approximately $25.4$ million inhabitants spread over $7866\text{}{\mathrm{km}}^{2}$ at an average elevation of 2240 m above mean sea level. It is surrounded by mountains to the east and west, creating a basin with low points to the the north which leads to air pollution problems because of limited ventilation [6]. Two synoptic wind regimes prevail throughout the year: an anticyclonic westerly wind from November to April and a moist trade wind associated with the rainy season from May to October. Despite weak prevailing synoptic winds, the thermally induced local wind converges toward the city, which tends to restrict the ventilation of polluted air [32]. The study area and the locations of the primary sampling sites are shown in Figure 1.

Let ${Y}_{1},\dots ,{Y}_{n}$ be an independent and identically distributed set of random variables with distribution function ${F}_{X}\left(x\right)$ and let ${M}_{n}=max\left({Y}_{1},\dots ,{Y}_{n}\right)$. Let ${F}_{{M}_{n}}$ be the distribution function of ${M}_{n}$, because ${F}_{{M}_{n}}={F}_{X}{\left(x\right)}^{n}$ we have that ${M}_{n}$ is a degenerate distribution when $n\to \infty $. The extreme value theory considers that the only nondegenerate limiting distribution ${G}_{n}=({M}_{n}-{a}_{n})/{b}_{n}$ (if such a sequences of constants $\left\{{b}_{n}>0\right\}$ and $\left\{{a}_{n}\right\}$ exist) as $n\to \infty $ is the generalized extreme value (GEV) distribution [12]:

$${F}_{GEV}\left(y\right)=\left\{\begin{array}{cc}exp\left\{-{\left(1+\kappa \frac{\left(y-\mu \right)}{\sigma}\right)}^{-\frac{1}{\kappa}}\right\},\hfill & \kappa \ne 0,\hfill \\ exp\left\{-exp\left(-\frac{(y-\mu )}{\sigma}\right)\right\},\hfill & \kappa =0\hfill \end{array}\right.$$

for $y:1+\kappa \frac{\left(y-\mu \right)}{\sigma}>0$ when $\kappa \ne 0$, where $-\infty \le y\le \mu +\sigma /\kappa $ when $\kappa <0$ (Weibull), $-\infty \le y\le +\infty $ when $\kappa =0$ (Gumbel) and $\mu +\sigma /\kappa \le y\le +\infty $ when $\kappa >0$ (Fréchet). Here, $\mu \in \mathbb{R}$, $\sigma >0$ and $\kappa \in \mathbb{R}$ are the location, scale and shape parameters, respectively; see [33].

In the nonstationary case, the parameters are expressed as a function of a vector of covariates ${x}_{t}$: $GEV\left(\mu \left({x}_{t}\right),\sigma \left({x}_{t}\right),\kappa \left({x}_{t}\right)\right)$ [34]. To ensure a positive value for the scale parameter, $log\phantom{\rule{4pt}{0ex}}\sigma \left({x}_{t}\right)$ is used instead of $\sigma \left({x}_{t}\right)$ in the estimation process.

For the non-stationary case, consider assigning a linear predictor to the location and scale parameters. The $\kappa $ parameter is usually assumed to be constant [29]. Therefore, we propose using the following conditions for the above parameters:

$$\begin{array}{c}{\mu}_{t}={\sum}_{i=1}^{{P}_{1}}{X}_{ti}{\beta}_{1i}+{\sum}_{i=1}^{{P}_{2}}{Z}_{ti}{u}_{1i},\\ log\phantom{\rule{4pt}{0ex}}{\sigma}_{t}={\sum}_{i=1}^{{P}_{1}}{X}_{ti}{\beta}_{2i}+{\sum}_{i=1}^{{P}_{2}}{Z}_{ti}{u}_{2i},\\ {\kappa}_{t}=\kappa \end{array}$$

(1)

where *X* is a scaled and centered $n\times {p}_{1}$ matrix of covariates that includes the intercept; ${\beta}_{i}$, $i=1,2$, is a vector of length ${p}_{1}$; ${u}_{i}$, $i=1,2$, is a vector of length ${p}_{2}$; *Z* is an $n\times {p}_{2}$ matrix such that ${\left\{Z\right\}}_{ij}=exp\left[-{\left(\u2225{\underset{\_}{x}}_{i}-{\underset{\_}{k}}_{j}\u2225\right)}^{2}\right]$, $i=1,\dots ,n$, $j=1,\dots ,{p}_{2}$; ${\underset{\_}{x}}_{i}$ is the vector of covariates for the *i*-th observation, scaled and centered; and ${\underset{\_}{k}}_{j}$ the *j*-th centroid obtained using the method of average linkage hierarchical clustering [28,35].

For a sample of n observations: $\underset{\_}{y}=({y}_{1},\dots ,{y}_{n})$, the maximum likelihood estimator for the non-stationary GEV can be determined by maximizing the likelihood function

$$\begin{array}{c}L({\mu}_{t},{\sigma}_{t},{\kappa}_{t}\mid \underset{\_}{y})=\prod _{t=1}^{n}\frac{1}{{\sigma}_{t}}exp\left\{-{\left[1+{\kappa}_{t}\left(\frac{{y}_{t}-{\mu}_{t}}{{\sigma}_{t}}\right)\right]}^{-\frac{1}{{\kappa}_{t}}}\right\}\times {\left[1+{\kappa}_{t}\left(\frac{{y}_{t}-{\mu}_{t}}{{\sigma}_{t}}\right)\right]}^{-\left(1+\frac{1}{{\kappa}_{t}}\right)}\end{array}$$

where *n* is the number of observations. The function of ${\kappa}_{t}$ is usually assumed constant [29], so the log-likelihood is:

$$\begin{array}{c}\ell ({\mu}_{t},{\sigma}_{t},\kappa \mid \underset{\_}{y})=-nlog{\sigma}_{t}-\sum _{t=1}^{n}{\left[1+\kappa \left(\frac{{y}_{t}-{\mu}_{t}}{{\sigma}_{t}}\right)\right]}^{-\frac{1}{\kappa}}-\sum _{t=1}^{n}\left(1+\frac{1}{\kappa}\right)log\left[1+\kappa \left(\frac{{y}_{t}-{\mu}_{t}}{{\sigma}_{t}}\right)\right]\end{array}$$

Let $C=\u230aX\phantom{\rule{4pt}{0ex}}Z\u230b$ and ${\underset{\_}{b}}_{\left(i\right)}^{\prime}=\u230a{\underset{\_}{\beta}}_{\left(i\right)}^{\prime}\text{}{\underset{\_}{u}}_{\left(i\right)}^{\prime}\u230b,$ where *C* is a $n\times p$ matrix, with $p={p}_{1}+{p}_{2};$${\underset{\_}{b}}_{\left(i\right)}^{\prime}$ is a vector of $p\times 1$ parameters, the linear predictors can be written as:

$${\mu}_{t}=C{b}_{1}\phantom{\rule{4pt}{0ex}};\phantom{\rule{4pt}{0ex}}log\phantom{\rule{4pt}{0ex}}{\sigma}_{t}=C{b}_{2}\phantom{\rule{4pt}{0ex}};\phantom{\rule{4pt}{0ex}}{\kappa}_{t}=\kappa $$

For this study, we assigned a penalization $\left(P\right)$ to the vector of parameters, such that:

$$P=\left[\begin{array}{cc}\frac{1}{{\sigma}_{\beta}^{2}}\otimes {I}_{{p}_{1}}& 0\\ 0& \frac{1}{{\sigma}_{u}^{2}}\otimes {I}_{{p}_{2}}\end{array}\right]$$

where ${I}_{{p}_{1}}$ and ${I}_{{p}_{2}}$ are identity matrices of order ${p}_{1}$ and ${p}_{1}$ respectively, ${\sigma}_{\beta}^{2}$ and ${\sigma}_{u}^{2}$ are values that control the degree of regularization of the model.

Therefore the penalized log-likelihood of the model is:

$${\ell}_{n}^{p}({\mu}_{t},{\sigma}_{t},\kappa \mid \underset{\_}{y})=\sum _{t=1}^{n}{\ell}_{t}({\mu}_{t},{\sigma}_{t},\kappa \mid \underset{\_}{y})-\sum _{i=1}^{2}{\underset{\_}{b}}_{\left(i\right)}^{\prime}P{\underset{\_}{b}}_{\left(i\right)}-\frac{1}{{\sigma}_{\kappa}^{2}}{\kappa}^{2}$$

where:

$${\ell}_{t}({\mu}_{t},{\sigma}_{t},\kappa \mid \underset{\_}{y})=-log{\sigma}_{t}-{\left[1+\kappa \left(\frac{{y}_{t}-{\mu}_{t}}{{\sigma}_{t}}\right)\right]}^{\frac{1}{\kappa}}log\left[1+\kappa \left(\frac{{y}_{t}-{\mu}_{t}}{{\sigma}_{t}}\right)\right]$$

Defining ${\ell}_{n}({\mu}_{t},{\sigma}_{t},\kappa \mid \underset{\_}{y})={\displaystyle \sum _{t=1}^{n}}{\ell}_{t}({\mu}_{t},{\sigma}_{t},\kappa \mid \underset{\_}{y})$ and ${\ell}_{PL}({b}_{\left(1\right)},{b}_{\left(2\right)},\kappa )=\left[-{\displaystyle \sum _{i=1}^{2}}{\underset{\_}{b}}_{\left(i\right)}^{\prime}P{\underset{\_}{b}}_{\left(i\right)}-\frac{1}{{\sigma}_{\kappa}^{2}}\kappa \right]$ we can rewrite the log-likelihood as:

$$\begin{array}{c}{\ell}_{n}^{p}({\mu}_{t},{\sigma}_{t},\kappa \mid \underset{\_}{y})={\ell}_{n}({\mu}_{t},{\sigma}_{t},\kappa \mid \underset{\_}{y})+{\ell}_{PL}({b}_{\left(1\right)},{b}_{\left(2\right)},\kappa )\end{array}$$

(2)

In order to compare two models, let ${M}_{1}$ with ${\theta}_{1}$ a parametric vector against another model ${M}_{0}$ with ${\theta}_{0}$ a subset vector such as ${M}_{1}\subset {M}_{0}$, a simple way to compare two models is to use the deviance statistic defined by [34]

$$D=2\left({l}_{n}^{*}\left({M}_{1}\right)-{l}_{n}^{*}\left({M}_{0}\right)\right)$$

where ${l}_{n}^{*}\left(M\right)$ is the maximized log likelihood function of model M. Values of D greater than ${\chi}_{k}^{2}$ are considered significant, where *k* is the difference between the dimensions of ${M}_{1}$ and ${M}_{0}$, thus model ${M}_{1}$ explains better data variation than model ${M}_{0}$.

Penalized maximum likelihood estimators are used in this work for two reasons, the first is to use these estimators to perform a procedure of variables selection through the deviance, and second, we will use these estimators as initial values for the Bayesian hierarchical model to reduce the number of samples necessary to reach the stationary distribution of the MCMC algorithm.

Under the assumption that $\pi \left({y}_{t}|{\theta}_{t}\right)$ is the GEV distribution with parameters of ${\theta}_{t}=\left({\mu}_{t},{\sigma}_{t},\kappa \right)$, a Bayesian formulation for the model of extreme values is as follows:

$$\begin{array}{c}\pi \left({\theta}_{t},\omega |{y}_{t}\right)\propto \pi \left({y}_{t}|{\theta}_{t}\right)\pi \left({\theta}_{t}|\omega \right)\pi \left(\omega \right)\end{array}$$

(3)

where $\omega =\left({\beta}_{1},{\beta}_{2},{u}_{1},{u}_{2}\right)$ is such that the set of equations given in Equation (1) is satisfied. This hierarchical model consists of three levels: a data level, denoted by $\pi \left({y}_{t}|{\theta}_{t}\right)$; a process level, denoted by $\pi \left({\theta}_{t}|\omega \right)$; and a prior level, denoted by $\pi \left(\omega \right)$. Alternatively, the model given in Equation (3) with the conditions given in Equation (1) can be reformulated as a function of the parameter space of ${\omega}^{*}=\left({\beta}_{1},{\beta}_{2},{u}_{1},{u}_{2},\kappa \right)$ and ${\omega}^{**}=\left\{{\sigma}_{1},{\sigma}_{2}\right\}$; therefore, we can write the posterior distribution as follows:

$$\begin{array}{c}\pi \left({\omega}^{*},{\omega}^{**}|{y}_{t}\right)\propto \pi \left({y}_{t}|{\omega}^{*}\right)\pi \left({\omega}^{*}|{\omega}^{**}\right)\pi \left({\omega}^{**}\right)\end{array}$$

(4)

where $\pi \left({y}_{t}|{\omega}^{*}\right)$ is the GEV density under the conditions on the parameters given in Equation (1). The prior distribution for ${\omega}^{*}$ is such that

$$\begin{array}{c}{\beta}_{1}\sim N\left(0,{10}^{4}I\right)\\ {\beta}_{2}\sim N\left(0,{10}^{4}I\right)\\ {u}_{1}|{\sigma}_{1}\sim N\left(0,{\sigma}_{1}^{2}{I}_{{K}_{x}}\right)\\ {u}_{2}|{\sigma}_{2}\sim N\left(0,{\sigma}_{2}^{2}{I}_{{K}_{x}}\right)\\ \kappa \sim Uniform(-5,5)\end{array}$$

The prior distribution for the hyperparameters ${\omega}^{**}$ is given by

$$\begin{array}{c}{\sigma}_{1}^{2}\sim Half-Cauchy\left(25\right)\\ \hfill {\sigma}_{2}^{2}\sim Half-Cauchy\left(25\right)\end{array}$$

To sample the a posteriori distribution, we use a MCMC method with an acceptance probability given by

$$\alpha \left({\theta}^{*}|\theta \right)=min\left(1,\frac{\pi \left(x|{\theta}^{*}\right)\pi \left({\theta}^{*}\right)Q\left({\theta}^{*},\theta \right)}{\pi \left(x|\theta \right)\pi \left(\theta \right)Q\left(\theta ,{\theta}^{*}\right)}\right)$$

where $\pi \left(\theta \right)$ is the prior distribution for the parameters, $\pi \left(x|\theta \right)$ is the likelihood, and *Q* is the proposal function.

In this section, we examine the performance of the hierarchical GEV model defined in Equation (4) using simulated data. We simulated 500 extremes from a GEV model with two covariates, ${x}_{1}$ and ${x}_{2}$, corresponding to the longitude and latitude, respectively. The ${x}_{1}$ values were generated with equally spaced data in the range of 0 to $4\pi $, and the values of the covariate ${x}_{2}$ were randomly selected from the interval $\left[0,4\right]$, with

$$\begin{array}{c}{\mu}_{t}=sin\left(\frac{{\left(\sqrt{{\left({x}_{1}-2\pi \right)}^{2}+{\left({x}_{2}-2\pi \right)}^{2}}\right)}^{\frac{3}{2}}}{5}\right)\\ {\sigma}_{t}=\sigma =0.3\\ {\kappa}_{t}=\kappa =-0.1\end{array}$$

(5)

For our model, we ran 70,000 iterations to obtain samples of the a *posteriori* density, after a burn-in period of 60,000 iterations, and applied thinning in every fifth iteration. The hierarchical model given in Equation (4) was fitted by setting ${p}_{2}=10$ in equation set Equation (1). The estimate of the shape parameter was $-0.14$. The true functions of $\mu $ and $\sigma $ as functions of the covariates ${x}_{1}$ and ${x}_{2}$ are shown in Figure 1a,c, respectively. The function corresponding to $\mu $ that was chosen for the simulation is a function that cannot be separated based only on the main effects of the covariates, so it cannot be adjusted as in most traditional models for extreme values. The function for sigma is a flat function in the space covariate.

The true surface and local spatial patterns of the location parameter (Figure 2a) that were used to simulate the extreme values were recovered reasonably well by the smoothing function (Figure 2b). Similarly, a comparison of Figure 2c,d reveals that the smoothing function for the scale parameter of the extreme values based on the model given in Equation (1) recovered the true flat function for $\sigma $ given in Equation (5).

Real functions (**a**,**c**) and functions obtained by fitting the parameters (**b**,**d**) of a non-stationary generalized extreme value (GEV) model to simulated data with a sample size of $n=500$.

To evaluate the performance of our model, we reserved a set of 1000 locations to serve as the testing set and then calculated the correlation with respect to their estimated values. Specifically, for the location parameter, we obtained a correlation of 0.99 between the predicted values and the testing data. Typically, when the trend found in the maxima can be modeled using nonlinear functions of a set of covariates, the interpretations of the individual coefficients of the smoothing function lose importance, and they are often meaningless. Therefore, the main information about the trend is provided by the smoothing function constructed from the complete set of estimated parameters.

The data correspond to 1238 observations of quarterly block maxima in ${\mathrm{PM}}_{10}$ between 1 January 1995, and 31 December 2016, obtained at 31 fixed monitoring stations of the Red Automática de Monitoreo Atmosférico (RAMA) network established by the Comision Ambiental Metropolitana of Mexico City to monitor compliance with ambient air quality standards; this network is part of the Sistema de Monitoreo Atmosférico (SIMAT), a program responsible for ongoing measurements of the main air pollutants in Mexico City.

We constructed a GEV model of the ${\mathrm{PM}}_{10}$ maxima in the Mexico City metropolitan area, using multivariate smoothing functions of spatio-temporal covariates, latitude ($s1$), longitude ($s2$) and time (*t*), grouped into ${X}_{\left(1\right)}$ and meteorological covariates, wind speed ($ws$) and relative humidity ($rh$), grouped into ${X}_{\left(2\right)}$, to fit the trends in the non stationary GEV model. We defined two models to estimate the GEV parameters, the GEV0 model which included the spatio-temporal covariates and the GEV1 model that included the spatio-temporal covariates as well as the meteorological covariates.

According to the proposed approach in Equation (1), we define the joint contribution ${\delta}_{{\theta}_{t}}\left(X\right)$ of the set of covarites *X* corresponding to the ${\theta}_{t}$ parameter of the GEV distribution, as follows

$$\begin{array}{c}{\delta}_{{\mu}_{t}}\left(X\right)={\sum}_{i=1}^{{p}_{1}}{X}_{ti}{\beta}_{1i}+{\sum}_{i=1}^{{p}_{2}}{Z}_{ti}{u}_{1i},\\ {\delta}_{log\phantom{\rule{4pt}{0ex}}{\sigma}_{t}}\left(X\right)={\sum}_{i=1}^{{p}_{1}}{X}_{ti}{\beta}_{2i}+{\sum}_{i=1}^{{p}_{2}}{Z}_{ti}{u}_{2i},\\ {\delta}_{{\kappa}_{t}}\left(X\right)=\kappa \end{array}$$

(6)

The GEV0 model is a baseline model that incorporates the spatio-temporal covariates ${X}_{\left(1\right)}$ using a multivariate spline structure that implicitly includes the interaction between the covariables of the group, as follows

$$\begin{array}{c}{\mu}_{t}={\delta}_{{\mu}_{t}}\left({X}_{\left(1\right)}\right),\\ log\phantom{\rule{4pt}{0ex}}{\sigma}_{t}={\delta}_{log\phantom{\rule{4pt}{0ex}}{\sigma}_{t}}\left({X}_{\left(1\right)}\right),\\ \kappa ={\delta}_{{\kappa}_{t}}\left({X}_{\left(1\right)}\right)\end{array}$$

(7)

The GEV1 model is an extension of the GEV0 model that incorporates in an additive way the meteorological set of covariates ${X}_{\left(2\right)}$ with the structure given by Equation (6).

$$\begin{array}{c}{\mu}_{t}={\delta}_{{\mu}_{t}}\left({X}_{\left(1\right)}\right)+{\delta}_{{\mu}_{t}}\left({X}_{\left(2\right)}\right),\\ log\phantom{\rule{4pt}{0ex}}{\sigma}_{t}={\delta}_{log\phantom{\rule{4pt}{0ex}}{\sigma}_{t}}\left({X}_{\left(1\right)}\right)+{\delta}_{log\phantom{\rule{4pt}{0ex}}{\sigma}_{t}}\left({X}_{\left(2\right)}\right),\\ \kappa ={\delta}_{{\kappa}_{t}}\left({X}_{\left(1\right)}\right)+{\delta}_{{\kappa}_{t}}\left({X}_{\left(2\right)}\right)\end{array}$$

(8)

A descriptive summary of the data is shown at Table 1. Four examples of quarterly block maxima of ${\mathrm{PM}}_{10}$ levels are presented in Figure 3. The U.S. and Mexican standard for ${\mathrm{PM}}_{10}$ pollution levels is 150 $\mathsf{\mu}$g/m${}^{3}$. An analysis of Table 1 shows that in the area of Villa de las Flores, the permissible level was exceeded by more than three quarters of all measured extreme values. At three of the monitoring stations, CUA = Cuautitlán, NET = Netzahualcoyotl and XAL = Xalostoc, all observations exceeded the permitted standard level, this is because these locations have a high population density and more concentrated industry. The peak ${\mathrm{PM}}_{10}$ concentrations exceeded 1000 $\mathsf{\mu}$g/m${}^{3}$ at CES = Cerro de la Estrella, MER = Merced, SAG = San Agustín, VIF = Villa de las Flores and XAL = Xalostoc; at SAG = San Agustín station, the level recorded was 10 times higher than the recommended safe limit.

Example of quarterly block maxima of ${\mathrm{PM}}_{10}$ levels at (**a**) VIF = Villa Flores; (**b**) MER = Merced; (**c**) XAL = Xalostoc and (**d**) TLA = Tlanepantla. Linear trend over time are indicated in red.

Descriptive summary information on the extreme values of particulate matter less than 10 micrograms in diameter (${\mathrm{PM}}_{10}$) at the stations considered in the study.

Figure 4 shows boxplots of the ${\mathrm{PM}}_{10}$ maxima at each of the 31 monitoring stations considered in the study. The sites with lower ${\mathrm{PM}}_{10}$ maxima, less than 151 $\mathsf{\mu}$g/m${}^{3}$ on average, are located in MGH = Miguel Hidalgo, MPA = Milpa Alta, AJM = Ajusco Medio and CUA = Cuajimalpa. Relatively well-preserved areas can still be found at these sites, which also have the lowest industrial activity and urban growth in the study area. By contrast, we also found sites with more than 723 $\mathsf{\mu}$g/m${}^{3}$${\mathrm{PM}}_{10}$ on average, such as NET = Netzahualcoyotl, XAL = Xalostoc, CES = Cerro de la Estrella and MER = Merced, which are characterized by high industrial activity and heavily traveled paved and unpaved roads with heavy vehicular traffic [6].

Boxplots of the ${\mathrm{PM}}_{10}$ maxima at 31 monitoring stations in the Mexico City metropolitan area.

In order to determine the significance of the meteorological variables in the non-stationary GEV model, we adjusted the GEV0 and GEV1 models by using the method of maximum likelihood (ML) and penalized maximum likelihood (PML) and we compared these two models through the deviance statistics. The results presented in Table 2 show that the model GEV1 is not statistically better than the baseline model GEV0, therefore the set of meteorological covariates wind speed ($ws$) and relative humidity ($rh$) did not present evidence at the 95% level to be included in the non-stationary model GEV for the modeling of ${\mathrm{PM}}_{10}$ maxima. Table 2 shows the effect of penalization of the parameters in the model, in the case of maximum likelihood method none regularization was performed, therefore ${\ell}_{PL}({b}_{\left(1\right)},{b}_{\left(2\right)},\kappa )$ is considerably greater than the case of the penalized maximum likelihood. The maximum likelihood estimators have desirable statistical properties, however penalized estimators are preferred because they control the overfitting and reduce the instability of the estimates. In this work penalized estimators are used as initial values for the Bayesian hierarchical model, in order to reduce the number of chains necessary to reach the stationary distribution of the sampling MCMC algorithm.

Once the set of covariates involved in the model is determined, a hierarchical Bayesian model was fitted to analyse the ${\mathrm{PM}}_{10}$ maxima in the Mexico City metropolitan area. At the first level, we modeled the ${\mathrm{PM}}_{10}$ maxima using the generalized extreme value distribution; at the second level, we used multivariate smoothing spline functions to model nonstationary spatio-temporal extremes; and finally, at the third level, we assumed a priori distribution functions for the parameters of the model. Unfortunately, the posterior density in Equation (4) is not analytically tractable, so we implemented our model via the random walk Metropolis-Hastings algorithm to draw samples of the unknown parameters.

We performed the analysis using the statistical software *R* (version 3.3.1, R Foundation for Statistical Computing, Vienna, Austria). We fitted our Bayesian hierarchical model with the conditions expressed in Equation (6), setting ${p}_{2}=10$. We ran 70,000 MCMC samples after a burn-in period of 60,000 iterations, with thinning every fifth iteration. A look at the log-likelihood of the chain revealed convergence toward the stationary distribution of the MCMC algorithm.

Evidently, the extreme values of ${\mathrm{PM}}_{10}$ vary over time and from area to area because of local conditions, such as the topographic setting or the local wind. This fact justifies the modeling of the location parameter with respect to time and space. We verified this behavior by means of the estimates presented in Table 3, which show that the magnitudes of the studied maxima tend to decrease over time. Similarly, the boxplots in Figure 4 show that the distribution of the maxima is not the same in all monitored locations; therefore, a suitable model for the ${\mathrm{PM}}_{10}$ maxima must include temporal and spatial variations and their possible interactions. One of the strengths of our study is that the proposed model implicitly includes interactions between covariates, whereas most models consider only the main effects through additive or linear functions.

Estimates and 95% credible intervals of the nonstationary GEV model for the ${\mathrm{PM}}_{10}$ maxima.

The estimates for the parameters of model Equation (1) corresponding to the coefficients of the main effects of the space-time covariates in the location and scale smoothing functions as well as the shape parameter are shown in Table 3. In this case, the vector of parameters ${\beta}_{1}$ corresponds to $\mu $, and the vector ${\beta}_{2}$ corresponds to $log\phantom{\rule{4pt}{0ex}}{\sigma}_{t}$. We noticed that the effect ${\beta}_{\left(1\right)t}$ of time on the location parameter is negative and that the effects ${\beta}_{\left(1\right)s1}$ and ${\beta}_{\left(1\right)s2}$ of the longitude and latitude are both positive. Because of the smoothing term in Equation (6) and the standardization of the covariates used in the model, it is not possible to establish a direct relation between the parameters estimated in Table 1 and the parameters of the GEV distribution in model Equation (6), except through the terms ${\beta}_{\left(1\right)0}$ and ${\beta}_{\left(2\right)0}$, which can be interpreted as the values of the location and scale parameters, respectively, of the GEV distribution in the middle of the studied period and in the center of the geographic area considered in the study.

The spatial smoothing for the location and scale functions is shown in Figure 5. The estimation function for the location parameter in the latitude–longitude coordinate system shows that the ${\mathrm{PM}}_{10}$ maxima tend to increase in the northwesterly direction. The scale parameter also increases in the northwesterly direction, but unlike for the location parameter, a slight decrease is observed in the region close to Iztacalco. The estimate of the shape parameter is 0.1715, and its 95% credible interval is (0.1694, 0.1737).

We have developed and validated a model of extreme values using a robust and solidly-based theory, an important application of the results of our study is through a risk map or return level map. The return level ${Z}_{p}$ is the threshold at which an extreme value is exceeded with probability p, which is expected to occur once every 1/p years [34]. Figure 6 shows the return levels for a return period of 25 years. Accordingly, the trend of increased risk in the northwest of the study area is preserved, with the greatest levels of risk in areas close to VIF = Villa Flores, SAG = San Agustín and ACO = Acolman and a lower level of risk in the area surrounding Pedregal.

Previous studies have analysed extremes of air pollutants using the generalized extreme value distribution [30,36]. However, the majority of these investigations have used information from a single site and consequently have ignored the aspect of spatial variability, which may result in underestimation of the GEV parameters. Several approaches to the spatio-temporal modeling of extreme values [19,28] have been implemented for environmental extreme data analysis. Therefore, on the basis of these studies, we have proposed a model that offers structural advantages for the modeling of extreme values.

Similar to the findings reported by studies in other countries [3], the extreme ${\mathrm{PM}}_{10}$ concentrations studied here exhibited spatial variations within the study area.

These results were consistent with the environmental characteristics of each monitored region: the locations with the highest industrial and population densities showed higher concentrations of ${\mathrm{PM}}_{10}$, and consequently, in these locations, the GEV model yielded high estimates of the location and scale parameters and the 25-year return map showed greater risks of extreme future events.

Most ${\mathrm{PM}}_{10}$ modeling studies on short-term dynamics have found that weather covariables such as wind speed or temperature are significant in the model [37]. This is not always the case in studies with a longer time horizon, where the conclusions for the long term are more robust. One of the reasons is due to the temporality of the data, daily maxima present a different association with respect to the meteorological covariates than the maxima obtained in a longer time window, such as quarterly maxima.

One of the findings of our study is the negative contribution of the effect of time on the trend of ${\mathrm{PM}}_{10}$ maxima; an explanation of this phenomenon is the continuous implementation of new emission control policies and the continuous revision of environmental norms in the Mexico City, which have recently become increasingly strict, mainly by regulating the traffic and restricting the automotive vehicles that can be used i.e., vehicle’s model year. Additionally, the Mexico City government have implemented a website for monitoring air quality (http://www.aire.cdmx.gob.mx/default.php), thus environmental contingency alerts are activated to mitigate the adverse effect of air pollutants on public health.

Recent years have seen a growing interest in the monitoring of ${\mathrm{PM}}_{10}$ air pollution because of the multiple health problems it causes in densely populated areas. In the Mexico City metropolitan area, ${\mathrm{PM}}_{10}$ air pollution often originates from dust from roadways and organic black carbon formed during combustion processes. This has led to is an ongoing public health problem over the past several years affecting a large section of the population. Recently, due to population growth and the increase in the number of combustion vehicles and industries, the ${\mathrm{PM}}_{10}$ air pollution problem has worsened. Therefore, it is extremely important to study the spatio-temporal trend of the ${\mathrm{PM}}_{10}$ maxima and provide this information to the management authorities so that prevention standards and policies can be reviewed and adapted to prevent extreme cases of ${\mathrm{PM}}_{10}$ air pollution.

Because wind speed ($ws$) and relative humidity ($rh$) are statistically significant covariates in most models for the short-term forecast for ${\mathrm{PM}}_{10}$ concentration, we consider including these covariates in the non-stationary GEV model. However by using the deviance statistic and the chi-square test, we found that at a credible level of 95%, the model with these covariates was not statistically better than the model that did not include these variables. Therefore, wind speed and relative humidity were not statistically significant in the non-stationary GEV model with quarterly block maxima of ${\mathrm{PM}}_{10}$ levels. The return levels for a return period of 25 years revealed a clear spatial trend of increased levels of ${\mathrm{PM}}_{10}$ in the northwest study area and a decreasing trend in the extreme values over time.

In this study, we implemented a methodology to analyse the nonstationary extreme data and to perform a spatio-temporal study of the maxima of breathable particulate matter pollution (${\mathrm{PM}}_{10}$) levels in the Mexico City metropolitan area. These ${\mathrm{PM}}_{10}$ levels usually vary in space and time, and can potentially include significant spatio-temporal interactions. Therefore, the commonly used models can underperform the GEV estimates and consequently result in inaccuracies. We achieved this using an analysis of non-stationary extreme values, in which we used a combination of existing traditional statistical techniques. We used the maximum likelihood method to perform a variable selection step, the penalized maximum likelihood estimators to obtain initial values and reduce the convergence time of the MCMC algorithm and fitted the estimators using a Bayesian approach to eliminate potential invalid parameter values. The combination of these statistical techniques gives support and solidity to our results.

We proposed a modification to the Bayesian estimation methods used in the previous studies related to the analysis of extreme values applied to environmental areas. In our model, the covariates are included in the generalized extreme value distribution through multivariate spline functions and, therefore, interactions between the covariates are also considered in the model. We observed that this approach can be easily extended to the modeling of extreme events and the generation of risk maps for air pollution, rainfall, heat waves, wind speed, etc. The results of the simulations conducted led us to conclude that the methodology is adequate and reliable for this type of study. Additionally, as a first attempt, time has been treated as a covariate. Extensions of this work should consider a more general model for spatio-temporal analysis, a specific analysis of the prior distributions of the parameters, and a method for determining the correct number of nodes in the multivariate spline functions.

This work was supported by the Program for Professional Teaching Development (PRODEP) under Grant Number DSA/103.5/16/10481.

Author Contributions

Alejandro Ivan Aguirre-Salado and Humberto Vaquera-Huerta conceived and designed the experiment and, wrote some parts of the paper. Alejandro Ivan Aguirre-Salado designed the programming code for processing the data and wrote the paper. Carlos Arturo Aguirre-Salado and Carlos Soubervielle-Montalvo helped in some spatial analyses and wrote some parts of the paper. Silvia Reyes-Mora, Ana Delia Olvera-Cervantes and Guillermo Arturo Lancho-Romero contributed in the mathematical rationale of the model and wrote some parts of the paper. All authors revised and approved the final version of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

1. Joseph A., Sawant A., Srivastava A. PM_{10} and its impacts on health—A case study in Mumbai. Int. J. Environ. Health Res. 2003;13:207–214. doi: 10.1080/0960312031000098107. [PubMed] [Cross Ref]

2. Kumar P., Jain S., Gurjar B., Sharma P., Khare M., Morawska L., Britter R. New Directions: Can a “blue sky” return to Indian megacities? Atmos. Environ. 2013;71:198–201. doi: 10.1016/j.atmosenv.2013.01.055. [Cross Ref]

3. Wang X., Guo Y., Li G., Zhang Y., Westerdahl D., Jin X., Pan X., Chen L. Spatiotemporal analysis for the effect of ambient particulate matter on cause-specific respiratory mortality in Beijing, China. Environ. Sci. Pollut. Res. 2016;23:10946–10956. doi: 10.1007/s11356-016-6273-5. [PubMed] [Cross Ref]

4. Kim K.H., Kabir E., Kabir S. A review on the human health impact of airborne particulate matter. Environ. Int. 2015;74:136–143. doi: 10.1016/j.envint.2014.10.005. [PubMed] [Cross Ref]

5. Garcia J., Teodoro F., Cerdeira R., Coelho L., Kumar P., Carvalho M. Developing a methodology to predict PM_{10} concentrations in urban areas using generalized linear models. Environ. Technol. 2016;37:2316–2325. doi: 10.1080/09593330.2016.1149228. [PubMed] [Cross Ref]

6. Edgerton S.A., Bian X., Doran J., Fast J.D., Hubbe J.M., Malone E.L., Shaw W.J., Whiteman C.D., Zhong S., Arriaga J., et al. Particulate air pollution in Mexico City: A collaborative research project. J. Air Waste Manag. Assoc. 1999;49:1221–1229. doi: 10.1080/10473289.1999.10463915. [PubMed] [Cross Ref]

7. Thishan Dharshana K., Coowanitwong N. Ambient PM_{10} and respiratory illnesses in Colombo city, Sri Lanka. J. Environ. Sci. Health A. 2008;43:1064–1070. doi: 10.1080/10934520802060035. [PubMed] [Cross Ref]

8. Elbayoumi M., Ramli N.A., Yusof N.F.F.M., Al Madhoun W. Spatial and seasonal variation of particulate matter (PM_{10} and PM_{2.5}) in Middle Eastern classrooms. Atmos. Environ. 2013;80:389–397. doi: 10.1016/j.atmosenv.2013.07.067. [Cross Ref]

9. Paschalidou A.K., Karakitsios S., Kleanthous S., Kassomenos P.A. Forecasting hourly PM_{10} concentration in Cyprus through artificial neural networks and multiple regression models: Implications to local environmental management. Environ. Sci. Pollut. Res. 2011;18:316–327. doi: 10.1007/s11356-010-0375-2. [PubMed] [Cross Ref]

10. El Adlouni S., Ouarda T., Zhang X., Roy R., Bobée B. Generalized maximum likelihood estimators for the nonstationary generalized extreme value model. Water Resour. Res. 2007;43:W03410. doi: 10.1029/2005WR004545. [Cross Ref]

11. Sang H., Gelfand A.E. Continuous spatial process models for spatial extreme values. J. Agric. Biol. Environ. Stat. 2010;15:49–65. doi: 10.1007/s13253-009-0010-1. [Cross Ref]

12. Jenkinson A.F. The frequency distribution of the annual maximum (or minimum) values of meteorological elements. Q. J. R. Meteorol. Soc. 1955;81:158–171. doi: 10.1002/qj.49708134804. [Cross Ref]

13. Smith R.L. Maximum likelihood estimation in a class of non-regular cases. Biometrika. 1985;72:67–92. doi: 10.1093/biomet/72.1.67. [Cross Ref]

14. Hosking J.R.M., Wallis J.R., Wood E.F. Estimation of the generalized extreme-value distribution by the method of probability-weighted moments. Technometrics. 1985;27:251–261. doi: 10.1080/00401706.1985.10488049. [Cross Ref]

15. Hosking J.R. L-moments: Analysis and estimation of distributions using linear combinations of order statistics. J. R. Stat. Soc. Ser. B. 1990;52:105–124.

16. Madsen H., Rasmussen P.F., Rosbjerg D. Comparison of annual maximum series and partial duration series methods for modeling extreme hydrologic events: 1. At-site modeling. Water Resour. Res. 1997;33:747–757. doi: 10.1029/96WR03848. [Cross Ref]

17. Gaetan C., Grigoletto M. A hierarchical model for the analysis of spatial rainfall extremes. J. Agric. Biol. Environ. Stat. 2007;12:434–449. doi: 10.1198/108571107X250193. [Cross Ref]

18. Reich B., Shaby B., Cooley D. A Hierarchical model for serially-dependent Extremes: A study of heat waves in the Western US. J. Agric. Biol. Environ. Stat. 2014;19:119–135. doi: 10.1007/s13253-013-0161-y. [Cross Ref]

19. Cooley D., Sain S.R. Spatial hierarchical modeling of precipitation extremes from a regional climate model. J. Agric. Biol. Environ. Stat. 2010;15:381–402. doi: 10.1007/s13253-010-0023-9. [Cross Ref]

20. Leadbetter M.R., Lindgren G., Rootzen H. Extremes and Related Properties of Random Sequences and Processes. Springer; New York, NY, USA: 1983. p. 336.

21. Wang X.L., Zwiers F.W., Swail V. North Atlantic Ocean wave climate scenarios for the 21st century. J. Clim. 2004;17:2368–2383. doi: 10.1175/1520-0442(2004)017<2368:NAOWCC>2.0.CO;2. [Cross Ref]

22. Kharin V.V., Zwiers F.W. Estimating extremes in transient climate change simulations. J. Clim. 2005;18:1156–1173. doi: 10.1175/JCLI3320.1. [Cross Ref]

23. Weissman I. Estimation of parameters and large quantiles based on the k largest observations. J. Am. Stat. Assoc. 1978;73:812–815.

24. Tawn J. Bivariate extreme value theory: Models and estimation. Biometrika. 1988;75:397–415. doi: 10.1093/biomet/75.3.397. [Cross Ref]

25. Scarf P.A. Estimation for a four parameter generalized extreme value distribution. Commun. Stat. Theory M. 1992;21:2185–2201. doi: 10.1080/03610929208830906. [Cross Ref]

26. Rosen O., Cohen A. Statistical Theory and Computational Aspects of Smoothing. Springer; New York, NY, USA: 1996. Extreme percentile regression; pp. 200–214.

27. Pauli F., Coles S. Penalized likelihood inference in extreme value analyses. J. Appl. Stat. 2001;28:547–560. doi: 10.1080/02664760120047889. [Cross Ref]

28. Bocci C., Caporali E., Petrucci A. Geoadditive modeling for extreme rainfall data. ASTA Adv. Stat. Anal. 2013;97:181–193. doi: 10.1007/s10182-012-0192-7. [Cross Ref]

29. Yee T.W., Stephenson A.G. Vector generalized linear and additive extreme value models. Extremes. 2007;10:1–19. doi: 10.1007/s10687-007-0032-4. [Cross Ref]

30. Rodriguez S., Reyes H., Perez P., Vaquera H. Selection of a subset of meteorological variables for ozone analysis: Case study of pedregal station in Mexico City. Environ. Sci. Eng. A. 2012;1:11–20.

31. Cannon A.J. A flexible nonlinear modelling framework for nonstationary generalized extreme value analysis in hydroclimatology. Hydrol. Process. 2010;24:673–685. doi: 10.1002/hyp.7506. [Cross Ref]

32. Jauregui E. Local wind and air pollution interaction in the Mexico basin. Atmósfera. 1988;1:131–140.

33. Fisher R.A., Tippett L.H.C. Limiting forms of the frequency distribution of the largest or smallest member of a sample. Proc. Camb. Philos. Soc. 1928;24:180–190. doi: 10.1017/S0305004100015681. [Cross Ref]

34. Coles S. An Introduction to Statistical Modeling of Extreme Values. Springer; London, UK: 2001.

35. Figueiredo M.A. On Gaussian radial basis function approximations: Interpretation, extensions, and learning strategies; Proceedings of the 15th International Conference on Pattern Recognition (ICPR-2000); Barcelona, Spain. 3–7 September 2000; pp. 618–621.

36. Kütchenhoff H., Thamerus M. Extreme value analysis of Munich air pollution data. Environ. Ecol. Stat. 1996;3:127–141. doi: 10.1007/BF02427858. [Cross Ref]

37. Lin C.Y., Chiang M.L., Lin C.Y. Empirical model for evaluating PM_{10} concentration caused by river dust episodes. Int. J. Environ. Res. Public Health. 2016;13:553 doi: 10.3390/ijerph13060553. [PMC free article] [PubMed] [Cross Ref]

Articles from International Journal of Environmental Research and Public Health are provided here courtesy of **Multidisciplinary Digital Publishing Institute (MDPI)**