3.2. Multimodel Combination Based on Predictor State Space—Algorithm

Let us suppose that we have streamflow forecasts,

, where

*m* = 1, 2, …,

*M* denotes the forecasts from

*M* different models,

*i* = 1, 2, …,

*N*_{m} represents ensembles of the conditional distribution of streamflows with

*N*_{m} denoting the total number of ensembles under each model, and t denotes the time (season/month) for which the forecast is issued. Assume that we have a total of

*t* = 1, 2, …,

*n* years for which the forecasts,

, are available and the models also have a common predictor vector,

*X*_{t}, which influences the conditional distribution of hydroclimatic attributes represented using the ensembles. The predictor

*X*_{t} =[

*x*_{1t} *x*_{2t} …

*x*_{pt}] could be either SST conditions modulating the flow/moisture pathways for the given site or it could be winter snow pack that could potentially determine the spring-melt flows. provides a flow chart indicating the steps in implementing the proposed multimodel combination conditioned on the predictor state. It is important that the proposed approach requires at least one common predictor among the

*M* competing models. Even if the models do not have a common predictor particularly in the context of GCM forecasts, one could use the leading principal components (PC) of the underlying boundary conditions (for instance, SSTs) as the common predictor across all the models. As mentioned before, developing multimodel ensembles based on optimal combination method requires the observed climatic/streamflow variables

*O*_{t}, using which one could assess the skill of the probabilistic forecasts based on rank probability score (RPS) [

*Murphy*, 1970;

*Candille and Talagrand*, 2005;

*Anderson*, 1996] to obtain the weights

. It is important to note that RPS is evaluated for each year using the ensembles (N

_{m} = 10000) representing the conditional distribution, which is quite different from correlation for which one needs a minimum of two years of forecasts. The main advantage of using RPS is that it quantifies the error in estimating the entire conditional distribution, whereas measures such as correlation and Root Mean Square Error (RMSE) consider only the error in predicting conditional mean. One could also employ continuous rank probability score, which compares the observed event with the forecasted probabilities using a parametric functional form [

*Candille and Talagrand*, 2005;

*Wilks*, 1995]. The rank probability skill score (RPSS) represents the level of improvement of the forecast RPS in comparison to the reference forecast RPS, which is usually assumed to be climatology.

Appendix A provides details on obtaining RPS and RPSS for given probabilistic forecasts.

Let us denote the RPS and RPSS of the probabilistic forecasts,

, for each time step as

and

, respectively. For the purpose of multimodel combination, we assess the skill of the model by analyzing its predictability under similar climatic conditions. One could identify similar predictor conditions in the predictor state space by choosing a metric that computes the distance between the current predictor state,

*X*_{t}, and the historical predictor vector,

*X*. For instance, if the current state of the predictors indicates E1 Nino conditions, then past E1 Nino conditions could be considered as similar conditions. The distance metric could be either Euclidean distance or a more generalized distance measure such as the Mahalonobis distance, which is more useful if the predictors exhibit correlation among them. Compute the distances

*d*_{t1} between the current conditioning state

*X*_{t}, and the historical predictor vector

*X*_{l} using Mahalonobis distance:

where

denotes the variance-covariance matrix of the historical predictor vector

*X*. One can note that if

*l* =

*t*, the distance metric,

*d*_{tl}, reduces to zero. Similarly, if

*X*_{t} is the principal components of the original SST fields, then the Mahalonobis distance boils down to the Euclidean distance. Using the distance vector

*d*, the ordered set of nearest neighbor indices

*J* can be identified. Thus the jth element in the distance vector metric provides the jth closest state

*X*_{l} to the current state

*X*_{t}. Using this information, we assess the performance of each model in the predictor state space as

where

denotes the skill of the forecasting model, m, for the year that represents the

*j*th closest condition (obtained from

*J*) to the current condition

*X*_{t}. In other words,

summarizes the average skill of the forecasting model,

*m*, by choosing

*K* years that resemble very similar to the current condition,

*X*_{t}. Using

obtained for each model at each time step, we obtain the weights for the multimodel combination so that a model with better performance during particular climatic conditions needs to be represented with more number of ensembles in comparison to a model with lower predictability under those conditions. It is important to note that RPS is a measure of error in predicting the probabilities and it is evaluated based on the entire ensembles that represent the conditional distribution of streamflows. We can use the average skill of the model,

, to obtain the weights for each model based on

equation (3).

If

is zero for a subset of models

*M*_{1} ≤

*M*, then the weights

are distributed equally (1.0/

*M*_{1}) between the models for which

is zero and the remaining models (

*M* −

*M*_{1}) are assigned zero weight. For instance, if two models out of five, have an average RPS (

) of zero taken over

*K* neighbors, then these two models will be assigned a weight of 0.5 and the remaining three models will be assigned a weight of zero. The multimodel forecasts for each time step could be developed by drawing

ensembles from each model to constitute the multimodel ensembles. Thus one has to specify the number of neighbors

*K* to implement this approach. It is also important to note that choosing fewer

*K* does not imply that the multimodel forecasts are developed using the observed predictand and predictors based in the identified

*K* similar conditions. In fact,

are forecasts developed based on the observed values of the predictand and predictor over a particular training period (for leave-one-out cross-validated forecasts, we use

*n* − 1 years of record as training period; for adaptive forecasts, we develop the model using n1 years and the remaining (n − n1) are considered for validation). Thus we use the weights,

, only to draw the ensembles from

, which is in fact obtained based on the chosen training period for developing the forecasts. The simplest approach for selecting the number of neighbors is to find a fixed

*K* that corresponds to the lowest-average RPS from the multimodel forecast. The other approach would be to choose a different number of neighbors

*K*_{t} for each year. Under this approach, “Varying

*K*_{t}”, the chosen

*K*_{t} corresponds to the minimum RPS that could be obtained from the multimodel ensembles for that year. Obviously,

*Varying K*_{t}1 will ensure the lowest-average RPS for the entire forecast, but it is computationally intensive. The performance of multimodel ensembles is compared with individual model's predictive skill using various verification measures such as average RPS (

), average RPSS (

), and anomaly correlation (

).