Details of the simulated GAW17 data set can be found in this same issue [

5]. I defined variants that had a minor allele frequency (MAF) less than 0.01 to be rare but potentially the most biologically interesting, because extremely rare mutations are expected to have the greatest deleterious effects on phenotype. Of all the SNPs provided in the data set, 73% (18,131) fall within this MAF range. For each gene, I applied the collapsing procedure, as described in the CMC method [

2], by grouping rare SNPs into one of two bins defined by their predicted impact on protein (i.e., synonymous or nonsynonymous variant). Any bin with a MAF less than 0.01 after the collapsing procedure was not included for further analysis. Common SNPs, defined as those having a MAF ≥ 0.01, were not collapsed with any other SNPs in the gene. For conciseness, I use the term

*variable* to define either a single common SNP or a SNP bin. The final marker panel included 7,385 variables: 1,029 bins containing collapsed rare variants and 6,356 bins containing common SNPs. I experimented with higher threshold values for bin definition (e.g., MAF = 0.05), but this strategy did not recover an appreciable number of bins from the filtering step because most genes in the data set were small and harbored private mutations. True log relative risks (denoted

**β**) for each SNP are provided in the simulation answer key, which quantifies each SNP’s effect on the quantitative traits Q1 and Q2. Thus, to assess how accurately my method can recover the true values of

**β** at each SNP, I constrained the analyses only to models where the outcome phenotype was either trait Q1 or Q2.

The statistical model I used was a two-level hierarchical model, described in detail by Chen and Thomas [

6]. One property of a hierarchical model that is appealing when analyzing variants of low frequency, where maximum-likelihood estimates (MLEs) of association

can be highly unstable, is the ability to smooth these point estimates (and their variances) toward prior distributions defined in a second level. At the first level, I apply ordinary least-squares regression, which produces MLEs of association between a continuous trait (i.e., either Q1 or Q2) and a random set of

*m* model variables. A design matrix

**X** stores the variable values, and the vector

**Y** stores values of Q1 or Q1 across all individuals:

I define a prior distribution on

**β** in Eq. (1) using the annotation information provided by GAW17. For variable

*k*,

**β**_{k} is distributed as a mixed-effects model, originally defined by Besag et al. [

7] as:

where the latent fixed effect is

*π* and the random effects components are:

The

**Z** matrix stores external knowledge about each of the

*m* variables currently in the model. To encode my belief that deleterious mutations would have higher or lower values of

**β** relative to other types of mutations, I assigned a value of 1 to the nonsynonymous mutation in the second column (after reserving the first column as the intercept) of the

*m* × 2 design matrix

**Z** and a value of 0 for any other SNP category. The term

*π*, estimated using ordinary linear regression, relates the magnitude of

**β** in Eq. (1) to values in

**Z**. Furthermore, to encode my belief that mutations within the same gene should have similar effects on disease, I specified an indicator encoding whether predictor

*k* and any other model variables are in the same gene by means of a

*k* ×

*k* adjacency matrix

**A**. Specifically, the parameter

stores the mean of the MLE

from the first level, taken across neighbors of variable

*k* (i.e., all other variables that are in the same gene) defined by means of

**A**. The variance term

*τ*^{2} is inversely scaled by

*v*_{k}, the number of neighbors of

*k* to weight the uncertainty about

*τ*^{2}. Finally,

*θ*_{k} soaks up any remaining variation in the second level of the model through the variance term

*σ*^{2}.

A posterior density is defined on the basis of the likelihood and normal density function corresponding to the first (Eq. (1)) and second (Eq. (2)) levels of the hierarchical model. I use the product of this density function and a model transition function as the objective function of a reversible jump Monte Carlo Markov chain (MCMC) algorithm to stochastically explore the search space, fitting all possible sets of model variables to the data. The model transition kernel itself is informed through empirical Bayes estimates of the hyperparameters (e.g.,

*π*), so that regions of the search space that have strong empirical support and prior evidence are prioritized. Further details on how the variable selection algorithm works can be found in Chen and Thomas [

6].

In the next section I present results between a more conventional method and my proposed method. The first method is an ordinary least-squares regression between the quantitative trait (i.e., Q1 or Q2) and each vector of variable scores, which I denote as the MLE method. This approach is equivalent to a conventional genome-wide association scan, testing for marginal effects. I compared this to four variations of the multivariate MCMC method. Specifically, I varied the degree of informativeness in the prior distribution by modifying the definition of the matrices **A** and **Z**. The most informative prior distribution (denoted FULL) stores both gene membership and SNP mutation type information in the **A** and **Z** matrices, respectively. In the second variant of the prior distribution (denoted **Z** only), I removed gene membership information so that matrix **A** was simply the identity matrix. Conversely, in the third variant of the prior distribution (denoted **A** only), I removed mutation class information so that the **Z** matrix included only the intercept. The last variant of the prior distribution (denoted UNINF) includes both the uninformative **Z** and **A** matrices and is equivalent to a the ridge style prior distribution (i.e., **β** ~ *N*(0, *σ*^{2} )).

For each of the MCMC analyses, I sampled 2 million realizations from the posterior distribution, retaining statistics on only the last million realizations to minimize any correlation to the initial parameter values. Run time on a 2-GHz Xeon processor was approximately 8 h. I verified that the retained statistics reached convergence by comparing their distributions across multiple chains using a nonsignificant *p*-value extracted from the Kolmogorov-Smirnov test.

To quantify evidence for any specific variable (either common SNP or SNP bin), I empirically estimated Bayes factors (BFs) for each variable by dividing the posterior odds by the prior odds, as described by Chen and Thomas [

6]. BFs quantify the increase in evidence for a hypothesis (in this case, inclusion of a variable into the model) in light of observed data relative to a prior hypothesis [

8].