The process by which we arrived at the model outlined above involved an iterative procedure of model checking and evaluation. Our understanding of the biology coupled with results of an exploratory analysis (results not reported here) suggested that the log ratio measurements depend on batch, context and variant. We began with a model for the individual replicates, fυrbc
, but found that it was sufficient to analyze the mean log ratio measurement for each variant within a batch, υbc
. This greatly simplifies the model, but does not affect our substantive conclusions (the correlation between estimates of Pr(Dυ
Data) derived from the two models is 0.9998). The batch and variant effects βbc
are not identifiable through the model's likelihood, but can be made so either by specifying a prior distribution that is informative for one of their means or by setting one effect to a constant. We experimented with several such choices and found that model fit was improved by fixing the WT effect. , panel (a) is a normal quantile–quantile plot of standardized residuals from our final model, averaged over posterior parameter uncertainty. A simultaneous 95% interval estimate for the empirical quantiles (green lines) includes the observed quantiles, indicating that the error structure of the model accurately describes residual variability in the data.
Figure 1 Model goodness of fit. Panel (a): Normal quantile–quantile plot of standardized residuals from our fit of the functional assay data. Standardized residuals were averaged over posterior uncertainty in the model's parameters. The green lines denote (more ...)
In order to provide a basis for comparison with the Bayesian model, we fit a fixed effects regression of fυrbc on the factors variant and batch. This analysis provided us with least squares estimates of the variant effects adjusted for batch. , panel (b), plots posterior means of the variant–level effects from the Bayesian hierarchical model against their corresponding least squares estimates. In this plot, the black line is the diagonal and points are highlighted according to whether the variant was studied in AA context 1396 to 1863 only (green), in AA context 1560 to 1863 only (blue) or both (teal). A modest amount of Bayesian shrinkage to the component–specific means (depicted by a black dashed line) is evident. This effect is highlighted by the red lines which plot the least squares fits among variants with a posterior mean greater than 3.0 (likely members of the protein neutral component) and less than 1.2 (likely members of the protein damaging component). Despite the shrinkage, the Bayesian estimates remain highly correlated (0.9964) with their frequentist fixed effects counterparts, suggesting that influence of the model's higher level parameters is negligible and that the model is not over–fitting the data.
summarizes the marginal prior and posterior distributions of the model's higher level parameters, i.e. those not indexed by batch or variant. In all cases, the prior distributions are diffuse and relatively non–informative while the marginal posterior distributions are concentrated. Inferences regarding the pathogenicity of the 76 VUSs are driven by the two component normal mixture model for the variant–specific random effects, ηυ. With the benign variants centered at the WT effect (assumed to be 3.8 for purposes of model identifiability), we estimate that the variants damaging to protein function have effects centered at -0.025 with a standard deviation of 0.226 adjusting for batch and context. There is considerable batch–to–batch variability in the average level of the log–ratio data: we estimate the standard deviation of the batch random effects distribution, λ, to be 1.381. Estimates of these parameters were insensitive to the scaling factor a used in the prior distributions: in both cases the posterior means estimated for these parameters in the sensitivity analyses had correlation > 0.999 with those reported in .
Table 3 Summaries of the marginal posterior (columns 2–4) and marginal prior (columns 5–6) distributions of the model's parameters excluding the batch– and variant–specific random effects and classification indicators. Columns (more ...)
provides a graphical summary of the variant–specific effects and of their mixture model marginal distribution. In this plot, each variant is represented by a boxplot summarizing the marginal posterior distribution of its random effect, ηυ. The wildtype variant is plotted in green, the known pathogenic variants are plotted in red, the VUSs are plotted in blue and the variants previously classified as pathogenic/benign and included in the model blinded to this information are plotted in orange/yellow. A point estimate of the mixture model is plotted in the right margin. Its left (upper) component corresponds to benign variants, while its right (lower) component corresponds to damaging variants; estimates of the means of the components are plotted as green and red dotted lines, respectively. The WT is the only assumed known benign variant in this analysis and, as such, serves to anchor its component. Its variant–specific effect is fixed at 3.8 as described above. There are four known pathogenic variants that serve to anchor the component representing protein damaging variants. These have variant-specific effects ranging from -0.76 (5673insC) up to 0.68 (W1837X). As the plot suggests, the damaging variant effects are more variable than their benign counterparts (the standard deviation of their component is estimated to be 2.30 times as large as the standard deviation of the benign variant component).
Figure 2 Boxplots summarizing the marginal posterior distributions of the variant–specific effect parameters, ηυ. The wildtype variant is indicated in green, the known pathogenic variants in red and the VUSs in blue. VUSs previously classified (more ...)
provides variant–specific summaries from the analysis. Each variant is labeled according to the classification assumed for it prior to the analysis: pathogenic, neutral, VUS, VUS previously classified as “neutral” with an estimated posterior probability of pathogenicity less than 0.0001, VUS previously classified as pathogenic with an estimated posterior probability greater than 0.999, and VUS previously classified as pathogenic with an estimated posterior probability of between 0.99 and 0.999. The last three classes of VUS were included in the analysis “blinded” to their prior classification for purposes of model evaluation. In addition, the table reports the posterior probabilities and the natural logarithm of the posterior odds for each variant being damaging. Estimates of these posterior probabilities were also insensitive to the scaling factor a used in the prior distributions: in both cases the estimates of these probabilities in the sensitivity analyses had correlation > 0.999 with those reported in .
For purposes of this analysis, we assumed that the prior probability of a variant being protein damaging for each VUS is 0.5 (see Equation 3
). This is a common “non–informative” choice when there is no prior data to inform the distribution of a binary variable. Our choice to isolate this analysis from the influence of other variables allows us to clearly summarize the utility of the functional data as a form of evidence for classifying VUSs. It also allows one to interpret the posterior odds of variant being damaging as a Bayes factor (BF) (23
), a Bayesian integrated likelihood ratio statistic that measures the degree to which the data — in this case the functional annotation measurements — support the hypothesis that a variant is protein damaging. Jeffreys (24
) provides a commonly cited metric for judging BFs in which BFs between 1 and 3.2 (0 and 1.16 on the loge
scale used in columns 4 and 8 in ) provide ‘indeterminate’ evidence in favor of the hypothesis, those between 3.2 and 10 (1.16 and 2.30) provide ‘positive’ evidence of the hypothesis, those between 10 and 31.2 (2.30 and 3.44) provide ‘strong’ evidence of the hypothesis, those between 31.2 and 100 (3.44 and 4.61) provide ‘very strong’ evidence of the hypothesis and those above 100 (4.61) provide ‘decisive’ evidence of the hypothesis. The BF is symmetric: its multiplicative inverse provides a comparable measure of evidence against
the hypothesis. Hence, for example, a BF between 1/100 and 1/31.2 (-4.61 and -3.44) provides ‘very strong’ evidence against the hypothesis on Jeffreys' scale of evidence.
The primary goal of this analysis is to determine how informative and accurate the functional data are for classifying VUSs before including it in a composite model that incorporates family, co–segregation, sequence alignment and other forms of data. Our analysis suggests that the functional data are very informative. Indeed, 65 of 76 VUSs have BFs, either in favor of or against the variant being protein damaging, in the ‘decisive’ range. Under classification scheme S1, 27 variants would be classified as damaging, 43 as neutral and six would remain unclassified; while under the much more stringent scheme S2, 27 variants would be classified as damaging, one as neutral and 48 would be left unclassified.
Further, results for the blinded samples suggest that classification probabilities derived from the model are both sensitive and specific. Our estimates of the posterior probabilities (PPs) of the variants being protein damaging for the VUSs previously classified as neutral range from 0.00089 (S1613G) up to 0.0057 (V1804D); all have BFs falling in the ‘decisive’ category of evidence against the variants being protein damaging. Our PP estimates for the VUSs previously classified as pathogenic by other methods were 0.999 (V1688del), 1.00 (L1764P), 1.00 (T1685I), 1.00 (G1706E), 1.00 (M1775R) and 1.00 (G1738R); their BFs where all in the ‘decisive’ range in favor of being protein damaging. All of these variants are classified correctly (i.e. the same as they had been previously) using classification scheme S1. In particular, our approach is both sensitive (observed relative frequency, RF=6/6, PM=0.93, CI=(0.67,1.00)) and specific (RF=9/9, PM=0.95, CI=(0.76,1.00)) when the variants are classified using this rule. In contrast, while sensitivity remains high (RF=6/6, PM=0.93, CI=(0.67,1.00)), specificity (RF=1/9, PM=0.15, CI=(0.01,0.41)) declines markedly under the more stringent scheme S2. This suggests that, while the functional data are very informative, they will need to be combined with other data, such as in silico and family–based results, in order to meet the more stringent requirements of clinical application.
In principal, a classification scheme with similar operating characteristics could be constructed by applying empirically determined thresholds to estimates of the variant–specific coefficients in the batch adjusted linear model. Classification against thresholds of 1 and 3 (x–axis, , Panel (b)), for example, would correctly call all blinded variants and would correspond to a procedure with estimated sensitivity of 0.93 and specificity of 0.95. Overall, 26 of the 76 variants would be classified as pathogenic and 43 as neutral, while seven would remain unclassified. While the predictive utility of the functional data is evident from these simple summaries of the data, the two component mixture model establishes the formal framework required for inference in this vexing problem. It provides a natural yardstick for classification (the posterior probability) and a modeling approach that can readily be extended to include the other forms of data relevant to classification of VUSs.