gives the AUC when relative risks are 1.1, 1.3 and 1.5, and genotype frequencies are 0.1, 0.2 and 0.3, for 40 and 80 identical markers. As expected, the lowest AUC, 0.55, corresponds to the lowest relative risk, 1.1, and lowest genotype frequency, 0.1, for 40 markers. When relative risk is 1.1 for 40 markers, the increase in AUC obtained by increasing the genotype frequency from 0.1 to 0.3 is only 0.03 or 5% when relative risk is 1.5, the increase in AUC is 0.09 or 12.5%. Similarly, when genotype frequency is 0.1, for 40 markers, the increase in AUC obtained by increasing the relative risk from 1.1 to 1.5 is 0.17 or 31% when genotype frequency is 0.3, the increase in AUC is 0.23 or 40%. Overall, the AUC increases when the number of markers are doubled to 80 with the smallest increase corresponding to the lowest values of relative risk and genotype frequency and the largest increase corresponding to the highest values of relative risk and genotype frequency. Genotypes with low relative risk (around 1.1) and genotype frequency (around 0.1) require about 1000 markers in the genomic profile to have a reasonable discriminative power (AUC around 0.74).
Area under the ROC curve (AUC) for genotype frequency (G) equal to 0.1, 0.2 and 0.3, and relative risks (R) equal to 1.1, 1.3 and 1.5, for 40 markers (AUC_1) and 80 markers (AUC_2).
When the relative risk is 1.1, genotype frequency, 0.1, disease prevalence, 5%, and heritability, 5%, the number of markers that contribute to risk of disease is 758 (AUC=0.71); if we increase heritability to 10%, the number of markers that contribute to risk of disease is 1208 (AUC=0.76). With heritability at 5%, when the disease prevalence is increased to 10%, the number of markers decline to 422 (AUC=0.66). When heritability and relative risk are also increased to 10% and 1.5, respectively, the number of markers that contribute to disease decline to 32 (AUC=0.70); when genotype frequency is increased to 0.3, the number of markers decline further to 17 (AUC=0.71). Interestingly, AUC as a function of heritability and disease prevalence remains approximately same for the ranges of relative risks (1.1–1.5) and genotype frequencies (0.1–0.3) considered. For example, when heritability is 5% and disease prevalence is 5%, the AUC remains around 0.71; when disease prevalence is increased to 10%, AUC declines to 0.66. When heritability is 10% and disease prevalence is 5%, the AUC is 0.76; increasing disease prevalence to 10% results in a decline of AUC to 0.71.
Even when the PAF is 50%, the AUC remains less than 0.64 for the ranges of genotype frequency and relative risks considered. When PAF is 50%, genotype frequency 0.1 and relative risk 1.1, the number of markers that contribute to risk of disease is 70 (AUC=0.57). When we increase the relative risk to 1.5, the number of markers decline to 14 (AUC=0.64). This is the maximum AUC obtained when PAF is 50%. Increasing the genotype frequency to 0.3 results in the number of markers declining further to 5 (AUC=0.62). These results show that for a given value of PAF, the number of markers that contribute to risk of disease declines with increasing genotype frequency and results in lower AUC; however, the number of markers that contribute to risk of disease also declines with increasing relative risks but leads to higher AUC. When the relative risk is 1.5, the PAF has to be at least 84% to have a reasonable discriminative power (AUC around 0.70) for the range of genotype frequency considered. When the relative risk is 1.1 and genotype frequency 0.3, even with a PAF of 99% the AUC is only 0.65.
When genotype frequency and relative risks differ for different markers, we used the formula derived for the best linear combination of markers to calculate the AUC. We also simulated one million observations for a case control study for specified genotype frequencies for cases and controls for multiple markers assuming a disease prevalence of 10%. The logistic procedure in SAS (Cary, NC, USA) was used to calculate the concordance statistic for the simulated data. , and give the AUC calculated using the best linear combination of markers (AUCB) and the AUC calculated from the logistic procedure (AUCS) for different values of relative risks and different genotype frequencies for two, three and five markers, respectively. For two or three markers, AUCB is always greater than AUCS. This is to be expected because the AUCS is based on the empirical ROC curve and the AUCB is based on the binormal ROC curve. The maximum difference between AUCS and AUCB was less than 0.024 for all the ranges of genotype frequencies (0.1–0.4) and relative risks (1.1–2.0) considered. For five markers, there is almost no difference between AUCB and AUCS. These tables show again that AUC increases with increasing genotype frequency, increasing relative risks and increasing number of markers in the genomic profile.
Areas under the ROC curves for the best linear combination of markers (AUCB) and for simulated data using logistic regression (AUCS) for two markers
Areas under the ROC curves for the best linear combination of markers (AUCB) and for simulated data using logistic regression (AUCS) for three markers
Areas under the ROC curves for the best linear combination of markers (AUCB) and for simulated data using logistic regression (AUCS) for five markers
Zheng et al.18
studied the genetic predisposition to prostate cancer by examining the association between prostate cancer and five SNPs that map to the three 8q24 loci, to 17q12 and to 17q24.3. Individually, the risk ratios associated with these loci ranged from 1.22 to 1.53. gives the odds ratios adjusted for age and geographic region, and genotype frequency of the five SNPs for prostate cancer susceptibility (AUC=0.58).
Genotype frequency and odds ratios of five SNPs for prostate cancer susceptibility
Let the genotype frequencies for the five SNPs are given by G1
=0.77 and G5
=0.26, and the relative risks by R1
=1.37 and R5
=1.22. The AUC calculated using the formula (AUCB) is 0.58. The joint PAF using the formula given in Zheng et al.18
for the five SNPs is 0.4045. When we consider five identical markers with average relative risk 1.36 and average genotype frequency 0.33, and PAF 0.40, the AUC calculated using the formula for PAF is 0.59.