1.  Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies 
PLoS ONE  2015;10(5):e0124107.
Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significant. However, a small P-value is not equivalent to small chances of a spurious finding and significance thresholds may fail to serve as efficient filters against false results. While the Bayesian approach can provide a direct assessment of the probability that a finding is spurious, its adoption in association studies has been slow, due in part to the ubiquity of P-values and the automated way they are, as a rule, produced by software packages. Attempts to design simple ways to convert an association P-value into the probability that a finding is spurious have been met with difficulties. The False Positive Report Probability (FPRP) method has gained increasing popularity. However, FPRP is not designed to estimate the probability for a particular finding, because it is defined for an entire region of hypothetical findings with P-values at least as small as the one observed for that finding. Here we propose a method that lets researchers extract probability that a finding is spurious directly from a P-value. Considering the counterpart of that probability, we term this method POFIG: the Probability that a Finding is Genuine. Our approach shares FPRP's simplicity, but gives a valid probability that a finding is spurious given a P-value. In addition to straightforward interpretation, POFIG has desirable statistical properties. The POFIG average across a set of tentative associations provides an estimated proportion of false discoveries in that set. POFIGs are easily combined across studies and are immune to multiple testing and selection bias. We illustrate an application of POFIG method via analysis of GWAS associations with Crohn's disease.
PMCID: PMC4425705  PMID: 25955023
2.  Correction: The Ranking Probability Approach and Its Usage in Design and Analysis of Large-Scale Studies 
PLoS ONE  2014;9(1):10.1371/annotation/6bc48309-0280-4066-b937-e88debd1579c.
PMCID: PMC3905079
3.  The Ranking Probability Approach and Its Usage in Design and Analysis of Large-Scale Studies 
PLoS ONE  2013;8(12):e83079.
In experiments with many statistical tests there is need to balance type I and type II error rates while taking multiplicity into account. In the traditional approach, the nominal -level such as 0.05 is adjusted by the number of tests, , i.e., as 0.05/. Assuming that some proportion of tests represent “true signals”, that is, originate from a scenario where the null hypothesis is false, power depends on the number of true signals and the respective distribution of effect sizes. One way to define power is for it to be the probability of making at least one correct rejection at the assumed -level. We advocate an alternative way of establishing how “well-powered” a study is. In our approach, useful for studies with multiple tests, the ranking probability is controlled, defined as the probability of making at least correct rejections while rejecting hypotheses with smallest P-values. The two approaches are statistically related. Probability that the smallest P-value is a true signal (i.e., ) is equal to the power at the level , to an excellent approximation. Ranking probabilities are also related to the false discovery rate and to the Bayesian posterior probability of the null hypothesis. We study properties of our approach when the effect size distribution is replaced for convenience by a single “typical” value taken to be the mean of the underlying distribution. We conclude that its performance is often satisfactory under this simplification; however, substantial imprecision is to be expected when is very large and is small. Precision is largely restored when three values with the respective abundances are used instead of a single typical effect size value.
PMCID: PMC3869737  PMID: 24376639
4.  Evidence of association of APOE with age-related macular degeneration - a pooled analysis of 15 studies 
Human mutation  2011;32(12):1407-1416.
Age-related macular degeneration (AMD) is the most common cause of incurable visual impairment in high-income countries. Previous studies report inconsistent associations between AMD and apolipoprotein E (APOE), a lipid transport protein involved in low-density cholesterol modulation. Potential interaction between APOE and sex, and smoking status, has been reported. We present a pooled analysis (n=21,160) demonstrating associations between late AMD and APOε4 (OR=0.72 per haplotype; CI: 0.65–0.74; P=4.41×10−11) and APOε2 (OR=1.83 for homozygote carriers; CI: 1.04–3.23; P=0.04), following adjustment for age-group and sex within each study and smoking status. No evidence of interaction between APOE and sex or smoking was found. Ever smokers had significant increased risk relative to never smokers for both neovascular (OR=1.54; CI: 1.38–1.72; P=2.8×10−15) and atrophic (OR=1.38; CI: 1.18–1.61; P=3.37×10−5) AMD but not early AMD (OR=0.94; CI: 0.86–1.03; P=0.16), implicating smoking as a major contributing factor to disease progression from early signs to the visually disabling late forms. Extended haplotype analysis incorporating rs405509 did not identify additional risks beyondε2 and ε4 haplotypes. Our expanded analysis substantially improves our understanding of the association between the APOE locus and AMD. It further provides evidence supporting the role of cholesterol modulation, and low-density cholesterol specifically, in AMD disease etiology.
PMCID: PMC3217135  PMID: 21882290
age-related macular degeneration; AMD; apolipoprotein E; APOE; case-control association study
5.  Variations in Apolipoprotein E Frequency With Age in a Pooled Analysis of a Large Group of Older People 
American Journal of Epidemiology  2011;173(12):1357-1364.
Variation in the apolipoprotein E gene (APOE) has been reported to be associated with longevity in humans. The authors assessed the allelic distribution of APOE isoforms ε2, ε3, and ε4 among 10,623 participants from 15 case-control and cohort studies of age-related macular degeneration (AMD) in populations of European ancestry (study dates ranged from 1990 to 2009). The authors included only the 10,623 control subjects from these studies who were classified as having no evidence of AMD, since variation within the APOE gene has previously been associated with AMD. In an analysis stratified by study center, gender, and smoking status, there was a decreasing frequency of the APOE ε4 isoform with increasing age (χ2 for trend = 14.9 (1 df); P = 0.0001), with a concomitant increase in the ε3 isoform (χ2 for trend = 11.3 (1 df); P = 0.001). The association with age was strongest in ε4 homozygotes; the frequency of ε4 homozygosity decreased from 2.7% for participants aged 60 years or less to 0.8% for those over age 85 years, while the proportion of participants with the ε3/ε4 genotype decreased from 26.8% to 17.5% across the same age range. Gender had no significant effect on the isoform frequencies. This study provides strong support for an association of the APOE gene with human longevity.
PMCID: PMC3145394  PMID: 21498624
aged; apolipoprotein E2; apolipoprotein E3; apolipoprotein E4; apolipoproteins E; longevity; meta-analysis; multicenter study

