Our meta-analysis adhered to the MOOSE and PRISMA statements for reporting on systematic reviews and the STREGA recommendations for reporting of genetic association studies. We conducted the meta-analysis in accordance with the general guidelines of the Cochrane Handbook for Systematic Reviews of Interventions, version 5.0.2, and the specific recommendations of genetic meta-analysis of the HuGE Review Handbook, version 1.0.
Information sources and search strategy
We searched Medline (from 1966 to 31 December 2010), Embase (from 1974 to 31 December 2010), and the Cochrane Library (from 1980 to 31 December 2010) without language restrictions. The search algorithm combined the categories for “drug”, “gene”, and “outcome” by the Boolean operator “AND”. The search terms (medical subject headings and text words) in each category were combined with the operator “OR”. The algorithm was trimmed for maximum sensitivity by sequentially adding search items in each category until the total number of hits did not increase further. At each step we determined the explanatory power of the algorithm by testing that the removal of any item resulted in a lower number of total hits. We determined the specificity of the algorithm by testing that the combination of the three search categories yielded fewer hits than any combination of two categories. The following search strategy was applied: (clopidogrel OR plavix OR iscover OR thienop* OR P2Y12) AND (associat* OR cytochrome OR cyp OR polymorph* OR genetic* OR metabolis* OR enzyme) AND (cardiov* OR vascular OR coronar* OR stent* OR thrombos* OR myocar* OR heart OR infarct* OR death OR stroke OR ischem*).
In addition, we searched the online databases of general medical, cardiovascular, pharmacological, and genetic journals as indexed by ISI Web of Science to identify advance online publications. We hand searched the contents pages of the 2005-10 issues of these journals and the bibliographies of relevant articles to retrieve further potential publications.
We fixed 31 December 2010 as the cut-off date for inclusion of new studies, avoiding subjectivities in the choice of the time point when new information is reviewed and allowing calendar year based assessments.
Study selection and eligibility criteria
We included original peer reviewed reports of observational studies and clinical trials if published in full text or if we had full access to all original data and protocols. We excluded studies that were published only as abstracts or conference reports. We considered reports that evaluated the association of reduced function and increased function genetic variants of CYP2C19 with the occurrence of clinical outcomes in patients with established coronary artery disease who were treated with clopidogrel. Two researchers (DT and TB) independently retrieved studies. The strength of agreement was measured by Cohen’s κ coefficient with approximate standard errors,
21 with κ=0.41-0.60 indicating moderate agreement, 0.61-0.80 good agreement, and ≥0.81 representing very good agreement.
22 Disagreements were resolved by consensus.
Clinical eligibility criteria and outcome definitions
Eligible studies were conducted in unrelated men and women (aged ≥18) of any ethnicity with a clinical presentation of stable angina pectoris or acute coronary syndrome who were scheduled for administration of a loading dose of at least 300 mg clopidogrel and subsequent maintenance treatment with 75-150 mg clopidogrel a day for at least three months. For eligibility, studies must have reported follow-up data for at least 30 days after entry (inclusion or randomisation) of participants. We included studies if they provided absolute numbers of the cumulative incidence (first occurrence during follow-up) of the clinical efficacy end points: major adverse cardiovascular events or fatal or non-fatal stent thrombosis.
The primary definition of major adverse cardiovascular events was the composite of death from cardiovascular causes, non-fatal myocardial infarction, and non-fatal stroke or the composite of death from any cause, non-fatal myocardial infarction, and non-fatal stroke as evaluated by universal clinical guidelines. Other eligible definitions were death from cardiovascular causes and myocardial infarction; death from any cause and myocardial infarction; death from cardiovascular causes; and fatal and non-fatal myocardial infarction. We required that the components subsumed under the definition of major adverse cardiovascular events were sensitive and unbiased measurable events of the same underlying disease process. We excluded studies reporting only all cause mortality because of the high likelihood of bias by events without an underlying cardiovascular cause. Studies reporting only composite end points, including the clinician driven proxy outcomes of revascularisation or admission to hospital, were excluded because of uncertainties about precision, reproducibility, correlation with clinical end points, the use of non-standardised definitions, and a high risk of reporting bias.
23 24Studies of patients who underwent percutaneous coronary intervention and reporting definite stent thrombosis events were eligible if the stent thrombosis was evaluated according to the definition from the Academic Research Consortium.
25 Probable stent thrombosis was considered only if definite and probable stent thrombosis were reported as a composite outcome. Possible stent thrombosis was not considered.
Genetic eligibility criteria
For unambiguous determination, polymorphisms of CYP2C19 needed to be designated by their NCBI dbSNP identifiers (“rs numbers”), their nucleotide exchange, or their common harmonised star allele nomenclature. We considered reports on the loss of function (reduced function) variants CYP2C19*2 (rs4244285), CYP2C19*3 (rs4986893), CYP2C19*4 (rs28399504), CYP2C19*5 (rs56337013), CYP2C19*6 (rs72552267), CYP2C19*7 (rs72558186), or CYP2C19*8 (rs41291556), and the gain of function variant CYP2C19*17 (rs12248560). The variant gene carrier status was required to be given as the distribution of genotypes among patients with and without the outcome event or as the number of individuals carrying at least one loss of function or gain of function allele. Studies reporting associations with a loss of function variant were eligible if they had genotyped at least the CYP2C19*2 allele because it accounts for more than 95% of the loss of function allele carrier status in white and black African populations and for more than 75% in Asian populations.
26Genetic model assumptions
Pharmacokinetic studies using different CYP2C19 substrates indicate an additive or dominant mode of inheritance of the loss of function metaboliser trait, showing either a per allele decrease in enzymatic activity
27 or a similar drop of activity in carriers of one or two loss of function variants compared with non-carriers.
28 29 Pharmacokinetic studies assessing the metabolic activity of CYP2C19 in relation to the gain of function CYP2C19*17 variant likewise indicate either additive inheritance with a per allele increase in activity between *1/*1, *1/*17, and *17/*17 carriers
30 or dominant inheritance with a similar gain of the enzymatic activities in carriers of one or two *17 alleles compared with wild type *1/*1 carriers.
31From the available mechanistic evidence we inferred that the CYP2C19 metaboliser phenotype of clopidogrel is determined by additive or dominant genetic models of inheritance and that these models also apply to the associations of CYP2C19 with clinical events. We chose dominant genetic models to quantify the effect size in each study by comparing the genotype contrasts of carriers with one or two variant alleles with non-carriers of the variant alleles because dominant or additive models have a similar statistical power,
32 genotype contrasts are more directly relevant at the individual and the clinical level compared with allele contrasts obtained from additive models, and allele contrasts could not be extracted from all studies.
Interaction between loss of function and gain of function polymorphisms
Except for one study,
19 all association studies modelled loss of function and gain of function polymorphisms independently of each other, implying complete equilibrium of linkage and absence of functional interaction. An examination of the polymorphic loci of the CYP2C19 gene in white populations (HapMap-CEU database, Haploview, version 4.2 software, Broad Institute, Harvard, MA, USA) showed a low pairwise correlation coefficient (
r2=0.047) between the most common loss of function polymorphism (*2) and the gain of function polymorphism *17. This excludes a substantial interaction between the single nucleotide polymorphisms at the haplotype level but entails the risk of interaction at the phenotype level. In single locus assessments of the CYP2C19*2 variant, about a quarter of all individuals classified as *2 allele carriers and about half of those classified as non-carriers were harbouring at least one *17 allele.
18 In turn, in single locus analyses of the CYP2C19*17 variant, about 20% of people classified as carriers of the *17 allele and about 35% of those classified as non-carriers were harbouring at least one *2 allele.
18 Pharmacokinetic analyses
31 33 and pharmacodynamic platelet response studies
17 suggest that the metaboliser phenotypes of mixed carriers of *2 and *17 alleles are comparable with individuals who are homozygous carriers of the wild type (*1) alleles at both loci. Therefore, the independently analysed associations of loss of function and gain of function single nucleotide polymorphisms with clinical outcomes are probably systematically distorted by the counteracting metaboliser trait. This does not necessarily invalidate the associations or increase the heterogeneity between studies, if it is assumed that a proportional bias applies to all studies. Because of higher frequencies of misclassified phenotypes among non-carriers of a certain variant allele, compared with the respective carriers, however, a trend towards overestimation of the single locus effect sizes is expected.
The rare loss of function variant loci CYP2C19*3-*8 are found within the same linkage disequilibrium block with the *2 variant. The pairwise correlation coefficients (r2≤0.019) indicate that their alleles assort independently of the *2 allele. Under the assumption of stochastic independence we pooled the data on loss of function alleles and analysed the combined values as data derived from a single bi-allelic loss of function locus.
Data collection process and extracted items
For standardisation of data extraction we adopted the Cochrane Consumers and Communication Review extraction template, modified it by the recommendations in the HuGE Review Handbook for abstracting genetic information, pilot tested five randomly selected included studies, and refined it accordingly. Two investigators independently extracted data from the included studies. Inconsistencies were resolved by consensus.
Extracted data included study identifiers (the first author’s name, year of publication, country or geographical origin of investigation, single or multicentre study); characteristics of study design (type of study, prospective or retrospective design, follow-up); characteristics of study participants (including diagnosis and procedural characteristics at study entry, demographic characteristics (number, sex, age, BMI (body mass index), and cardiovascular risk factors (smoking, hypertension, dyslipidaemia, diabetes mellitus)); characteristics of study intervention (loading dose of clopidogrel, duration of treatment, comedication with aspirin); outcome measures (type and number of events, carriers with loss of function or gain of function alleles) (table 1).
| Table 1 Characteristics of included studies of effect of variants of the cytochrome P450 (CYP) 2C19 genotype on clinical efficacy of clopidogrel |
We included data from multiple published reports from the same study population only once. We used only the data from the initial report, and, in the case of overlapping samples, we used the data of the largest follow-up study. If a report referred to a previous publication for the description of study design, setting, and patients’ characteristics, we extracted these data. Missing data for one study
34 were extracted from a subsequent meta-analysis
14 conducted by the same corresponding author.
To avoid the risk of retrieval bias we did not contact the original investigators for more detailed information when we were unable to obtain complete data and protocols from all studies or to check accurateness and reliability of the obtained data.
Methods for assessing the risk of bias in individual studies
To explore the risk of bias in individual studies, we investigated indicators general to the quality of epidemiological studies and specific to the quality of genetic association studies.
35 36 We extracted quality information on loss to follow-up, funding sources, comparability of groups (demographic and clinical homogeneity between groups with and without outcome, demographic and clinical homogeneity between carriers and non-carriers of the allele of interest, absence of population stratification (ethnic homogeneity)), reliability and validity of phenotype assessment (use of standardised definitions of disease phenotypes, blinding of clinical outcome assessors to patients’ genetic information), and reliability and validity of genotype assessment (consistency of observed genotype frequencies with the Hardy-Weinberg equilibrium, use of an appropriate genotyping method, high call rate, blinding of investigators who performed genotyping to clinical outcome).
Additionally, in adopting the guidelines of the Cochrane Handbook for Systematic Reviews of Interventions, we graded the methodological quality of the selected studies with a summary score using the Newcastle-Ottawa quality assessment scale.
37 We applied a modified scale (adapted for genetic association studies), awarding a maximum score of 8, with one point each for representativeness of the exposed group (carriers of the genotype of interest) for the underlying population; selection of the unexposed group (non-carriers) from the same population as the exposed individuals; adequate measurement of exposure; adequate ascertainment of the absence of the outcome of interest at begin of the study; demographic and clinical comparability between groups (carriers and non-carriers of the genotype); appropriate measurement of outcome; adequate length of follow-up; and completeness of follow-up. Study quality was considered to be good when the score was ≥6 and poor to moderate when the score was <6. Two investigators independently scored quality, and the inter-rater agreement was determined by the κ statistic.
Consistency of the observed genotyping frequencies with the Hardy-Weinberg equilibrium provides an overall (albeit insensitive and non-specific) indication for the absence of any strong bias by the selection of patient groups, population stratification, or genotyping errors. We checked departure from Hardy-Weinberg equilibrium using Fisher’s exact test
38 instead of the χ
2 test reported in the individual studies as it yields increased statistical power. For the cohort studies and the genetic subgroup analyses, testing on Hardy-Weinberg equilibrium was performed in the whole population, and for the case-control study in the control group. We considered that significant departure from Hardy-Weinberg equilibrium (P<0.05) necessitated a correction of individual risk estimates.
35Summary effect measures and sample size estimation
We calculated crude unadjusted odds ratios and 95% confidence intervals for each study based on genotype contrasts of a dominant model comparing heterozygous and homozygous genotypes of the minor allele with homozygous genotypes of the major allele. As primary summary effect estimates we calculated summary odds ratios and 95% confidence intervals according to the DerSimonian and Laird random effects model, which utilises weights that incorporate variance within and between studies. In addition, we calculated the fixed effects summary estimates according to the Mantel-Haenszel method, which includes only variance within studies. Fixed effects meta-analysis assumes that the genetic effects are the same across all studied populations. Random effects calculations assume that the genetic effects might vary across populations because of genuine differences (such as population specific gene environment or gene-gene interactions) or differential biases (such as population stratification; genotyping error; phenotype misclassification; and population differences in correlation of clinical phenotypes, correlation between molecular and clinical phenotype, and linkage disequilibrium of gene variants). Anticipating heterogeneity between studies in meta-analyses of genetic association studies, random effects models are generally the preferred frequentist approach compared with fixed effect models.
39 40 41 In the presence of high variance between studies, however, the random effects analysis has considerably less power to reject the null hypothesis of no association. By contrast, random effects models give relatively more weight to smaller studies, which involves the risk of generating higher summary estimates in the presence of bias from small study effects. Therefore, we have presented both random and fixed effects analysis. P<0.05 indicates a nominally significant overall association (according to the z test statistic for the null hypothesis of no association (odds ratio 1)).
To estimate the total sample size needed to be included in a meta-analysis to detect a significant association at low summary odds ratios, we performed Monte-Carlo simulations using PBAT software, version 3.61.
42 For a dominant genotype contrast, to detect a summary odds ratio of 1.15 (the threshold for epidemiological credibility
43) at a significance level α=0.05 with a power of 0.80 for cumulative event rates in the range of 1-10% and minor allele frequencies in the range of 15-25%, the required sample size was about 8000-9000, and to detect an odds ratio of 1.20 it was about 4500-5000 (based on 10

000 simulations each).
Heterogeneity measures
The presence of heterogeneity between studies was explored with the Cochran’s Q statistic, which is the weighted sum of squares of the deviations of individual study odds ratios from the Mantel-Haenszel summary odds ratio. The statistic follows a χ
2 distribution with k−1 degrees of freedom (where k is the number of studies); P<0.10 indicated significant heterogeneity. The extent of variance between studies was estimated by the τ
2 metric. The percentage of total variance attributable to heterogeneity between studies was quantified with the I
2 metric and its approximate 95% confidence intervals (where I
2=((Q−(k−1))/Q)×100%). I
2 is independent of the number of studies and—in contrast with Q and τ
2—allows comparison across different meta-analyses.
44 Values of I
2 <25%, ≥25%-<50%, and ≥50% were considered to represent low, modest, and large heterogeneity, respectively. Both Q and I
2, however, have only low statistical power to detect heterogeneity with small numbers of studies
45 and provide no information about the causes of heterogeneity. Hence, we performed additional prespecified heterogeneity and sensitivity analyses. Formal evaluations required that at least four studies were included in the meta-analysis.
Assessment of bias across studies
Inconsistency in replication is an important issue in genetic association. The first studies often suggest a stronger genetic effect than is found by subsequent studies.
46 To assess the replication validity in the meta-analyses we compared the odds ratio of the first published studies with the random effects summary odds ratios without the first studies using the z test statistic, with P<0.05 indicating a significant inconsistency in replication.
35 To explore the evolution and robustness of the summary effect estimates over time, we conducted cumulative and recursive cumulative meta-analyses. In cumulative meta-analysis, the random effects summary odds ratios are calculated with publication of each new study. In recursive cumulative meta-analysis, the ratio of the cumulative odds ratios in year n+1 to year n is calculated.
47 These analyses facilitate the identification of early extreme contradictory estimates in genetic associations and of potential time lag bias resulting from a more rapid publication of studies with significant results compared with studies with non-significant results.
35To assess potential bias from small study effects we constructed funnel plots displaying the log odds ratios of individual studies on the horizontal axis and the standard errors of the log odds ratios (precision) on the vertical axis. Funnel plot asymmetry is a graphical means of indicating whether effect estimates of small studies differ from those in larger studies,
48 but visual inspection is an unreliable method to detect bias. We carried out formal statistical assessment of funnel plot asymmetry with the Harbord-Egger regression test, which yields lower false positive and false negative rates when applied to dichotomous outcomes compared with the traditional Egger regression test or the Begg-Mazumdar rank correlation test.
49 50 P<0.10 was assumed to indicate a significant difference of the precision in large versus small studies. For more specific evaluation of the presence and extent of publication bias we used the non-parametric trim and fill method according to Duval and Tweedie,
51 which imputes missing studies in the funnel plot based on symmetry assumptions.
Sensitivity analyses
We performed a prespecified combinatorial exclusion sensitivity analysis to identify the individual studies or clusters of studies that provide the strongest contribution to the heterogeneity of the meta-analysis.
52 Potential differences in the main characteristics of excluded studies in comparison with the remaining studies were investigated (considering the structured PICOS information of the Cochrane Handbook for Systematic Reviews, chapter 5).
Grading the evidence of meta-analyses
Reliable and valid instruments for evaluating the quality of the evidence derived from systematic reviews and meta-analyses are essential for developing transparent and unbiased clinical recommendations and avoiding implicit subjectivity. For judgment of the strength of the meta-analysed evidence for recommendations we applied the GRADE methods, proposed by WHO for producing practice guidelines.
53 54 To evaluate the quality of evidence with specific regard to genetic topics, we used the Venice consensus criteria
43 for rating the cumulative epidemiological evidence of meta-analyses of genetic association studies that yield significant (P<0.05) summary estimates.
Within the GRADE evaluation process observational studies are basically considered to present low quality of evidence. According to the GRADE handbook, version 3.2, and pragmatic instructions to guide the grading process
54 judgments on the included studies are made with respect to the following five criteria that lower the quality of evidence: limitations of individual studies (risk of bias within a study); inconsistency (heterogeneity of results across studies); indirectness of evidence; imprecision (total number of events <300)
55; and publication bias. Studies not downgraded for any reason are judged for three factors that increase the quality of evidence: dose-response gradient (gene-dose effect); large magnitude of effect (relative risk >2.0 or <0.5); and reduction of effect by all plausible biases present (potential underestimation of effect).
The Venice consensus criteria assign three levels for the amount of evidence, the consistency of replication, and the protection from bias. For amount of evidence, grade “A” is assigned when the total number of minor alleles of cases and controls combined in the meta-analyses exceeds 1000, “B” when it is between 100 and 1000, and “C” when it is less than 100. For replication and consistency, grade A is assigned for I2 <25%, B for I2 25-50%, and C for I2 >50%. For protection from bias, grade A implies that there is probably no bias that can affect the presence of the association, grade B that there is no demonstrable bias but important information is missing for its appraisal, and grade C that there is evidence for potential or clear bias that can invalidate the association. Specifically, whenever the summary odds ratio deviates less than 1.15-fold from null association (odds ratio=1), occult publication and selective reporting biases alone might invalidate the association, regardless of the presence or absence of other biases, and therefore a grade of C is assigned. When the summary odds ratio is ≥1.15-fold from null association, a grade of C is assigned when a significant modified regression test suggests the possibility of bias or when the association is no longer nominally significant on exclusion of the initial study or of studies violating the Hardy-Weinberg equilibrium. The composite epidemiological credibility is rated as “strong” if three A grades are assigned, “moderate” if at least one B grade but no C grades are assigned, and “weak” if a C grade in any of the three assessment criteria is assigned.
The grading was done independently by two investigators and repeated by a third investigator if disagreement occurred.
Statistical software programs
The statistical analyses were performed with PASW version 18.0.1 (SPSS, Il, US), Cochrane Review Manager 5.0 (Cochrane Library Software, Oxford, UK), and MIX version 1.7 (Department of Medical Informatics of Kitasato University, Japan).