We have identified a genetic variant and region of the ANGPT2 gene demonstrating a consistent association with increased risk of trauma-associated ALI in two separate populations, two ethnicities, and across multiple genotyping platforms. Sequencing of the ALI-associated region revealed no untyped coding variants, and in silico modeling predicted splice site alteration for SNPs in LD with rs1868554. Immunoblotting of ALI subject plasma revealed an alteration in the isoform pattern of ANG2 in carriers of the rs1868554T allele. Thus, we have identified a shift in ANG2 isoform ratio in the plasma of rs1868554T carriers with ALI.
ANG2 protein was first described in 1997 as a naturally occurring antagonist for ANG1, an angiogenic factor essential for normal vascular development (47
). In the absence of angiogenic stimuli, ANG2 induces endothelial cell apoptosis and vascular regression, enhances vascular leak, and destabilizes blood vessels (48
). In recent years, ANG2 has been implicated in pulmonary vascular leak syndromes including ALI and sepsis in both animal and human studies (49
). ANG2-rich serum from patients with sepsis disrupts endothelial architecture when applied exogenously (52
), and elevated levels of ANG2 have been detected in the blood and bronchoalveolar lavage fluid of patients with ALI (49
). Among patients with trauma, plasma ANG2 was among the top performing biomarkers distinguishing patients who did from those who did not develop ALI (20
). Other vascular permeability regulating genes, such as MYLK
, and VEGFA
, have also shown association with ALI, supporting the critical role of endothelial barrier regulation in the pathophysiology of ALI (54
We identified two ANGPT2
SNPs (rs2442598 and rs1868554) strongly associated with the development of ALI in patients with major trauma. The area demonstrating association with ALI was consistent both on haplotype (Stage I) and regional association (Stages I and II) analysis (, , and E2). This genomic region spans the first to the second intron of ANGPT2
and includes an exon that is variably spliced (58
), termed Ang2443
or isoform C (NCBI ref NP_001112360.1). The alternatively spliced isoform, which lacks the second exon and alters the coiled-coil but not the signal sequence or fibrinogen-like domain, is expressed in primary endothelial cell lines at approximately 10% the abundance of isoform A (58
). The ratio of ANG2 isoforms in peripheral blood has not previously been published.
To test the in silico
prediction that individual SNPs might cause splice variation, we performed ELISA of plasma ANG2 followed by immunoprecipitation and Western blotting. In our samples, plasma ANG2 did not vary predictably by rs1868554 genotype, with a very wide range in values for each genotype. However, we found that carriers of the rs1868554T allele had a shift in the isoform ratio for ANG2. We do not yet know if isoform 1 or 2 described here are ANG2 isoforms C and A, because we did not have subjects' RNA (endothelial or circulating) available to analyze the coding sequence. Nor is it known if the proprietary antibody for the ELISA used to quantitate plasma ANG2 (R&D Systems catalog DANG20) discriminates between isoforms. Isoform C was first described in human endothelial cell lines (58
), and we know very little about whether or how circulating ANG2 might differ from intracellular ANG2. We observed a slightly higher migration of ANG2 bands in plasma relative to endothelial cell lysate. This might represent a post-translational modification or some other variation between endothelial and circulating protein. In the future it will be important to analyze the messenger RNA associated with these isoforms and compare them with the reference sequences for ANG2 isoforms A and C, and to characterize circulating versus cellular protein. If isoforms lacking exon 2 enhance vascular permeability relative to the reference protein, the role of ANG2 coiled–coil domain in vascular permeability regulation may warrant reexamination.
We did not observe a difference in isoform ratio between heterozygous and homozygous carriers of rs1868554T, suggesting that there are additional factors beyond genotype regulating the splicing of this protein. Although we used an additive model based on its superior power relative to a dominant model, rs1868554 was associated with ALI in each stage assuming a dominant model for the T allele, although the association was less pronounced (OR, 1.88 and 95% CI, 1.01–3.49 in Stage I; OR, 1.26 and 95% CI, 1.05–1.51 in Stage II).
We focused our bioinformatic investigation on splice site regulation given the proximity of our association signal to the variably spliced exon and because transcriptional regulation seemed less likely given the distance (35 kb) of our signal from the transcription start site. SNP analysis of variants in LD with rs1868554 predicted the creation of a novel splice site for rs2515478, a variant with strong LD to rs1868554 in EA and weak LD in AA subjects, and for rs1301303, a variant showing complete LD (r2 = 1.0) with rs1868554 in EA subjects. It may be that one of these SNPs, or one or more of the SNPs we did not uncover during sequencing, is the functional polymorphism. Alternatively, there may yet be a more obvious splice disruption signal within exon 2 that our sequencing could not adequately capture. We found no coding variation in this exon, despite HapMap data reporting one synonymous SNP with relatively high frequency (rs6559167).
Ours represents the second report of ANGPT2
genetic variants associated with the development of ALI. Previously, Su and coworkers (17
) reported that two ANGPT2
tagging SNPs and their haplotype block were significantly associated with the development of ARDS in a primarily septic population (haplotype OR, 1.42; 95% CI, 1.09–1.85). Our study may be complimentary in that our main finding, rs1868554, resides in the same haplotype block with rs2515475, the strongest SNP reported by Su and coworkers (17
). However, despite residing in the same block, the pairwise LD between rs1868554 and rs2515475 is marginal in the African (r2
0.40) and absent in the European (r2
0.08) ancestral population (42
). Stage II did not replicate an association between ALI and rs2515475 (OR, 1.0; P
= 0.97), although there was a strong association between ALI and SNPs in close proximity to rs2515475. Possible explanations for this discrepancy include the presence of recombination hotspots within the ANGPT2
gene close to our region of association, which could account for neighboring loci failing to display tight LD and failing to demonstrate a consistent association with the ALI phenotype, or that the observed ALI associations for rs1868554 and rs2515475 are completely independent.
The OR for the ANGPT2
variants in Stage I were further from the null than in Stage II, which may reflect an example of the “winner's curse” phenomenon (59
), or it may reflect differences caused by the use of healthy control subjects in Stage II, or different clinical or demographic factors between the populations. Most Stage I subjects (53%) experienced penetrating trauma, whereas Stage II cases were more likely to have experienced blunt trauma (92%). It is also possible that despite a similar minor allele frequency (~ 30%) in both African and European ancestral populations (30
), rs1868554 may be more closely linked to the functional variant in subjects of African ethnicity. Our sequencing data revealed expected differences between EA and AA subjects but did not highlight a novel variant more associated with either ancestry.
The major strengths of our approach include the discovery cohort study design, multiethnic investigation with adjustment for population stratification, large replication population, and the association with an alteration in plasma protein (34
). The use of a replication stage and functional correlation minimized the risk of false-positive associations. We demonstrated a consistent association with ALI in a large distinct population recruited from five centers across the United States. Although there was some variability by site, the association of rs1868554 showed a consistent direction of effect across all centers (Table E7).
This study has several limitations. The Stage I sample size is relatively small, limiting our power to detect all but the strongest effects (relative risk ≥ 1.8) in the discovery phase. Negative findings from Stage I should be interpreted with caution because this study was not designed to evaluate more modest effect sizes or rarer variants. False-negative gene associations in Stage I may have occurred because of lack of Stage I power. In addition, this study by design could only assess genetic risk factors shared by both African and EAs. Because no appropriate replication population was available for polymorphisms restricted to individuals of African ancestry, we cannot determine whether the Stage I associations observed in STAT1
represent false-positives or true associations. Although the rarity of these SNPs may limit their clinical significance, there is evidence suggesting that STAT1
) in particular may play a role in lung injury pathogenesis. Furthermore, scientists involved in the design of the HumanCVD chip have suggested that in the AA population, a P
value of 1.9 × 10−6
might be considered “chip-wide significance,” analogous to P
less than 5 × 10−8
in genome-wide studies (62
). Our ANGPT2
Stage I results would not have met this threshold. However, independent replication of the association, coupled with an association with altered plasma isoform ratios, provide good evidence for its legitimacy as a risk variant.
Although using a trauma-specific cohort diminishes heterogeneity caused by different precipitating factors of ALI, the generalizability of our findings to other at-risk populations may be limited. Our sequencing did not definitively identify the causal variant leading to this splice variation, but suggests that splice enhancer variation or novel splice site creation is possible for SNPs in this region. Given the observed recombination hotspots close to the ALI-associated region of ANGPT2, future studies may seek to perform high-density genotyping of this region in subjects with sepsis- or pneumonia-associated ALI.
Our phenotype was based on the AECC definition of ALI, and this definition may be problematic (63
). We performed a sensitivity analysis in our most densely phenotyped population (Stage I) to test the extent to which phenotypic misspecification might influence our findings, and found the associations between ANGPT2
variants and ALI were robust to a more stringent control definition, despite lower sample size.
Subjects comprising the control group for Stage II were not critically ill patients with trauma, and they were predominantly children. Despite the theoretical risk for selection bias when using population-based control subjects, accumulating genome-wide data support the use of population-based control subjects provided that the phenotype of interest is rare in the general population (65
). We used the Stage II population for replication only, choosing to limit our focus to SNPs already manifesting association with ALI in a critically ill cohort to minimize the risk of significant confounding between gene variants and severe trauma. The use of pediatric control subjects might introduce a survival bias when studying an outcome in adults, although previous genome-wide studies have not identified significant bias associated with birth cohort populations (66
), and the control group used in this study has performed well in other genome-wide analyses of adult traits (22
). Furthermore, if significant misclassification were to occur because of control subjects never being exposed to an ALI-precipitating event, one might expect a weakening of power to refute the null hypothesis. This should be considered in our inability to replicate the PPARGC1B
By design, our associations were limited to those genes and SNPs assayed in Stage I by the HumanCVD BeadChip. This platform, designed for cardiovascular, metabolic, and pulmonary conditions, provided coverage for 85% of previous ALI-associated genes (Table E1), suggesting it is a reasonable candidate gene platform for the ALI phenotype. Because of the chip's design, there may be important genetic variation, such as copy number variation or structural variation, which we did not detect. Our sequencing makes copy number variation unlikely as a cause for the ANGPT2 association, but other candidate genes may be significantly influenced by structural variation. An alternative strategy is to use the whole genome analysis in the first stage, which could potentially highlight novel loci or other candidates of interest. The rationale for our chosen genotyping strategy was to have dense genotyping in AAs to use the ethnically diverse Penn trauma cohort to its fullest extent, and to have an adequately powered replication population.
ALI remains a significant source of morbidity and mortality in patients experiencing major trauma. The development of ALI in critically ill patients with trauma is associated with an almost threefold increased risk of mortality compared with those who do not develop ALI (9
). A molecular model of ALI susceptibility may aid in the development of specific, targeted therapy for high-risk individuals. Further characterization of ANGPT2
genetic variation and expression, and further mechanistic investigation into the effects of ANG2 isoform variation, may lead to novel therapeutic paradigms in trauma-associated ALI.