1.  Exome Array Analysis Identifies a Common Variant in IL27 Associated with Chronic Obstructive Pulmonary Disease 
Rationale: Chronic obstructive pulmonary disease (COPD) susceptibility is in part related to genetic variants. Most genetic studies have been focused on genome-wide common variants without a specific focus on coding variants, but common and rare coding variants may also affect COPD susceptibility.
Objectives: To identify coding variants associated with COPD.
Methods: We tested nonsynonymous, splice, and stop variants derived from the Illumina HumanExome array for association with COPD in five study populations enriched for COPD. We evaluated single variants with a minor allele frequency greater than 0.5% using logistic regression. Results were combined using a fixed effects meta-analysis. We replicated novel single-variant associations in three additional COPD cohorts.
Measurements and Main Results: We included 6,004 control subjects and 6,161 COPD cases across five cohorts for analysis. Our top result was rs16969968 (P = 1.7 × 10−14) in CHRNA5, a locus previously associated with COPD susceptibility and nicotine dependence. Additional top results were found in AGER, MMP3, and SERPINA1. A nonsynonymous variant, rs181206, in IL27 (P = 4.7 × 10−6) was just below the level of exome-wide significance but attained exome-wide significance (P = 5.7 × 10−8) when combined with results from other cohorts. Gene expression datasets revealed an association of rs181206 and the surrounding locus with expression of multiple genes; several were differentially expressed in COPD lung tissue, including TUFM.
Conclusions: In an exome array analysis of COPD, we identified nonsynonymous variants at previously described loci and a novel exome-wide significant variant in IL27. This variant is at a locus previously described in genome-wide associations with diabetes, inflammatory bowel disease, and obesity and appears to affect genes potentially related to COPD pathogenesis.
PMCID: PMC4960630  PMID: 26771213
chronic obstructive pulmonary disease; genetics; exome; IL-27
2.  Novel evidence of association with NSCL/P was shown for SNPs in FOXF2 gene in an Asian population 
The forkhead box F2 gene (FOXF2) located in chromosome 6p25.3 has been shown to play a crucial role in palatal development in mouse and rat models. To date, no evidence of linkage or association has been reported for this gene in humans with oral clefts.
Allelic transmission disequilibrium tests were used to robustly assess evidence of linkage and association with nonsyndromic cleft lip with or without cleft palate (NSCL/P) for 9 SNPs in and around FOXF2 in both Asian and European trios using PLINK.
Statistically significant evidence of linkage and association was shown for two SNPs (rs1711968, and rs732835) in 216 Asian trios where the empiric P values with permutation tests were 0.0016 and 0.005, respectively. The corresponding estimated odds ratios for carrying the minor allele at these SNPs were 2.05 (95%CI=1.41, 2.98) and 1.77 (95%CI=1.26, 2.49), respectively.
Our results provided statistical evidence of linkage and association between FOXF2 and NSCL/P.
PMCID: PMC5180447  PMID: 26278207
FBAT; FOXF2; nonsyndromic cleft lip with or without cleft palate; PLINK; SNP; transmission disequilibrium test
3.  Targeted Deep Sequencing Identifies Rare ‘loss-of-function’ Variants in IFNGR1 for Risk of Atopic Dermatitis Complicated by Eczema Herpeticum 
A subset of atopic dermatitis (AD) is associated with increased susceptibility to eczema herpeticum (ADEH+). We previously reported that common single nucleotide polymorphisms (SNPs) in interferon-gamma (IFNG) and receptor 1 (IFNGR1) were associated with ADEH+ phenotype.
To interrogate the role of rare variants in IFN-pathway genes for risk of ADEH+.
We performed targeted sequencing of interferon-pathway genes (IFNG, IFNGR1, IFNAR1 and IL12RB1) in 228 European American (EA) AD patients selected according to their EH status and severity measured by Eczema Area and Severity Index (EASI). Replication genotyping was performed in independent samples of 219 EA and 333 African Americans (AA). Functional investigation of ‘loss-of-function’ variants was conducted using site-directed mutagenesis.
We identified 494 single nucleotide variants (SNVs) encompassing 105kb of sequence, including 145 common, 349 (70.6%) rare (minor allele frequency (MAF) <5%) and 86 (17.4%) novel variants, of which 2.8% were coding-synonymous, 93.3% were non-coding (64.6% intronic), and 3.8% were missense. We identified six rare IFNGR1 missense including three damaging variants (Val14Met (V14M), Val61Ile and Tyr397Cys (Y397C)) conferring a higher risk for ADEH+ (P=0.031). Variants V14M and Y397C were confirmed to be deleterious leading to partial IFNGR1 deficiency. Seven common IFNGR1 SNPs, along with common protective haplotypes (2 to 7-SNPs) conferred a reduced risk of ADEH+ (P=0.015-0.002, P=0.0015-0.0004, respectively), and both SNP and haplotype associations were replicated in an independent AA sample (P=0.004-0.0001 and P=0.001-0.0001, respectively).
Our results provide evidence that both genetic variants in the gene encoding IFNGR1 are implicated in susceptibility to the ADEH+ phenotype.
We provided the first evidence that rare functional IFNGR1 mutations contribute to a defective systemic IFN-γ immune response that accounts for the propensity of AD patients to disseminated viral skin infections.
PMCID: PMC4679503  PMID: 26343451
IFNGR1; genetic variants; atopic dermatitis; eczema herpeticum
4.  Genetic factors influencing risk to orofacial clefts: today’s challenges and tomorrow’s opportunities 
F1000Research  2016;5:2800.
Orofacial clefts include cleft lip (CL), cleft palate (CP), and cleft lip and palate (CLP), which combined represent the largest group of craniofacial malformations in humans with an overall prevalence of one per 1,000 live births. Each of these birth defects shows strong familial aggregation, suggesting a major genetic component to their etiology. Genetic studies of orofacial clefts extend back centuries, but it has proven difficult to define any single etiologic mechanism because many genes appear to influence risk. Both linkage and association studies have identified several genes influencing risk, but these differ across families and across populations. Genome-wide association studies have identified almost two dozen different genes achieving genome-wide significance, and there are broad classes of ‘causal genes’ for orofacial clefts: a few genes strongly associated with risk and possibly directly responsible for Mendelian syndromes which include orofacial clefts as a key phenotypic feature of the syndrome, and multiple genes with modest individual effects on risk but capable of disrupting normal craniofacial development under the right circumstances (which may include exposure to environmental risk factors). Genomic sequencing studies are now underway which will no doubt reveal additional genes/regions where variants (sequence and structural) can play a role in controlling risk to orofacial clefts. The real challenge to medicine and public health is twofold: to identify specific genes and other etiologic factors in families with affected members and then to devise effective interventions for these different biological mechanisms controlling risk to complex and heterogeneous birth defects such as orofacial clefts.
PMCID: PMC5133690  PMID: 27990279
orofacial clefts; CHD1; ADAMTS9; whole genome sequencing
5.  Filaggrin Mutations That Confer Risk of Atopic Dermatitis Confer Greater Risk for Eczema Herpeticum 
Loss-of-function null mutations R501X and 2282del4 in the skin barrier gene, filaggrin (FLG), represent the most replicated genetic risk factors for atopic dermatitis (AD). Associations have not been reported in African ancestry populations. Eczema herpeticum (ADEH) is a rare but serious complication of AD resulting from disseminated cutaneous HSV infections.
We aimed to determine whether FLG polymorphisms contribute to ADEH susceptibility.
Two common loss-of-function mutations plus nine FLG single nucleotide polymorphisms (SNPs) were genotyped in 278 European American AD patients, of whom 112 had ADEH, and 157non-atopic controls. Replication was performed on 339 African Americans.
Significant associations were observed for both the R501X and 2282del4 mutations and AD among European Americans (P=1.46×10−5,3.87×10−5, respectively), but the frequency of the R501X mutation was three times higher (25.% vs 9%) for ADEH compared to AD without EH (odds ratio [OR]=3.4 (1.7–6.8), P=0.0002). Associations with ADEH were stronger with the combined null mutations (OR=10.1 (4.7–22.1), P=1.99×10−11). Associations with the R501X mutation were replicated in the African American population; the null mutation was absent among healthy African Americans, but present among AD (3.2%, P=0.035) and common among ADEH (9.4%; P=0.0049) patients. However, the 2282del4 mutation was absent among African American ADEH patients and rare (<1%) among healthy individuals.
The R501X mutation in the gene encoding filaggrin, one of the strongest genetic predictors of AD, confers an even greater risk for ADEH in both European and African ancestry populations, suggesting a role for defective skin barrier in this devastating condition.
Clinical Implications
The Filaggrin (FLG) R501X Mutation, a major risk factor for atopic dermatitis, confers a greater risk of the severe, HSV-associated complication, eczema herpeticum in diverse ethnic groups.
Capsule Summary
Mutations in the skin barrier function protein, filaggrin, are strong predictors of atopic dermatitis. This report demonstrates an even greater association between one of these mutations (R501X) and eczema herpeticum in ethnically diverse American populations.
PMCID: PMC5103856  PMID: 19733298
Atopic dermatitis; Eczema herpeticum; filaggrin; R501X; 2282del4; Single Nucleotide Polymorphisms
6.  Challenges and disparities in the application of personalized genomic medicine to populations with African ancestry 
Nature Communications  2016;7:12521.
To characterize the extent and impact of ancestry-related biases in precision genomic medicine, we use 642 whole-genome sequences from the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) project to evaluate typical filters and databases. We find significant correlations between estimated African ancestry proportions and the number of variants per individual in all variant classification sets but one. The source of these correlations is highlighted in more detail by looking at the interaction between filtering criteria and the ClinVar and Human Gene Mutation databases. ClinVar's correlation, representing African ancestry-related bias, has changed over time amidst monthly updates, with the most extreme switch happening between March and April of 2014 (r=0.733 to r=−0.683). We identify 68 SNPs as the major drivers of this change in correlation. As long as ancestry-related bias when using these clinical databases is minimally recognized, the genetics community will face challenges with implementation, interpretation and cost-effectiveness when treating minority populations.
Personalized medicine requires accurate and ethnicity-optimized reference genome panels. Here, the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) evaluates typical variant filters and existing genome databases against newly sequenced African-ancestry populations.
PMCID: PMC5062569  PMID: 27725664
7.  A continuum of admixture in the Western Hemisphere revealed by the African Diaspora genome 
Nature Communications  2016;7:12522.
The African Diaspora in the Western Hemisphere represents one of the largest forced migrations in history and had a profound impact on genetic diversity in modern populations. To date, the fine-scale population structure of descendants of the African Diaspora remains largely uncharacterized. Here we present genetic variation from deeply sequenced genomes of 642 individuals from North and South American, Caribbean and West African populations, substantially increasing the lexicon of human genomic variation and suggesting much variation remains to be discovered in African-admixed populations in the Americas. We summarize genetic variation in these populations, quantifying the postcolonial sex-biased European gene flow across multiple regions. Moreover, we refine estimates on the burden of deleterious variants carried across populations and how this varies with African ancestry. Our data are an important resource for empowering disease mapping studies in African-admixed individuals and will facilitate gene discovery for diseases disproportionately affecting individuals of African ancestry.
The Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) aims to better understand population genetics of the African diaspora. Here, it uses deeply sequenced whole-genomes to describe the impact of admixture and potential disease burden of deleterious variants.
PMCID: PMC5062574  PMID: 27725671
8.  Hemizygous Deletion on Chromosome 3p26.1 Is Associated with Heavy Smoking among African American Subjects in the COPDGene Study 
PLoS ONE  2016;11(10):e0164134.
Many well-powered genome-wide association studies have identified genetic determinants of self-reported smoking behaviors and measures of nicotine dependence, but most have not considered the role of structural variants, such as copy number variation (CNVs), influencing these phenotypes. Here, we included 2,889 African American and 6,187 non-Hispanic White subjects from the COPDGene cohort ( to carefully investigate the role of polymorphic CNVs across the genome on various measures of smoking behavior. We identified a CNV component (a hemizygous deletion) on chromosome 3p26.1 associated with two quantitative phenotypes related to smoking behavior among African Americans. This polymorphic hemizygous deletion is significantly associated with pack-years and cigarettes smoked per day among African American subjects in the COPDGene study. We sought evidence of replication in African Americans from the population based Atherosclerosis Risk in Communities (ARIC) study. While we observed similar CNV counts, the extent of exposure to cigarette smoking among ARIC subjects was quite different and the smaller sample size of heavy smokers in ARIC severely limited statistical power, so we were unable to replicate our findings from the COPDGene cohort. But meta-analyses of COPDGene and ARIC study subjects strengthened our association signal. However, a few linkage studies have reported suggestive linkage to the 3p26.1 region, and a few genome-wide association studies (GWAS) have reported markers in the gene (GRM7) nearest to this 3p26.1 area of polymorphic deletions are associated with measures of nicotine dependence among subjects of European ancestry.
PMCID: PMC5053531  PMID: 27711239
9.  A Genome-Wide Association Study of Emphysema and Airway Quantitative Imaging Phenotypes 
Rationale: Chronic obstructive pulmonary disease (COPD) is defined by the presence of airflow limitation on spirometry, yet subjects with COPD can have marked differences in computed tomography imaging. These differences may be driven by genetic factors. We hypothesized that a genome-wide association study (GWAS) of quantitative imaging would identify loci not previously identified in analyses of COPD or spirometry. In addition, we sought to determine whether previously described genome-wide significant COPD and spirometric loci were associated with emphysema or airway phenotypes.
Objectives: To identify genetic determinants of quantitative imaging phenotypes.
Methods: We performed a GWAS on two quantitative emphysema and two quantitative airway imaging phenotypes in the COPDGene (non-Hispanic white and African American), ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints), NETT (National Emphysema Treatment Trial), and GenKOLS (Genetics of COPD, Norway) studies and on percentage gas trapping in COPDGene. We also examined specific loci reported as genome-wide significant for spirometric phenotypes related to airflow limitation or COPD.
Measurements and Main Results: The total sample size across all cohorts was 12,031, of whom 9,338 were from COPDGene. We identified five loci associated with emphysema-related phenotypes, one with airway-related phenotypes, and two with gas trapping. These loci included previously reported associations, including the HHIP, 15q25, and AGER loci, as well as novel associations near SERPINA10 and DLC1. All previously reported COPD and a significant number of spirometric GWAS loci were at least nominally (P < 0.05) associated with either emphysema or airway phenotypes.
Conclusions: Genome-wide analysis may identify novel risk factors for quantitative imaging characteristics in COPD and also identify imaging features associated with previously identified lung function loci.
PMCID: PMC4595690  PMID: 26030696
emphysema; airway; genetics; chronic obstructive pulmonary disease
10.  Common Genetic Polymorphisms Influence Blood Biomarker Measurements in COPD 
PLoS Genetics  2016;12(8):e1006011.
Implementing precision medicine for complex diseases such as chronic obstructive lung disease (COPD) will require extensive use of biomarkers and an in-depth understanding of how genetic, epigenetic, and environmental variations contribute to phenotypic diversity and disease progression. A meta-analysis from two large cohorts of current and former smokers with and without COPD [SPIROMICS (N = 750); COPDGene (N = 590)] was used to identify single nucleotide polymorphisms (SNPs) associated with measurement of 88 blood proteins (protein quantitative trait loci; pQTLs). PQTLs consistently replicated between the two cohorts. Features of pQTLs were compared to previously reported expression QTLs (eQTLs). Inference of causal relations of pQTL genotypes, biomarker measurements, and four clinical COPD phenotypes (airflow obstruction, emphysema, exacerbation history, and chronic bronchitis) were explored using conditional independence tests. We identified 527 highly significant (p < 8 X 10−10) pQTLs in 38 (43%) of blood proteins tested. Most pQTL SNPs were novel with low overlap to eQTL SNPs. The pQTL SNPs explained >10% of measured variation in 13 protein biomarkers, with a single SNP (rs7041; p = 10−392) explaining 71%-75% of the measured variation in vitamin D binding protein (gene = GC). Some of these pQTLs [e.g., pQTLs for VDBP, sRAGE (gene = AGER), surfactant protein D (gene = SFTPD), and TNFRSF10C] have been previously associated with COPD phenotypes. Most pQTLs were local (cis), but distant (trans) pQTL SNPs in the ABO blood group locus were the top pQTL SNPs for five proteins. The inclusion of pQTL SNPs improved the clinical predictive value for the established association of sRAGE and emphysema, and the explanation of variance (R2) for emphysema improved from 0.3 to 0.4 when the pQTL SNP was included in the model along with clinical covariates. Causal modeling provided insight into specific pQTL-disease relationships for airflow obstruction and emphysema. In conclusion, given the frequency of highly significant local pQTLs, the large amount of variance potentially explained by pQTL, and the differences observed between pQTLs and eQTLs SNPs, we recommend that protein biomarker-disease association studies take into account the potential effect of common local SNPs and that pQTLs be integrated along with eQTLs to uncover disease mechanisms. Large-scale blood biomarker studies would also benefit from close attention to the ABO blood group.
Author Summary
Precision medicine is an emerging approach that takes into account variability in genes, gene and protein expression, environment and lifestyle. Recent advances in high-throughput genome-wide genotyping, genomics, and proteomics coupled with the creation of large, highly-phenotyped clinical cohorts now allows for integration of these molecular data sets at the individual level. Here we use genome-wide genotyping and blood measurements of 88 biomarkers in 1,340 subjects from two large NIH-supported clinical cohorts of smokers (SPIROMICS and COPDGene) to identify more than 300 novel DNA variants that influence measurement of blood protein levels (pQTLs). We find that many DNA variants explain a large portion of the variability of measured protein expression in blood. Furthermore, we show that integration of DNA variants with blood biomarker levels can improve the ability of predictive models to reflect the relationship between biomarker and disease features (e.g., emphysema) within chronic obstructive pulmonary disease (COPD).
PMCID: PMC4988780  PMID: 27532455
11.  A Genome-wide analysis of the response to inhaled beta2-agonists in Chronic Obstructive Pulmonary Disease 
The pharmacogenomics journal  2015;16(4):326-335.
Short-acting β2-agonist bronchodilators are the most common medications used in treating chronic obstructive pulmonary disease (COPD). Genetic variants determining bronchodilator responsiveness (BDR) in COPD have not been identified.
We performed a genome-wide association study (GWAS) of BDR in 5789 current or former smokers with COPD in one African American and four white populations. BDR was defined as the quantitative spirometric response to inhaled β2-agonists. We combined results in a meta-analysis.
In the meta-analysis, SNPs in the genes KCNK1 (P=2.02×10−7) and KCNJ2 (P=1.79×10−7) were the top associations with BDR. Among African Americans, SNPs in CDH13 were significantly associated with BDR (P=5.1×10−9). A nominal association with CDH13 was identified in a gene-based analysis in all subjects.
We identified suggestive association with BDR among COPD subjects for variants near two potassium channel genes (KCNK1 and KCNJ2). SNPs in CDH13 were significantly associated with BDR in African Americans.
PMCID: PMC4848212  PMID: 26503814
12.  Genome-Wide Association Study Identification of Novel Loci Associated with Airway Responsiveness in Chronic Obstructive Pulmonary Disease 
Increased airway responsiveness is linked to lung function decline and mortality in subjects with chronic obstructive pulmonary disease (COPD); however, the genetic contribution to airway responsiveness remains largely unknown. A genome-wide association study (GWAS) was performed using the Illumina (San Diego, CA) Human660W-Quad BeadChip on European Americans with COPD from the Lung Health Study. Linear regression models with correlated meta-analyses, including data from baseline (n = 2,814) and Year 5 (n = 2,657), were used to test for common genetic variants associated with airway responsiveness. Genotypic imputation was performed using reference 1000 Genomes Project data. Expression quantitative trait loci (eQTL) analyses in lung tissues were assessed for the top 10 markers identified, and immunohistochemistry assays assessed protein staining for SGCD and MYH15. Four genes were identified within the top 10 associations with airway responsiveness. Markers on chromosome 9p21.2 flanked by LINGO2 met a predetermined threshold of genome-wide significance (P < 9.57 × 10−8). Markers on chromosomes 3q13.1 (flanked by MYH15), 5q33 (SGCD), and 6q21 (PDSS2) yielded suggestive evidence of association (9.57 × 10−8 < P ≤ 4.6 × 10−6). Gene expression studies in lung tissue showed single nucleotide polymorphisms on chromosomes 5 and 3 to act as eQTL for SGCD (P = 2.57 × 10−9) and MYH15 (P = 1.62 × 10−6), respectively. Immunohistochemistry confirmed localization of SGCD protein to airway smooth muscle and vessels and MYH15 to airway epithelium, vascular endothelium, and inflammatory cells. We identified novel loci associated with airway responsiveness in a GWAS among smokers with COPD. Risk alleles on chromosomes 5 and 3 acted as eQTLs for SGCD and MYH15 messenger RNA, and these proteins were expressed in lung cells relevant to the development of airway responsiveness.
PMCID: PMC4566043  PMID: 25514360
COPD; airway reactivity; bronchial responsiveness; eQTL; δ-sarcoglycan
13.  IRF6 mutation screening in nonsyndromic orofacial clefting: analysis of 1521 families 
Clinical genetics  2015;90(1):28-34.
Van der Woude syndrome (VWS) is an autosomal dominant malformation syndrome characterized by orofacial clefting (OFC) and lower lip pits. The clinical presentation of VWS is variable and can present as an isolated OFC, making it difficult to distinguish VWS cases from individuals with nonsyndromic OFCs. About 70% of causal VWS mutations occur in IRF6, a gene that is also associated with nonsyndromic OFCs. Screening for IRF6 mutations in apparently nonsyndromic cases has been performed in several modestly sized cohorts with mixed results. In the current study we screened 1521 trios with presumed nonsyndromic OFCs to determine the frequency of causal IRF6 mutations. We identified seven likely causal IRF6 mutations, although a posteriori review identified two misdiagnosed VWS families based on the presence of lip pits. We found no evidence for association between rare IRF6 polymorphisms and nonsyndromic OFCs. We combined our results with other similar studies (totaling 2,472 families) and conclude that causal IRF6 mutations are found in 0.24%-0.44% of apparently nonsyndromic OFC families. We suggest that clinical mutation screening for IRF6 be considered for certain family patterns such as families with mixed types of OFCs and/or autosomal dominant transmission.
PMCID: PMC4783275  PMID: 26346622
nonsyndromic oral clefts; syndromic cleft; interferon regulatory factor 6; mutation screening
14.  Genome-wide site-specific differential methylation in the blood of individuals with Klinefelter Syndrome 
Klinefelter syndrome (KS) (47 XXY) is a common sex-chromosome aneuploidy with an estimated prevalence of 1 in every 660 male births. Investigations into the associations between DNA methylation and the highly variable clinical manifestations of KS have largely focused on the supernumerary X chromosome; systematic investigations of the epigenome have been limited. We obtained genome-wide DNA methylation data from peripheral blood using the Illumina HumanMethylation450K platform in 5 KS (47 XXY), 102 male (46 XY), and 113 female (46 XX) control subjects participating in the chronic obstructive pulmonary disease (COPD) Gene Study. Empirical Bayes-mediated models were used to test for differential methylation by KS status. CpG sites with a false-discovery rate <0.05 from the first-generation HumanMethylation27K platform were further examined in an independent replication cohort of 2 KS subjects, 590 male, and 495 female controls drawn from the International COPD Genetics Network (ICGN). Differential methylation at sites throughout the genome were identified, including 86 CpG sites that were differentially methylated in KS subjects relative to both male and female controls. CpG sites annotated to the HEN1 methyltransferase homolog 1 (HENMT1), calcyclin-binding protein (CACYBP), and GTPase-activating protein (SH3 domain)-binding protein 1 (G3BP1) genes were among the “KS-specific” loci that were replicated in ICGN. We therefore conclude that site-specific differential methylation exists throughout the genome in KS. The functional impact and clinical relevance of these differentially methylated loci should be explored in future studies.
PMCID: PMC4439255  PMID: 25988574
[MeSH]: Klinefelter syndrome; DNA methylation; epigenomics; XXY syndrome
15.  A genome-wide study of inherited deletions identified two regions associated with non-syndromic isolated oral clefts 
DNA copy number variants play an important part in the development of common birth defects such as oral clefts. Individual patients with multiple birth defects (including oral clefts) have been shown to carry small and large chromosomal deletions.
We investigated the role of polymorphic copy number deletions by comparing transmission rates of deletions from parents to offspring in case-parent trios of European ancestry ascertained through a cleft proband with trios ascertained through a normal offspring. DNA copy numbers in trios were called using the joint hidden Markov model in the freely available PennCNV software ( All statistical analyses were performed using Bioconductor tools ( in the open source environment R.
We identified a 67 kilo-base (kb) region in the gene MGAM on chromosome 7q34, and a 206 kb region overlapping genes ADAM3A and ADAM5 on chromosome 8p11, where deletions are more frequently transmitted to cleft offspring than control offspring.
These genes or nearby regulatory elements may be involved in the etiology of oral clefts.
PMCID: PMC4415613  PMID: 25776870
Oral clefts; DNA copy numbers; inherited deletions; case-parent trios; arrays
16.  Clinical and Radiologic Disease in Smokers With Normal Spirometry 
JAMA internal medicine  2015;175(9):1539-1549.
Airflow obstruction on spirometry is universally used to define chronic obstructive pulmonary disease (COPD), and current or former smokers without airflow obstruction may assume that they are disease free.
To identify clinical and radiologic evidence of smoking-related disease in a cohort of current and former smokers who did not meet spirometric criteria for COPD, for whom we adopted the discarded label of Global Initiative for Obstructive Lung Disease (GOLD) 0.
Individuals from the Genetic Epidemiology of COPD (COPDGene) cross-sectional observational study completed spirometry, chest computed tomography (CT) scans, a 6-minute walk, and questionnaires. Participants were recruited from local communities at 21 sites across the United States. The GOLD 0 group (n = 4388) (ratio of forced expiratory volume in the first second of expiration [FEV1] to forced vital capacity >0.7 and FEV1 ≥80% predicted) from the COPDGene study was compared with a GOLD 1 group (n = 794), COPD groups (n = 3690), and a group of never smokers (n = 108). Recruitment began in January 2008 and ended in July 2011.
Physical function impairments, respiratory symptoms, CT abnormalities, use of respiratory medications, and reduced respiratory-specific quality of life.
One or more respiratory-related impairments were found in 54.1% (2375 of 4388) of the GOLD 0 group. The GOLD 0 group had worse quality of life (mean [SD] St George’s Respiratory Questionnaire total score, 17.0 [18.0] vs 3.8 [6.8] for the never smokers; P < .001) and a lower 6-minute walk distance, and 42.3% (127 of 300) of the GOLD 0 group had CT evidence of emphysema or airway thickening. The FEV1 percent predicted distribution and mean for the GOLD 0 group were lower but still within the normal range for the population. Current smoking was associated with more respiratory symptoms, but former smokers had greater emphysema and gas trapping. Advancing age was associated with smoking cessation and with more CT findings of disease. Individuals with respiratory impairments were more likely to use respiratory medications, and the use of these medications was associated with worse disease.
Lung disease and impairments were common in smokers without spirometric COPD. Based on these results, we project that there are 35 million current and former smokers older than 55 years in the United States who may have unrecognized disease or impairment. The effect of chronic smoking on the lungs and the individual is substantially underestimated when using spirometry alone.
PMCID: PMC4564354  PMID: 26098755
17.  Genetic control of gene expression at novel and established chronic obstructive pulmonary disease loci 
Human Molecular Genetics  2014;24(4):1200-1210.
Genetic risk loci have been identified for a wide range of diseases through genome-wide association studies (GWAS), but the relevant functional mechanisms have been identified for only a small proportion of these GWAS-identified loci. By integrating results from the largest current GWAS of chronic obstructive disease (COPD) with expression quantitative trait locus (eQTL) analysis in whole blood and sputum from 121 subjects with COPD from the ECLIPSE Study, this analysis identifies loci that are simultaneously associated with COPD and the expression of nearby genes (COPD eQTLs). After integrative analysis, 19 COPD eQTLs were identified, including all four previously identified genome-wide significant loci near HHIP, FAM13A, and the 15q25 and 19q13 loci. For each COPD eQTL, fine mapping and colocalization analysis to identify causal shared eQTL and GWAS variants identified a subset of sites with moderate-to-strong evidence of harboring at least one shared variant responsible for both the eQTL and GWAS signals. Transcription factor binding site (TFBS) analysis confirms that multiple COPD eQTL lead SNPs disrupt TFBS, and enhancer enrichment analysis for loci with the strongest colocalization signals showed enrichment for blood-related cell types (CD3 and CD4+ T cells, lymphoblastoid cell lines). In summary, integrative eQTL and GWAS analysis confirms that genetic control of gene expression plays a key role in the genetic architecture of COPD and identifies specific blood-related cell types as likely participants in the functional pathway from GWAS-associated variant to disease phenotype.
PMCID: PMC4806382  PMID: 25315895
18.  Dissecting genetics for chronic mucus hypersecretion in smokers with and without COPD 
Smoking is a notorious risk factor for chronic mucus hypersecretion (CMH). CMH frequently occurs in Chronic Obstructive Pulmonary Disease (COPD). The question arises whether the same single nucleotide polymorphisms (SNPs) are related to CMH in smokers with and without COPD.
We performed two genome wide association studies on CMH under an additive genetic model in male heavy smokers (≥20 pack-years) with COPD (n=849, 39.9% CMH) and without COPD (n=1,348, 25.4% CMH), followed by replication and meta-analysis in comparable populations, and assessment of the functional relevance of significantly associated SNPs.
GWA analysis on CMH in COPD and non-COPD yielded no genome wide significance after replication. In COPD, our top SNP (rs10461985, p=5.43×10−5) was located in the GDNF-antisense gene that is functionally associated with the GDNF gene. Expression of GDNF in bronchial biopsies of COPD patients was significantly associated with CMH (p=0.007). In non-COPD, 4 SNPs had a p-value <10−5 in the meta-analysis, including a SNP (rs4863687) in the MAML3 gene, the T-allele showing modest association with CMH (p=7.57×10−6, OR=1.48) and with significantly increased MAML3 expression in lung tissue (p=2.59×10−12).
Our data suggest the potential for differential genetic backgrounds of CMH in individuals with and without COPD.
PMCID: PMC4498483  PMID: 25234806
19.  A genome-wide association study identifies risk loci for spirometric measures among smokers of European and African ancestry 
BMC Genetics  2015;16:138.
Pulmonary function decline is a major contributor to morbidity and mortality among smokers. Post bronchodilator FEV1 and FEV1/FVC ratio are considered the standard assessment of airflow obstruction. We performed a genome-wide association study (GWAS) in 9919 current and former smokers in the COPDGene study (6659 non-Hispanic Whites [NHW] and 3260 African Americans [AA]) to identify associations with spirometric measures (post-bronchodilator FEV1 and FEV1/FVC). We also conducted meta-analysis of FEV1 and FEV1/FVC GWAS in the COPDGene, ECLIPSE, and GenKOLS cohorts (total n = 13,532).
Among NHW in the COPDGene cohort, both measures of pulmonary function were significantly associated with SNPs at the 15q25 locus [containing CHRNA3/5, AGPHD1, IREB2, CHRNB4] (lowest p-value = 2.17 × 10−11), and FEV1/FVC was associated with a genomic region on chromosome 4 [upstream of HHIP] (lowest p-value = 5.94 × 10−10); both regions have been previously associated with COPD. For the meta-analysis, in addition to confirming associations to the regions near CHRNA3/5 and HHIP, genome-wide significant associations were identified for FEV1 on chromosome 1 [TGFB2] (p-value = 8.99 × 10−9), 9 [DBH] (p-value = 9.69 × 10−9) and 19 [CYP2A6/7] (p-value = 3.49 × 10−8) and for FEV1/FVC on chromosome 1 [TGFB2] (p-value = 8.99 × 10−9), 4 [FAM13A] (p-value = 3.88 × 10−12), 11 [MMP3/12] (p-value = 3.29 × 10−10) and 14 [RIN3] (p-value = 5.64 × 10−9).
In a large genome-wide association study of lung function in smokers, we found genome-wide significant associations at several previously described loci with lung function or COPD. We additionally identified a novel genome-wide significant locus with FEV1 on chromosome 9 [DBH] in a meta-analysis of three study populations.
Electronic supplementary material
The online version of this article (doi:10.1186/s12863-015-0299-4) contains supplementary material, which is available to authorized users.
PMCID: PMC4668640  PMID: 26634245
Chronic obstructive pulmonary disease; DBH; FEV1; FEV1/FVC; Genome-wide association study; Spirometry
20.  Analytic power and sample size calculation for the genotypic transmission/disequilibrium test in case-parent trio studies 
Case-parent trio studies considering genotype data from children affected by a disease and from their parents are frequently used to detect single nucleotide polymorphisms (SNPs) associated with disease. The most popular statistical tests in this study design are transmission/disequlibrium tests (TDTs). Several types of these tests have been developed, e.g., procedures based on alleles or genotypes. Therefore, it is of great interest to examine which of these tests have the highest statistical power to detect SNPs associated with disease. Comparisons of the allelic and the genotypic TDT for individual SNPs have so far been conducted based on simulation studies, since the test statistic of the genotypic TDT was determined numerically. Recently, it, however, has been shown that this test statistic can be presented in closed form. In this article, we employ this analytic solution to derive equations for calculating the statistical power and the required sample size for different types of the genotypic TDT. The power of this test is then compared with the one of the corresponding score test assuming the same mode of inheritance as well as the allelic TDT based on a multiplicative mode of inheritance, which is equivalent to the score test assuming an additive mode of inheritance. This is, thus, the first time that the power of these tests are compared based on equations, yielding instant results and omitting the need for time-consuming simulation studies. This comparison reveals that the tests have almost the same power, with the score test being slightly more powerful.
PMCID: PMC4206700  PMID: 25123830
Case-parent trio design; Conditional logisitc regression; Genome-wide association studies; Power calculation; Wald test
21.  Common Genetic Variants Associated with Resting Oxygenation in Chronic Obstructive Pulmonary Disease 
Hypoxemia is a major complication of chronic obstructive pulmonary disease (COPD) that correlates with disease prognosis. Identifying genetic variants associated with oxygenation may provide clues for deciphering the heterogeneity in prognosis among patients with COPD. However, previous genetic studies have been restricted to investigating COPD candidate genes for association with hypoxemia. To report results from the first genome-wide association study (GWAS) of resting oxygen saturation (as measured by pulse oximetry [Spo2]) in subjects with COPD, we performed a GWAS of Spo2 in two large, well characterized COPD populations: COPDGene, including both the non-Hispanic white (NHW) and African American (AA) groups, and Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE). We identified several suggestive loci (P < 1 × 10−5) associated with Spo2 in COPDGene in the NHW (n = 2810) and ECLIPSE (n = 1758) groups, and two loci on chromosomes 14 and 15 in the AA group (n = 820) from COPDGene achieving a level of genome-wide significance (P < 5 × 10−8). The chromosome 14 single-nucleotide polymorphism, rs6576132, located in an intergenic region, was nominally replicated (P < 0.05) in the NHW group from COPDGene. The chromosome 15 single-nucleotide polymorphisms were rare in subjects of European ancestry, so the results could not be replicated. The chromosome 15 region contains several genes, including TICRR and KIF7, and is proximal to RHCG (Rh family C glyocoprotein gene). We have identified two loci associated with resting oxygen saturation in AA subjects with COPD, and several suggestive regions in subjects of European descent with COPD. Our study highlights the importance of investigating the genetics of complex traits in different racial groups.
PMCID: PMC4224086  PMID: 24825563
chronic obstructive pulmonary disease; hypoxemia; pulse oximetry; genome-wide association study; oxygen saturation
22.  Oesophageal squamous cell carcinoma in high-risk Chinese populations: Possible role for vascular epithelial growth factor A 
Mechanisms involved in wound healing play some role in carcinogenesis in multiple organs, likely by creating a chronic inflammatory milieu. This study sought to assess the role of genetic markers in selected inflammation-related genes involved in wound healing (interleukin (IL)-1a, IL-1b, IL-1 Receptor type I (IL-1Ra), IL-1 Receptor type II (IL-1Rb), tumour necrosis factor (TNF)-α, tumour necrosis factor receptor superfamily member (TNFRSF)1A, nuclear factor kappa beta (NF-kB)1, NF-kB2, inducible nitric oxide synthase (iNOS), cyclooxygenase (COX)-2, hypoxia induced factor (HIF)-1α, vascular endothelial growth factor (VEGF)A and P-53) in risk to oesophageal squamous cell carcinoma (OSCC).
We genotyped 125 tag single nucleotide polymorphism (SNP)s in 410 cases and 377 age and sex matched disease-free individuals from Nutritional Intervention Trial (NIT) cohort, and 546 cases and 556 controls individually matched for age, sex and neighbourhood from Shanxi case–control study, both conducted in high-risk areas of north-central China (1985–2007). Cox proportional-hazard models and conditional logistic regression models were used for SNPs analyses for NIT and Shanxi, respectively. Fisher's inverse test statistics were used to obtain gene-level significance.
Multiple SNPs were significantly associated with OSCC in both studies, however, none retained their significance after a conservative Bonferroni adjustment. Empiric p-values for tag SNPs in VEGFA in NIT were highly concentrated in the lower tail of the distribution, suggesting this gene may be influencing risk. Permutation tests confirmed the significance of this pattern. At the gene level, VEGFA yielded an empiric significance (P = 0.027) in NIT. We also observed some evidence for interaction between environmental factors and some VEGFA tag SNPs.
Our finding adds further evidence for a potential role for markers in the VEGFA gene in the development and progression of early precancerous lesions of oesophagus.
PMCID: PMC4363989  PMID: 25172294
Oesophageal squamous; cell carcinoma; Inflammation; Wound-healing; Genetic marker; Genetics; Inflammation-related events; Vascular endothelial growth factor A; VEGFA
23.  Admixture Mapping Identifies a Quantitative Trait Locus Associated with FEV1/FVC in the COPDGene Study 
Genetic epidemiology  2014;38(7):652-659.
African Americans are admixed with genetic contributions from European and African ancestral populations. Admixture mapping leverages this information to map genes influencing differential disease risk across populations. We performed admixture and association mapping in 3300 African American current or former smokers from the COPDGene Study. We analyzed estimated local ancestry and SNP genotype information to identify regions associated with FEV1/FVC, the ratio of forced expiratory volume in one second to forced vital capacity, measured by spirometry performed after bronchodilator administration. Global African ancestry inversely associated with FEV1/FVC (p = 0.035). Genome-wide admixture analysis, controlling for age, gender, body mass index, current smoking status, pack-years smoked, and four principal components summarizing the genetic background of African Americans in the COPDGene Study, identified a region on chromosome 12q14.1 associated with FEV1/FVC (p = 2.1 × 10-6) when regressed on local ancestry. Allelic association in this region of chromosome 12 identified an intronic variant in FAM19A2 (rs348644) as associated with FEV1/FVC (p=1.76 × 10-6). By combining admixture and association mapping, a marker on chromosome 12q14.1 was identified as being associated with reduced FEV1/FVC ratio among African-Americans in the COPDGene Study.
PMCID: PMC4190160  PMID: 25112515
admixture mapping; lung function; COPD; African Americans
24.  Genome-wide interaction studies reveal sex-specific asthma risk alleles 
Human Molecular Genetics  2014;23(19):5251-5259.
Asthma is a complex disease with sex-specific differences in prevalence. Candidate gene studies have suggested that genotype-by-sex interaction effects on asthma risk exist, but this has not yet been explored at a genome-wide level. We aimed to identify sex-specific asthma risk alleles by performing a genome-wide scan for genotype-by-sex interactions in the ethnically diverse participants in the EVE Asthma Genetics Consortium. We performed male- and female-specific genome-wide association studies in 2653 male asthma cases, 2566 female asthma cases and 3830 non-asthma controls from European American, African American, African Caribbean and Latino populations. Association tests were conducted in each study sample, and the results were combined in ancestry-specific and cross-ancestry meta-analyses. Six sex-specific asthma risk loci had P-values < 1 × 10−6, of which two were male specific and four were female specific; all were ancestry specific. The most significant sex-specific association in European Americans was at the interferon regulatory factor 1 (IRF1) locus on 5q31.1. We also identify a Latino female-specific association in RAP1GAP2. Both of these loci included single-nucleotide polymorphisms that are known expression quantitative trait loci and have been associated with asthma in independent studies. The IRF1 locus is a strong candidate region for male-specific asthma susceptibility due to the association and validation we demonstrate here, the known role of IRF1 in asthma-relevant immune pathways and prior reports of sex-specific differences in interferon responses.
PMCID: PMC4159149  PMID: 24824216
25.  Dectecting disease variants in case-parent trio studies using the Bioconductor software package trio 
Genetic epidemiology  2014;38(6):516-522.
Case-parent trio studies are commonly employed in genetics to detect variants underlying common complex disease risk. Both commercial and freely available software suites for genetic data analysis usually contain methods for case-parent trio designs. A user might, however, experience limitations with these packages, which can include missing functionality to extend the software if a desired analysis has not been implemented, and the inability to programmatically capture all the software versions used for low-level processing and high-level inference of genomic data, a critical consideration in particular for high-throughput experiments. Here, we present a software vignette (i.e., a manual with step by step instructions and examples to demonstrate software functionality) for reproducible genome-wide analyses of case-parent trio data using the open source Bioconductor package trio. The workflow for the practitioner uses data from previous genetic trio studies to illustrate functions for marginal association tests, assessment of parent-of-origin effects, power and sample size calculations, and functions to detect gene-gene and gene-environment interactions associated with disease.
PMCID: PMC4139708  PMID: 25048299
Software; Case-parent trios; Transmission disequilibrium tests; Gene-environment interactions; Parent-of-origin effects

