Genome-wide association studies of human gene expression promise to identify functional regulatory genetic variation that contributes to phenotypic diversity. However, it is unclear how useful this approach will be for the identification of disease-susceptibility variants. We generated gene expression profiles for 22 184 mRNA transcripts using RNA derived from peripheral blood CD4+ lymphocytes, and genome-wide genotype data for 516 512 autosomal markers in 200 subjects. We screened for cis-acting variants by testing variants mapping within 50 kb of expressed transcripts for association with transcript abundance using generalized linear models. Significant associations were identified for 1585 genes at a false discovery rate of 0.05 (corresponding to P-values ranging from 1 × 10−91 to 7 × 10−4). Importantly, we identified evidence of regulatory variation for 119 previously mapped disease genes, including 24 examples where the variant with the strongest evidence of disease-association demonstrates strong association with specific transcript abundance. The prevalence of cis-acting variants among disease-associated genes was 63% higher than the genome-wide rate in our data set (P = 6.41 × 10−6), and although many of the implicated loci were associated with immune-related diseases (including asthma, connective tissue disorders and inflammatory bowel disease), associations with genes implicated in non-immune-related diseases including lipid profiles, anthropomorphic measurements, cancer and neurologic disease were also observed. Genetic variants that confer inter-individual differences in gene expression represent an important subset of variants that contribute to disease susceptibility. Population-based integrative genetic approaches can help identify such variation and enhance our understanding of the genetic basis of complex traits.
A 900-KB inversion exists within a large region of conserved linkage disequilibrium (LD) on chromosome 17. CRHR1 is located within the inversion region and associated with inhaled corticosteroid response in asthma. We hypothesized that CRHR1 variants are in LD with the inversion, supporting a potential role for natural selection in the genetic response to corticosteroids. We genotyped 6 single nucleotide polymorphisms (SNPs) spanning chr17:40,410,565–42,372,240, including 4 SNPs defining inversion status. Similar allele frequencies and strong LD were noted between the inversion and a CRHR1 SNP previously associated with lung function response to inhaled corticosteroids. Each inversion-defining SNP was strongly associated with inhaled corticosteroid response in adult asthma (p-values 0.002–0.005). The CRHR1 response to inhaled corticosteroids may thus be explained by natural selection resulting from inversion status or by long-range LD with another gene. Additional pharmacogenetic investigations into to regions of chromosomal diversity, including copy number variation and inversions, are warranted.
CRHR1; tau haplotype; MAPT; inversion; asthma; corticosteroid; pharmacogenetics
Corticotropin - releasing hormone receptor 2 (CRHR2) participates in smooth muscle relaxation response and may influence acute airway bronchodilator response to short – acting β2 agonist treatment of asthma. We aim to assess associations between genetic variants of CRHR2 and acute bronchodilator response in asthma.
We investigated 28 single nucleotide polymorphisms in CRHR2 for associations with acute bronchodilator response to albuterol in 607 Caucasian asthmatic subjects recruited as part of the Childhood Asthma Management Program (CAMP). Replication was conducted in two Caucasian adult asthma cohorts – a cohort of 427 subjects enrolled in a completed clinical trial conducted by Sepracor Inc. (MA, USA) and a cohort of 152 subjects enrolled in the Clinical Trial of Low-Dose Theopylline and Montelukast (LODO) conducted by the American Lung Association Asthma Clinical Research Centers.
Five variants were significantly associated with acute bronchodilator response in at least one cohort (p-value ≤ 0.05). Variant rs7793837 was associated in CAMP and LODO (p-value = 0.05 and 0.03, respectively) and haplotype blocks residing at the 5’ end of CRHR2 were associated with response in all three cohorts.
We report for the first time, at the gene level, replicated associations between CRHR2 and acute bronchodilator response. While no single variant was significantly associated in all three cohorts, the findings that variants at the 5’ end of CRHR2 are associated in each of three cohorts strongly suggest that the causative variants reside in this region and its genetic effect, although present, is likely to be weak.
Asthma; genetics; corticotrophin releasing hormone receptor 2; CRHR2; bronchodilator response; polymorphism; β2 adrenergic receptor agonist
Electronic health records (EHRs) have the potential to improve completeness and timeliness of tuberculosis (TB) surveillance relative to traditional reporting, particularly for culture-negative disease. We report on the development and validation of a TB detection algorithm for EHR data followed by implementation in a live surveillance and reporting system.
We used structured electronic data from an ambulatory practice in eastern Massachusetts to develop a screening algorithm aimed at achieving 100% sensitivity for confirmed active TB with the highest possible positive predictive value (PPV) for physician-suspected disease. We validated the algorithm in 16 years of retrospective electronic data and then implemented it in a real-time EHR-based surveillance system. We assessed PPV and the completeness of case capture relative to conventional reporting in 18 months of prospective surveillance.
The final algorithm required a prescription for pyrazinamide, an International Classification of Diseases, Ninth Revision (ICD-9) code for TB and prescriptions for two antituberculous medications, or an ICD-9 code for TB and an order for a TB diagnostic test. During validation, this algorithm had a PPV of 84% (95% confidence interval 78, 88) for physician-suspected disease. One-third of confirmed cases were culture-negative. All false-positives were instances of latent TB. In 18 months of prospective EHR-based surveillance with this algorithm, seven additional cases of physician-suspected active TB were detected, including two patients with culture-negative disease. A review of state health department records revealed no cases missed by the algorithm.
Live, prospective TB surveillance using EHR data is feasible and promising.
Rationale: Several family-based studies have identified genetic linkage for lung function and airflow obstruction to chromosome 2q.
Objectives: We hypothesized that merging results of high-resolution single nucleotide polymorphism (SNP) mapping in four separate populations would lead to the identification of chronic obstructive pulmonary disease (COPD) susceptibility genes on chromosome 2q.
Methods: Within the chromosome 2q linkage region, 2,843 SNPs were genotyped in 806 COPD cases and 779 control subjects from Norway, and 2,484 SNPs were genotyped in 309 patients with severe COPD from the National Emphysema Treatment Trial and 330 community control subjects. Significant associations from the combined results across the two case-control studies were followed up in 1,839 individuals from 603 families from the International COPD Genetics Network (ICGN) and in 949 individuals from 127 families in the Boston Early-Onset COPD Study.
Measurements and Main Results: Merging the results of the two case-control analyses, 14 of the 790 overlapping SNPs had a combined P < 0.01. Two of these 14 SNPs were consistently associated with COPD in the ICGN families. The association with one SNP, located in the gene XRCC5, was replicated in the Boston Early-Onset COPD Study, with a combined P = 2.51 × 10−5 across the four studies, which remains significant when adjusted for multiple testing (P = 0.02). Genotype imputation confirmed the association with SNPs in XRCC5.
Conclusions: By combining data from COPD genetic association studies conducted in four independent patient samples, we have identified XRCC5, an ATP-dependent DNA helicase, as a potential COPD susceptibility gene.
emphysema; genetic linkage; metaanalysis; single nucleotide polymorphism
Pathogens have represented an important selective force during the adaptation of modern human populations to changing social and other environmental conditions. The evolution of the immune system has therefore been influenced by these pressures. Genomic scans have revealed that immune system is one of the functions enriched with genes under adaptive selection.
Here, we describe how the innate immune system has responded to these challenges, through the analysis of resequencing data for 132 innate immunity genes in two human populations. Results are interpreted in the context of the functional and interaction networks defined by these genes. Nucleotide diversity is lower in the adaptors and modulators functional classes, and is negatively correlated with the centrality of the proteins within the interaction network. We also produced a list of candidate genes under positive or balancing selection in each population detected by neutrality tests and showed that some functional classes are preferential targets for selection.
We found evidence that the role of each gene in the network conditions the capacity to evolve or their evolvability: genes at the core of the network are more constrained, while adaptation mostly occurred at particular positions at the network edges. Interestingly, the functional classes containing most of the genes with signatures of balancing selection are involved in autoinflammatory and autoimmune diseases, suggesting a counterbalance between the beneficial and deleterious effects of the immune response.
Network modeling of whole transcriptome expression data enables characterization of complex epistatic (gene-gene) interactions that underlie cellular functions. Though numerous methods have been proposed and successfully implemented to develop these networks, there are no formal methods for comparing differences in network connectivity patterns as a function of phenotypic trait.
Here we describe a novel approach for quantifying the differences in gene-gene connectivity patterns across disease states based on Graphical Gaussian Models (GGMs). We compare the posterior probabilities of connectivity for each gene pair across two disease states, expressed as a posterior odds-ratio (postOR) for each pair, which can be used to identify network components most relevant to disease status. The method can also be generalized to model differential gene connectivity patterns within previously defined gene sets, gene networks and pathways. We demonstrate that the GGM method reliably detects differences in network connectivity patterns in datasets of varying sample size. Applying this method to two independent breast cancer expression data sets, we identified numerous reproducible differences in network connectivity across histological grades of breast cancer, including several published gene sets and pathways. Most notably, our model identified two gene hubs (MMP12 and CXCL13) that each exhibited differential connectivity to more than 30 transcripts in both datasets. Both genes have been previously implicated in breast cancer pathobiology, but themselves are not differentially expressed by histologic grade in either dataset, and would thus have not been identified using traditional differential gene expression testing approaches. In addition, 16 curated gene sets demonstrated significant differential connectivity in both data sets, including the matrix metalloproteinases, PPAR alpha sequence targets, and the PUFA synthesis pathway.
Our results suggest that GGM can be used to formally evaluate differences in global interactome connectivity across disease states, and can serve as a powerful tool for exploring the molecular events that contribute to disease at a systems level.
Rationale: Animal models demonstrate that aberrant gene expression in utero can result in abnormal pulmonary phenotypes.
Objectives: We sought to identify genes that are differentially expressed during in utero airway development and test the hypothesis that variants in these genes influence lung function in patients with asthma.
Methods: Stage 1 (Gene Expression): Differential gene expression analysis across the pseudoglandular (n = 27) and canalicular (n = 9) stages of human lung development was performed using regularized t tests with multiple comparison adjustments. Stage 2 (Genetic Association): Genetic association analyses of lung function (FEV1, FVC, and FEV1/FVC) for variants in five differentially expressed genes were conducted in 403 parent-child trios from the Childhood Asthma Management Program (CAMP). Associations were replicated in 583 parent-child trios from the Genetics of Asthma in Costa Rica study.
Measurements and Main Results: Of the 1,776 differentially expressed genes between the pseudoglandular (gestational age: 7–16 wk) and the canalicular (gestational age: 17–26 wk) stages, we selected 5 genes in the Wnt pathway for association testing. Thirteen single nucleotide polymorphisms in three genes demonstrated association with lung function in CAMP (P < 0.05), and associations for two of these genes were replicated in the Costa Ricans: Wnt1-inducible signaling pathway protein 1 with FEV1 (combined P = 0.0005) and FVC (combined P = 0.0004), and Wnt inhibitory factor 1 with FVC (combined P = 0.003) and FEV1/FVC (combined P = 0.003).
Conclusions: Wnt signaling genes are associated with impaired lung function in two childhood asthma cohorts. Furthermore, gene expression profiling of human fetal lung development can be used to identify genes implicated in the pathogenesis of lung function impairment in individuals with asthma.
asthma; lung development; lung function; genetic variation; gene expression
Prior studies suggest a role for a variant (rs5743836) in the promoter of toll-like receptor 9 (TLR9) in asthma and other inflammatory diseases. We performed detailed genetic association studies of the functional variant rs5743836 with asthma susceptibility and asthma-related phenotypes in three independent cohorts.
rs5743836 was genotyped in two family-based cohorts of children with asthma and a case-control study of adult asthmatics. Association analyses were performed using chi square, family-based and population-based testing. A luciferase assay was performed to investigate whether rs5743836 genotype influences TLR9 promoter activity.
Contrary to prior reports, rs5743836 was not associated with asthma in any of the three cohorts. Marginally significant associations were found with FEV1 and FVC (p = 0.003 and p = 0.008, respectively) in one of the family-based cohorts, but these associations were not significant after correcting for multiple comparisons. Higher promoter activity of the CC genotype was demonstrated by luciferase assay, confirming the functional importance of this variant.
Although rs5743836 confers regulatory effects on TLR9 transcription, this variant does not appear to be an important asthma-susceptibility locus.
Genetic variation at the MYH9 locus is linked to the high incidence of focal segmental glomerulosclerosis (FSGS) and non-diabetic end-stage renal disease among African Americans. To further define risk alleles with FSGS we performed a genome-wide association analysis using more than one million single nucleotide polymorphisms in 56 African and 61 European American patients with biopsy-confirmed FSGS. Results were compared to 1641 European and 1800 African Americans as unselected controls. While no association was observed in the cohort of European Americans; the case-control comparison of African Americans found variants within a 60kb region of chromosome 22 containing part of the APOL1 and MYH9 genes associated with increased risk of FSGS. This region spans different linkage disequilibrium blocks and variants associating with disease within this region are in linkage disequilibrium with variants which have shown signals of natural selection. APOL1 is a strong candidate for a gene that has undergone recent natural selection and is known to be involved in the infection by Trypanosome brucei, a parasite common in Africa that has recently adapted to infect human hosts. Further studies will be required to establish which variants are causally related to kidney disease, what mutations caused the selective sweep, and to ultimately determine if these are the same.
focal segmental glomerulosclerosis; end stage kidney disease; genetic renal disease
Low plasma B-vitamin levels and elevated homocysteine have been associated with cancer, cardiovascular disease and neurodegenerative disorders. Common variants in FUT2 on chromosome 19q13 were associated with plasma vitamin B12 levels among women in a genome-wide association study in the Nurses’ Health Study (NHS) NCI-Cancer Genetic Markers of Susceptibility (CGEMS) project. To identify additional loci associated with plasma vitamin B12, homocysteine, folate and vitamin B6 (active form pyridoxal 5′-phosphate, PLP), we conducted a meta-analysis of three GWA scans (total n = 4763, consisting of 1658 women in NHS-CGEMS, 1647 women in Framingham-SNP-Health Association Resource (SHARe) and 1458 men in SHARe). On chromosome 19q13, we confirm the association of plasma vitamin B12 with rs602662 and rs492602 (P-value = 1.83 × 10−15 and 1.30 × 10−14, respectively) in strong linkage disequilibrium (LD) with rs601338 (P = 6.92 × 10−15), the FUT2 W143X nonsense mutation. We identified additional genome-wide significant loci for plasma vitamin B12 on chromosomes 6p21 (P = 4.05 × 10−08), 10p12 (P-value=2.87 × 10−9) and 11q11 (P-value=2.25 × 10−10) in genes with biological relevance. We confirm the association of the well-studied functional candidate SNP 5,10-methylene tetrahydrofolate reductase (MTHFR) Ala222Val (dbSNP ID: rs1801133; P-value=1.27 × 10−8), on chromosome 1p36 with plasma homocysteine and identify an additional genome-wide significant locus on chromosome 9q22 (P-value=2.06 × 10−8) associated with plasma homocysteine. We also identified genome-wide associations with variants on chromosome 1p36 with plasma PLP (P-value=1.40 × 10−15). Genome-wide significant loci were not identified for plasma folate. These data reveal new biological candidates and confirm prior candidate genes for plasma homocysteine, plasma vitamin B12 and plasma PLP.
Asthma is a chronic respiratory disease whose genetic basis has been explored for over two decades, most recently via genome-wide association studies. We sought to find asthma-susceptibility variants by using probands from a single population in both family-based and case-control association designs.
We used probands from the Childhood Asthma Management Program (CAMP) in two primary genome-wide association study designs: (1) probands were combined with publicly available population controls in a case-control design, and (2) probands and their parents were used in a family-based design. We followed a two-stage replication process utilizing three independent populations to validate our primary findings.
We found that single nucleotide polymorphisms with similar case-control and family-based association results were more likely to replicate in the independent populations, than those with the smallest p-values in either the case-control or family-based design alone. The single nucleotide polymorphism that showed the strongest evidence for association to asthma was rs17572584, which replicated in 2/3 independent populations with an overall p-value among replication populations of 3.5E-05. This variant is near a gene that encodes an enzyme that has been implicated to act coordinately with modulators of Th2 cell differentiation and is expressed in human lung.
Our results suggest that using probands from family-based studies in case-control designs, and combining results of both family-based and case-control approaches, may be a way to augment our ability to find SNPs associated with asthma and other complex diseases.
Rationale: Association studies have implicated many genes in asthma pathogenesis, with replicated associations between single-nucleotide polymorphisms (SNPs) and asthma reported for more than 30 genes. Genome-wide genotyping enables simultaneous evaluation of most of this variation, and facilitates more comprehensive analysis of other common genetic variation around these candidate genes for association with asthma.
Objectives: To use available genome-wide genotypic data to assess the reproducibility of previously reported associations with asthma and to evaluate the contribution of additional common genetic variation surrounding these loci to asthma susceptibility.
Methods: Illumina Human Hap 550Kv3 BeadChip (Illumina, San Diego, CA) SNP arrays were genotyped in 422 nuclear families participating in the Childhood Asthma Management Program. Genes with at least one SNP demonstrating prior association with asthma in two or more populations were tested for evidence of association with asthma, using family-based association testing.
Measurements and Main Results: We identified 39 candidate genes from the literature, using prespecified criteria. Of the 160 SNPs previously genotyped in these 39 genes, 10 SNPs in 6 genes were significantly associated with asthma (including the first independent replication for asthma-associated integrin β3 [ITGB3]). Evaluation of 619 additional common variants included in the Illumina 550K array revealed additional evidence of asthma association for 15 genes, although none were significant after adjustment for multiple comparisons.
Conclusions: We replicated asthma associations for a minority of candidate genes. Pooling genome-wide association study results from multiple studies will increase the power to appreciate marginal effects of genes and further clarify which candidates are true “asthma genes.”
asthma; replication; single-nucleotide polymorphism; integrin β3; association
Although asthma is highly prevalent among certain Hispanic subgroups, genetic determinants of asthma and asthma‐related traits have not been conclusively identified in Hispanic populations. A study was undertaken to identify genomic regions containing susceptibility loci for pulmonary function and bronchodilator responsiveness (BDR) in Costa Ricans.
Eight extended pedigrees were ascertained through schoolchildren with asthma in the Central Valley of Costa Rica. Short tandem repeat (STR) markers were genotyped throughout the genome at an average spacing of 8.2 cM. Multipoint variance component linkage analyses of forced expiratory volume in 1 second (FEV1) and FEV1/ forced vital capacity (FVC; both pre‐bronchodilator and post‐bronchodilator) and BDR were performed in these eight families (pre‐bronchodilator spirometry, n = 640; post‐bronchodilator spirometry and BDR, n = 624). Nine additional STR markers were genotyped on chromosome 7. Secondary analyses were repeated after stratification by cigarette smoking.
Among all subjects, the highest logarithm of the odds of linkage (LOD) score for FEV1 (post‐bronchodilator) was found on chromosome 7q34–35 (LOD = 2.45, including the additional markers). The highest LOD scores for FEV1/FVC (pre‐bronchodilator) and BDR were found on chromosomes 2q (LOD = 1.53) and 9p (LOD = 1.53), respectively. Among former and current smokers there was near‐significant evidence of linkage to FEV1/FVC (post‐bronchodilator) on chromosome 5p (LOD = 3.27) and suggestive evidence of linkage to FEV1 on chromosomes 3q (pre‐bronchodilator, LOD = 2.74) and 4q (post‐bronchodilator, LOD = 2.66).
In eight families of children with asthma in Costa Rica, there is suggestive evidence of linkage to FEV1 on chromosome 7q34–35. In these families, FEV1/FVC may be influenced by an interaction between cigarette smoking and a locus (loci) on chromosome 5p.
Bayesian hierarchical models that characterize the distributions of (transformed) gene profiles have been proven very useful and flexible in selecting differentially expressed genes across different types of tissue samples (e.g. Lo and Gottardo, 2007). However, the marginal mean and variance of these models are assumed to be the same for different gene clusters and for different tissue types. Moreover, it is not easy to determine which of the many competing Bayesian hierarchical models provides the best fit for a specific microarray data set. To address these two issues, we propose a marginal mixture model that directly models the marginal distribution of transformed gene profiles. Specifically, we approximate the marginal distributions of transformed gene profiles via a mixture of three-component multivariate Normal distributions, each component of which has the same structures of marginal mean vector and covariance matrix as those for Bayesian hierarchical models, but the values can differ. Based on the proposed model, a method is derived to select genes differentially expressed across two types of tissue samples. The derived gene selection method performs well on a real microarray data set and consistently has the best performance (based on class agreement indices) compared with several other gene selection methods on simulated microarray data sets generated from three different mixture models.
Rationale: Inhaled β-agonists are one of the most widely used classes of drugs for the treatment of asthma. However, a substantial proportion of patients with asthma do not have a favorable response to these drugs, and identifying genetic determinants of drug response may aid in tailoring treatment for individual patients.
Objectives: To screen variants in candidate genes in the steroid and β-adrenergic pathways for association with response to inhaled β-agonists.
Methods: We genotyped 844 single nucleotide polymorphisms (SNPs) in 111 candidate genes in 209 children and their parents participating in the Childhood Asthma Management Program. We screened the association of these SNPs with acute response to inhaled β-agonists (bronchodilator response [BDR]) using a novel algorithm implemented in a family-based association test that ranked SNPs in order of statistical power. Genes that had SNPs with median power in the highest quartile were then taken for replication analyses in three other asthma cohorts.
Measurements and Main Results: We identified 17 genes from the screening algorithm and genotyped 99 SNPs from these genes in a second population of patients with asthma. We then genotyped 63 SNPs from four genes with significant associations with BDR, for replication in a third and fourth population of patients with asthma. Evidence for association from the four asthma cohorts was combined, and SNPs from ARG1 were significantly associated with BDR. SNP rs2781659 survived Bonferroni correction for multiple testing (combined P value = 0.00048, adjusted P value = 0.047).
Conclusions: These findings identify ARG1 as a novel gene for acute BDR in both children and adults with asthma.
pharmacogenetics; asthma; bronchodilator agents
Health care providers are legally obliged to report cases of specified diseases to public health authorities, but existing manual, provider-initiated reporting systems generally result in incomplete, error-prone, and tardy information flow. Automated laboratory-based reports are more likely accurate and timely, but lack clinical information and treatment details. Here, we describe the Electronic Support for Public Health (ESP) application, a robust, automated, secure, portable public health detection and messaging system for cases of notifiable diseases. The ESP application applies disease specific logic to any complete source of electronic medical data in a fully automated process, and supports an optional case management workflow system for case notification control. All relevant clinical, laboratory and demographic details are securely transferred to the local health authority as an HL7 message. The ESP application has operated continuously in production mode since January 2007, applying rigorously validated case identification logic to ambulatory EMR data from more than 600,000 patients. Source code for this highly interoperable application is freely available under an approved open-source license at http://esphealth.org.
With the recent development of microarray technologies, the comparability of gene expression data obtained from different platforms poses an important problem. We evaluated two widely used platforms, Affymetrix U133 Plus 2.0 and the Illumina HumanRef-8 v2 Expression Bead Chips, for comparability in a biological system in which changes may be subtle, namely fetal lung tissue as a function of gestational age.
We performed the comparison via sequence-based probe matching between the two platforms. "Significance grouping" was defined as a measure of comparability. Using both expression correlation and significance grouping as measures of comparability, we demonstrated that despite overall cross-platform differences at the single gene level, increased correlation between the two platforms was found in genes with higher expression level, higher probe overlap, and lower p-value. We also demonstrated that biological function as determined via KEGG pathways or GO categories is more consistent across platforms than single gene analysis.
We conclude that while the comparability of the platforms at the single gene level may be increased by increasing sample size, they are highly comparable ontologically even for subtle differences in a relatively small sample size. Biologically relevant inference should therefore be reproducible across laboratories using different platforms.
Automatic identification of notifiable diseases from electronic medical records can potentially improve the timeliness and completeness of public health surveillance. We describe the development and implementation of an algorithm for prospective surveillance of patients with acute hepatitis B using electronic medical record data.
Initial algorithms were created by adapting Centers for Disease Control and Prevention diagnostic criteria for acute hepatitis B into electronic terms. The algorithms were tested by applying them to ambulatory electronic medical record data spanning 1990 to May 2006. A physician reviewer classified each case identified as acute or chronic infection. Additional criteria were added to algorithms in serial fashion to improve accuracy. The best algorithm was validated by applying it to prospective electronic medical record data from June 2006 through April 2008. Completeness of case capture was assessed by comparison with state health department records.
A final algorithm including a positive hepatitis B specific test, elevated transaminases and bilirubin, absence of prior positive hepatitis B tests, and absence of an ICD9 code for chronic hepatitis B identified 112/113 patients with acute hepatitis B (sensitivity 97.4%, 95% confidence interval 94–100%; specificity 93.8%, 95% confidence interval 87–100%). Application of this algorithm to prospective electronic medical record data identified 8 cases without false positives. These included 4 patients that had not been reported to the health department. There were no known cases of acute hepatitis B missed by the algorithm.
An algorithm using codified electronic medical record data can reliably detect acute hepatitis B. The completeness of public health surveillance may be improved by automatically identifying notifiable diseases from electronic medical record data.
Rationale: Computed tomography (CT) scanning of the lung may reduce phenotypic heterogeneity in defining subjects with chronic obstructive pulmonary disease (COPD), and allow identification of genetic determinants of emphysema severity and distribution.
Objectives: We sought to identify genes associated with CT scan distribution of emphysema in individuals without α1-antitrypsin deficiency but with severe COPD.
Methods: We evaluated baseline CT densitometry phenotypes in 282 individuals with emphysema enrolled in the Genetics Ancillary Study of the National Emphysema Treatment Trial, and used regression models to identify genetic variants associated with emphysema distribution.
Measurements and Main Results: Emphysema distribution was assessed by two methods—assessment by radiologists and by computerized density mask quantitation, using a threshold of −950 Hounsfield units. A total of 77 polymorphisms in 20 candidate genes were analyzed for association with distribution of emphysema. GSTP1, EPHX1, and MMP1 polymorphisms were associated with the densitometric, apical-predominant distribution of emphysema (p value range = 0.001–0.050). When an apical-predominant phenotype was defined by the radiologist scoring method, GSTP1 and EPHX1 single-nucleotide polymorphisms were found to be significantly associated. In a case–control analysis of COPD susceptibility limited to cases with densitometric upper-lobe–predominant cases, the EPHX1 His139Arg single-nucleotide polymorphism was associated with COPD (p = 0.005).
Conclusions: Apical and basal emphysematous destruction appears to be influenced by different genes. Polymorphisms in the xenobiotic enzymes, GSTP1 and EPHX1, are associated with apical-predominant emphysema. Altered detoxification of cigarette smoke metabolites may contribute to emphysema distribution, and these findings may lead to further insight into genetic determinants of emphysema.
COPD; genetics; association analysis; computed tomography; emphysema
Although there is now plenty of genomic data and no shortage of analysis methods for translational genomic research, many biologists do not have efficient and transparent access to the computational resources they need. No single data resource or analysis application is ever likely to efficiently address all aspects of any individual researcher’s needs, so most researchers are forced to manually integrate data and outputs from multiple resources. The inevitable heterogeneity of data formats and of command syntax between data resources and software applications presents a major obstacle, particularly to those biologists lacking practical informatics skills. We describe some design and implementation features of an open-source application that supports the integration of the best available third-party genomics software applications, data and annotation resources into a coherent framework, substantially overcoming many practical challenges associated with actually doing translational genomic research.
Rationale: Patients with severe chronic obstructive pulmonary disease (COPD) may have varying levels of disability despite similar levels of lung function. This variation may reflect different COPD subtypes, which may have different genetic predispositions.
Objectives: To identify genetic associations for COPD-related phenotypes, including measures of exercise capacity, pulmonary function, and respiratory symptoms.
Methods: In 304 subjects from the National Emphysema Treatment Trial, we genotyped 80 markers in 22 positional and/or biologically plausible candidate genes. Regression models were used to test for association, using a test–replication approach to guard against false-positive results. For significant associations, effect estimates were recalculated using the entire cohort. Positive associations with dyspnea were confirmed in families from the Boston Early-Onset COPD Study.
Results: The test–replication approach identified four genes—microsomal epoxide hydrolase (EPHX1), latent transforming growth factor-β binding protein-4 (LTBP4), surfactant protein B (SFTPB), and transforming growth factor-β1 (TGFB1)—that were associated with COPD-related phenotypes. In all subjects, single-nucleotide polymorphisms (SNPs) in EPHX1 (p ⩽ 0.03) and in LTBP4 (p ⩽ 0.03) were associated with maximal output on cardiopulmonary exercise testing. Markers in LTBP4 (p ⩽ 0.05) and SFTPB (p = 0.005) were associated with 6-min walk test distance. SNPs in EPHX1 were associated with carbon monoxide diffusing capacity (p ⩽ 0.04). Three SNPs in TGFB1 were associated with dyspnea (p ⩽ 0.002), one of which replicated in the family study (p = 0.02).
Conclusions: Polymorphisms in several genes seem to be associated with COPD-related traits other than FEV1. These associations may identify genes in pathways important for COPD pathogenesis.
dyspnea; emphysema; exercise tolerance; genetic association; pulmonary function tests
Rationale: T-bet (TBX21 or T-box 21) is a critical regulator of T-helper 1 lineage commitment and IFN-γ production. Knockout mice lacking T-bet develop airway hyperresponsiveness (AHR) to methacholine, peribronchial eosinophilic and lymphocytic inflammation, and increased type III collagen deposition below the bronchial epithelium basement membrane, reminiscent of both acute and chronic asthma histopathology. Little is known regarding the role of genetic variation surrounding T-bet in the development of human AHR.
Objectives: To assess the relationship between T-bet polymorphisms and asthma-related phenotypes using family-based association.
Methods: Single nucleotide polymorphism discovery was performed by resequencing the T-bet genomic locus in 30 individuals (including 22 patients with asthma). Sixteen variants were genotyped in 580 nuclear families ascertained through offspring with asthma from the Childhood Asthma Management Program clinical trial. Haplotype patterns were determined from this genotype data. Family-based tests of association were performed with asthma, AHR, lung function, total serum immunoglobulin E, and blood eosinophil levels.
Main Results: We identified 24 variants. Evidence of association was observed between c.−7947 and asthma in white families using both additive (p = 0.02) or dominant models (p = 0.006). c.−7947 and three other variants were also associated with AHR (log-methacholine PC20, p = 0.02–0.04). Haplotype analysis suggested that an AHR locus is in linkage disequilibrium with variants in the 3′UTR. Evidence of association of AHR with c.−7947, but not with other 3′UTR SNPs, was replicated in an independent cohort of adult males with AHR.
Conclusions: These data suggest that T-bet variation contributes to airway responsiveness in asthma.
immunoglobulin E; single nucleotide polymorphism; T-box; TBX21
Many systems for routine public health surveillance rely on centralized collection of potentially identifiable, individual, identifiable personal health information (PHI) records. Although individual, identifiable patient records are essential for conditions for which there is mandated reporting, such as tuberculosis or sexually transmitted diseases, they are not routinely required for effective syndromic surveillance. Public concern about the routine collection of large quantities of PHI to support non-traditional public health functions may make alternative surveillance methods that do not rely on centralized identifiable PHI databases increasingly desirable.
The National Bioterrorism Syndromic Surveillance Demonstration Program (NDP) is an example of one alternative model. All PHI in this system is initially processed within the secured infrastructure of the health care provider that collects and holds the data, using uniform software distributed and supported by the NDP. Only highly aggregated count data is transferred to the datacenter for statistical processing and display.
Detailed, patient level information is readily available to the health care provider to elucidate signals observed in the aggregated data, or for ad hoc queries. We briefly describe the benefits and disadvantages associated with this distributed processing model for routine automated syndromic surveillance.
For well-defined surveillance requirements, the model can be successfully deployed with very low risk of inadvertent disclosure of PHI – a feature that may make participation in surveillance systems more feasible for organizations and more appealing to the individuals whose PHI they hold. It is possible to design and implement distributed systems to support non-routine public health needs if required.