PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (56)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
more »
1.  The Value of Statistical or Bioinformatics Annotation for Rare Variant Association with Quantitative Trait 
Genetic epidemiology  2013;37(7):666-674.
In the past few years, a plethora of methods for rare variant association with phenotype have been proposed. These methods aggregate information from multiple rare variants across genomic region(s), but there is little consensus as to which method is most effective. The weighting scheme adopted when aggregating information across variants is one of the primary determinants of effectiveness. Here we present a systematic evaluation of multiple weighting schemes through a series of simulations intended to mimic large sequencing studies of a quantitative trait. We evaluate existing phenotype-independent and -dependent methods, as well as weights estimated by penalized regression approaches including Lasso, Elastic Net and SCAD. We find that the difference in power between phenotype-dependent schemes is negligible when high quality functional annotations are available. When functional annotations are unavailable or incomplete, all methods suffer from power loss; however, the variable selection methods outperform the others at the cost of increased computational time. Therefore, in the absence of good annotation, we recommend variable selection methods (which can be viewed as “statistical annotation”) on top regions implicated by a phenotype independent weighting scheme. Further, once a region is implicated, variable selection can help to identify potential causal SNPs for biological validation. These findings are supported by an analysis of a high coverage targeted sequencing study of 1898 individuals.
doi:10.1002/gepi.21747
PMCID: PMC4083762  PMID: 23836599
rare variants; association; weighting; variable selection; variant annotation
2.  Unified Analysis of Secondary Traits in Case-Control Association Studies 
Journal of the American Statistical Association  2013;108(502):10.1080/01621459.2013.793121.
It has been repeatedly shown that in case-control association studies, analysis of a secondary trait which ignores the original sampling scheme can produce highly biased risk estimates. Although a number of approaches have been proposed to properly analyze secondary traits, most approaches fail to reproduce the marginal logistic model assumed for the original case-control trait and/or do not allow for interaction between secondary trait and genotype marker on primary disease risk. In addition, the flexible handling of covariates remains challenging. We present a general retrospective likelihood framework to perform association testing for both binary and continuous secondary traits which respects marginal models and incorporates the interaction term. We provide a computational algorithm, based on a reparameterized approximate profile likelihood, for obtaining the maximum likelihood (ML) estimate and its standard error for the genetic effect on secondary trait, in presence of covariates. For completeness we also present an alternative pseudo-likelihood method for handling covariates. We describe extensive simulations to evaluate the performance of the ML estimator in comparison with the pseudo-likelihood and other competing methods.
doi:10.1080/01621459.2013.793121
PMCID: PMC3881430  PMID: 24409003
3.  Biomarkers of Exposure and Effect in Human Lymphoblastoid TK6 Cells Following [13C2]-Acetaldehyde Exposure 
Toxicological Sciences  2013;133(1):1-12.
Editor’s Highlight: Byproducts of constitutive metabolism may themselves be toxic, complicating the risk assessment of the same chemicals encountered from external sources. The application of stable labeled compounds offers insight into the source of chemicals producing biological effects and provides a basis to quantify the contribution of exogenous exposure to biological events. This report describes the concentration dependent contributions of exogenous [13C2]-acetaldehyde and endogenously produced acetaldehyde to adduct formation in human lymphoblastoid cells in vitro. — Jeffrey Fisher
The dose-response relationship for biomarkers of exposure (N2-ethylidene-dG adducts) and effect (cell survival and micronucleus formation) was determined across 4.5 orders of magnitude (50nM–2mM) using [13C2]-acetaldehyde exposures to human lymphoblastoid TK6 cells for 12h. There was a clear increase in exogenous N 2-ethylidene-dG formation at exposure concentrations ≥ 1µM, whereas the endogenous adducts remained nearly constant across all exposure concentrations, with an average of 3.0 adducts/107 dG. Exogenous adducts were lower than endogenous adducts at concentrations ≤ 10µM and were greater than endogenous adducts at concentrations ≥ 250µM. When the endogenous and exogenous adducts were summed together, statistically significant increases in total adduct formation over the endogenous background occurred at 50µM. Cell survival and micronucleus formation were monitored across the exposure range and statistically significant decreases in cell survival and increases in micronucleus formation occurred at ≥ 1000µM. This research supports the hypothesis that endogenously produced reactive species, including acetaldehyde, are always present and constitute the majority of the observed biological effects following very low exposures to exogenous acetaldehyde. These data can replace default assumptions of linear extrapolation to very low doses of exogenous acetaldehyde for risk prediction.
doi:10.1093/toxsci/kft029
PMCID: PMC3627555  PMID: 23425604
acetaldehyde; DNA adduct; micronucleus; biomarker of exposure; biomarker of effect; liquid chromatography–; mass spectrometry.
4.  Standardizing Benchmark Dose Calculations to Improve Science-Based Decisions in Human Health Assessments 
Environmental Health Perspectives  2014;122(5):499-505.
Background: Benchmark dose (BMD) modeling computes the dose associated with a prespecified response level. While offering advantages over traditional points of departure (PODs), such as no-observed-adverse-effect-levels (NOAELs), BMD methods have lacked consistency and transparency in application, interpretation, and reporting in human health assessments of chemicals.
Objectives: We aimed to apply a standardized process for conducting BMD modeling to reduce inconsistencies in model fitting and selection.
Methods: We evaluated 880 dose–response data sets for 352 environmental chemicals with existing human health assessments. We calculated benchmark doses and their lower limits [10% extra risk, or change in the mean equal to 1 SD (BMD/L10/1SD)] for each chemical in a standardized way with prespecified criteria for model fit acceptance. We identified study design features associated with acceptable model fits.
Results: We derived values for 255 (72%) of the chemicals. Batch-calculated BMD/L10/1SD values were significantly and highly correlated (R2 of 0.95 and 0.83, respectively, n = 42) with PODs previously used in human health assessments, with values similar to reported NOAELs. Specifically, the median ratio of BMDs10/1SD:NOAELs was 1.96, and the median ratio of BMDLs10/1SD:NOAELs was 0.89. We also observed a significant trend of increasing model viability with increasing number of dose groups.
Conclusions: BMD/L10/1SD values can be calculated in a standardized way for use in health assessments on a large number of chemicals and critical effects. This facilitates the exploration of health effects across multiple studies of a given chemical or, when chemicals need to be compared, providing greater transparency and efficiency than current approaches.
Citation: Wignall JA, Shapiro AJ, Wright FA, Woodruff TJ, Chiu WA, Guyton KZ, Rusyn I. 2014. Standardizing benchmark dose calculations to improve science-based decisions in human health assessments. Environ Health Perspect 122:499–505; http://dx.doi.org/10.1289/ehp.1307539
doi:10.1289/ehp.1307539
PMCID: PMC4014768  PMID: 24569956
5.  Physiologically Based Pharmacokinetic (PBPK) Modeling of Interstrain Variability in Trichloroethylene Metabolism in the Mouse 
Environmental Health Perspectives  2014;122(5):456-463.
Background: Quantitative estimation of toxicokinetic variability in the human population is a persistent challenge in risk assessment of environmental chemicals. Traditionally, interindividual differences in the population are accounted for by default assumptions or, in rare cases, are based on human toxicokinetic data.
Objectives: We evaluated the utility of genetically diverse mouse strains for estimating toxicokinetic population variability for risk assessment, using trichloroethylene (TCE) metabolism as a case study.
Methods: We used data on oxidative and glutathione conjugation metabolism of TCE in 16 inbred and 1 hybrid mouse strains to calibrate and extend existing physiologically based pharmacokinetic (PBPK) models. We added one-compartment models for glutathione metabolites and a two-compartment model for dichloroacetic acid (DCA). We used a Bayesian population analysis of interstrain variability to quantify variability in TCE metabolism.
Results: Concentration–time profiles for TCE metabolism to oxidative and glutathione conjugation metabolites varied across strains. Median predictions for the metabolic flux through oxidation were less variable (5-fold range) than that through glutathione conjugation (10-fold range). For oxidative metabolites, median predictions of trichloroacetic acid production were less variable (2-fold range) than DCA production (5-fold range), although the uncertainty bounds for DCA exceeded the predicted variability.
Conclusions: Population PBPK modeling of genetically diverse mouse strains can provide useful quantitative estimates of toxicokinetic population variability. When extrapolated to lower doses more relevant to environmental exposures, mouse population-derived variability estimates for TCE metabolism closely matched population variability estimates previously derived from human toxicokinetic studies with TCE, highlighting the utility of mouse interstrain metabolism studies for addressing toxicokinetic variability.
Citation: Chiu WA, Campbell JL Jr, Clewell HJ III, Zhou YH, Wright FA, Guyton KZ, Rusyn I. 2014. Physiologically based pharmacokinetic (PBPK) modeling of interstrain variability in trichloroethylene metabolism in the mouse. Environ Health Perspect 122:456–463; http://dx.doi.org/10.1289/ehp.1307623
doi:10.1289/ehp.1307623
PMCID: PMC4014769  PMID: 24518055
6.  Genomewide Association for Schizophrenia in the CATIE Study: Results of Stage 1 
Molecular psychiatry  2008;13(6):570-584.
Background
Little is known for certain about the genetics of schizophrenia. The advent of genomewide association has been widely anticipated as holding promise as a means to identify reproducible DNA sequence variation associated with this important and debilitating disorder.
Methods
738 cases with DSM-IV schizophrenia (all participants in the CATIE study) and 733 group-matched controls were genotyped for 492,900 single nucleotide polymorphisms (SNPs) using the Affymetrix 500K two chip genotyping platform plus a custom 164K fill-in chip. Following multiple quality control steps for both subjects and SNPs, logistic regression analyses were used to assess the evidence for association of all SNPs with schizophrenia.
Results
We identified a number of promising SNPs for follow-up studies, although no SNP or multi-marker combination of SNPs achieved genomewide statistical significance. Although a few signals coincided with genomic regions previously implicated in schizophrenia, chance could not be excluded.
Conclusions
These data do not provide evidence for the involvement of any genomic region with schizophrenia detectable with moderate sample size. However, planned GWAS for response phenotypes and inclusion of individual phenotype and genotype data from this study in meta-analyses holds promise for the eventual identification of susceptibility and protective variants.
doi:10.1038/mp.2008.25
PMCID: PMC3910086  PMID: 18347602
schizophrenia; genome-wide association; CATIE
7.  ToxPi GUI: an interactive visualization tool for transparent integration of data from diverse sources of evidence 
Bioinformatics  2012;29(3):402-403.
Motivation: Scientists and regulators are often faced with complex decisions, where use of scarce resources must be prioritized using collections of diverse information. The Toxicological Prioritization Index (ToxPi™) was developed to enable integration of multiple sources of evidence on exposure and/or safety, transformed into transparent visual rankings to facilitate decision making. The rankings and associated graphical profiles can be used to prioritize resources in various decision contexts, such as testing chemical toxicity or assessing similarity of predicted compound bioactivity profiles. The amount and types of information available to decision makers are increasing exponentially, while the complex decisions must rely on specialized domain knowledge across multiple criteria of varying importance. Thus, the ToxPi bridges a gap, combining rigorous aggregation of evidence with ease of communication to stakeholders.
Results: An interactive ToxPi graphical user interface (GUI) application has been implemented to allow straightforward decision support across a variety of decision-making contexts in environmental health. The GUI allows users to easily import and recombine data, then analyze, visualize, highlight, export and communicate ToxPi results. It also provides a statistical metric of stability for both individual ToxPi scores and relative prioritized ranks.
Availability: The ToxPi GUI application, complete user manual and example data files are freely available from http://comptox.unc.edu/toxpi.php.
Contact: reif.david@gmail.com
doi:10.1093/bioinformatics/bts686
PMCID: PMC3988461  PMID: 23202747
8.  Sex differences in the human peripheral blood transcriptome 
BMC Genomics  2014;15:33.
Background
Genomes of men and women differ in only a limited number of genes located on the sex chromosomes, whereas the transcriptome is far more sex-specific. Identification of sex-biased gene expression will contribute to understanding the molecular basis of sex-differences in complex traits and common diseases.
Results
Sex differences in the human peripheral blood transcriptome were characterized using microarrays in 5,241 subjects, accounting for menopause status and hormonal contraceptive use. Sex-specific expression was observed for 582 autosomal genes, of which 57.7% was upregulated in women (female-biased genes). Female-biased genes were enriched for several immune system GO categories, genes linked to rheumatoid arthritis (16%) and genes regulated by estrogen (18%). Male-biased genes were enriched for genes linked to renal cancer (9%). Sex-differences in gene expression were smaller in postmenopausal women, larger in women using hormonal contraceptives and not caused by sex-specific eQTLs, confirming the role of estrogen in regulating sex-biased genes.
Conclusions
This study indicates that sex-bias in gene expression is extensive and may underlie sex-differences in the prevalence of common diseases.
doi:10.1186/1471-2164-15-33
PMCID: PMC3904696  PMID: 24438232
10.  THE INTERACTIVE DECISION COMMITTEE FOR CHEMICAL TOXICITY ANALYSIS 
Journal of statistical research  2012;46(2):157-186.
SUMMARY
We introduce the Interactive Decision Committee method for classification when high-dimensional feature variables are grouped into feature categories. The proposed method uses the interactive relationships among feature categories to build base classifiers which are combined using decision committees. A two-stage or a single-stage 5-fold cross-validation technique is utilized to decide the total number of base classifiers to be combined. The proposed procedure is useful for classifying biochemicals on the basis of toxicity activity, where the feature space consists of chemical descriptors and the responses are binary indicators of toxicity activity. Each descriptor belongs to at least one descriptor category. The support vector machine, the random forests, and the tree-based AdaBoost algorithms are utilized as classifier inducers. Forward selection is used to select the best combinations of the base classifiers given the number of base classifiers. Simulation studies demonstrate that the proposed method outperforms a single large, unaggregated classifier in the presence of interactive feature category information. We applied the proposed method to two toxicity data sets associated with chemical compounds. For these data sets, the proposed method improved classification performance for the majority of outcomes compared to a single large, unaggregated classifier.
PMCID: PMC3887560  PMID: 24415822
Chemical toxicity; Decision committee method; Ensemble; Ensemble feature selection; QSAR modeling; Statistical learning
12.  GENETIC MODIFIERS OF LIVER DISEASE IN CYSTIC FIBROSIS 
Context
A subset (~3–5%) of patients with cystic fibrosis (CF) develops severe liver disease (CFLD) with portal hypertension.
Objective
To assess whether any of 9 polymorphisms in 5 candidate genes (SERPINA1, ACE, GSTP1, MBL2, and TGFB1) are associated with severe liver disease in CF patients.
Design, Setting, and Participants
A 2-stage design was used in this case–control study. CFLD subjects were enrolled from 63 U.S., 32 Canadian, and 18 CF centers outside of North America, with the University of North Carolina at Chapel Hill (UNC) as the coordinating site. In the initial study, we studied 124 CFLD patients (enrolled 1/1999–12/2004) and 843 CF controls (patients without CFLD) by genotyping 9 polymorphisms in 5 genes previously implicated as modifiers of liver disease in CF. In the second stage, the SERPINA1 Z allele and TGFB1 codon 10 genotype were tested in an additional 136 CFLD patients (enrolled 1/2005–2/2007) and 1088 CF controls.
Main Outcome Measures
We compared differences in distribution of genotypes in CF patients with severe liver disease versus CF patients without CFLD.
Results
The initial study showed CFLD to be associated with the SERPINA1 (also known as α1-antiprotease and α1-antitrypsin) Z allele (P value=3.3×10−6; odds ratio (OR) 4.72, 95% confidence interval (CI) 2.31–9.61), and with transforming growth factor β-1 (TGFB1) codon 10 CC genotype (P=2.8×10−3; OR 1.53, CI 1.16–2.03). In the replication study, CFLD was associated with the SERPINA1 Z allele (P=1.4×10−3; OR 3.42, CI 1.54–7.59), but not with TGFB1 codon 10. A combined analysis of the initial and replication studies by logistic regression showed CFLD to be associated with SERPINA1 Z allele (P=1.5×10−8; OR 5.04, CI 2.88–8.83).
Conclusion
The SERPINA1 Z allele is a risk factor for liver disease in CF. Patients who carry the Z allele are at greater odds (OR ~5) to develop severe liver disease with portal hypertension.
doi:10.1001/jama.2009.1295
PMCID: PMC3711243  PMID: 19738092
13.  Exome sequencing of extreme phenotypes identifies DCTN4 as a modifier of chronic Pseudomonas aeruginosa infection in cystic fibrosis 
Nature genetics  2012;44(8):886-889.
Exome sequencing has become a powerful and effective strategy for discovery of genes underlying Mendelian disorders1. However, use of exome sequencing to identify variants associated with complex traits has been more challenging, partly because the samples sizes needed for adequate power may be very large2. One strategy to increase efficiency is to sequence individuals who are at both ends of a phenotype distribution (i.e., extreme phenotypes). Because the frequency of alleles that contribute to the trait are enriched in one or both extremes of phenotype, a modest sample size can potentially identify novel candidate genes/alleles3. As part of the National Heart, Lung, and Blood Institute Exome Sequencing Project (ESP), we used an extreme phenotype design to discover that variants in DCTN4, encoding a dynactin protein, are associated with time to first Pseudomonas aeruginosa (P. aeruginosa) airway infection, chronic P. aeruginosa infection and mucoid P. aeruginosa among individuals with cystic fibrosis (MIM219700).
doi:10.1038/ng.2344
PMCID: PMC3702264  PMID: 22772370
14.  Quantitative High-Throughput Screening for Chemical Toxicity in a Population-Based In Vitro Model 
Toxicological Sciences  2012;126(2):578-588.
A shift in toxicity testing from in vivo to in vitro may efficiently prioritize compounds, reveal new mechanisms, and enable predictive modeling. Quantitative high-throughput screening (qHTS) is a major source of data for computational toxicology, and our goal in this study was to aid in the development of predictive in vitro models of chemical-induced toxicity, anchored on interindividual genetic variability. Eighty-one human lymphoblast cell lines from 27 Centre d’Etude du Polymorphisme Humain trios were exposed to 240 chemical substances (12 concentrations, 0.26nM–46.0μM) and evaluated for cytotoxicity and apoptosis. qHTS screening in the genetically defined population produced robust and reproducible results, which allowed for cross-compound, cross-assay, and cross-individual comparisons. Some compounds were cytotoxic to all cell types at similar concentrations, whereas others exhibited interindividual differences in cytotoxicity. Specifically, the qHTS in a population-based human in vitro model system has several unique aspects that are of utility for toxicity testing, chemical prioritization, and high-throughput risk assessment. First, standardized and high-quality concentration-response profiling, with reproducibility confirmed by comparison with previous experiments, enables prioritization of chemicals for variability in interindividual range in cytotoxicity. Second, genome-wide association analysis of cytotoxicity phenotypes allows exploration of the potential genetic determinants of interindividual variability in toxicity. Furthermore, highly significant associations identified through the analysis of population-level correlations between basal gene expression variability and chemical-induced toxicity suggest plausible mode of action hypotheses for follow-up analyses. We conclude that as the improved resolution of genetic profiling can now be matched with high-quality in vitro screening data, the evaluation of the toxicity pathways and the effects of genetic diversity are now feasible through the use of human lymphoblast cell lines.
doi:10.1093/toxsci/kfs023
PMCID: PMC3307611  PMID: 22268004
chemical cytotoxicity; apoptosis; HapMap; lymphoblasts; qHTS
15.  Molecular Subtypes in Head and Neck Cancer Exhibit Distinct Patterns of Chromosomal Gain and Loss of Canonical Cancer Genes 
PLoS ONE  2013;8(2):e56823.
Head and neck squamous cell carcinoma (HNSCC) is a frequently fatal heterogeneous disease. Beyond the role of human papilloma virus (HPV), no validated molecular characterization of the disease has been established. Using an integrated genomic analysis and validation methodology we confirm four molecular classes of HNSCC (basal, mesenchymal, atypical, and classical) consistent with signatures established for squamous carcinoma of the lung, including deregulation of the KEAP1/NFE2L2 oxidative stress pathway, differential utilization of the lineage markers SOX2 and TP63, and preference for the oncogenes PIK3CA and EGFR. For potential clinical use the signatures are complimentary to classification by HPV infection status as well as the putative high risk marker CCND1 copy number gain. A molecular etiology for the subtypes is suggested by statistically significant chromosomal gains and losses and differential cell of origin expression patterns. Model systems representative of each of the four subtypes are also presented.
doi:10.1371/journal.pone.0056823
PMCID: PMC3579892  PMID: 23451093
16.  Empirical pathway analysis, without permutation 
Biostatistics (Oxford, England)  2013;14(3):573-585.
Resampling-based expression pathway analysis techniques have been shown to preserve type I error rates, in contrast to simple gene-list approaches that implicitly assume the independence of genes in ranked lists. However, resampling is intensive in computation time and memory requirements. We describe accurate analytic approximations to permutations of score statistics, including novel approaches for Pearson's correlation, and summed score statistics, that have good performance for even relatively small sample sizes. Our approach preserves the essence of permutation pathway analysis, but with greatly reduced computation. Extensions for inclusion of covariates and censored data are described, and we test the performance of our procedures using simulations based on real datasets. These approaches have been implemented in the new R package safeExpress.
doi:10.1093/biostatistics/kxt004
PMCID: PMC3677738  PMID: 23428933
Gene sets; Multiple hypothesis testing; Permutation approximation
17.  Computational tools for discovery and interpretation of expression quantitative trait loci 
Pharmacogenomics  2012;13(3):343-352.
Expression quantitative trait locus (eQTL) analysis is rapidly moving from a cutting-edge concept in genomics to a mature area of investigation, with important connections to genome-wide association studies for human disease, pharmacogenomics and toxicogenomics. Despite the importance of the topic, many investigators must develop their own code or use tools not specifically suited for eQTL analysis. Convenient computational tools are becoming available, but they are not widely publicized, and investigators who are interested in discovery or eQTL, or in using them to interpret genome-wide association study results may have difficulty navigating the available resources. The purpose of this review is to help investigators find appropriate programs for eQTL analysis and interpretation.
doi:10.2217/pgs.11.185
PMCID: PMC3295835  PMID: 22304583
bioinformatics; fast linear modeling; gene expression
18.  A powerful and flexible approach to the analysis of RNA sequence count data 
Bioinformatics  2011;27(19):2672-2678.
Motivation: A number of penalization and shrinkage approaches have been proposed for the analysis of microarray gene expression data. Similar techniques are now routinely applied to RNA sequence transcriptional count data, although the value of such shrinkage has not been conclusively established. If penalization is desired, the explicit modeling of mean–variance relationships provides a flexible testing regimen that ‘borrows’ information across genes, while easily incorporating design effects and additional covariates.
Results: We describe BBSeq, which incorporates two approaches: (i) a simple beta-binomial generalized linear model, which has not been extensively tested for RNA-Seq data and (ii) an extension of an expression mean–variance modeling approach to RNA-Seq data, involving modeling of the overdispersion as a function of the mean. Our approaches are flexible, allowing for general handling of discrete experimental factors and continuous covariates. We report comparisons with other alternate methods to handle RNA-Seq data. Although penalized methods have advantages for very small sample sizes, the beta-binomial generalized linear model, combined with simple outlier detection and testing approaches, appears to have favorable characteristics in power and flexibility.
Availability: An R package containing examples and sample datasets is available at http://www.bios.unc.edu/research/genomic_software/BBSeq
Contact: yzhou@bios.unc.edu; fwright@bios.unc.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btr449
PMCID: PMC3179656  PMID: 21810900
19.  Control of population stratification by correlation-selected principal components 
Biometrics  2010;67(3):967-974.
Summary
In genome-wide association studies, population stratification is recognized as producing inflated type I error due to the inflation of test statistics. Principal component-based methods applied to genotypes provide information about population structure, and have been widely used to control for stratification. Here we explore the precise relationship between genotype principal components and inflation of association test statistics, thereby drawing a connection between principal component-based stratification control and the alternative approach of genomic control. Our results provide an inherent justification for the use of principal components, but call into question the popular practice of selecting principal components based on significance of eigenvalues alone. We propose a new approach, called EigenCorr, which selects principal components based on both their eigenvalues and their correlation with the (disease) phenotype. Our approach tends to select fewer principal components for stratification control than does testing of eigenvalues alone, providing substantial computational savings and improvements in power. Analyses of simulated and real data demonstrate the usefulness of the proposed approach.
doi:10.1111/j.1541-0420.2010.01520.x
PMCID: PMC3117098  PMID: 21133882
Genomic Control; GWAS; PCA; Population Stratification
20.  A Novel Lung Disease Phenotype Adjusted for Mortality Attrition for Cystic Fibrosis Genetic Modifier Studies 
Pediatric pulmonology  2011;46(9):857-869.
SUMMARY
Genetic studies of lung disease in Cystic Fibrosis are hampered by the lack of a severity measure that accounts for chronic disease progression and mortality attrition. Further, combining analyses across studies requires common phenotypes that are robust to study design and patient ascertainment.
Using data from the North American Cystic Fibrosis Modifier Consortium (Canadian Consortium for CF Genetic Studies, Johns Hopkins University CF Twin and Sibling Study, and University of North Carolina/Case Western Reserve University Gene Modifier Study), the authors calculated age-specific CF percentile values of FEV1 which were adjusted for CF age-specific mortality data.
The phenotype was computed for 2061 patients representing the Canadian CF population, 1137 extreme phenotype patients in the UNC/Case Western study, and 1323 patients from multiple CF sib families in the CF Twin and Sibling Study. Despite differences in ascertainment and median age, our phenotype score was distributed in all three samples in a manner consistent with ascertainment differences, reflecting the lung disease severity of each individual in the underlying population. The new phenotype score was highly correlated with the previously recommended complex phenotype, but the new phenotype is more robust for shorter follow-up and for extreme ages.
A disease progression and mortality adjusted phenotype reduces the need for stratification or additional covariates, increasing statistical power and avoiding possible distortions. This approach will facilitate large scale genetic and environmental epidemiological studies which will provide targeted therapeutic pathways for the clinical benefit of patients with CF.
doi:10.1002/ppul.21456
PMCID: PMC3130075  PMID: 21462361
Forced Expiratory Volume; Age Effects; Severity of Illness Index
21.  MicroRNA Expression in the Livers of Inbred Mice 
Mutation research  2011;714(1-2):126-133.
MicroRNAs are short, non-coding RNA sequences that regulate genes at the post-transcriptional level and have been shown to be important in development, tissue differentiation, and disease. Limited attention has been given to the natural variation in miRNA expression across genetically diverse populations even though it is well established that genetic polymorphisms can have a profound effect on mRNA levels. Expression level of 577 miRNAs in the livers of 70 strains of inbred mice was assessed, and we found that miRNA expression is highly stable across different strains. Globally, the expression of miRNA target transcripts does not correlate with miRNA expression, primarily due to the low variance of miRNA but high variance of mRNA expression across strains. Our results show that there is little genetic effect on the baseline miRNA levels in murine liver. The stability of mouse liver miRNA expression in a genetically diverse population suggests that treatment-induced disruptions in liver miRNA expression, a phenomenon established for a large number of toxicants, may indicate an important mechanism for the disturbance of normal liver function, and may prove to be a useful genetic background-independent biomarker of toxicant effect.
doi:10.1016/j.mrfmmm.2011.05.007
PMCID: PMC3166582  PMID: 21616085
micro RNA; liver; mouse; gene expression
22.  NF-κB is activated by radiotherapy and is prognostic for overall survival in patients with rectal cancer treated with preoperative fluorouracil-based chemoradiation 
Purpose
Rectal cancer is often clinically resistant to radiotherapy and there would be value to identifying molecular markers to define the biological basis for this phenomenon. NF-κB is a potentially anti-apoptotic transcription factor that has been associated with resistance to radiotherapy in model systems. This study was designed to evaluate NF- κB activation in rectal cancers being treated with chemoradiation to determine whether NF- κB activity correlates with outcome in rectal cancer
Methods and Materials
22 patients were biopsied at multiple time points in a prospective study, and another 50 were analyzed retrospectively. Pre-treatment tumor tissue was analyzed for multiple NF- κB subunits by immunohistochemistry (IHC). Serial tumor biopsies were analyzed for NF- κB-regulated gene expression by RT-PCR and for NF-κB subunit nuclear localization by IHC.
Results
Several NF- κB target genes (Bcl-2, cIAP-2, IL-8 and TRAF1) were significantly upregulated by a single fraction of radiotherapy at 24 hours demonstrating for the first time that NF-κB is activated by radiotherapy in human rectal tumors. Baseline NF-κB p50 nuclear expression did not correlate with pathologic response to radiotherapy, but increasing baseline p50 was prognostic for overall survival (HR 2.15, p = 0.040).
Conclusions
NF-κB nuclear expression at baseline in rectal cancer is prognostic for overall survival but not predictive of response to radiotherapy. Larger patient numbers would be needed to assess the effect of NF-κB target gene upregulation on response to RT. Our results suggest that NF-κB may play an important role in tumor metastasis as opposed to resistance to chemoradiotherapy.
doi:10.1016/j.ijrobp.2010.02.063
PMCID: PMC3010530  PMID: 20630669
23.  Multiple apical plasma membrane constituents are associated with susceptibility to meconium ileus in individuals with cystic fibrosis 
Nature Genetics  2012;44(5):562-569.
Variants associated with meconium ileus in cystic fibrosis (CF) were identified in 3,763 patients by GWAS. Five SNPs at two loci near SLC6A14 (min P=1.28×10−12 at rs3788766), chr Xq23-24 and SLC26A9 (min P=9.88×10−9 at rs4077468), chr 1q32.1 accounted for ~5% of the phenotypic variability, and were replicated in an independent patient collection (n=2,372; P=0.001 and 0.0001 respectively). By incorporating that disease-causing mutations in CFTR alter electrolyte and fluid flux across epithelia into an hypothesis-driven genome-wide analysis (GWAS-HD), we identified the same SLC6A14 and SLC26A9 associated SNPs, while establishing evidence for the involvement of SNPs in a third solute carrier gene, SLC9A3. In addition, GWAS-HD provided evidence of association between meconium ileus and multiple constituents of the apical plasma membrane where CFTR resides (P=0.0002, testing 155 apical genes jointly and replicated, P=0.022). These findings suggest that modulating activities of apical membrane constituents could complement current therapeutic paradigms for cystic fibrosis.
doi:10.1038/ng.2221
PMCID: PMC3371103  PMID: 22466613
24.  Genome-wide association and linkage identify modifier loci of lung disease severity in cystic fibrosis at 11p13 and 20q13.2 
Nature Genetics  2011;43(6):539-546.
A combined genome-wide association and linkage study was used to identify loci causing variation in CF lung disease severity. A significant association (P=3. 34 × 10-8) near EHF and APIP (chr11p13) was identified in F508del homozygotes (n=1,978). The association replicated in F508del homozygotes (P=0.006) from a separate family-based study (n=557), with P=1.49 × 10-9 for the three-study joint meta-analysis. Linkage analysis of 486 sibling pairs from the family-based study identified a significant QTL on chromosome 20q13.2 (LOD=5.03). Our findings provide insight into the causes of variation in lung disease severity in CF and suggest new therapeutic targets for this life-limiting disorder.
doi:10.1038/ng.838
PMCID: PMC3296486  PMID: 21602797
25.  Understanding the Population Structure of North American Patients with Cystic Fibrosis 
Clinical genetics  2011;79(2):136-146.
Rationale
It is generally presumed that the Cystic Fibrosis (CF) population is relatively homogeneous, and predominantly of European origin. The complex ethnic make-up observed in the CF patients collected by the North American CF Modifier Gene Consortium has brought this assumption into question, and suggested the potential for population substructure in the three CF study samples collected from North America. It is well appreciated that population substructure can result in spurious genetic associations.
Objectives
To understand the ethnic composition of the North American CF population, and to assess the need for population structure adjustment in genetic association studies with North American CF patients.
Methods
Genome-wide single-nucleotide polymorphisms on 3076 unrelated North American CF patients were used to perform population structure analyses. We compared self-reported ethnicity to genotype-inferred ancestry, and also examined whether geographic distribution and CFTR mutation type could explain the structure observed.
Main Results
Although largely Caucasian, our analyses identified a considerable number of CF patients with admixed African-Caucasian, Mexican-Caucasian and Indian-Caucasian ancestries. Population substructure was present and comparable across the three studies of the consortium. Neither geographic distribution nor mutation type explained the population structure.
Conclusion
Given the ethnic diversity of the North American CF population, it is essential to carefully detect, estimate and adjust for population substructure to guard against potential spurious findings in CF genetic association studies. Other Mendelian diseases that are presumed to predominantly affect single ethnic groups may also benefit from careful analysis of population structure.
doi:10.1111/j.1399-0004.2010.01502.x
PMCID: PMC2995003  PMID: 20681990
ethnicity; principal component analysis; population substructure; population stratification

Results 1-25 (56)