Recent studies have demonstrated the use of genomic data, particularly gene expression signatures, as clinical prognostic factors in complex diseases. Such studies herald the future for genomic medicine and the opportunity for personalized prognosis in a variety of clinical contexts that utilize genomescale molecular information. Several key areas represent logical and critical next steps in the use of complex genomic profiling data towards the goal of personalized medicine. First, analyses should be geared toward the development of molecular profiles that predict future events – such as major clinical events or the response, resistance, or adverse reaction to therapy. Secondly, these must move into actual clinical practice by forming the basis for the next generation of clinical trials that will employ these methodologies to stratify patients. Lastly, there remain formidable challenges is in the translation of genomic technologies into clinical medicine that will need to be addressed: professional and public education, health outcomes research, reimbursement, regulatory oversight and privacy protection.
genomic medicine, personalized medicine, human genome.
Staphylococcus aureus causes a spectrum of human infection. Diagnostic delays and uncertainty lead to treatment delays and inappropriate antibiotic use. A growing literature suggests the host’s inflammatory response to the pathogen represents a potential tool to improve upon current diagnostics. The hypothesis of this study is that the host responds differently to S. aureus than to E. coli infection in a quantifiable way, providing a new diagnostic avenue. This study uses Bayesian sparse factor modeling and penalized binary regression to define peripheral blood gene-expression classifiers of murine and human S. aureus infection. The murine-derived classifier distinguished S. aureus infection from healthy controls and Escherichia coli-infected mice across a range of conditions (mouse and bacterial strain, time post infection) and was validated in outbred mice (AUC>0.97). A S. aureus classifier derived from a cohort of 94 human subjects distinguished S. aureus blood stream infection (BSI) from healthy subjects (AUC 0.99) and E. coli BSI (AUC 0.84). Murine and human responses to S. aureus infection share common biological pathways, allowing the murine model to classify S. aureus BSI in humans (AUC 0.84). Both murine and human S. aureus classifiers were validated in an independent human cohort (AUC 0.95 and 0.92, respectively). The approach described here lends insight into the conserved and disparate pathways utilized by mice and humans in response to these infections. Furthermore, this study advances our understanding of S. aureus infection; the host response to it; and identifies new diagnostic and therapeutic avenues.
There is great potential for host-based gene expression analysis to impact the early diagnosis of infectious diseases. In particular, the influenza pandemic of 2009 highlighted the challenges and limitations of traditional pathogen-based testing for suspected upper respiratory viral infection. We inoculated human volunteers with either influenza A (A/Brisbane/59/2007 (H1N1) or A/Wisconsin/67/2005 (H3N2)), and assayed the peripheral blood transcriptome every 8 hours for 7 days. Of 41 inoculated volunteers, 18 (44%) developed symptomatic infection. Using unbiased sparse latent factor regression analysis, we generated a gene signature (or factor) for symptomatic influenza capable of detecting 94% of infected cases. This gene signature is detectable as early as 29 hours post-exposure and achieves maximal accuracy on average 43 hours (p = 0.003, H1N1) and 38 hours (p-value = 0.005, H3N2) before peak clinical symptoms. In order to test the relevance of these findings in naturally acquired disease, a composite influenza A signature built from these challenge studies was applied to Emergency Department patients where it discriminates between swine-origin influenza A/H1N1 (2009) infected and non-infected individuals with 92% accuracy. The host genomic response to Influenza infection is robust and may provide the means for detection before typical clinical symptoms are apparent.
Current understanding of chronic diseases is based on crude clinical characterization, imaging studies, and laboratory testing that has evolved over decades. The Measurement to Understand Reclassification of Disease of Cabarrus/Kannapolis (MURDOCK) Study is a multi-tiered, longitudinal study designed to enable classification of chronic diseases using clinically annotated biospecimen collections, -omic technologies, electronic health records, and standard epidemiological methods. We expect that detailed molecular classification will improve mechanistic understanding of chronic diseases, augmenting discovery and testing of new treatments, and allowing refined selection of prevention and treatment strategies. The MURDOCK Study Community Registry and Biorepository will serve as a bridge for validation of initial exploratory studies, a platform for future prospective studies in targeted populations, and a resource of both data (analytical and clinical) and samples for cross-registry meta-analyses and comparative population studies. Participation of local health care providers and the Cabarrus County/Kannapolis, NC, community will facilitate future medical research and provide the opportunity to educate and inform the public about genomic research, actively engaging them in shaping the future of medical discovery and treatment of chronic diseases. We present the rationale and study design for the MURDOCK Community Registry and Biorepository and baseline characteristics of the first 6000 participants.
Disease reclassification; community registry; biorepository
Identify SNPs associated with mild statin-induced side effects.
Statin-induced side effects can interfere with therapy. SNPs in cytochrome P450 enzymes impair statin metabolism; the reduced function SLCO1B1*5 allele impairs statin clearance and is associated with simvastatin-induced myopathy with CK elevation.
The STRENGTH study was a pharmacogenetics study of statin efficacy and safety. Subjects (n=509) were randomized to atorvastatin 10mg, simvastatin 20mg, or pravastatin 10mg followed by 80mg, 80mg, and 40mg, respectively. We defined a composite adverse event (CAE) as discontinuation for any side effect, myalgia, or CK>3× baseline during follow-up. We sequenced CYP2D6, CYP2C8, CYP2C9, CYP3A4, and SLCO1B1 and tested seven reduced function alleles for association with the CAE.
The CAE occurred in 99 subjects (54 discontinuations, 49 myalgias, and nine CK elevations). Sex was associated with CAE (percent female in CAE vs. no CAE groups, 66% vs. 50%, p<0.01). SLCO1B1*5 was associated with CAE (percent with ≥ 1 allele in CAE vs. no CAE groups, 37% vs. 25%, p=0.03) and those with CAE with no significant CK elevation (p≤ 0.03). Furthermore, there was evidence for a gene-dose effect (percent with CAE in those with 0, 1, or 2 alleles: 19%, 27%, and 50%, trend p = 0.01). Finally, the CAE risk appeared to be highest in those carriers assigned to simvastatin.
SLCO1B1*5 genotype and female sex were associated mild statin-induced side effects. These findings expand the results of a recent genome wide association study of statin myopathy with CK > 3 times normal to milder, statin-induced, muscle side effects.
hydroxymethylglutaryl-CoA Reductase Inhibitors; pharmacogenetics; single nucleotide polymorphisms; muscular diseases; clinical trial; myopathy
Genomic risk profiling involves the analysis of genetic variations linked through statistical associations to a range of disease states. There is considerable controversy as to how, and even whether, to incorporate these tests into routine medical care.
To assess physician attitudes and uptake of genomic risk profiling among an ‘early adopter’ practice group.
We surveyed members of MDVIP, a national group of primary care physicians (PCPs), currently offering genomic risk profiling as part of their practice.
All physicians in the MDVIP network (N = 356)
We obtained a 44% response rate. One third of respondents had ordered a test for themselves and 42% for a patient. The odds of having ordered personal testing were 10.51-fold higher for those who felt well-informed about genomic risk testing (p < 0.0001). Of those who had not ordered a test for themselves, 60% expressed concerns for patients regarding discrimination by life and long-term/disability insurers, 61% about test cost, and 62% about clinical utility. The odds of ordering testing for their patients was 8.29-fold higher among respondents who had ordered testing for themselves (p < 0.0001). Of those who had ordered testing for patients, concerns about insurance coverage (p = 0.014) and uncertain clinical utility (p = 0.034) were associated with a lower relative frequency of intention to order testing again in the future.
Our findings demonstrate that respondent familiarity was a key predictor of physician ordering behavior and clinical utility was a primary concern for genomic risk profiling. Educational and interpretive support may enhance uptake of genomic risk profiling.
Electronic supplementary material
The online version of this article (doi:10.1007/s11606-011-1651-7) contains supplementary material, which is available to authorized users.
primary care; genetic testing; risk; education
We describe the study design, procedures, and development of the risk counseling protocol used in a randomized controlled trial to evaluate the impact of genetic testing for diabetes mellitus (DM) on psychological, health behavior, and clinical outcomes.
Eligible patients are aged 21 to 65 years with body mass index (BMI) ≥27 kg/m2 and no prior diagnosis of DM. At baseline, conventional DM risk factors are assessed, and blood is drawn for possible genetic testing. Participants are randomized to receive conventional risk counseling for DM with eye disease counseling or with genetic test results. The counseling protocol was pilot tested to identify an acceptable graphical format for conveying risk estimates and match the length of the eye disease to genetic counseling. Risk estimates are presented with a vertical bar graph denoting risk level with colors and descriptors. After receiving either genetic counseling regarding risk for DM or control counseling on eye disease, brief lifestyle counseling for prevention of DM is provided to all participants.
A standardized risk counseling protocol is being used in a randomized trial of 600 participants. Results of this trial will inform policy about whether risk counseling should include genetic counseling.
ClinicalTrials.gov Identifier NCT01060540
Genetic testing; Type II diabetes; Weight loss
Facing critically low return per dollar invested on clinical research and clinical care, the American biomedical enterprise is in need of a significant transformation. A confluence of high-throughput “omic” technologies and increasing adoption of the electronic health record has fueled excitement for a new paradigm for biomedical research and practice. The ability to simultaneously measure thousands of molecular variables and assess their relationships with clinical data collected during the course of care could enable reclassification of disease not only by gross phenotypic observation but according to underlying molecular mechanism and influence of social determinants.In turn, this reclassification could enable development of targeted therapeutic interventions as well as disease prevention strategies at the individual and population levels.
The MURDOCK Study consists of distinct project “horizons” or stages. Horizon 1 entailed the generation and analysis of molecular data for existing large,clinically well-annotated cohorts in four disease areas. Horizon 1.5 involves creating and maintaining a 50,000-person,community volunteer registry for biomarker signature validation and prospective studies, including integration of environmental and social data. Horizon 2 leverages and prospectively recruits Horizon 1.5 volunteers, and extends the study to additional disease areas of interest. Horizon 3 will expand the study through regional, national,and international partnerships.
The MURDOCK Study embodies a new model of team science investigation and represents a significant resource for translational research. The study team invites inquiries to form new collaborations to exploit the rich resources provided by these biospecimens and associated study data.
Stratified medicine; personalized medicine; biomarkers; disease reclassification; community registry; biorepository
The biomedical research community relies on a diverse set of resources, both within their own institutions and at other research centers. In addition, an increasing number of shared electronic resources have been developed. Without effective means to locate and query these resources, it is challenging, if not impossible, for investigators to be aware of the myriad resources available, or to effectively perform resource discovery when the need arises. In this paper, we describe the development and use of the Biomedical Resource Ontology (BRO) to enable semantic annotation and discovery of biomedical resources. We also describe the Resource Discovery System (RDS) which is a federated, inter-institutional pilot project that uses the BRO to facilitate resource discovery on the Internet. Through the RDS framework and its associated Biositemaps infrastructure, the BRO facilitates semantic search and discovery of biomedical resources, breaking down barriers and streamlining scientific research that will improve human health.
Ontology; Biositemaps; Resources; Biomedical research; Resource annotation; Resource discovery; Search; Semantic web; Web 2.0; Clinical and Translational Science Awards
Type 2 diabetes is a prevalent chronic condition globally that results in extensive morbidity, decreased quality of life, and increased health services utilization. Lifestyle changes can prevent the development of diabetes, but require patient engagement. Genetic risk testing might represent a new tool to increase patients' motivation for lifestyle changes. Here we describe the rationale, development, and design of a randomized controlled trial (RCT) assessing the clinical and personal utility of incorporating type 2 diabetes genetic risk testing into comprehensive diabetes risk assessments performed in a primary care setting.
Patients are recruited in the laboratory waiting areas of two primary care clinics and enrolled into one of three study arms. Those interested in genetic risk testing are randomized to receive either a standard risk assessment (SRA) for type 2 diabetes incorporating conventional risk factors plus upfront disclosure of the results of genetic risk testing ("SRA+G" arm), or the SRA alone ("SRA" arm). Participants not interested in genetic risk testing will not receive the test, but will receive SRA (forming a third, "no-test" arm). Risk counseling is provided by clinic staff (not study staff external to the clinic). Fasting plasma glucose, insulin levels, body mass index (BMI), and waist circumference are measured at baseline and 12 months, as are patients' self-reported behavioral and emotional responses to diabetes risk information. Primary outcomes are changes in insulin resistance and BMI after 12 months; secondary outcomes include changes in diet patterns, physical activity, waist circumference, and perceived risk of developing diabetes.
The utility, feasibility, and efficacy of providing patients with genetic risk information for common chronic diseases in primary care remain unknown. The study described here will help to establish whether providing type 2 diabetes genetic risk information in a primary care setting can help improve patients' clinical outcomes, risk perceptions, and/or their engagement in healthy behavior change. In addition, study design features such as the use of existing clinic personnel for risk counseling could inform the future development and implementation of care models for the use of individual genetic risk information in primary care.
genetic information clinical utility; genetic testing; preventive health behavior; RCT protocol; risk perception; type 2 diabetes
During the recent H1N1 influenza pandemic, excess morbidity and mortality was seen in young but not older adults suggesting that prior infection with influenza strains may have protected older subjects. In contrast, a history of recent seasonal trivalent vaccine in younger adults was not associated with protection.
Methods and Findings
To study hemagglutinin (HA) antibody responses in influenza immunization and infection, we have studied the day 7 plasma cell repertoires of subjects immunized with seasonal trivalent inactivated influenza vaccine (TIV) and compared them to the plasma cell repertoires of subjects experimentally infected (EI) with influenza H3N2 A/Wisconsin/67/2005. The majority of circulating plasma cells after TIV produced influenza-specific antibodies, while most plasma cells after EI produced antibodies that did not react with influenza HA. While anti-HA antibodies from TIV subjects were primarily reactive with single or few HA strains, anti-HA antibodies from EI subjects were isolated that reacted with multiple HA strains. Plasma cell-derived anti-HA antibodies from TIV subjects showed more evidence of clonal expansion compared with antibodies from EI subjects. From an H3N2-infected subject, we isolated a 4-member clonal lineage of broadly cross-reactive antibodies that bound to multiple HA subtypes and neutralized both H1N1 and H3N2 viruses. This broad reactivity was not detected in post-infection plasma suggesting this broadly reactive clonal lineage was not immunodominant in this subject.
The presence of broadly reactive subdominant antibody responses in some EI subjects suggests that improved vaccine designs that make broadly reactive antibody responses immunodominant could protect against novel influenza strains.
The CDC's Family History Public Health Initiative encourages adoption and increase awareness of family health history. To meet these goals and develop a personalized medicine implementation science research agenda, the Genomedical Connection is using an implementation research (T3 research) framework to develop and integrate a self-administered computerized family history system with built-in decision support into 2 primary care clinics in North Carolina.
The family health history system collects a three generation family history on 48 conditions and provides decision support (pedigree and tabular family history, provider recommendation report and patient summary report) for 4 pilot conditions: breast cancer, ovarian cancer, colon cancer, and thrombosis. All adult English-speaking, non-adopted, patients scheduled for well-visits are invited to complete the family health system prior to their appointment. Decision support documents are entered into the medical record and available to provider's prior to the appointment. In order to optimize integration, components were piloted by stakeholders prior to and during implementation. Primary outcomes are change in appropriate testing for hereditary thrombophilia and screening for breast cancer, colon cancer, and ovarian cancer one year after study enrollment. Secondary outcomes include implementation measures related to the benefits and burdens of the family health system and its impact on clinic workflow, patients' risk perception, and intention to change health related behaviors. Outcomes are assessed through chart review, patient surveys at baseline and follow-up, and provider surveys. Clinical validity of the decision support is calculated by comparing its recommendations to those made by a genetic counselor reviewing the same pedigree; and clinical utility is demonstrated through reclassification rates and changes in appropriate screening (the primary outcome).
This study integrates a computerized family health history system within the context of a routine well-visit appointment to overcome many of the existing barriers to collection and use of family history information by primary care providers. Results of the implementation process, its acceptability to patients and providers, modifications necessary to optimize the system, and impact on clinical care can serve to guide future implementation projects for both family history and other tools of personalized medicine, such as health risk assessments.
Exposure to influenza viruses is necessary, but not sufficient, for healthy human hosts to develop symptomatic illness. The host response is an important determinant of disease progression. In order to delineate host molecular responses that differentiate symptomatic and asymptomatic Influenza A infection, we inoculated 17 healthy adults with live influenza (H3N2/Wisconsin) and examined changes in host peripheral blood gene expression at 16 timepoints over 132 hours. Here we present distinct transcriptional dynamics of host responses unique to asymptomatic and symptomatic infections. We show that symptomatic hosts invoke, simultaneously, multiple pattern recognition receptors-mediated antiviral and inflammatory responses that may relate to virus-induced oxidative stress. In contrast, asymptomatic subjects tightly regulate these responses and exhibit elevated expression of genes that function in antioxidant responses and cell-mediated responses. We reveal an ab initio molecular signature that strongly correlates to symptomatic clinical disease and biomarkers whose expression patterns best discriminate early from late phases of infection. Our results establish a temporal pattern of host molecular responses that differentiates symptomatic from asymptomatic infections and reveals an asymptomatic host-unique non-passive response signature, suggesting novel putative molecular targets for both prognostic assessment and ameliorative therapeutic intervention in seasonal and pandemic influenza.
The transcriptional responses of human hosts towards influenza viral pathogens are important for understanding virus-mediated immunopathology. Despite great advances gained through studies using model organisms, the complete temporal host transcriptional responses in a natural human system are poorly understood. In a human challenge study using live influenza (H3N2/Wisconsin) viruses, we conducted a clinically uninformed (unsupervised) factor analysis on gene expression profiles and established an ab initio molecular signature that strongly correlates to symptomatic clinical disease. This is followed by the identification of 42 biomarkers whose expression patterns best differentiate early from late phases of infection. In parallel, a clinically informed (supervised) analysis revealed over-stimulation of multiple viral sensing pathways in symptomatic hosts and linked their temporal trajectory with development of diverse clinical signs and symptoms. The resultant inflammatory cytokine profiles were shown to contribute to the pathogenesis because their significant increase preceded disease manifestation by 36 hours. In subclinical asymptomatic hosts, we discovered strong transcriptional regulation of genes involved in inflammasome activation, genes encoding virus interacting proteins, and evidence of active anti-oxidant and cell-mediated innate immune response. Taken together, our findings offer insights into influenza virus-induced pathogenesis and provide a valuable tool for disease monitoring and management in natural environments.
Identifying sources of variation in expression microarray data and the effect of variance in gene expression measurements on complex predictive and diagnostic models is essential when translating microarray-based experimental approaches into clinical assays. The technical reproducibility of microarray platforms is well established. Here, we investigate the additional impact of intratumor heterogeneity, a largely unstudied component of variance, on the performance of several microarray-based assays in breast cancer.
Patients and Methods
Genome-wide expression profiling was performed on 50 core needle biopsies from 18 breast cancer patients using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Global profiles of expression were characterized using unsupervised clustering methods and variance components models. Array-based measures of estrogen receptor (ER) and progesterone receptor (PR) status were compared with immunohistochemistry. The precision of genomic predictors of ER pathway status, recurrence risk, and sensitivity to chemotherapeutics was evaluated by interclass correlation.
Global patterns of gene expression demonstrated that intratumor variation was substantially less than the total variation observed across the patient population. Nevertheless, a fraction of genes exhibited significant intratumor heterogeneity in expression. A high degree of reproducibility was observed in single-gene predictors of ER (intraclass correlation coefficient [ICC] = 0.94) and PR expression (ICC = 0.90), and in a multigene predictor of ER pathway activation (ICC = 0.98) with high concordance with immunohistochemistry. Substantial agreement was also observed for multigene signatures of cancer recurrence (ICC = 0.71) and chemotherapeutic sensitivity (ICC = 0.72 and 0.64).
Intratumor heterogeneity, although present at the level of individual gene expression, does not preclude precise microarray-based predictions of tumor behavior or clinical outcome in breast cancer patients.
The National Heart, Lung, and Blood Institute convened working group to provide basic and clinical research recommendations to the National Heart, Lung, and Blood Institute on the development of an integrated approach for identifying those individuals who are at high risk for cardiovascular event such as acute coronary syndromes (ACS) or sudden cardiac death in the “near term.” The working group members defined near-term as occurring within 1 year of the time of assessment. The participants reviewed current clinical cardiology practices for risk assessment and state-of-the-science techniques in several areas, including biomarkers, proteomics, genetics, psychosocial factors, imaging, coagulation, and vascular and myocardial susceptibility. This report presents highlights of these reviews and a summary of suggested research directions.
cardiovascular diseases; death, sudden; myocardial infarction; risk factors; risk prediction
Alterations in gene expression in peripheral blood cells have been shown to be sensitive to the presence and extent of coronary artery disease (CAD). A non-invasive blood test that could reliably assess obstructive CAD likelihood would have diagnostic utility.
Microarray analysis of RNA samples from a 195 patient Duke CATHGEN registry case:control cohort yielded 2,438 genes with significant CAD association (p < 0.05), and identified the clinical/demographic factors with the largest effects on gene expression as age, sex, and diabetic status. RT-PCR analysis of 88 CAD classifier genes confirmed that diabetic status was the largest clinical factor affecting CAD associated gene expression changes. A second microarray cohort analysis limited to non-diabetics from the multi-center PREDICT study (198 patients; 99 case: control pairs matched for age and sex) evaluated gene expression, clinical, and cell population predictors of CAD and yielded 5,935 CAD genes (p < 0.05) with an intersection of 655 genes with the CATHGEN results. Biological pathway (gene ontology and literature) and statistical analyses (hierarchical clustering and logistic regression) were used in combination to select 113 genes for RT-PCR analysis including CAD classifiers, cell-type specific markers, and normalization genes.
RT-PCR analysis of these 113 genes in a PREDICT cohort of 640 non-diabetic subject samples was used for algorithm development. Gene expression correlations identified clusters of CAD classifier genes which were reduced to meta-genes using LASSO. The final classifier for assessment of obstructive CAD was derived by Ridge Regression and contained sex-specific age functions and 6 meta-gene terms, comprising 23 genes. This algorithm showed a cross-validated estimated AUC = 0.77 (95% CI 0.73-0.81) in ROC analysis.
We have developed a whole blood classifier based on gene expression, age and sex for the assessment of obstructive CAD in non-diabetic patients from a combination of microarray and RT-PCR data derived from studies of patients clinically indicated for invasive angiography.
Clinical trial registration information
PREDICT, Personalized Risk Evaluation and Diagnosis in the Coronary Tree, http://www.clinicaltrials.gov, NCT00500617
Atherosclerosis; gene expression; whole blood classifier
There is interindividual variation in low-density lipoprotein cholesterol (LDLc) lowering by statins and limited study into the genetic associations of the dose dependant LDLc lowering by statins.
Methods and Results
Five hundred nine patients with hyperlipidemia were randomly assigned atorvastatin 10 mg, simvastatin 20 mg, or pravastatin 10 mg (low-dose phase) followed by 80 mg, 80 mg, and 40 mg (high-dose phase), respectively. Thirty-one genes in statin, cholesterol, and lipoprotein metabolism were sequenced and 489 single nucleotide polymorphisms with minor allele frequencies >2% were tested for associations with percentage LDLc lowering at low doses using multivariable adjusted general linear regression. Significant associations from the analysis at low dose were then repeated at high-dose statins. At low doses, only 1 single nucleotide polymorphism met our experiment-wide significance level, ABCA1 rs12003906. Twenty-six subjects carried the minor allele of rs12003906, which was associated with an attenuated LDLc reduction (LDLc reduction in carriers versus noncarriers −24.1±2.6% versus −32.2±1.5%; P=0.0001). In addition, we replicated the association with the APOE ε3 allele and a reduced LDLc reduction. At high doses, carriers of the minor allele of ABCA1 rs12003906 and the APOE ε3 allele improved their LDLc reduction but continued to have a diminished LDLc reduction compared with noncarriers (−30.5±4.0% versus −42.0±2.4%; P=0.005) and (−38.5±1.9% versus −45.3±2.8%; P=0.009), respectively.
An intronic single nucleotide polymorphism in ABCA1 and the APOE ε3 allele are associated with reduced LDLc lowering by statins and identify individuals who may be resistant to maximal LDLc lowering by statins.
cholesterol; genetics; hypercholesterolemia; pharmacogenetics; HMG-CoA
Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis.
Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches.
Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Recent advances in genomic research have demonstrated a substantial role for genomic factors in predicting response to cancer therapies. Researchers in the fields of cancer pharmacogenomics and pharmacoepidemiology seek to understand why individuals respond differently to drug therapy, in terms of both adverse effects and treatment efficacy. To identify research priorities as well as the resources and infrastructure needed to advance these fields, the National Cancer Institute (NCI) sponsored a workshop titled “Cancer Pharmacogenomics: Setting a Research Agenda to Accelerate Translation” on July 21, 2009, in Bethesda, MD. In this commentary, we summarize and discuss five science-based recommendations and four infrastructure-based recommendations that were identified as a result of discussions held during this workshop. Key recommendations include 1) supporting the routine collection of germline and tumor biospecimens in NCI-sponsored clinical trials and in some observational and population-based studies; 2) incorporating pharmacogenomic markers into clinical trials; 3) addressing the ethical, legal, social, and biospecimen- and data-sharing implications of pharmacogenomic and pharmacoepidemiologic research; and 4) establishing partnerships across NCI, with other federal agencies, and with industry. Together, these recommendations will facilitate the discovery and validation of clinical, sociodemographic, lifestyle, and genomic markers related to cancer treatment response and adverse events, and they will improve both the speed and efficiency by which new pharmacogenomic and pharmacoepidemiologic information is translated into clinical practice.
The increasing availability of personal genomic tests has led to discussions about the validity and utility of such tests and the balance of benefits and harms. A multidisciplinary workshop was convened by the National Institutes of Health and the Centers for Disease Control and Prevention to review the scientific foundation for using personal genomics in risk assessment and disease prevention and to develop recommendations for targeted research. The clinical validity and utility of personal genomics is a moving target with rapidly developing discoveries but little translation research to close the gap between discoveries and health impact. Workshop participants made recommendations in five domains: (1) developing and applying scientific standards for assessing personal genomic tests; (2) developing and applying a multidisciplinary research agenda, including observational studies and clinical trials to fill knowledge gaps in clinical validity and utility; (3) enhancing credible knowledge synthesis and information dissemination to clinicians and consumers; (4) linking scientific findings to evidence-based recommendations for use of personal genomics; and (5) assessing how the concept of personal utility can affect health benefits, costs, and risks by developing appropriate metrics for evaluation. To fulfill the promise of personal genomics, a rigorous multidisciplinary research agenda is needed.
behavioral sciences; epidemiologic methods; evidence-based medicine; genetics; genetic testing; genomics; medicine; public health
Lipoprotein-associated phospholipase A2 (Lp-PLA2) is an emerging risk factor and therapeutic target for cardiovascular disease. The activity and mass of this enzyme are heritable traits, but major genetic determinants have not been explored in a systematic, genome-wide fashion. We carried out a genome-wide association study of Lp-PLA2 activity and mass in 6,668 Caucasian subjects from the population-based Framingham Heart Study. Clinical data and genotypes from the Affymetrix 550K SNP array were obtained from the open-access Framingham SHARe project. Each polymorphism that passed quality control was tested for associations with Lp-PLA2 activity and mass using linear mixed models implemented in the R statistical package, accounting for familial correlations, and controlling for age, sex, smoking, lipid-lowering-medication use, and cohort. For Lp-PLA2 activity, polymorphisms at four independent loci reached genome-wide significance, including the APOE/APOC1 region on chromosome 19 (p = 6×10−24); CELSR2/PSRC1 on chromosome 1 (p = 3×10−15); SCARB1 on chromosome 12 (p = 1×10−8) and ZNF259/BUD13 in the APOA5/APOA1 gene region on chromosome 11 (p = 4×10−8). All of these remained significant after accounting for associations with LDL cholesterol, HDL cholesterol, or triglycerides. For Lp-PLA2 mass, 12 SNPs achieved genome-wide significance, all clustering in a region on chromosome 6p12.3 near the PLA2G7 gene. Our analyses demonstrate that genetic polymorphisms may contribute to inter-individual variation in Lp-PLA2 activity and mass.
Blood levels of lipoprotein-associated phospholipase A2 (Lp-PLA2) show a strong association with atherosclerosis in humans. This enzyme is made by certain cells of the immune system, associates with lipoproteins (HDL and LDL), and is thought to be involved in inflammation. Studies have shown that Lp-PLA2 is a good predictor of cardiovascular disease, independent of HDL and LDL cholesterol levels. This has led to the development of drugs aimed at inhibiting Lp-PLA2 as a way to treat or prevent cardiovascular disease. The activity and mass of Lp-PLA2 are heritable traits, but major genetic determinants have not been explored in a systematic fashion. We examined genetic variants across the human genome to identify genes influencing Lp-PLA2 activity and mass. We studied 6,668 Caucasian subjects from the population-based Framingham Heart Study. Clinical data and genetic data on 550,000 genetic variants were available for association analysis. There was no overlap in the most significantly associated SNPs for activity and mass. We identified four distinct gene regions showing highly significant associations with Lp-PLA2 activity, all of which are known to include genes involved in cholesterol metabolism. The only locus associated with Lp-PLA2 mass was a region harboring PLA2G7, the gene that encodes lipoprotein-associated phospholipase A2.
Acute respiratory infections (ARI) are a common reason for seeking medical attention and the threat of pandemic influenza will likely add to these numbers. Using human viral challenge studies with live rhinovirus, respiratory syncytial virus, and influenza A, we developed peripheral blood gene expression signatures that distinguish individuals with symptomatic ARI from uninfected individuals with > 95% accuracy. We validated this “acute respiratory viral” signature - encompassing genes with a known role in host defense against viral infections - across each viral challenge. We also validated the signature in an independently acquired dataset for influenza A and classified infected individuals from healthy controls with 100% accuracy. In the same dataset, we could also distinguish viral from bacterial ARIs (93% accuracy). These results demonstrate that ARIs induce changes in human peripheral blood gene expression that can be used to diagnose a viral etiology of respiratory infection and triage symptomatic individuals.
Coronary artery disease and acute myocardial infarction are complex traits in which there has been recent research to identify the principal genes that engender susceptibility or provide protection. Although there has been exceptional progress in the technology, which now allows genotyping of hundreds of thousands of single‐nucleotide polymorphisms in each individual, there remains a pattern of inconsistency in the studies performed to date, in part owing to the difficulties in defining cases and controls. In this paper, salient issues to facilitate research in this important field are reviewed.
genetics; coronary artery disease; myocardial infarction; atherosclerosis
Several studies have noted that genetic variants of SCARB1, a lipoprotein receptor involved in reverse cholesterol transport, are associated with serum lipid levels in a sex-dependent fashion. However, the mechanism underlying this gene by sex interaction has not been explored.
We utilized both epidemiological and molecular methods to study how estrogen and gene variants interact to influence SCARB1 expression and lipid levels. Interaction between 35 SCARB1 haplotype-tagged polymorphisms and endogenous estradiol levels was assessed in 498 postmenopausal Caucasian women from the population-based Rancho Bernardo Study. We further examined associated variants with overall and SCARB1 splice variant (SR-BI and SR-BII) expression in 91 human liver tissues using quantitative real-time PCR.
Several variants on a haplotype block spanning intron 11 to intron 12 of SCARB1 showed significant gene by estradiol interaction affecting serum lipid levels, the strongest for rs838895 with HDL-cholesterol (p = 9.2 × 10-4) and triglycerides (p = 1.3 × 10-3) and the triglyceride:HDL cholesterol ratio (p = 2.7 × 10-4). These same variants were associated with expression of the SR-BI isoform in a sex-specific fashion, with the strongest association found among liver tissue from 52 young women <45 years old (p = 0.002).
Estrogen and SCARB1 genotype may act synergistically to regulate expression of SCARB1 isoforms and impact serum levels of HDL cholesterol and triglycerides. This work highlights the importance of considering sex-dependent effects of gene variants on serum lipid levels.
Systemic and local inflammation plays a prominent role in the pathogenesis of atherosclerotic coronary artery disease, but the relationship of whole blood gene expression changes with coronary disease remains unclear. We have investigated whether gene expression patterns in peripheral blood correlate with the severity of coronary disease and whether these patterns correlate with the extent of atherosclerosis in the vascular wall.
Patients were selected according to their coronary artery disease index (CADi), a validated angiographical measure of the extent of coronary atherosclerosis that correlates with outcome. RNA was extracted from blood of 120 patients with at least a stenosis greater than 50% (CADi≥23) and from 121 controls without evidence of coronary stenosis (CADi = 0).
160 individual genes were found to correlate with CADi (rho>0.2, P<0.003). Prominent differential expression was observed especially in genes involved in cell growth, apoptosis and inflammation. Using these 160 genes, a partial least squares multivariate regression model resulted in a highly predictive model (r2 = 0.776, P<0.0001). The expression pattern of these 160 genes in aortic tissue also predicted the severity of atherosclerosis in human aortas, showing that peripheral blood gene expression associated with coronary atherosclerosis mirrors gene expression changes in atherosclerotic arteries.
In conclusion, the simultaneous expression pattern of 160 genes in whole blood correlates with the severity of coronary artery disease and mirrors expression changes in the atherosclerotic vascular wall.