To identify loci for age at menarche, we performed a meta-analysis of 32 genome-wide association studies in 87,802 women of European descent, with replication in up to 14,731 women. In addition to the known loci at LIN28B (P=5.4×10−60) and 9q31.2 (P=2.2×10−33), we identified 30 novel menarche loci (all P<5×10−8) and found suggestive evidence for a further 10 loci (P<1.9×10−6). New loci included four previously associated with BMI (in/near FTO, SEC16B, TRA2B and TMEM18), three in/near other genes implicated in energy homeostasis (BSX, CRTC1, and MCHR2), and three in/near genes implicated in hormonal regulation (INHBA, PCSK2 and RXRG). Ingenuity and MAGENTA pathway analyses identified coenzyme A and fatty acid biosynthesis as biological processes related to menarche timing.
Analysis of the biological gene networks involved in a disease may lead to the identification of therapeutic targets. Such analysis requires exploring network properties, in particular the importance of individual network nodes (i.e., genes). There are many measures that consider the importance of nodes in a network and some may shed light on the biological significance and potential optimality of a gene or set of genes as therapeutic targets. This has been shown to be the case in cancer therapy. A dilemma exists, however, in finding the best therapeutic targets based on network analysis since the optimal targets should be nodes that are highly influential in, but not toxic to, the functioning of the entire network. In addition, cancer therapeutics targeting a single gene often result in relapse since compensatory, feedback and redundancy loops in the network may offset the activity associated with the targeted gene. Thus, multiple genes reflecting parallel functional cascades in a network should be targeted simultaneously, but require the identification of such targets. We propose a methodology that exploits centrality statistics characterizing the importance of nodes within a gene network that is constructed from the gene expression patterns in that network. We consider centrality measures based on both graph theory and spectral graph theory. We also consider the origins of a network topology, and show how different available representations yield different node importance results. We apply our techniques to tumor gene expression data and suggest that the identification of optimal therapeutic targets involving particular genes, pathways and sub-networks based on an analysis of the nodes in that network is possible and can facilitate individualized cancer treatments. The proposed methods also have the potential to identify candidate cancer therapeutic targets that are not thought to be oncogenes but nonetheless play important roles in the functioning of a cancer-related network or pathway.
network analysis; centrality; cancer; pathway; drug targets; personalized treatment; gene expression
Body mass index (BMI) is a well-known measure of obesity with a multitude of genetic and non-genetic determinants. Identifying the underlying factors associated with BMI is difficult because of its multifactorial etiology that varies as a function of geoethnic background and socioeconomic setting. Thus, we pursued a study exploring the influence of the degree of Native American admixture on BMI (as well as weight and height individually) in a community sample of Native Americans (n=846) while accommodating a variety of socioeconomic and cultural factors.
Participants’ degree of Native American (NA) ancestry was estimated using a genome-wide panel of markers. The participants also completed an extensive survey of cultural and social identity measures: the Indian Culture Scale (ICS) and the Orthogonal Cultural Identification Scale (OCIS). Multiple linear regression was used to examine the relation between these measures and BMI.
Our results suggest that BMI is correlated positively with the proportion of NA ancestry. Age was also significantly associated with BMI, while gender and socioeconomic measures (education and income) were not. For the two cultural identity measures, the ICS showed a positive correlation with BMI, while OCIS was not associated with BMI.
Taken together, these results suggest that genetic and cultural environmental factors, rather than socioeconomic factors, account for a substantial proportion of variation in BMI in this population. Further, significant correlations between degree of NA ancestry and BMI suggest that admixture mapping may be appropriate to identify loci associated with BMI in this population.
Genetic ancestry; admixture; body habitus; obesity; Native Americans
There is considerable debate about the most efficient way to interrogate rare coding variants in association studies. The options include direct genotyping of specific known coding variants in genes or, alternatively, sequencing across the entire exome to capture known as well as novel variants. Each strategy has advantages and disadvantages, but the availability of cost-efficient exome arrays has made the former appealing. Here we consider the utility of a direct genotyping chip, the Illumina HumanExome array (HE), by evaluating its content based on: 1. functionality; and 2. amenability to imputation. We explored these issues by genotyping a large, ethnically diverse cohort on the HumanOmniExpressExome array (HOEE) which combines the HE with content from the GWAS array (HOE). We find that the use of the HE is likely to be a cost-effective way of expanding GWAS, but does have some drawbacks that deserve consideration when planning studies.
Illumina HumanExome array; expanding GWAS; genotyping rare SNPs; coding variants
Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants.
Sequencing; functional analysis; computer modeling; genomic variation
Dental caries remains a significant public health problem and is considered pandemic worldwide. The prediction of dental caries based on profiling of microbial species involved in disease and equally important, the identification of species conferring dental health has proven more difficult than anticipated due to high interpersonal and geographical variability of dental plaque microbiota. We have used RNA-Seq to perform global gene expression analysis of dental plaque microbiota derived from 19 twin pairs that were either concordant (caries-active or caries-free) or discordant for dental caries. The transcription profiling allowed us to define a functional core microbiota consisting of nearly 60 species. Similarities in gene expression patterns allowed a preliminary assessment of the relative contribution of human genetics, environmental factors and caries phenotype on the microbiota's transcriptome. Correlation analysis of transcription allowed the identification of numerous functional networks, suggesting that inter-personal environmental variables may co-select for groups of genera and species. Analysis of functional role categories allowed the identification of dominant functions expressed by dental plaque biofilm communities, that highlight the biochemical priorities of dental plaque microbes to metabolize diverse sugars and cope with the acid and oxidative stress resulting from sugar fermentation. The wealth of data generated by deep sequencing of expressed transcripts enables a greatly expanded perspective concerning the functional expression of dental plaque microbiota.
caries; oral microbiota; dental plaque; biofilm; transcriptome
Next-generation sequencing (NGS) technologies have become much more efficient, allowing whole human genomes to be sequenced faster and cheaper than ever before. However, processing the raw sequence reads associated with NGS technologies requires care and sophistication in order to draw compelling inferences about phenotypic consequences of variation in human genomes. It has been shown that different approaches to variant calling from NGS data can lead to different conclusions. Ensuring appropriate accuracy and quality in variant calling can come at a computational cost.
We describe our experience implementing and evaluating a group-based approach to calling variants on large numbers of whole human genomes. We explore the influence of many factors that may impact the accuracy and efficiency of group-based variant calling, including group size, the biogeographical backgrounds of the individuals who have been sequenced, and the computing environment used. We make efficient use of the Gordon supercomputer cluster at the San Diego Supercomputer Center by incorporating job-packing and parallelization considerations into our workflow while calling variants on 437 whole human genomes generated as part of large association study.
We ultimately find that our workflow resulted in high-quality variant calls in a computationally efficient manner. We argue that studies like ours should motivate further investigations combining hardware-oriented advances in computing systems with algorithmic developments to tackle emerging ‘big data’ problems in biomedical research brought on by the expansion of NGS technologies.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0736-4) contains supplementary material, which is available to authorized users.
Variant calling; Supercomputing; Whole-genome sequencing
The limitations of genome-wide association (GWA) studies that focus on the phenotypic influence of common genetic variants have motivated human geneticists to consider the contribution of rare variants to phenotypic expression. The increasing availability of high-throughput sequencing technology has enabled studies of rare variants, but will not be sufficient for their success since appropriate analytical methods are also needed. We consider data analysis approaches to testing associations between a phenotype and collections of rare variants in a defined genomic region or set of regions. Ultimately, although a wide variety of analytical approaches exist, more work is needed to refine them and determine their properties and power in different contexts.
Insulin-like growth factors (IGF) 1 and 2 are known as potential mitogens for normal and neoplastic cells. IGF2 is a main fetal growth factor while IGF1 is activated through growth hormone action during postnatal growth and development. However, there is strong evidence that activation of IGF2 by its E2F transcription factor 3 (E2F3) is present in different types of cancer. Also high levels of IGF1 strongly correlate with cancer development due to anti-apoptotic properties and enhancement of cancer cell differentiation, which can be attenuated by IGFBP3. Head and neck cancer is known as one of the six most common human cancers. The main risk factor for head and neck cancer is consumption of tobacco and alcohol as well as viral and bacterial infection by stimulation of chronic local inflammation. There is also a genetic basis for this form of cancer; however, the genetic markers are not yet established. In this study we investigated the levels of the expression of IGF2, IGF1, E2F3 and IGFBP3 in human cancers and healthy tissues surrounding the tumor obtained from each of 41 patients. Our study indicated that there is no alteration of the level of expression of IGF2, E2F3 and IGF1 in Head and neck squamous cell carcinoma (HNSCC) cases studied in selected experimental population, but there was evidence for upregulation of pro-apoptotic IGFBP3 in cancer when comparing to healthy tissue. These important findings indicate that insulin-growth factors are not directly associated with HNSCC showing some variability between patients and location of tumor. However, elevated level of IGFBP3 suggests possible regulatory role of IGF signal by its binding protein in this type of tumors.
oral cancer; IGF1; IGF2; survival
The determination of the ancestry and genetic backgrounds of the subjects in genetic and general epidemiology studies is a crucial component in the analysis of relevant outcomes or associations. Although there are many methods for differentiating ancestral subgroups among individuals based on genetic markers only a few of these methods provide actual estimates of the fraction of an individual’s genome that is likely to be associated with different ancestral populations. We propose a method for assigning ancestry that works in stages to refine estimates of ancestral population contributions to individual genomes. The method leverages genotype data in the public domain obtained from individuals with known ancestries. Although we showcase the method in the assessment of ancestral genome proportions leveraging largely continental populations, the strategy can be used for assessing within-continent or more subtle ancestral origins with the appropriate data.
genetic ancestry; admixture; population genetics; admixture proportions
The ongoing controversy surrounding direct-to-consumer (DTC) personal genomic tests intensified last year when the U.S. Government Accountability Office (GAO) released results of an undercover investigation of four companies that offer such testing. Among their findings, they reported that some of their donors received DNA-based predictions that conflicted with their actual medical histories. We aimed to more rigorously evaluate the relationship between DTC genomic risk estimates and self-reported disease by leveraging data from the Scripps Genomic Health Initiative (SGHI). We prospectively collected self-reported personal and family health history data for 3,416 individuals who went on to purchase a commercially available DTC genomic test. For 5 out of 15 total conditions studied, we found that risk estimates from the test were significantly associated with self-reported family and/or personal health history. The 5 conditions, included Graves’ disease, Type 2 Diabetes, Lupus, Alzheimer’s disease, and Restless Leg Syndrome. To further investigate these findings, we ranked each of the 15 conditions based on published heritability estimates and conducted post-hoc power analyses based on the number of individuals in our sample who reported significant histories of each condition. We found that high heritability, coupled with high prevalence in our sample and thus adequate statistical power, explained the pattern of associations observed. Our study represents one of the first evaluations of the relationship between risk estimates from a commercially available DTC personal genomic test and self-reported health histories in the consumers of that test.
direct-to-consumer; genetic testing; genetic risk estimates; clinical validity; consumer genomics
There have been a number of recent successes in the use of whole genome sequencing and sophisticated bioinformatics techniques to identify pathogenic DNA sequence variants responsible for individual idiopathic congenital conditions. However, the success of this identification process is heavily influenced by the ancestry or genetic background of a patient with an idiopathic condition. This is so because potential pathogenic variants in a patient’s genome must be contrasted with variants in a reference set of genomes made up of other individuals’ genomes of the same ancestry as the patient. We explored the effect of ignoring the ancestries of both an individual patient and the individuals used to construct reference genomes. We pursued this exploration in two major steps. We first considered variation in the per-genome number and rates of likely functional derived (i.e., non-ancestral, based on the chimp genome) single nucleotide variants and small indels in 52 individual whole human genomes sampled from 10 different global populations. We took advantage of a suite of computational and bioinformatics techniques to predict the functional effect of over 24 million genomic variants, both coding and non-coding, across these genomes. We found that the typical human genome harbors ∼5.5–6.1 million total derived variants, of which ∼12,000 are likely to have a functional effect (∼5000 coding and ∼7000 non-coding). We also found that the rates of functional genotypes per the total number of genotypes in individual whole genomes differ dramatically between human populations. We then created tables showing how the use of comparator or reference genome panels comprised of genomes from individuals that do not have the same ancestral background as a patient can negatively impact pathogenic variant identification. Our results have important implications for clinical sequencing initiatives.
clinical sequencing; congenital disease; whole genome sequencing; population genetics
To identify the cause of childhood onset involuntary paroxysmal choreiform and dystonic movements in 2 unrelated sporadic cases and to investigate the functional effect of missense mutations in adenylyl cyclase 5 (ADCY5) in sporadic and inherited cases of autosomal dominant familial dyskinesia with facial myokymia (FDFM).
Whole exome sequencing was performed on 2 parent–child trios. The effect of mutations in ADCY5 was studied by measurement of cyclic adenosine monophosphate (cAMP) accumulation under stimulatory and inhibitory conditions.
The same de novo mutation (c.1252C>T, p.R418W) in ADCY5 was found in both studied cases. An inherited missense mutation (c.2176G>A, p.A726T) in ADCY5 was previously reported in a family with FDFM. The significant phenotypic overlap with FDFM was recognized in both cases only after discovery of the molecular link. The inherited mutation in the FDFM family and the recurrent de novo mutation affect residues in different protein domains, the first cytoplasmic domain and the first membrane-spanning domain, respectively. Functional studies revealed a statistically significant increase in β-receptor agonist-stimulated intracellular cAMP consistent with an increase in adenylyl cyclase activity for both mutants relative to wild-type protein, indicative of a gain-of-function effect.
FDFM is likely caused by gain-of-function mutations in different domains of ADCY5—the first definitive link between adenylyl cyclase mutation and human disease. We have illustrated the power of hypothesis-free exome sequencing in establishing diagnoses in rare disorders with complex and variable phenotype. Mutations in ADCY5 should be considered in patients with undiagnosed complex movement disorders even in the absence of a family history.
Multivariate distance matrix regression (MDMR) analysis is a statistical technique that allows researchers to relate P variables to an additional M factors collected on N individuals, where P ≫ N. The technique can be applied to a number of research settings involving high-dimensional data types such as DNA sequence data, gene expression microarray data, and imaging data. MDMR analysis involves computing the distance between all pairs of individuals with respect to P variables of interest and constructing an N × N matrix whose elements reflect these distances. Permutation tests can be used to test linear hypotheses that consider whether or not the M additional factors collected on the individuals can explain variation in the observed distances between and among the N individuals as reflected in the matrix. Despite its appeal and utility, properties of the statistics used in MDMR analysis have not been explored in detail. In this paper we consider the level accuracy and power of MDMR analysis assuming different distance measures and analysis settings. We also describe the utility of MDMR analysis in assessing hypotheses about the appropriate number of clusters arising from a cluster analysis.
regression analysis; multivariate analysis; distance matrix; simulation
Human skull and brain morphology are strongly influenced by genetic factors, and skull size and shape vary worldwide. However, the relationship between specific brain morphology and genetically-determined ancestry is largely unknown.
We used two independent data sets to characterize variation in skull and brain morphology among individuals of European ancestry. The first data set is a historical sample of 1,170 male skulls with 37 shape measurements drawn from 27 European populations. The second data set includes 626 North American individuals of European ancestry participating in the Alzheimer's Disease Neuroimaging Initiative (ADNI) with magnetic resonance imaging, height and weight, neurological diagnosis, and genome-wide single nucleotide polymorphism (SNP) data.
We found that both skull and brain morphological variation exhibit a population-genetic fingerprint among individuals of European ancestry. This fingerprint shows a Northwest to Southeast gradient, is independent of body size, and involves frontotemporal cortical regions.
Our findings are consistent with prior evidence for gene flow in Europe due to historical population movements and indicate that genetic background should be considered in studies seeking to identify genes involved in human cortical development and neuropsychiatric disease.
Biological anthropology; Cortex; Craniometry; Genetic drift; Imaging genomics; Neuroimaging; Population genetics
Cortical thickness is a highly heritable structural brain measurement and reduced thickness has been associated with both schizophrenia and bipolar disorder as well as decreased cognitive performance among healthy controls. Identifying genes that contribute to variation in cortical thickness provides a path to elucidate some of the biological mechanisms underlying these diseases as well as general cognitive abilities.
To identify common genetic variants that affect cortical thickness in schizophrenia, bipolar disorder, and controls and secondarily to test these variants for association with cognitive performance.
597,198 single nucleotide polymorphisms (SNPs) were tested for association with average cortical thickness in a genome-wide association study (GWAS). Significantly associated SNPs were tested for their affect on several measures of cognitive performance.
Four major hospitals in Oslo, Norway.
The GWAS included controls (n = 181) and individuals with DSM-IV diagnosed schizophrenia spectrum disorder (n = 94), bipolar spectrum disorder (n = 97), and other psychotic and affective disorders (n = 49). The follow-up cognitive study included an additional 622 cases and controls.
Main Outcome Measures
Cortical thickness measured with magnetic resonance imaging and cognitive performance as assessed by several neuropsychological tests.
Two closely linked genetic variants (rs4906844 and rs11633924) within the Prader-Willi/Angelman syndrome region on chromosome 15q12 showed genome-wide significant association (p = 1.08 × 10−8) with average cortical thickness as well as modest association with cognitive performance (p = 0.028) specifically among subjects diagnosed with schizophrenia.
This is the first GWAS to identify a common genetic variant that contributes to the heritable reduction of cortical thickness in schizophrenia. These results highlight the utility of cortical thickness as an intermediate phenotype for neuropsychiatric diseases. Future independent replication studies are required to confirm these findings.
Waist circumference (WC) and waist-to-hip ratio (WHR) are surrogate measures of central adiposity that are associated with adverse cardiovascular events, type 2 diabetes and cancer independent of body mass index (BMI). WC and WHR are highly heritable with multiple susceptibility loci identified to date. We assessed the association between SNPs and BMI-adjusted WC and WHR and unadjusted WC in up to 57 412 individuals of European descent from 22 cohorts collaborating with the NHLBI's Candidate Gene Association Resource (CARe) project. The study population consisted of women and men aged 20–80 years. Study participants were genotyped using the ITMAT/Broad/CARE array, which includes ∼50 000 cosmopolitan tagged SNPs across ∼2100 cardiovascular-related genes. Each trait was modeled as a function of age, study site and principal components to control for population stratification, and we conducted a fixed-effects meta-analysis. No new loci for WC were observed. For WHR analyses, three novel loci were significantly associated (P < 2.4 × 10−6). Previously unreported rs2811337-G near TMCC1 was associated with increased WHR (β ± SE, 0.048 ± 0.008, P = 7.7 × 10−9) as was rs7302703-G in HOXC10 (β = 0.044 ± 0.008, P = 2.9 × 10−7) and rs936108-C in PEMT (β = 0.035 ± 0.007, P = 1.9 × 10−6). Sex-stratified analyses revealed two additional novel signals among females only, rs12076073-A in SHC1 (β = 0.10 ± 0.02, P = 1.9 × 10−6) and rs1037575-A in ATBDB4 (β = 0.046 ± 0.01, P = 2.2 × 10−6), supporting an already established sexual dimorphism of central adiposity-related genetic variants. Functional analysis using ENCODE and eQTL databases revealed that several of these loci are in regulatory regions or regions with differential expression in adipose tissue.
With numbers of common variants identified mainly through genome-wide association studies (GWASs), there is great interest in incorporating the findings into screening individuals at high risk of psoriasis. The purpose of this study is to establish genetic prediction models and evaluate its discriminatory ability in psoriasis in Han Chinese population. We built the genetic prediction models through weighted polygenic risk score (PRS) using 14 susceptibility variants in 8,819 samples. We found the risk of psoriasis among individuals in the top quartile of PRS was significantly larger than those in the lowest quartile of PRS (OR = 28.20, P < 2.0×10-16). We also observed statistically significant associations between the PRS, family history and early age onset of psoriasis. We also built a predictive model with all 14 known susceptibility variants and alcohol consumption, which achieved an area under the curve statistic of ~ 0.88. Our study suggests that 14 psoriasis known susceptibility loci have the discriminating potential, as is also associated with family history and age of onset. This is the genetic predictive model in psoriasis with the largest accuracy to date.
We coupled two strategies – trait extremes and genome-wide pooling – to discover a novel BP locus that encodes a previously uncharacterized thiamine transporter.
Hypertension is a heritable trait that remains the most potent and widespread cardiovascular risk factor, though details of its genetic determination are poorly understood.
Representative genomic DNA pools were created from male and female subjects in the highest and lowest 5th %iles of BP in a primary care population of >50,000 individuals. The peak associated SNPs were typed in individual DNA samples, as well as twins/siblings phenotyped for cardiovascular and autonomic traits. Biochemical properties of the associated transporter were evaluated in cellular assays.
After chip hybridization and calculation of relative allele scores, the peak associations were typed in individual samples, revealing association of hypertension, SBP, and DBP to the previously uncharacterized solute carrier SLC35F3. The BP genetic association at SLC35F3 was validated by meta-analysis in an independent sample from the original source population, as well as the ICBP (across North America and Western Europe). Sequence homology to a putative yeast thiamine (vitamin B1) transporter prompted us to express human SLC35F3 in E. coli, which catalyzed [3H]-thiamine uptake. SLC35F3 risk allele (T/T) homozygotes displayed decreased erythrocyte thiamine content on microbiological assay. In twin pairs, the SLC35F3 risk allele predicted heritable cardiovascular traits previously associated with thiamine deficiency, including elevated cardiac stroke volume with decreased vascular resistance, and elevated pressor responses to environmental (cold) stress. Allelic expression imbalance (AEI) confirmed that cis-variation at the human SLC35F3 locus influenced expression of that gene, and the AEI peak coincided with the hypertension peak.
Novel strategies were coupled to position a new hypertension susceptibility locus, uncovering a previously unsuspected thiamine transporter whose genetic variants predicted several disturbances in cardiac and autonomic function. The results have implications for the pathogenesis and treatment of systemic hypertension.
SLC35F3; thiamine; transporter; hypertension
The enormous advances in genetics and genomics of the past decade have the potential to revolutionize health care, including mental health care, and bring about a system predominantly characterized by the practice of genomic and personalized medicine. We briefly review the history of genetics and genomics and present heritability estimates for major chronic diseases of aging and neuropsychiatric disorders. We then assess the extent to which the results of genetic and genomic studies are currently being leveraged clinically for disease treatment and prevention and identify priority research areas in which further work is needed. Pharmacogenomics has emerged as one area of genomics that already has had notable impacts on disease treatment and the practice of medicine. Little evidence, however, for the clinical validity and utility of predictive testing based on genomic information is available, and thus has, to some extent, hindered broader-scale preventive efforts for common, complex diseases. Furthermore, although other disease areas have had greater success in identifying genetic factors responsible for various conditions, progress in identifying the genetic basis of neuropsychiatric diseases has lagged behind. We review social, economic, and policy issues relevant to genomic medicine, and find that a new model of health care based on proactive and preventive health planning and individualized treatment will require major advances in health care policy and administration. Specifically, incentives for relevant stakeholders are critical, as are realignment of incentives and education initiatives for physicians, and updates to pertinent legislation. Moreover, the translational behavioral and public health research necessary for fully integrating genomics into health care is lacking, and further work in these areas is needed. In short, while the pace of advances in genetic and genomic science and technology has been rapid, more work is needed to fully realize the potential for impacting disease treatment and prevention generally, and mental health specifically.
genomics; genetic testing; genetic risk assessment; public health genomics; pharmacogenomics
Available statistical preprocessing or quality control analysis tools for gene expression microarray datasets are known to greatly affect downstream data analysis, especially when degraded samples, unique tissue samples, or novel expression assays are used. It is therefore important to assess the validity and impact of the assumptions built in to preprocessing schemes for a dataset. We developed and assessed a data preprocessing strategy for use with the Illumina DASL-based gene expression assay with partially degraded postmortem prefrontal cortex samples. The samples were obtained from individuals with autism as part of an investigation of the pathogenic factors contributing to autism. Using statistical analysis methods and metrics such as those associated with multivariate distance matrix regression and mean inter-array correlation, we developed a DASL-based assay gene expression preprocessing pipeline to accommodate and detect problems with microarray-based gene expression values obtained with degraded brain samples. Key steps in the pipeline included outlier exclusion, data transformation and normalization, and batch effect and covariate corrections. Our goal was to produce a clean dataset for subsequent downstream differential expression analysis. We ultimately settled on available transformation and normalization algorithms in the R/Bioconductor package lumi based on an assessment of their use in various combinations. A log2-transformed, quantile-normalized, and batch and seizure-corrected procedure was likely the most appropriate for our data. We empirically tested different components of our proposed preprocessing strategy and believe that our results suggest that a preprocessing strategy that effectively identifies outliers, normalizes the data, and corrects for batch effects can be applied to all studies, even those pursued with degraded samples.
gene expression; microarray; data preprocessing; quality control
Individuals with anorexia nervosa (AN) and bulimia nervosa (BN) have alterations of measures of serotonin (5-HT) and dopamine (DA) function, which persist after long-term recovery and are associated with elevated harm avoidance (HA), a measure of anxiety and behavioral inhibition.
Based on theories that 5-HT is an aversive motivational system that may oppose a DA-related appetitive system, we explored interactions of positron emission tomography (PET) radioligand measures that reflect portions of these systems.
Twenty-seven individuals recovered (REC) from eating disorders (EDs) (7 AN-BN, 11 AN, 9 BN) and 9 control women (CW) were analyzed for correlations between [11C]McN5652 and [11C]raclopride binding.
There was a positive correlation between [11C]McN5652 binding potential BPnon displaceable(ND)) and [11C]raclopride BPND for the dorsal caudate (r(27) = .62; p < .001), antero-ventral striatum (r(27) = .55, p = .003), middle caudate (r(27) = .68; p < .001), ventral (r(27) = .64; p < .001) and dorsal putamen (r(27) = .42; p = .03). No significant correlations were found in CW. [11C]raclopride BPND, but not [11C]McN5652 BPND, was significantly related to HA in REC EDs. A linear regression analysis showed that the interaction between [11C]McN5652 BPND and [11C]raclopride BPND in the dorsal putamen significantly (b = 140.04; t (22) = 2.21; p = .04) predicted HA.
This is the first study using PET and the radioligands [11C]McN5652 and [11C]raclopride to show a direct relationship between 5-HT transporter and striatal DA D2/D3 receptor binding in humans, supporting the possibility that 5-HT and DA interactions contribute to HA behaviors in EDs.
anorexia nervosa; bulimia nervosa; positron emission tomography; dopamine; serotonin; harm avoidance
The Marine Resiliency Study (MRS) is a prospective study of factors predictive of posttraumatic stress disorder (PTSD) among approximately 2,600 Marines in 4 battalions deployed to Iraq or Afghanistan. We describe the MRS design and predeployment participant characteristics. Starting in 2008, our research team conducted structured clinical interviews on Marine bases and collected data 4 times: at predeployment and at 1 week, 3 months, and 6 months postdeployment. Integrated with these data are medical and career histories from the Career History Archival Medical and Personnel System (CHAMPS) database. The CHAMPS database showed that 7.4% of the Marines enrolled in MRS had at least 1 mental health diagnosis. Of enrolled Marines, approximately half (51.3%) had prior deployments. We found a moderate positive relationship between deployment history and PTSD prevalence in these baseline data.
One goal of aging research is to find drugs that delay the onset of age-associated disease. Studies in invertebrates, particularly C. elegans, have uncovered numerous genes involved in aging, many conserved in mammals. However, which of these encode proteins suitable for drug targeting is unknown. To investigate this question, we screened a library of compounds with known mammalian pharmacology for compounds that increase C. elegans lifespan. We identified 60 compounds that increase longevity in C. elegans, 33 of which also increased resistance to oxidative stress. Many of these compounds are drugs approved for human use. Enhanced resistance to oxidative stress was associated primarily with compounds that target receptors for biogenic amines, such as dopamine or serotonin. A pharmacological network constructed with these data reveals that lifespan extension and increased stress resistance cluster together in a few pharmacological classes, most involved in intercellular signaling. These studies identify compounds that can now be explored for beneficial effects on aging in mammals, as well as tools that can be used to further investigate the mechanisms underlying aging in C. elegans.
aging; oxidative stress; drugs; pharmaceutical; serotonin; dopamine