Search tips
Search criteria

Results 1-25 (7994)

Clipboard (0)
Year of Publication
more »
1.  Genomic features separating ten strains of Neorhizobium galegae with different symbiotic phenotypes 
BMC Genomics  2015;16(1):348.
The symbiotic phenotype of Neorhizobium galegae, with strains specifically fixing nitrogen with either Galega orientalis or G. officinalis, has made it a target in research on determinants of host specificity in nitrogen fixation. The genomic differences between representative strains of the two symbiovars are, however, relatively small. This introduced a need for a dataset representing a larger bacterial population in order to make better conclusions on characteristics typical for a subset of the species. In this study, we produced draft genomes of eight strains of N. galegae having different symbiotic phenotypes, both with regard to host specificity and nitrogen fixation efficiency. These genomes were analysed together with the previously published complete genomes of N. galegae strains HAMBI 540T and HAMBI 1141.
The results showed that the presence of an additional rpoN sigma factor gene in the symbiosis gene region is a characteristic specific to symbiovar orientalis, required for nitrogen fixation. Also the nifQ gene was shown to be crucial for functional symbiosis in both symbiovars. Genome-wide analyses identified additional genes characteristic of strains of the same symbiovar and of strains having similar plant growth promoting properties on Galega orientalis. Many of these genes are involved in transcriptional regulation or in metabolic functions.
The results of this study confirm that the only symbiosis-related gene that is present in one symbiovar of N. galegae but not in the other is an rpoN gene. The specific function of this gene remains to be determined, however. New genes that were identified as specific for strains of one symbiovar may be involved in determining host specificity, while others are defined as potential determinant genes for differences in efficiency of nitrogen fixation.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1576-3) contains supplementary material, which is available to authorized users.
PMCID: PMC4417242  PMID: 25933608
Neorhizobium galegae; Symbiosis; Genome; rpoN; nifQ; Nitrogen fixation
2.  Early response to nanoparticles in the Arabidopsis transcriptome compromises plant defence and root-hair development through salicylic acid signalling 
BMC Genomics  2015;16(1):341.
The impact of nano-scaled materials on photosynthetic organisms needs to be evaluated. Plants represent the largest interface between the environment and biosphere, so understanding how nanoparticles affect them is especially relevant for environmental assessments. Nanotoxicology studies in plants allude to quantum size effects and other properties specific of the nano-stage to explain increased toxicity respect to bulk compounds. However, gene expression profiles after exposure to nanoparticles and other sources of environmental stress have not been compared and the impact on plant defence has not been analysed.
Arabidopsis plants were exposed to TiO2-nanoparticles, Ag-nanoparticles, and multi-walled carbon nanotubes as well as different sources of biotic (microbial pathogens) or abiotic (saline, drought, or wounding) stresses. Changes in gene expression profiles and plant phenotypic responses were evaluated. Transcriptome analysis shows similarity of expression patterns for all plants exposed to nanoparticles and a low impact on gene expression compared to other stress inducers. Nanoparticle exposure repressed transcriptional responses to microbial pathogens, resulting in increased bacterial colonization during an experimental infection. Inhibition of root hair development and transcriptional patterns characteristic of phosphate starvation response were also observed. The exogenous addition of salicylic acid prevented some nano-specific transcriptional and phenotypic effects, including the reduction in root hair formation and the colonization of distal leaves by bacteria.
This study integrates the effect of nanoparticles on gene expression with plant responses to major sources of environmental stress and paves the way to remediate the impact of these potentially damaging compounds through hormonal priming.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1530-4) contains supplementary material, which is available to authorized users.
PMCID: PMC4417227  PMID: 25903678
Nanoparticles; Nanotoxycology; Arabidopsis; Defence; Transcriptome; Stress; Systemic acquired response
3.  Transcriptomic profiles of aging in purified human immune cells 
BMC Genomics  2015;16(1):333.
Transcriptomic studies hold great potential towards understanding the human aging process. Previous transcriptomic studies have identified many genes with age-associated expression levels; however, small samples sizes and mixed cell types often make these results difficult to interpret.
Using transcriptomic profiles in CD14+ monocytes from 1,264 participants of the Multi-Ethnic Study of Atherosclerosis (aged 55–94 years), we identified 2,704 genes differentially expressed with chronological age (false discovery rate, FDR ≤ 0.001). We further identified six networks of co-expressed genes that included prominent genes from three pathways: protein synthesis (particularly mitochondrial ribosomal genes), oxidative phosphorylation, and autophagy, with expression patterns suggesting these pathways decline with age. Expression of several chromatin remodeler and transcriptional modifier genes strongly correlated with expression of oxidative phosphorylation and ribosomal protein synthesis genes. 17% of genes with age-associated expression harbored CpG sites whose degree of methylation significantly mediated the relationship between age and gene expression (p < 0.05). Lastly, 15 genes with age-associated expression were also associated (FDR ≤ 0.01) with pulse pressure independent of chronological age.
Comparing transcriptomic profiles of CD14+ monocytes to CD4+ T cells from a subset (n = 423) of the population, we identified 30 age-associated (FDR < 0.01) genes in common, while larger sets of differentially expressed genes were unique to either T cells (188 genes) or monocytes (383 genes). At the pathway level, a decline in ribosomal protein synthesis machinery gene expression with age was detectable in both cell types.
An overall decline in expression of ribosomal protein synthesis genes with age was detected in CD14+ monocytes and CD4+ T cells, demonstrating that some patterns of aging are likely shared between different cell types. Our findings also support cell-specific effects of age on gene expression, illustrating the importance of using purified cell samples for future transcriptomic studies. Longitudinal work is required to establish the relationship between identified age-associated genes/pathways and aging-related diseases.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1522-4) contains supplementary material, which is available to authorized users.
PMCID: PMC4417516  PMID: 25898983
Aging; Monocyte; T cell; Transcriptome; Mitochondrial ribosome; Translation; Protein synthesis; Ribonucleoprotein complex; Oxidative phosphorylation; Autophagy; Methylation
4.  The RNAi machinery controls distinct responses to environmental signals in the basal fungus Mucor circinelloides 
BMC Genomics  2015;16(1):237.
RNA interference (RNAi) is a conserved mechanism of genome defence that can also have a role in the regulation of endogenous functions through endogenous small RNAs (esRNAs). In fungi, knowledge of the functions regulated by esRNAs has been hampered by lack of clear phenotypes in most mutants affected in the RNAi machinery. Mutants of Mucor circinelloides affected in RNAi genes show defects in physiological and developmental processes, thus making Mucor an outstanding fungal model for studying endogenous functions regulated by RNAi. Some classes of Mucor esRNAs map to exons (ex-siRNAs) and regulate expression of the genes from which they derive. To have a broad picture of genes regulated by the silencing machinery during vegetative growth, we have sequenced and compared the mRNA profiles of mutants in the main RNAi genes by using RNA-seq. In addition, we have achieved a more complete phenotypic characterization of silencing mutants.
Deletion of any main RNAi gene provoked a deep impact in mRNA accumulation at exponential and stationary growth. Genes showing increased mRNA levels, as expected for direct ex-siRNAs targets, but also genes with decreased expression were detected, suggesting that, most probably, the initial ex-siRNA targets regulate the expression of other genes, which can be up- or down-regulated. Expression of 50% of the genes was dependent on more than one RNAi gene in agreement with the existence of several classes of ex-siRNAs produced by different combinations of RNAi proteins. These combinations of proteins have also been involved in the regulation of different cellular processes. Besides genes regulated by the canonical RNAi pathway, this analysis identified processes, such as growth at low pH and sexual interaction that are regulated by a dicer-independent non-canonical RNAi pathway.
This work shows that the RNAi pathways play a relevant role in the regulation of a significant number of endogenous genes in M. circinelloides during exponential and stationary growth phases and opens up an important avenue for in-depth study of genes involved in the regulation of physiological and developmental processes in this fungal model.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1443-2) contains supplementary material, which is available to authorized users.
PMCID: PMC4417260  PMID: 25880254
Asexual sporulation; Sexual interaction; pH regulation; Non-canonical RNAi pathway; esRNAs; mRNA profiling
5.  Distribution in microbial genomes of genes similar to lodA and goxA which encode a novel family of quinoproteins with amino acid oxidase activity 
BMC Genomics  2015;16(1):231.
L-Amino acid oxidases (LAOs) have been generally described as flavoproteins that oxidize amino acids releasing the corresponding ketoacid, ammonium and hydrogen peroxide. The generation of hydrogen peroxide gives to these enzymes antimicrobial characteristics. They are involved in processes such as biofilm development and microbial competition. LAOs are of great biotechnological interest in different applications such as the design of biosensors, biotransformations and biomedicine.
The marine bacterium Marinomonas mediterranea synthesizes LodA, the first known LAO that contains a quinone cofactor. LodA is encoded in an operon that contains a second gene coding for LodB, a protein required for the post-translational modification generating the cofactor. Recently, GoxA, a quinoprotein with sequence similarity to LodA but with a different enzymatic activity (glycine oxidase instead of lysine-ε-oxidase) has been described. The aim of this work has been to study the distribution of genes similar to lodA and/or goxA in sequenced microbial genomes and to get insight into the evolution of this novel family of proteins through phylogenetic analysis.
Genes encoding LodA-like proteins have been detected in several bacterial classes. However, they are absent in Archaea and detected only in a small group of fungi of the class Agaromycetes. The vast majority of the genes detected are in a genome region with a nearby lodB-like gene suggesting a specific interaction between both partner proteins.
Sequence alignment of the LodA-like proteins allowed the detection of several conserved residues. All of them showed a Cys and a Trp that aligned with the residues that are forming part of the cysteine tryptophilquinone (CTQ) cofactor in LodA. Phylogenetic analysis revealed that LodA-like proteins can be clustered in different groups. Interestingly, LodA and GoxA are in different groups, indicating that those groups are related to the enzymatic activity of the proteins detected.
Genome mining has revealed for the first time the broad distribution of LodA-like proteins containing a CTQ cofactor in many different microbial groups. This study provides a platform to explore the potentially novel enzymatic activities of the proteins detected, the mechanisms of post-translational modifications involved in their synthesis, as well as their biological relevance.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1455-y) contains supplementary material, which is available to authorized users.
PMCID: PMC4417212  PMID: 25886995
L-amino acid oxidase; Quinone cofactor; Post-translational modification; Lysine oxidase; Glycine oxidase
6.  A genetic map of cassava (Manihot esculenta Crantz) with integrated physical mapping of immunity-related genes 
BMC Genomics  2015;16(1):190.
Cassava, Manihot esculenta Crantz, is one of the most important crops world-wide representing the staple security for more than one billion of people. The development of dense genetic and physical maps, as the basis for implementing genetic and molecular approaches to accelerate the rate of genetic gains in breeding program represents a significant challenge. A reference genome sequence for cassava has been made recently available and community efforts are underway for improving its quality. Cassava is threatened by several pathogens, but the mechanisms of defense are far from being understood. Besides, there has been a lack of information about the number of genes related to immunity as well as their distribution and genomic organization in the cassava genome.
A high dense genetic map of cassava containing 2,141 SNPs has been constructed. Eighteen linkage groups were resolved with an overall size of 2,571 cM and an average distance of 1.26 cM between markers. More than half of mapped SNPs (57.4%) are located in coding sequences. Physical mapping of scaffolds of cassava whole genome sequence draft using the mapped markers as anchors resulted in the orientation of 687 scaffolds covering 45.6% of the genome. One hundred eighty nine new scaffolds are anchored to the genetic cassava map leading to an extension of the present cassava physical map with 30.7 Mb. Comparative analysis using anchor markers showed strong co-linearity to previously reported cassava genetic and physical maps. In silico based searching for conserved domains allowed the annotation of a repertory of 1,061 cassava genes coding for immunity-related proteins (IRPs). Based on physical map of the corresponding sequencing scaffolds, unambiguous genetic localization was possible for 569 IRPs.
This is the first study reported so far of an integrated high density genetic map using SNPs with integrated genetic and physical localization of newly annotated immunity related genes in cassava. These data build a solid basis for future studies to map and associate markers with single loci or quantitative trait loci for agronomical important traits. The enrichment of the physical map with novel scaffolds is in line with the efforts of the cassava genome sequencing consortium.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1397-4) contains supplementary material, which is available to authorized users.
PMCID: PMC4417308  PMID: 25887443
Linkage mapping; Physical mapping; Genotyping by sequencing; Single nucleotide polymorphisms; Immunity-related genes
7.  Factors to preserve CpG-rich sequences in methylated CpG islands 
BMC Genomics  2015;16(1):144.
Mammalian CpG islands (CGIs) normally escape DNA methylation in all adult tissues and developmental stages. However, in our previous study we unexpectedly identified many methylated CGIs in human peripheral blood leukocytes. Methylated CpG dinucleotides convert to TpG dinucleotides through deaminization of their cytosine bases more frequently than hypomethylated CpG dinucleotides. Therefore, we wondered how methylated CGIs in germline or non-germline cells maintain their CpG-rich sequences. It is known that events such as germline hypomethylation, CpG selection, biased gene conversion (BGC), and frequent CpG fixation can contribute to the maintenance of CpG-rich sequences in methylated CGIs in germline or non-germline cells. However, it has not been investigated which of the processes maintain CpG-rich sequences of methylated CGIs in each genomic position.
In this study, we comprehensively examined the contribution of the processes described above to the maintenance of CpG-rich sequences in methylated CGIs in germline and non-germline cells which were classified by genomic positions. Approximately 60–80% of CGIs with high methylation in H1 cell line (H1-HM) in all the genomic positions showed a low average CpG → TpG/CpA substitution rate. In contrast, fewer than half the numbers of CGIs with H1-HM in all the genomic positions showed a low average CpG → TpG/CpA substitution rate and low levels of methylation in sperm cells (SPM-LM). Furthermore, a small fraction of CGIs with a low average CpG → TpG/CpA substitution rate and high levels of methylation in sperm cells (SPM-HM) showed CpG selection.
On the other hand, independent of the positions in genes, most CGIs with SPM-HM showed a slightly higher average TpG/CpA → CpG substitution rate compared with those with SPM-LM.
Relatively high numbers (approximately 60–80%) of CGIs with H1-HM in all the genomic positions preserve their CpG-rich sequences by a low CpG → TpG/CpA substitution rate caused mainly by their SPM-LM, and for those with SPM-HM partly by CpG selection and TpG/CpA → CpG fixation. BGC has little contribution to the maintenance of CpG-rich sequences of CGIs with SPM-HM which were classified by genomic positions.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1286-x) contains supplementary material, which is available to authorized users.
PMCID: PMC4417305  PMID: 25879481
CpG island; DNA methylation; CpG selection; CpG fixation; Biased gene conversion
8.  Mixture SNPs effect on phenotype in genome-wide association studies 
BMC Genomics  2015;16(1):3.
Recently mixed linear models are used to address the issue of “missing" heritability in traditional Genome-wide association studies (GWAS). The models assume that all single-nucleotide polymorphisms (SNPs) are associated with the phenotypes of interest. However, it is more common that only a small proportion of SNPs have significant effects on the phenotypes, while most SNPs have no or very small effects. To incorporate this feature, we propose an efficient Hierarchical Bayesian Model (HBM) that extends the existing mixed models to enforce automatic selection of significant SNPs. The HBM models the SNP effects using a mixture distribution of a point mass at zero and a normal distribution, where the point mass corresponds to those non-associative SNPs.
We estimate the HBM using Gibbs sampling. The estimation performance of our method is first demonstrated through two simulation studies. We make the simulation setups realistic by using parameters fitted on the Framingham Heart Study (FHS) data. The simulation studies show that our method can accurately estimate the proportion of SNPs associated with the simulated phenotype and identify these SNPs, as well as adapt to certain model mis-specification than the standard mixed models. In addition, we analyze data from the FHS and the Health and Retirement Study (HRS) to study the association between Body Mass Index (BMI) and SNPs on Chromosome 16, and replicate the identified genetic associations. The analysis of the FHS data identifies 0.3% SNPs on Chromosome 16 that affect BMI, including rs9939609 and rs9939973 on the FTO gene. These two SNPs are in strong linkage disequilibrium with rs1558902 (Rsq =0.901 for rs9939609 and Rsq =0.905 for rs9939973), which has been reported to be linked with obesity in previous GWAS. We then replicate the findings using the HRS data: the analysis finds 0.4% of SNPs associated with BMI on Chromosome 16. Furthermore, around 25% of the genes that are identified to be associated with BMI are common between the two studies.
The results demonstrate that the HBM and the associated estimation algorithm offer a powerful tool for identifying significant genetic associations with phenotypes of interest, among a large number of SNPs that are common in modern genetics studies.
PMCID: PMC4417323  PMID: 25649116
Bayesian variable selection; Genome-wide association studies; Gibbs sampling
9.  Analyzing allele specific RNA expression using mixture models 
BMC Genomics  2015;16(1):566.
Measuring allele-specific RNA expression provides valuable insights into cis-acting genetic and epigenetic regulation of gene expression. Widespread adoption of high-throughput sequencing technologies for studying RNA expression (RNA-Seq) permits measurement of allelic RNA expression imbalance (AEI) at heterozygous single nucleotide polymorphisms (SNPs) across the entire transcriptome, and this approach has become especially popular with the emergence of large databases, such as GTEx. However, the existing binomial-type methods used to model allelic expression from RNA-seq assume a strong negative correlation between reference and variant allele reads, which may not be reasonable biologically.
Here we propose a new strategy for AEI analysis using RNA-seq data. Under the null hypothesis of no AEI, a group of SNPs (possibly across multiple genes) is considered comparable if their respective total sums of the allelic reads are of similar magnitude. Within each group of “comparable” SNPs, we identify SNPs with AEI signal by fitting a mixture of folded Skellam distributions to the absolute values of read differences. By applying this methodology to RNA-Seq data from human autopsy brain tissues, we identified numerous instances of moderate to strong imbalanced allelic RNA expression at heterozygous SNPs. Findings with SLC1A3 mRNA exhibiting known expression differences are discussed as examples.
The folded Skellam mixture model searches for SNPs with significant difference between reference and variant allele reads (adjusted for different library sizes), using information from a group of “comparable” SNPs across multiple genes. This model is particularly suitable for performing AEI analysis on genes with few heterozygous SNPs available from RNA-seq, and it can fit over-dispersed read counts without specifying the direction of the correlation between reference and variant alleles.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1749-0) contains supplementary material, which is available to authorized users.
PMCID: PMC4521363  PMID: 26231172
Allelic RNA expression imbalance (AEI); Allele-specific expression (ASE); RNA-seq; Poisson mixture; Folded Skellam mixture; Human brain
10.  Identification of genes associated with shell color in the black-lipped pearl oyster, Pinctada margaritifera 
BMC Genomics  2015;16(1):568.
Color polymorphism in the nacre of pteriomorphian bivalves is of great interest for the pearl culture industry. The nacreous layer of the Polynesian black-lipped pearl oyster Pinctada margaritifera exhibits a large array of color variation among individuals including reflections of blue, green, yellow and pink in all possible gradients. Although the heritability of nacre color variation patterns has been demonstrated by experimental crossing, little is known about the genes involved in these patterns. In this study, we identify a set of genes differentially expressed among extreme color phenotypes of P. margaritifera using a suppressive and subtractive hybridization (SSH) method comparing black phenotypes with full and half albino individuals.
Out of the 358 and 346 expressed sequence tags (ESTs) obtained by conducting two SSH libraries respectively, the expression patterns of 37 genes were tested with a real-time quantitative PCR (RT-qPCR) approach by pooling five individuals of each phenotype. The expression of 11 genes was subsequently estimated for each individual in order to detect inter-individual variation. Our results suggest that the color of the nacre is partially under the influence of genes involved in the biomineralization of the calcitic layer. A few genes involved in the formation of the aragonite tablets of the nacre layer and in the biosynthesis chain of melanin also showed differential expression patterns. Finally, high variability in gene expression levels were observed within the black phenotypes.
Our results revealed that three main genetic processes were involved in color polymorphisms: the biomineralization of the nacreous and calcitic layers and the synthesis of pigments such as melanin, suggesting that color polymorphism takes place at different levels in the shell structure. The high variability of gene expression found within black phenotypes suggests that the present work should serve as a basis for future studies exploring more thoroughly the expression patterns of candidate genes within black phenotypes with different dominant iridescent colors.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1776-x) contains supplementary material, which is available to authorized users.
PMCID: PMC4521380  PMID: 26231360
Differential expression; Biomineralization; Nacre; Pearl; Pigmentation; Albino
11.  MAC: identifying and correcting annotation for multi-nucleotide variations 
BMC Genomics  2015;16(1):569.
Next-Generation Sequencing (NGS) technologies have rapidly advanced our understanding of human variation in cancer. To accurately translate the raw sequencing data into practical knowledge, annotation tools, algorithms and pipelines must be developed that keep pace with the rapidly evolving technology. Currently, a challenge exists in accurately annotating multi-nucleotide variants (MNVs). These tandem substitutions, when affecting multiple nucleotides within a single protein codon of a gene, result in a translated amino acid involving all nucleotides in that codon. Most existing variant callers report a MNV as individual single-nucleotide variants (SNVs), often resulting in multiple triplet codon sequences and incorrect amino acid predictions. To correct potentially misannotated MNVs among reported SNVs, a primary challenge resides in haplotype phasing which is to determine whether the neighboring SNVs are co-located on the same chromosome.
Here we describe MAC (Multi-Nucleotide Variant Annotation Corrector), an integrative pipeline developed to correct potentially mis-annotated MNVs. MAC was designed as an application that only requires a SNV file and the matching BAM file as data inputs. Using an example data set containing 3024 SNVs and the corresponding whole-genome sequencing BAM files, we show that MAC identified eight potentially mis-annotated SNVs, and accurately updated the amino acid predictions for seven of the variant calls.
MAC can identify and correct amino acid predictions that result from MNVs affecting multiple nucleotides within a single protein codon, which cannot be handled by most existing SNV-based variant pipelines. The MAC software is freely available and represents a useful tool for the accurate translation of genomic sequence to protein function.
PMCID: PMC4521406  PMID: 26231518
12.  Identification and functional analysis of early gene expression induced by circadian light-resetting in Drosophila 
BMC Genomics  2015;16(1):570.
The environmental light–dark cycle is the dominant cue that maintains 24-h biological rhythms in multicellular organisms. In Drosophila, light entrainment is mediated by the photosensitive protein CRYPTOCHROME, but the role and extent of transcription regulation in light resetting of the dipteran clock is yet unknown. Given the broad transcriptional changes in response to light previously identified in mammals, we have sought to analyse light-induced global transcriptional changes in the fly’s head by using Affymetrix microarrays. Flies were subjected to a 30-min light pulse during the early night (3 h after lights-off), a stimulus which causes a substantial phase delay of the circadian rhythm. We then analysed changes in gene expression 1 h after the light stimulus.
We identified 200 genes whose transcripts were significantly altered in response to the light pulse at a false discovery rate cut-off of 10 %. Analysis of these genes and their biological functions suggests the involvement of at least six biological processes in light-induced delay phase shifts of rhythmic activities. These processes include signalling, ion channel transport, receptor activity, synaptic organisation, signal transduction, and chromatin remodelling. Using RNAi, the expression of 22 genes was downregulated in the clock neurons, leading to significant effects on circadian output. For example, while continuous light normally causes arrhythmicity in wild-type flies, the knockdown of Kr-h1, Nipped-A, Thor, nrv1, Nf1, CG11155 (ionotropic glutamate receptor), and Fmr1 resulted in flies that were rhythmic, suggesting a disruption in the light input pathway to the clock.
Our analysis provides a first insight into the early responsive genes that are activated by light and their contribution to light resetting of the Drosophila clock. The analysis suggests multiple domains and pathways that might be associated with light entrainment, including a mechanism that was represented by a light-activated set of chromatin remodelling genes.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1787-7) contains supplementary material, which is available to authorized users.
PMCID: PMC4521455  PMID: 26231660
Circadian clock; Transcriptome; Light entrainment; Drosophila; Microarrays; Chromatin remodelling; Gene expression; Circadian phase shift
13.  POTION: an end-to-end pipeline for positive Darwinian selection detection in genome-scale data through phylogenetic comparison of protein-coding genes 
BMC Genomics  2015;16(1):567.
Detection of genes evolving under positive Darwinian evolution in genome-scale data is nowadays a prevailing strategy in comparative genomics studies to identify genes potentially involved in adaptation processes. Despite the large number of studies aiming to detect and contextualize such gene sets, there is virtually no software available to perform this task in a general, automatic, large-scale and reliable manner. This certainly occurs due to the computational challenges involved in this task, such as the appropriate modeling of data under analysis, the computation time to perform several of the required steps when dealing with genome-scale data and the highly error-prone nature of the sequence and alignment data structures needed for genome-wide positive selection detection.
We present POTION, an open source, modular and end-to-end software for genome-scale detection of positive Darwinian selection in groups of homologous coding sequences. Our software represents a key step towards genome-scale, automated detection of positive selection, from predicted coding sequences and their homology relationships to high-quality groups of positively selected genes. POTION reduces false positives through several sophisticated sequence and group filters based on numeric, phylogenetic, quality and conservation criteria to remove spurious data and through multiple hypothesis corrections, and considerably reduces computation time thanks to a parallelized design. Our software achieved a high classification performance when used to evaluate a curated dataset of Trypanosoma brucei paralogs previously surveyed for positive selection. When used to analyze predicted groups of homologous genes of 19 strains of Mycobacterium tuberculosis as a case study we demonstrated the filters implemented in POTION to remove sources of errors that commonly inflate errors in positive selection detection. A thorough literature review found no other software similar to POTION in terms of customization, scale and automation.
To the best of our knowledge, POTION is the first tool to allow users to construct and check hypotheses regarding the occurrence of site-based evidence of positive selection in non-curated, genome-scale data within a feasible time frame and with no human intervention after initial configuration. POTION is available at
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1765-0) contains supplementary material, which is available to authorized users.
PMCID: PMC4521464  PMID: 26231214
Genome-scale positive selection detection; Comparative genomics; Molecular Darwinian positive selection
14.  Sequence diversity and differential expression of major phenylpropanoid-flavonoid biosynthetic genes among three mango varieties 
BMC Genomics  2015;16(1):561.
Mango fruits contain a broad spectrum of phenolic compounds which impart potential health benefits; their biosynthesis is catalysed by enzymes in the phenylpropanoid-flavonoid (PF) pathway. The aim of this study was to reveal the variability in genes involved in the PF pathway in three different mango varieties Mangifera indica L., a member of the family Anacardiaceae: Kensington Pride (KP), Irwin (IW) and Nam Doc Mai (NDM) and to determine associations with gene expression and mango flavonoid profiles.
A close evolutionary relationship between mango genes and those from the woody species poplar of the Salicaceae family (Populus trichocarpa) and grape of the Vitaceae family (Vitis vinifera), was revealed through phylogenetic analysis of PF pathway genes. We discovered 145 SNPs in total within coding sequences with an average frequency of one SNP every 316 bp. Variety IW had the highest SNP frequency (one SNP every 258 bp) while KP and NDM had similar frequencies (one SNP every 369 bp and 360 bp, respectively). The position in the PF pathway appeared to influence the extent of genetic diversity of the encoded enzymes. The entry point enzymes phenylalanine lyase (PAL), cinnamate 4-mono-oxygenase (C4H) and chalcone synthase (CHS) had low levels of SNP diversity in their coding sequences, whereas anthocyanidin reductase (ANR) showed the highest SNP frequency followed by flavonoid 3’-hydroxylase (F3’H). Quantitative PCR revealed characteristic patterns of gene expression that differed between mango peel and flesh, and between varieties.
The combination of mango expressed sequence tags and availability of well-established reference PF biosynthetic genes from other plant species allowed the identification of coding sequences of genes that may lead to the formation of important flavonoid compounds in mango fruits and facilitated characterisation of single nucleotide polymorphisms between varieties. We discovered an association between the extent of sequence variation and position in the pathway for up-stream genes. The high expression of PAL, C4H and CHS genes in mango peel compared to flesh is associated with high amounts of total phenolic contents in peels, which suggest that these genes have an influence on total flavonoid levels in mango fruit peel and flesh. In addition, the particularly high expression levels of ANR in KP and NDM peels compared to IW peel and the significant accumulation of its product epicatechin gallate (ECG) in those extracts reflects the rate-limiting role of ANR on ECG biosynthesis in mango.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1784-x) contains supplementary material, which is available to authorized users.
PMCID: PMC4518526  PMID: 26220670
Mango fruit; Expressed sequence tags; Phenylpropanoid-flavonoid pathway; Nucleotide diversity; Gene expression
15.  Deep sequencing-based characterization of transcriptome of trifoliate orange (Poncirus trifoliata (L.) Raf.) in response to cold stress 
BMC Genomics  2015;16(1):555.
Trifoliate orange (Poncirus trifoliata (L.) Raf.) is extremely cold hardy after a full acclimation; however the underlying molecular mechanisms underlying this economically valuable trait remain poorly understood. In this study, global transcriptome profiles of trifoliate orange under cold conditions (4 °C) over a time course were generated by high-throughput sequencing.
More than 68 million high-quality reads were produced and assembled into a non-redundant data of 77,292 unigenes with an average length of 1112 bp (N50 = 1778 bp). Of these, 23,846 had significant sequence similarity to known genes and these were assigned to 61 gene ontology (GO) categories and 25 clusters of orthologous groups (COG) involved in 128 KEGG pathways. Sequences derived from cold-treated and control plants were mapped to the assembled transcriptome, resulting in the identification of 5549 differentially expressed genes (DEGs). These comprised 600 (462 up-regulated, 138 down-regulated), 2346 (1631 up-regulated, 715 down-regulated), and 5177 (2702 up-regulated, 2475 down-regulated) genes from the cold-treated samples at 6, 24 and 72 h, respectively. The accuracy of the RNA-seq derived transcript expression data was validated by analyzing the expression patterns of 17 DEGs by qPCR. Plant hormone signal transduction, plant-pathogen interaction, and secondary metabolism were the most significantly enriched GO categories amongst in the DEGs. A total of 60 transcription factors were shown to be cold responsive. In addition, a number of genes involved in the catabolism and signaling of hormones, such as abscisic acid, ethylene and gibberellin, were affected by the cold stress. Meanwhile, levels of putrescine progressively increased under cold, which was consistent with up-regulation of an arginine decarboxylase gene.
This dataset provides valuable information regarding the trifoliate orange transcriptome changes in response to cold stress and may help guide future identification and functional analysis of genes that are importnatn for enhancing cold hardiness.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1629-7) contains supplementary material, which is available to authorized users.
PMCID: PMC4518522  PMID: 26219960
Poncirus trifoliata; RNA-seq; Cold stress; Transcriptome profiling; Digital gene expression; Citrus
16.  Global transcriptome and gene regulation network for secondary metabolite biosynthesis of tea plant (Camellia sinensis) 
BMC Genomics  2015;16(1):560.
Major secondary metabolites, including flavonoids, caffeine, and theanine, are important components of tea products and are closely related to the taste, flavor, and health benefits of tea. Secondary metabolite biosynthesis in Camellia sinensis is differentially regulated in different tissues during growth and development. Until now, little was known about the expression patterns of genes involved in secondary metabolic pathways or their regulatory mechanisms. This study aimed to generate expression profiles for C. sinensis tissues and to build a gene regulation model of the secondary metabolic pathways.
RNA sequencing was performed on 13 different tissue samples from various organs and developmental stages of tea plants, including buds and leaves of different ages, stems, flowers, seeds, and roots. A total of 43.7 Gbp of raw sequencing data were generated, from which 347,827 unigenes were assembled and annotated. There were 46,693, 8446, 3814, 10,206, and 4948 unigenes specifically expressed in the buds and leaves, stems, flowers, seeds, and roots, respectively. In total, 1719 unigenes were identified as being involved in the secondary metabolic pathways in C. sinensis, and the expression patterns of the genes involved in flavonoid, caffeine, and theanine biosynthesis were characterized, revealing the dynamic nature of their regulation during plant growth and development. The possible transcription factor regulation network for the biosynthesis of flavonoid, caffeine, and theanine was built, encompassing 339 transcription factors from 35 families, namely bHLH, MYB, and NAC, among others. Remarkably, not only did the data reveal the possible critical check points in the flavonoid, caffeine, and theanine biosynthesis pathways, but also implicated the key transcription factors and related mechanisms in the regulation of secondary metabolite biosynthesis.
Our study generated gene expression profiles for different tissues at different developmental stages in tea plants. The gene network responsible for the regulation of the secondary metabolic pathways was analyzed. Our work elucidated the possible cross talk in gene regulation between the secondary metabolite biosynthetic pathways in C. sinensis. The results increase our understanding of how secondary metabolic pathways are regulated during plant development and growth cycles, and help pave the way for genetic selection and engineering for germplasm improvement.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1773-0) contains supplementary material, which is available to authorized users.
PMCID: PMC4518527  PMID: 26220550
Tea plant; Camellia sinensis; RNA-seq; Secondary metabolite; Transcription factor; Regulation network
17.  The Pkn22 Ser/Thr kinase in Nostoc PCC 7120: role of FurA and NtcA regulators and transcript profiling under nitrogen starvation and oxidative stress 
BMC Genomics  2015;16(1):557.
The filamentous cyanobacterium Nostoc sp. strain PCC 7120 can fix N2 when combined nitrogen is not available. Furthermore, it has to cope with reactive oxygen species generated as byproducts of photosynthesis and respiration. We have previously demonstrated the synthesis of Ser/Thr kinase Pkn22 as an important survival response of Nostoc to oxidative damage. In this study we wished to investigate the possible involvement of this kinase in signalling peroxide stress and nitrogen deprivation.
Quantitative RT-PCR experiments revealed that the pkn22 gene is induced in response to peroxide stress and to combined nitrogen starvation. Electrophoretic motility assays indicated that the pkn22 promoter is recognized by the global transcriptional regulators FurA and NtcA. Transcriptomic analysis comparing a pkn22-insertion mutant and the wild type strain indicated that this kinase regulates genes involved in important cellular functions such as photosynthesis, carbon metabolism and iron acquisition. Since metabolic changes may lead to oxidative stress, we investigated whether this is the case with nitrogen starvation. Our results rather invalidate this hypothesis thereby suggesting that the function of Pkn22 under nitrogen starvation is independent of its role in response to peroxide stress.
Our analyses have permitted a more complete functional description of Ser/Thr kinase in Nostoc. We have decrypted the transcriptional regulation of the pkn22 gene, and analysed the whole set of genes under the control of this kinase in response to the two environmental changes often encountered by cyanobacteria in their natural habitat: oxidative stress and nitrogen deprivation.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1703-1) contains supplementary material, which is available to authorized users.
PMCID: PMC4518582  PMID: 26220092
Cyanobacteria; Nostoc; Ser/Thr kinase; Oxidative stress; Nitrogen starvation; Signalling; Microarray
18.  Transcriptional analysis of susceptible and resistant European corn borer strains and their response to Cry1F protoxin 
BMC Genomics  2015;16(1):558.
Despite a number of recent reports of insect resistance to transgenic crops expressing insecticidal toxins from Bacillus thuringiensis (Bt), little is known about the mechanism of resistance to these toxins. The purpose of this study is to identify genes associated with the mechanism of Cry1F toxin resistance in European corn borer (Ostrinia nubilalis Hübner). For this, we compared the global transcriptomic response of laboratory selected resistant and susceptible O. nubilalis strain to Cry1F toxin. We further identified constitutive transcriptional differences between the two strains.
An O. nubilalis midgut transcriptome of 36,125 transcripts was assembled de novo from 106 million Illumina HiSeq and Roche 454 reads and used as a reference for estimation of differential gene expression analysis. Evaluation of gene expression profiles of midgut tissues from the Cry1F susceptible and resistant strains after toxin exposure identified a suite of genes that responded to the toxin in the susceptible strain (n = 1,654), but almost 20-fold fewer in the resistant strain (n = 84). A total of 5,455 midgut transcripts showed significant constitutive expression differences between Cry1F susceptible and resistant strains. Transcripts coding for previously identified Cry toxin receptors, cadherin and alkaline phosphatase and proteases were also differentially expressed in the midgut of the susceptible and resistant strains.
Our current study provides a valuable resource for further molecular characterization of Bt resistance and insect response to Cry1F toxin in O. nubilalis and other pest species.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1751-6) contains supplementary material, which is available to authorized users.
PMCID: PMC4518661  PMID: 26220297
European corn borer; Ostrinia nubilalis; Insect resistance; Cry1F resistance; Bt-toxin; Transcriptomics; Cry1F response; RNA-Seq
19.  Quantitative proteomic analysis of formalin–fixed, paraffin–embedded clear cell renal cell carcinoma tissue using stable isotopic dimethylation of primary amines 
BMC Genomics  2015;16(1):559.
Formalin-fixed, paraffin-embedded (FFPE) tissues represent the most abundant resource of archived human specimens in pathology. Such tissue specimens are emerging as a highly valuable resource for translational proteomic studies. In quantitative proteomic analysis, reductive di-methylation of primary amines using stable isotopic formaldehyde variants is increasingly used due to its robustness and cost-effectiveness.
In the present study we show for the first time that isotopic amine dimethylation can be used in a straightforward manner for the quantitative proteomic analysis of FFPE specimens without interference from formalin employed in the FFPE process. Isotopic amine dimethylation of FFPE specimens showed equal labeling efficiency as for cryopreserved specimens. For both FFPE and cryopreserved specimens, differential labeling of identical samples yielded highly similar ratio distributions within the expected range for dimethyl labeling. In an initial application, we profiled proteome changes in clear cell renal cell carcinoma (ccRCC) FFPE tissue specimens compared to adjacent non–malignant renal tissue. Our findings highlight increased levels of glyocolytic enzymes, annexins as well as ribosomal and proteasomal proteins.
Our study establishes isotopic amine dimethylation as a versatile tool for quantitative proteomic analysis of FFPE specimens and underlines proteome alterations in ccRCC.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1768-x) contains supplementary material, which is available to authorized users.
PMCID: PMC4518706  PMID: 26220445
Dimethylation; Formalin-fixation; Paraffin-embedment clear cell renal cell carcinoma
20.  Network analysis of temporal functionalities of the gut induced by perturbations in new-born piglets 
BMC Genomics  2015;16(1):556.
Evidence is accumulating that perturbation of early life microbial colonization of the gut induces long-lasting adverse health effects in individuals. Understanding the mechanisms behind these effects will facilitate modulation of intestinal health. The objective of this study was to identify biological processes involved in these long lasting effects and the (molecular) factors that regulate them. We used an antibiotic and the same antibiotic in combination with stress on piglets as an early life perturbation. Then we used host gene expression data from the gut (jejunum) tissue and community-scale analysis of gut microbiota from the same location of the gut, at three different time-points to gauge the reaction to the perturbation. We analysed the data by a new combination of existing tools. First, we analysed the data in two dimensions, treatment and time, with quadratic regression analysis. Then we applied network-based data integration approaches to find correlations between host gene expression and the resident microbial species.
The use of a new combination of data analysis tools allowed us to identify significant long-lasting differences in jejunal gene expression patterns resulting from the early life perturbations. In addition, we were able to identify potential key gene regulators (hubs) for these long-lasting effects. Furthermore, data integration also showed that there are a handful of bacterial groups that were associated with temporal changes in gene expression.
The applied systems-biology approach allowed us to take the first steps in unravelling biological processes involved in long lasting effects in the gut due to early life perturbations. The observed data are consistent with the hypothesis that these long lasting effects are due to differences in the programming of the gut immune system as induced by the temporary early life changes in the composition and/or diversity of microbiota in the gut.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1733-8) contains supplementary material, which is available to authorized users.
PMCID: PMC4518884  PMID: 26220188
Gene expression; Microbiota; Data-integration; Long-term effects; Early life perturbations; Antibiotic; Stress; Pig intestine
21.  Whole transcriptome profiling of the vernalization process in Lilium longiflorum (cultivar White Heaven) bulbs 
BMC Genomics  2015;16(1):550.
Vernalization is an obligatory requirement of extended exposure to low temperatures to induce flowering in certain plants. It is the most important factor affecting flowering time and quality in Easter lily (Lilium longiflorum). Exposing the bulbs to 4 °C gradually decreases flowering time up to 50 % compared to non-vernalized plants. We aim to understand the molecular regulation of vernalization in Easter lily, for which we characterized the global expression in lily bulb meristems after 0, 2, 5, 7 and 9 weeks of incubation at 4 °C.
We assembled de-novo a transcriptome which, after filtering, yielded 121,572 transcripts and 42,430 genes which hold 15,414 annotated genes, with up to 3,657 GO terms. This extensive annotation was mapped to the more general GO slim plant with a total of 94 terms. The response to cold exposure was summarized in 6 expression clusters, providing useful patterns for dissecting the dynamics of vernalization in lily. The functional annotation (GO and GO slim plant) was used to group transcripts in gene sets. Analysis of these gene sets and profiles revealed that most of the enriched functions among genes up-regulated by cold exposure were related to epigenetic processes and chromatin remodeling. Candidate vernalization genes in lily were selected based on their sequence similarity to known regulators of flowering in other species.
We present a detailed analysis of gene expression dynamics during vernalization in Lilium, covering several time points and accounting for biological variation by the use of replicates. The resulting collection of transcripts and novel isoforms provides a useful resource for studying the changes occurring during vernalization at a fine level. The selected potential candidate genes can shed light on the regulation of this process.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1675-1) contains supplementary material, which is available to authorized users.
PMCID: PMC4515921  PMID: 26216467
22.  Transcriptome analysis of bacteriophage communities in periodontal health and disease 
BMC Genomics  2015;16(1):549.
The role of viruses as members of the human microbiome has gained broader attention with the discovery that human body surfaces are inhabited by sizeable viral communities. The majority of the viruses identified in these communities have been bacteriophages that predate upon cellular microbiota rather than the human host. Phages have the capacity to lyse their hosts or provide them with selective advantages through lysogenic conversion, which could help determine the structure of co-existing bacterial communities. Because conditions such as periodontitis are associated with altered bacterial biota, phage mediated perturbations of bacterial communities have been hypothesized to play a role in promoting periodontal disease. Oral phage communities also differ significantly between periodontal health and disease, but the gene expression of oral phage communities has not been previously examined.
Here, we provide the first report of gene expression profiles from the oral bacteriophage community using RNA sequencing, and find that oral phages are more highly expressed in subjects with relative periodontal health. While lysins were highly expressed, the high proportion of integrases expressed suggests that prophages may account for a considerable proportion of oral phage gene expression. Many of the transcriptome reads matched phages found in the oral cavities of the subjects studied, indicating that phages may account for a substantial proportion of oral gene expression. Reads homologous to siphoviruses that infect Firmicutes were amongst the most prevalent transcriptome reads identified in both periodontal health and disease. Some genes from the phage lytic module were significantly more highly expressed in subjects with periodontal disease, suggesting that periodontitis may favor the expression of some lytic phages.
As we explore the contributions of viruses to the human microbiome, the data presented here suggest varying expression of bacteriophage communities in oral health and disease.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1781-0) contains supplementary material, which is available to authorized users.
PMCID: PMC4515923  PMID: 26215258
Saliva; Bacteriophage; Microbiome; Virome; Metagenome; Transcriptome; Periodontal Disease; Periodontitis
23.  Genomic sequence of the aflatoxigenic filamentous fungus Aspergillus nomius 
BMC Genomics  2015;16(1):551.
Aspergillus nomius is an opportunistic pathogen and one of the three most important producers of aflatoxins in section Flavi. This fungus has been reported to contaminate agricultural commodities, but it has also been sampled in non-agricultural areas so the host range is not well known. Having a similar mycotoxin profile as A. parasiticus, isolates of A. nomius are capable of secreting B- and G- aflatoxins.
In this study we discovered that the A. nomius type strain (NRRL 13137) has a genome size of approximately 36 Mb which is comparable to other Aspergilli whose genomes have been sequenced. Its genome encompasses 11,918 predicted genes, 72 % of which were assigned GO terms using BLAST2GO. More than 1,200 of those predicted genes were identified as unique to A. nomius, and the most significantly enriched GO category among the unique genes was oxidoreducatase activity. Phylogenomic inference shows NRRL 13137 as ancestral to the other aflatoxigenic species examined from section Flavi. This strain contains a single mating-type idiomorph designated as MAT1-1.
This study provides a preliminary analysis of the A. nomius genome. Given the recently discovered potential for A. nomius to undergo sexual recombination, and based on our findings, this genome sequence provides an additional evolutionary reference point for studying the genetics and biology of aflatoxin production.
PMCID: PMC4515932  PMID: 26216546
Aspergillus nomius; Genome sequence; Gene ontology; Phylogenomics; Mating-type locus
24.  Evolutionary insights from de novo transcriptome assembly and SNP discovery in California white oaks 
BMC Genomics  2015;16(1):552.
Reference transcriptomes provide valuable resources for understanding evolution within and among species. We de novo assembled and annotated a reference transcriptome for Quercus lobata and Q. garryana and identified single-nucleotide polymorphisms (SNPs) to provide resources for forest genomicists studying this ecologically and economically important genus. We further performed preliminary analyses of genes important in interspecific divergent (positive) selection that might explain ecological differences among species, estimating rates of nonsynonymous to synonymous substitutions (dN/dS) and Fay and Wu’s H. Functional classes of genes were tested for unusually high dN/dS or low H consistent with divergent positive selection.
Our draft transcriptome is among the most complete for oaks, including 83,644 contigs (23,329 ≥ 1 kbp), 14,898 complete and 13,778 partial gene models, and functional annotations for 9,431 Arabidopsis orthologs and 19,365 contigs with Pfam hits. We identified 1.7 million possible sequence variants including 1.1 million high-quality diallelic SNPs — among the largest sets identified in any tree. 11 of 18 functional categories with significantly elevated dN/dS are involved in disease response, including 50+ genes with dN/dS > 1. Other high-dN/dS genes are involved in biotic response, flowering and growth, or regulatory processes. In contrast, median dN/dS was low (0.22), suggesting that purifying selection influences most genes. No functional categories have unusually low H.
These results offer preliminary support for the hypothesis that divergent selection at pathogen resistance are important factors in species divergence in these hybridizing California oaks. Our transcriptome provides a solid foundation for future studies of gene expression, natural selection, and speciation in Quercus.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1761-4) contains supplementary material, which is available to authorized users.
PMCID: PMC4517385  PMID: 26215102
Annotation; De novo assembly; Divergence; dN/dS; Quercus douglasii; Quercus garryana; Quercus lobata; RNA-Seq; Single-nucleotide polymorphism; Transcriptome
25.  The genome of the truffle-parasite Tolypocladium ophioglossoides and the evolution of antifungal peptaibiotics 
BMC Genomics  2015;16(1):553.
Two major mycoparasitic lineages, the family Hypocreaceae and the genus Tolypocladium, exist within the fungal order, Hypocreales. Peptaibiotics are a group of secondary metabolites almost exclusively described from Trichoderma species of Hypocreaceae. Peptaibiotics are produced by nonribosomal peptide synthetases (NRPSs) and have antibiotic and antifungal activities. Tolypocladium species are mainly truffle parasites, but a few species are insect pathogens.
The draft genome sequence of the truffle parasite Tolypocladium ophioglossoides was generated and numerous secondary metabolite clusters were discovered, many of which have no known putative product. However, three large peptaibiotic gene clusters were identified using phylogenetic analyses. Peptaibiotic genes are absent from the predominantly plant and insect pathogenic lineages of Hypocreales, and are therefore exclusive to the largely mycoparasitic lineages. Using NRPS adenylation domain phylogenies and reconciliation of the domain tree with the organismal phylogeny, it is demonstrated that the distribution of these domains is likely not the product of horizontal gene transfer between mycoparasitic lineages, but represents independent losses in insect pathogenic lineages. Peptaibiotic genes are less conserved between species of Tolypocladium and are the product of complex patterns of lineage sorting and module duplication. In contrast, these genes are more conserved within the genus Trichoderma and consistent with diversification through speciation.
Peptaibiotic NRPS genes are restricted to mycoparasitic lineages of Hypocreales, based on current sampling. Phylogenomics and comparative genomics can provide insights into the evolution of secondary metabolite genes, their distribution across a broader range of taxa, and their possible function related to host specificity.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1777-9) contains supplementary material, which is available to authorized users.
PMCID: PMC4517408  PMID: 26215153
Secondary metabolism; Hypocreales; Mycoparasites; Lineage sorting

Results 1-25 (7994)