Search tips
Search criteria

Results 1-25 (447)

Clipboard (0)
Year of Publication
more »
1.  Transcription analysis on response of porcine alveolar macrophages to Haemophilus parasuis 
BMC Genomics  2012;13:68.
Haemophilus parasuis (H. parasuis) is the etiological agent of Glässer's disease in pigs. Currently, the molecular basis of this infection is largely unknown. The innate immune response is the first line of defense against the infectious disease. Systematical analysis on host innate immune response to the infection is important for understanding the pathogenesis of the infectious microorganisms.
A total of 428 differentially expressed (DE) genes were identified in the porcine alveolar macrophages (PAMs) 6 days after H. parasuis infection. These genes were principally related to inflammatory response, immune response, microtubule polymerization, regulation of transcript and signal transduction. Through the pathway analysis, the significant pathways mainly concerned with cell adhesion molecules, cytokine-cytokine receptor interaction, complement and coagulation cascades, toll-like receptor signaling pathway, MAPK signaling pathway, suggesting that the host took different strategies to activate immune and inflammatory response upon H. parasuis infection. The global interactions network and two subnetworks of the proteins encoded by DE genes were analyzed by using STRING. Further immunostimulation analysis indicated that mRNA levels of S100 calcium-binding protein A4 (S100A4) and S100 calcium-binding protein A6 (S100A6) in porcine PK-15 cells increased within 48 h and were sustained after administration of lipopolysaccharide (LPS) and Poly (I:C) respectively. The s100a4 and s100a6 genes were found to be up-regulated significantly in lungs, spleen and lymph nodes in H. parasuis infected pigs. We firstly cloned and sequenced the porcine coronin1a gene. Phylogenetic analysis showed that poCORONIN 1A belonged to the group containing the Bos taurus sequence. Structural analysis indicated that the poCORONIN 1A contained putative domains of Trp-Asp (WD) repeats signature, Trp-Asp (WD) repeats profile and Trp-Asp (WD) repeats circular profile at the N-terminus.
Our present study is the first one focusing on the response of porcine alveolar macrophages to H. parasuis. Our data demonstrate a series of genes are activated upon H. parasuis infection. The observed gene expression profile could help screening the potential host agents for reducing the prevalence of H. parasuis and further understanding the molecular pathogenesis associated with H. parasuis infection in pigs.
PMCID: PMC3296652  PMID: 22330747
2.  Comparative genomics of parasitic silkworm microsporidia reveal an association between genome expansion and host adaptation 
BMC Genomics  2013;14:186.
Microsporidian Nosema bombycis has received much attention because the pébrine disease of domesticated silkworms results in great economic losses in the silkworm industry. So far, no effective treatment could be found for pébrine. Compared to other known Nosema parasites, N. bombycis can unusually parasitize a broad range of hosts. To gain some insights into the underlying genetic mechanism of pathological ability and host range expansion in this parasite, a comparative genomic approach is conducted. The genome of two Nosema parasites, N. bombycis and N. antheraeae (an obligatory parasite to undomesticated silkworms Antheraea pernyi), were sequenced and compared with their distantly related species, N. ceranae (an obligatory parasite to honey bees).
Our comparative genomics analysis show that the N. bombycis genome has greatly expanded due to the following three molecular mechanisms: 1) the proliferation of host-derived transposable elements, 2) the acquisition of many horizontally transferred genes from bacteria, and 3) the production of abundnant gene duplications. To our knowledge, duplicated genes derived not only from small-scale events (e.g., tandem duplications) but also from large-scale events (e.g., segmental duplications) have never been seen so abundant in any reported microsporidia genomes. Our relative dating analysis further indicated that these duplication events have arisen recently over very short evolutionary time. Furthermore, several duplicated genes involving in the cytotoxic metabolic pathway were found to undergo positive selection, suggestive of the role of duplicated genes on the adaptive evolution of pathogenic ability.
Genome expansion is rarely considered as the evolutionary outcome acting on those highly reduced and compact parasitic microsporidian genomes. This study, for the first time, demonstrates that the parasitic genomes can expand, instead of shrink, through several common molecular mechanisms such as gene duplication, horizontal gene transfer, and transposable element expansion. We also showed that the duplicated genes can serve as raw materials for evolutionary innovations possibly contributing to the increase of pathologenic ability. Based on our research, we propose that duplicated genes of N. bombycis should be treated as primary targets for treatment designs against pébrine.
PMCID: PMC3614468  PMID: 23496955
Gene duplication; Horizontal gene transfer; Host-derived transposable element; Host adaptation; Microsporidian; Silkworms
3.  Genomic and transcriptomic analysis of the endophytic fungus Pestalotiopsis fici reveals its lifestyle and high potential for synthesis of natural products 
BMC Genomics  2015;16(1):28.
In recent years, the genus Pestalotiopsis is receiving increasing attention, not only because of its economic impact as a plant pathogen but also as a commonly isolated endophyte which is an important source of bioactive natural products. Pestalotiopsis fici Steyaert W106-1/CGMCC3.15140 as an endophyte of tea produces numerous novel secondary metabolites, including chloropupukeananin, a derivative of chlorinated pupukeanane that is first discovered in fungi. Some of them might be important as the drug leads for future pharmaceutics.
Here, we report the genome sequence of the endophytic fungus of tea Pestalotiopsis fici W106-1/CGMCC3.15140. The abundant carbohydrate-active enzymes especially significantly expanding pectinases allow the fungus to utilize the limited intercellular nutrients within the host plants, suggesting adaptation of the fungus to endophytic lifestyle. The P. fici genome encodes a rich set of secondary metabolite synthesis genes, including 27 polyketide synthases (PKSs), 12 non-ribosomal peptide synthases (NRPSs), five dimethylallyl tryptophan synthases, four putative PKS-like enzymes, 15 putative NRPS-like enzymes, 15 terpenoid synthases, seven terpenoid cyclases, seven fatty-acid synthases, and five hybrids of PKS-NRPS. The majority of these core enzymes distributed into 74 secondary metabolite clusters. The putative Diels-Alderase genes have undergone expansion.
The significant expansion of pectinase encoding genes provides essential insight in the life strategy of endophytes, and richness of gene clusters for secondary metabolites reveals high potential of natural products of endophytic fungi.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-014-1190-9) contains supplementary material, which is available to authorized users.
PMCID: PMC4320822  PMID: 25623211
Genome; Endophyte; Pestalotiopsis fici; Secondary metabolite
4.  Mapping and analysis of a novel candidate Fusarium wilt resistance gene FOC1 in Brassica oleracea 
BMC Genomics  2014;15(1):1094.
Cabbage Fusarium wilt is a major disease worldwide that can cause severe yield loss in cabbage (Brassica olerecea). Although markers linked to the resistance gene FOC1 have been identified, no candidate gene for it has been determined so far. In this study, we report the fine mapping and analysis of a candidate gene for FOC1 using a double haploid (DH) population with 160 lines and a F2 population of 4000 individuals derived from the same parental lines.
We confirmed that the resistance to Fusarium wilt was controlled by a single dominant gene based on the resistance segregation ratio of the two populations. Using InDel primers designed from whole-genome re-sequencing data for the two parental lines (the resistant inbred-line 99–77 and the highly susceptible line 99–91) and the DH population, we mapped the resistance gene to a 382-kb genomic region on chromosome C06. Using the F2 population, we narrowed the region to an 84-kb interval that harbored ten genes, including four probable resistance genes (R genes): Bol037156, Bol037157, Bol037158 and Bol037161 according to the gene annotations from BRAD, the genomic database for B. oleracea. After correcting the model of the these genes, we re-predicted two R genes in the target region: re-Bol037156 and re-Bol0371578. The latter was excluded after we compared the two genes’ sequences between ten resistant materials and ten susceptible materials. For re-Bol037156, we found high identity among the sequences of the resistant lines, while among the susceptible lines, there were two types of InDels (a 1-bp insertion and a 10-bp deletion), each of which caused a frameshift and terminating mutation in the cDNA sequences. Further sequence analysis of the two InDel loci from 80 lines (40 resistant and 40 susceptible) also showed that all 40 R lines had no InDel mutation while 39 out of 40 S lines matched the two types of loci. Thus re-Bol037156 was identified as a likely candidate gene for FOC1 in cabbage.
This work may lay the foundation for marker-assisted selection as well as for further function analysis of the FOC1 gene.
PMCID: PMC4299151  PMID: 25495687
Brassica oleracea; Fusarium wilt; Resistance gene; FOC1; Map-based cloning
5.  Whole genome comparative analysis of channel catfish (Ictalurus punctatus) with four model fish species 
BMC Genomics  2013;14:780.
Comparative mapping is a powerful tool to study evolution of genomes. It allows transfer of genome information from the well-studied model species to non-model species. Catfish is an economically important aquaculture species in United States. A large amount of genome resources have been developed from catfish including genetic linkage maps, physical maps, BAC end sequences (BES), integrated linkage and physical maps using BES-derived markers, physical map contig-specific sequences, and draft genome sequences. Application of such genome resources should allow comparative analysis at the genome scale with several other model fish species.
In this study, we conducted whole genome comparative analysis between channel catfish and four model fish species with fully sequenced genomes, zebrafish, medaka, stickleback and Tetraodon. A total of 517 Mb draft genome sequences of catfish were anchored to its genetic linkage map, which accounted for 62% of the total draft genome sequences. Based on the location of homologous genes, homologous chromosomes were determined among catfish and the four model fish species. A large number of conserved syntenic blocks were identified. Analysis of the syntenic relationships between catfish and the four model fishes supported that the catfish genome is most similar to the genome of zebrafish.
The organization of the catfish genome is similar to that of the four teleost species, zebrafish, medaka, stickleback, and Tetraodon such that homologous chromosomes can be identified. Within each chromosome, extended syntenic blocks were evident, but the conserved syntenies at the chromosome level involve extensive inter-chromosomal and intra-chromosomal rearrangements. This whole genome comparative map should facilitate the whole genome assembly and annotation in catfish, and will be useful for genomic studies of various other fish species.
PMCID: PMC3840565  PMID: 24215161
Catfish; Genome; Comparative mapping; Linkage mapping; Conserved synteny
6.  Diversity and evolution of multiple orc/cdc6-adjacent replication origins in haloarchaea 
BMC Genomics  2012;13:478.
While multiple replication origins have been observed in archaea, considerably less is known about their evolutionary processes. Here, we performed a comparative analysis of the predicted (proved in part) orc/cdc6-associated replication origins in 15 completely sequenced haloarchaeal genomes to investigate the diversity and evolution of replication origins in halophilic Archaea.
Multiple orc/cdc6-associated replication origins were predicted in all of the analyzed haloarchaeal genomes following the identification of putative ORBs (origin recognition boxes) that are associated with orc/cdc6 genes. Five of these predicted replication origins in Haloarcula hispanica were experimentally confirmed via autonomous replication activities. Strikingly, several predicted replication origins in H. hispanica and Haloarcula marismortui are located in the distinct regions of their highly homologous chromosomes, suggesting that these replication origins might have been introduced as parts of new genomic content. A comparison of the origin-associated Orc/Cdc6 homologs and the corresponding predicted ORB elements revealed that the replication origins in a given haloarchaeon are quite diverse, while different haloarchaea can share a few conserved origins. Phylogenetic and genomic context analyses suggested that there is an original replication origin (oriC1) that was inherited from the ancestor of archaea, and several other origins were likely evolved and/or translocated within the haloarchaeal species.
This study provides detailed information about the diversity of multiple orc/cdc6-associated replication origins in haloarchaeal genomes, and provides novel insight into the evolution of multiple replication origins in Archaea.
PMCID: PMC3528665  PMID: 22978470
7.  Generation of genome-scale gene-associated SNPs in catfish for the construction of a high-density SNP array 
BMC Genomics  2011;12:53.
Single nucleotide polymorphisms (SNPs) have become the marker of choice for genome-wide association studies. In order to provide the best genome coverage for the analysis of performance and production traits, a large number of relatively evenly distributed SNPs are needed. Gene-associated SNPs may fulfill these requirements of large numbers and genome wide distribution. In addition, gene-associated SNPs could themselves be causative SNPs for traits. The objective of this project was to identify large numbers of gene-associated SNPs using high-throughput next generation sequencing.
Transcriptome sequencing was conducted for channel catfish and blue catfish using Illumina next generation sequencing technology. Approximately 220 million reads (15.6 Gb) for channel catfish and 280 million reads (19.6 Gb) for blue catfish were obtained by sequencing gene transcripts derived from various tissues of multiple individuals from a diverse genetic background. A total of over 35 billion base pairs of expressed short read sequences were generated. Over two million putative SNPs were identified from channel catfish and almost 2.5 million putative SNPs were identified from blue catfish. Of these putative SNPs, a set of filtered SNPs were identified including 342,104 intra-specific SNPs for channel catfish, 366,269 intra-specific SNPs for blue catfish, and 420,727 inter-specific SNPs between channel catfish and blue catfish. These filtered SNPs are distributed within 16,562 unique genes in channel catfish and 17,423 unique genes in blue catfish.
For aquaculture species, transcriptome analysis of pooled RNA samples from multiple individuals using Illumina sequencing technology is both technically efficient and cost-effective for generating expressed sequences. Such an approach is most effective when coupled to existing EST resources generated using traditional sequencing approaches because the reference ESTs facilitate effective assembly of the expressed short reads. When multiple individuals with different genetic backgrounds are used, RNA-Seq is very effective for the identification of SNPs. The SNPs identified in this report will provide a much needed resource for genetic studies in catfish and will contribute to the development of a high-density SNP array. Validation and testing of these SNPs using SNP arrays will form the material basis for genome association studies and whole genome-based selection in catfish.
PMCID: PMC3033819  PMID: 21255432
8.  Diverse genome structures of Salmonella paratyphi C 
BMC Genomics  2007;8:290.
Salmonella paratyphi C, like S. typhi, is adapted to humans and causes typhoid fever. Previously we reported different genome structures between two strains of S. paratyphi C, which suggests that S. paratyphi C might have a plastic genome (large DNA segments being organized in different orders or orientations on the genome). As many but not all host-adapted Salmonella pathogens have large genomic insertions as well as the supposedly resultant genomic rearrangements, bacterial genome plasticity presents an extraordinary evolutionary phenomenon. Events contributing to genomic plasticity, especially large insertions, may be associated with the formation of particular Salmonella pathogens.
We constructed a high resolution genome map in S. paratyphi C strain RKS4594 and located four insertions totaling 176 kb (including the 90 kb SPI7) and seven deletions totaling 165 kb relative to S. typhimurium LT2. Two rearrangements were revealed, including an inversion of 1602 kb covering the ter region and the translocation of the 43 kb I-CeuI F fragment. The 23 wild type strains analyzed in this study exhibited diverse genome structures, mostly as a result of recombination between rrn genes. In at least two cases, the rearrangements involved recombination between genomic sites other than the rrn genes, possibly homologous genes in prophages. Two strains had a 20 kb deletion between rrlA and rrlB, which is a highly conservative region and no deletion has been reported in this region in any other Salmonella lineages.
S. paratyphi C has diverse genome structures among different isolates, possibly as a result of large genomic insertions, e.g., SPI7. Although the Salmonella typhoid agents may not be more closely related among them than each of them to other Salmonella lineages, they may have evolved in similar ways, i.e., acquiring typhoid-associated genes followed by genome structure rearrangements. Comparison of multiple Salmonella typhoid agents at both single sequenced genome and population levels will facilitate the studies on the evolutionary process of typhoid pathogenesis, especially the identification of typhoid-associated genes.
PMCID: PMC2000905  PMID: 17718928
9.  Gene coexpression networks reveal key drivers of phenotypic divergence in porcine muscle 
BMC Genomics  2015;16(1):50.
Domestication of the wild pig has led to obese and lean phenotype breeds, and evolutionary genome research has sought to identify the regulatory mechanisms underlying this phenotypic diversity. However, revealing the molecular mechanisms underlying muscle phenotype variation based on differentially expressed genes has proved to be difficult. To characterize the mechanisms regulating muscle phenotype variation under artificial selection, we aimed to provide an integrated view of genome organization by weighted gene coexpression network analysis.
Our analysis was based on 20 publicly available next-generation sequencing datasets of lean and obese pig muscle generated from 10 developmental stages. The evolution of the constructed coexpression modules was examined using the genome resequencing data of 37 domestic pigs and 11 wild boars. Our results showed the regulation of muscle development might be more complex than had been previously acknowledged, and is regulated by the coordinated action of muscle, nerve and immunity related genes. Breed-specific modules that regulated muscle phenotype divergence were identified, and hundreds of hub genes with major roles in muscle development were determined to be responsible for key functional distinctions between breeds. Our evolutionary analysis showed that the role of changes in the coding sequence under positive selection in muscle phenotype divergence was minor.
Muscle phenotype divergence was found to be regulated by the divergence of coexpression network modules under artificial selection, and not by changes in the coding sequence of genes. Our results present multiple lines of evidence suggesting links between modules and muscle phenotypes, and provide insights into the molecular bases of genome organization in muscle development and phenotype variation.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1238-5) contains supplementary material, which is available to authorized users.
PMCID: PMC4328970  PMID: 25651817
Muscle; Modules; Weighted gene coexpression network analysis; Phenotype variation; Artificial selection
10.  Genome sequencing of Sporisorium scitamineum provides insights into the pathogenic mechanisms of sugarcane smut 
BMC Genomics  2014;15(1):996.
Sugarcane smut can cause losses in cane yield and sugar content that range from 30% to total crop failure. Losses tend to increase with the passage of years. Sporisorium scitamineum is the fungus that causes sugarcane smut. This fungus has the potential to infect all sugarcane species unless a species is resistant to biotrophic fungal pathogens. However, it remains unclear how the fungus breaks through the cell walls of sugarcane and causes the formation of black or gray whip-like structures on the sugarcane plants.
Here, we report the first high-quality genome sequence of S. scitamineum assembled de novo with a contig N50 of 41 kb, a scaffold N50 of 884 kb and genome size 19.8 Mb, containing an estimated 6,636 genes. This phytopathogen can utilize a wide range of carbon and nitrogen sources. A reduced set of genes encoding plant cell wall hydrolytic enzymes leads to its biotrophic lifestyle, in which damage to the host should be minimized. As a bipolar mating fungus, a and b loci are linked and the mating-type locus segregates as a single locus. The S. scitamineum genome has only 6 G protein-coupled receptors (GPCRs) grouped into five classes, which are responsible for transducing extracellular signals into intracellular responses, however, the genome is without any PTH11-like GPCR. There are 192 virulence associated genes in the genome of S. scitamineum, among which 31 expressed in all the stages, which mainly encode for energy metabolism and redox of short-chain compound related enzymes. Sixty-eight candidates for secreted effector proteins (CSEPs) were found in the genome of S. scitamineum, and 32 of them expressed in the different stages of sugarcane infection, which are probably involved in infection and/or triggering defense responses. There are two non-ribosomal peptide synthetase (NRPS) gene clusters that are involved in the generation of ferrichrome and ferrichrome A, while the terpenes gene cluster is composed of three unknown function genes and seven biosynthesis related genes.
As a destructive pathogen to sugar industry, the S. scitamineum genome will facilitate future research on the genomic basis and the pathogenic mechanisms of sugarcane smut.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-996) contains supplementary material, which is available to authorized users.
PMCID: PMC4246466  PMID: 25406499
Sporisorium scitamineum, Sugarcane smut; Pathogenic mechanisms, G-protein coupled receptors, Carbohydrate degrading enzymes, Biotrophic properties, Candidates for secreted effector proteins, Secondary metabolic pathways
11.  An improved 2b-RAD approach (I2b-RAD) offering genotyping tested by a rice (Oryza sativa L.) F2 population 
BMC Genomics  2014;15(1):956.
2b-RAD (type IIB endonucleases restriction-site associated DNA) approach was invented by Wang in 2012 and proven as a simple and flexible method for genome-wide genotyping. However, there is still plenty of room for improvement for the existent 2b-RAD approach. Firstly, it doesn’t include the samples pooling in library preparation as other reduced representation libraries. Secondly, the information of 2b-RAD tags, such as tags numbers and distributions, in most of species are unknown. The purposes of the research are to improve a new 2b-RAD approach which possesses samples pooling, moreover to figure out the characteristic and application potentiality of 2b-RAD tags by bioinformatics analysis.
Twelve adapter1 and an adapter2 were designed. A library approach comprising digestion, ligation, pooling, PCR and size selection were established. For saving costs, we used non-phosphorylated adapters and indexed PCR primers. A F2 population of rice (Oryza sativa .L) was genotyped to validate the new approach. On average, 2000332 high quality reads of each sample were obtained with high evenness. Totally 3598 markers containing 3804 SNPs were discovered and the missing rate was 18.9%. A genetic linkage map of 1385 markers was constructed and 92% of the markers’ orders in the genetic map were in accordance with the orders in chromosomes. Meanwhile, the bioinformatics simulation in 20 species showed that the BsaXI had the most widespread recognition sites, indicating that 2b-RAD tags had a powerful application potentiality for high density genetic map. Using modified adapters with a fix base in 3′end, 2b-RAD was also fit for QTL studies with low costs.
An improved 2b-RAD genotyping approach was established in this research and named as I2b-RAD. The method was a simple, fast, cost-effective and multiplex sequencing library approach. It could be adjusted by selecting different enzymes and adapters to fit for alternative uses including chromosomes assembly, QTL fine mapping and even natural population analysis.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-956) contains supplementary material, which is available to authorized users.
PMCID: PMC4236440  PMID: 25373334
2b-RAD; Genotyping; Genetic linkage map
12.  Large-scale transcriptome comparison reveals distinct gene activations in wheat responding to stripe rust and powdery mildew 
BMC Genomics  2014;15(1):898.
Stripe rust (Puccinia striiformis f. sp. tritici; Pst) and powdery mildew (Blumeria graminis f. sp. tritici; Bgt) are important diseases of wheat (Triticum aestivum) worldwide. Similar mechanisms and gene transcripts are assumed to be involved in the host defense response because both pathogens are biotrophic fungi. The main objective of our study was to identify co-regulated mRNAs that show a change in expression pattern after inoculation with Pst or Bgt, and to identify mRNAs specific to the fungal stress response.
The transcriptome of the hexaploid wheat line N9134 inoculated with the Chinese Pst race CYR 31 was compared with that of the same line inoculated with Bgt race E09 at 1, 2, and 3 days post-inoculation. Infection by Pst and Bgt affected transcription of 23.8% of all T. aestivum genes. Infection by Bgt triggered a more robust alteration in gene expression in N9134 compared with the response to Pst infection. An array of overlapping gene clusters with distinctive expression patterns provided insight into the regulatory differences in the responses to Bgt and Pst infection. The differentially expressed genes were grouped into seven enriched Kyoto Encyclopedia of Genes and Genomes pathways in Bgt-infected leaves and four pathways in Pst-infected leaves, while only two pathways overlapped. In the plant–pathogen interaction pathway, N9134 activated a higher number of genes and pathways in response to Bgt infection than in response to Pst invasion. Genomic analysis revealed that the wheat genome shared some microbial genetic fragments, which were specifically induced in response to Bgt and Pst infection.
Taken together, our findings indicate that the responses of wheat N9134 to infection by Bgt and Pst shows differences in the pathways and genes activated. The mass sequence data for wheat–fungus interaction generated in this study provides a powerful platform for future functional and molecular research on wheat–fungus interactions.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-898) contains supplementary material, which is available to authorized users.
PMCID: PMC4201691  PMID: 25318379
Bread wheat; Stripe rust; Powdery mildew; RNA-Seq; Gene expression
13.  Transcriptome differences in the hypopharyngeal gland between Western Honeybees (Apis mellifera) and Eastern Honeybees (Apis cerana) 
BMC Genomics  2014;15(1):744.
Apis mellifera and Apis cerana are two sibling species of Apidae. Apis cerana is adept at collecting sporadic nectar in mountain and forest region and exhibits stiffer hardiness and acarid resistance as a result of natural selection, whereas Apis mellifera has the advantage of producing royal jelly. To identify differentially expressed genes (DEGs) that affect the development of hypopharyngeal gland (HG) and/or the secretion of royal jelly between these two honeybee species, we performed a digital gene expression (DGE) analysis of the HGs of these two species at three developmental stages (newly emerged worker, nurse and forager).
Twelve DGE-tag libraries were constructed and sequenced using the total RNA extracted from the HGs of newly emerged workers, nurses, and foragers of Apis mellifera and Apis cerana. Finally, a total of 1482 genes in Apis mellifera and 1313 in Apis cerana were found to exhibit an expression difference among the three developmental stages. A total of 1417 DEGs were identified between these two species. Of these, 623, 1072, and 462 genes showed an expression difference at the newly emerged worker, nurse, and forager stages, respectively. The nurse stage exhibited the highest number of DEGs between these two species and most of these were found to be up-regulated in Apis mellifera. These results suggest that the higher yield of royal jelly in Apis mellifera may be due to the higher expression level of these DEGs.
In this study, we investigated the DEGs between the HGs of two sibling honeybee species (Apis mellifera and Apis cerana). Our results indicated that the gene expression difference was associated with the difference in the royal jelly yield between these two species. These results provide an important clue for clarifying the mechanisms underlying hypopharyngeal gland development and the production of royal jelly.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-744) contains supplementary material, which is available to authorized users.
PMCID: PMC4158095  PMID: 25174638
Apis mellifera; Apis cerana; Hypopharyngeal gland; Digital gene expression; Differentially expressed gene
14.  Comparative genomic analysis of Klebsiella pneumonia (LCT-KP214) and a mutant strain (LCT-KP289) obtained after spaceflight 
BMC Genomics  2014;15:589.
With the development of space science, it is important to analyze the relationship between the space environment and genome variations that might cause phenotypic changes in microbes. Klebsiella pneumoniae is commonly found on the human body and is resistant to multiple drugs. To study space-environment-induced genome variations and drug resistance changes, K. pneumoniae was carried into outer space by the Shenzhou VIII spacecraft.
The K. pneumoniae strain LCT-KP289 was selected after spaceflight based on its phenotypic differences compared to the ground-control strain. Analysis of genomic structural variations revealed one inversion, 25 deletions, fifty-nine insertions, two translocations and six translocations with inversions. In addition, 155 and 400 unique genes were observed in LCT-KP214 and LCT-KP289, respectively, including the gene encoding dihydroxyacetone kinase, which generates the ATP and NADH required for microbial growth. Furthermore, a large number of mutant genes were related to transport and metabolism. Phylogenetic analysis revealed that most genes in these two strains had a dN/dS value greater than 1, indicating that the strain diversity increased after spaceflight. Analysis of drug-resistance phenotypes revealed that the K. pneumoniae strain LCT-KP289 was resistant to sulfamethoxazole, whereas the control strain, LCT-KP214, was not; both strains were resistant to benzylpenicillin, ampicillin, lincomycin, vancomycin, chloramphenicol and streptomycin. The sulfamethoxazole resistance may be associated with sequences in Scaffold7 in LCT-KP289, which were not observed in LCT-K214; this scaffold contained the gene sul1. In the strain LCT-KP289, we also observed a drug-resistance integron containing emrE (confers multidrug resistance) and ant (confers resistance to spectinomycin, streptomycin, tobramycin, kanamycin, sisomicin, dibekacin, and gentamicin). The gene ampC (confers resistance to penicillin, cephalosporin-ii and cephalosporin-i) was present near the integron. In addition, 30 and 26 drug-resistance genes were observed in LCT-KP289 and LCT-KP214, respectively.
Comparison of a K. pneumoniae strain obtained after spaceflight with the ground-control strain revealed genome variations and phenotypic changes and elucidated the genomic basis of the acquired drug resistance. These data pave the way for future studies on the effects of spaceflight.
PMCID: PMC4226956  PMID: 25015528
Klebsiella pneumoniae; Comparative genomic analysis; Virulence gene; Resistance gene
15.  Comparative genomics of Riemerella anatipestifer reveals genetic diversity 
BMC Genomics  2014;15(1):479.
Riemerella anatipestifer is one of the most important pathogens of ducks. However, the molecular mechanisms of R. anatipestifer infection are poorly understood. In particular, the lack of genomic information from a variety of R. anatipestifer strains has proved severely limiting.
In this study, we present the complete genomes of two R. anatipestifer strains, RA-CH-1 (2,309,519 bp, Genbank accession CP003787) and RA-CH-2 (2,166,321 bp, Genbank accession CP004020). Both strains are from isolates taken from two different sick ducks in the SiChuang province of China. A comparative genomics approach was used to identify similarities and key differences between RA-CH-1 and RA-CH-2 and the previously sequenced strain RA-GD, a clinical isolate from GuangDong, China, and ATCC11845.
The genomes of RA-CH-2 and RA-GD were extremely similar, while RA-CH-1 was significantly different than ATCC11845. RA-CH-1 is 140,000 bp larger than the three other strains and has 16 unique gene families. Evolutionary analysis shows that RA-CH-1 and RA-CH-2 are closed and in a branch with ATCC11845, while RA-GD is located in another branch. Additionally, the detection of several iron/heme-transport related proteins and motility mechanisms will be useful in elucidating factors important in pathogenicity. This information will allow a better understanding of the phenotype of different R. anatipestifer strains and molecular mechanisms of infection.
PMCID: PMC4103989  PMID: 24935762
Riemerella anatipestifer; Comparative genomics; Structural variation
16.  Comparative genomic analysis of Mycobacterium tuberculosis clinical isolates 
BMC Genomics  2014;15(1):469.
Due to excessive antibiotic use, drug-resistant Mycobacterium tuberculosis has become a serious public health threat and a major obstacle to disease control in many countries. To better understand the evolution of drug-resistant M. tuberculosis strains, we performed whole genome sequencing for 7 M. tuberculosis clinical isolates with different antibiotic resistance profiles and conducted comparative genomic analysis of gene variations among them.
We observed that all 7 M. tuberculosis clinical isolates with different levels of drug resistance harbored similar numbers of SNPs, ranging from 1409–1464. The numbers of insertion/deletions (Indels) identified in the 7 isolates were also similar, ranging from 56 to 101. A total of 39 types of mutations were identified in drug resistance-associated loci, including 14 previously reported ones and 25 newly identified ones. Sixteen of the identified large Indels spanned PE-PPE-PGRS genes, which represents a major source of antigenic variability. Aside from SNPs and Indels, a CRISPR locus with varied spacers was observed in all 7 clinical isolates, suggesting that they might play an important role in plasticity of the M. tuberculosis genome. The nucleotide diversity (Л value) and selection intensity (dN/dS value) of the whole genome sequences of the 7 isolates were similar. The dN/dS values were less than 1 for all 7 isolates (range from 0.608885 to 0.637365), supporting the notion that M. tuberculosis genomes undergo purifying selection. The Л values and dN/dS values were comparable between drug-susceptible and drug-resistant strains.
In this study, we show that clinical M. tuberculosis isolates exhibit distinct variations in terms of the distribution of SNP, Indels, CRISPR-cas locus, as well as the nucleotide diversity and selection intensity, but there are no generalizable differences between drug-susceptible and drug-resistant isolates on the genomic scale. Our study provides evidence strengthening the notion that the evolution of drug resistance among clinical M. tuberculosis isolates is clearly a complex and diversified process.
Electronic supplementary material
The online version of this article (doi: 10.1186/1471-2164-15-469) contains supplementary material, which is available to authorized users.
PMCID: PMC4070564  PMID: 24923884
Mycobacterium tuberculosis; Drug resistance; Single nucleotide polymorphisms; Whole genome sequencing; Evolution
17.  Genomic characteristics and comparative genomics analysis of Penicillium chrysogenum KF-25 
BMC Genomics  2014;15:144.
Penicillium chrysogenum has been used in producing penicillin and derived β-lactam antibiotics for many years. Although the genome of the mutant strain P. chrysogenum Wisconsin 54-1255 has already been sequenced, the versatility and genetic diversity of this species still needs to be intensively studied. In this study, the genome of the wild-type P. chrysogenum strain KF-25, which has high activity against Ustilaginoidea virens, was sequenced and characterized.
The genome of KF-25 was about 29.9 Mb in size and contained 9,804 putative open reading frames (orfs). Thirteen genes were predicted to encode two-component system proteins, of which six were putatively involved in osmolarity adaption. There were 33 putative secondary metabolism pathways and numerous genes that were essential in metabolite biosynthesis. Several P. chrysogenum virus untranslated region sequences were found in the KF-25 genome, suggesting that there might be a relationship between the virus and P. chrysogenum in evolution. Comparative genome analysis showed that the genomes of KF-25 and Wisconsin 54-1255 were highly similar, except that KF-25 was 2.3 Mb smaller. Three hundred and fifty-five KF-25 specific genes were found and the biological functions of the proteins encoded by these genes were mainly unknown (232, representing 65%), except for some orfs encoding proteins with predicted functions in transport, metabolism, and signal transduction. Numerous KF-25-specific genes were found to be associated with the pathogenicity and virulence of the strains, which were identical to those of wild-type P. chrysogenum NRRL 1951.
Genome sequencing and comparative analysis are helpful in further understanding the biology, evolution, and environment adaption of P. chrysogenum, and provide a new tool for identifying further functional metabolites.
PMCID: PMC3938070  PMID: 24555742
Penicillium chrysogenum; Genome; Comparative genome
18.  Drechslerella stenobrocha genome illustrates the mechanism of constricting rings and the origin of nematode predation in fungi 
BMC Genomics  2014;15:114.
Nematode-trapping fungi are a unique group of organisms that can capture nematodes using sophisticated trapping structures. The genome of Drechslerella stenobrocha, a constricting-ring-forming fungus, has been sequenced and reported, and provided new insights into the evolutionary origins of nematode predation in fungi, the trapping mechanisms, and the dual lifestyles of saprophagy and predation.
The genome of the fungus Drechslerella stenobrocha, which mechanically traps nematodes using a constricting ring, was sequenced. The genome was 29.02 Mb in size and was found rare instances of transposons and repeat induced point mutations, than that of Arthrobotrys oligospora. The functional proteins involved in nematode-infection, such as chitinases, subtilisins, and adhesive proteins, underwent a significant expansion in the A. oligospora genome, while there were fewer lectin genes that mediate fungus-nematode recognition in the D. stenobrocha genome. The carbohydrate-degrading enzyme catalogs in both species were similar to those of efficient cellulolytic fungi, suggesting a saprophytic origin of nematode-trapping fungi. In D. stenobrocha, the down-regulation of saprophytic enzyme genes and the up-regulation of infection-related genes during the capture of nematodes indicated a transition between dual life strategies of saprophagy and predation. The transcriptional profiles also indicated that trap formation was related to the protein kinase C (PKC) signal pathway and regulated by Zn(2)–C6 type transcription factors.
The genome of D. stenobrocha provides support for the hypothesis that nematode trapping fungi evolved from saprophytic fungi in a high carbon and low nitrogen environment. It reveals the transition between saprophagy and predation of these fungi and also proves new insights into the mechanisms of mechanical trapping.
PMCID: PMC3924618  PMID: 24507587
Nematode-trapping fungi; Comparative genomic analysis; Origin of nematode predation; Transcriptomes; Trapping mechanism
19.  A comprehensive microRNA expression profile of the backfat tissue from castrated and intact full-sib pair male pigs 
BMC Genomics  2014;15:47.
It is widely known that castration has a significant effect on the accumulation of adipose tissue. microRNAs (miRNAs) are known to be involved in fat deposition and to be regulated by the androgen-induced androgen receptor (AR). However, there is little understanding of the relationship between miRNAs and fat deposition after castration. In this study, the high-throughput SOLiD sequencing approach was used to identify and characterize miRNA expression in backfat from intact and castrated full-sib male 23-week-old pigs. The patterns of adipogenesis and fat deposition were compared between castrated and intact male pigs.
A total of 366 unique miRNA genes were identified, comprising 174 known pre-miRNAs and 192 novel pre-miRNAs. One hundred and sixty-seven pre-miRNAs were common to both castrated (F3) and intact (F4) male pig small RNA libraries. The novel pre-miRNAs encoded 153 miRNAs/miRNA*s and 141 miRNAs/miRNA*s in the F3 and F4 libraries, respectively. One hundred and seventy-seven miRNAs, including 45 up- and 132 down-regulated, had more than 2-fold differential expression between the castrated and intact male pigs (p-value < 0.001). Thirty-five miRNAs were further selected, based on the expression abundance and differentiation between the two libraries, to predict their targets in KEGG pathways. KEGG pathway analyses suggested that miRNAs differentially expressed between the castrated and intact male pigs are involved in proliferation, apoptosis, differentiation, migration, adipose tissue development and other important biological processes. The expression patterns of eight arbitrarily selected miRNAs were validated by stem-loop reverse-transcription quantitative polymerase chain reaction. These data confirmed the expression tendency observed with SOLiD sequencing. miRNA isomiRs and mirtrons were also investigated in this study. Mirtrons are a recently described category of miRNA relying on splicing rather than processing by the microprocessor complex to generate the RNAi pathway. The functions of miRNAs important for regulating fat deposition were also investigated in this study.
This study expands the number of fat-deposition-related miRNAs in pig. The results also indicate that castration can significantly affect the expression patterns of fat-related miRNAs. The differentially expressed miRNAs may play important roles in fat deposition after castration.
PMCID: PMC3901342  PMID: 24443800
Male pig; MicroRNA; Fat deposition; Castration
20.  Specific gene-regulation networks during the pre-implantation development of the pig embryo as revealed by deep sequencing 
BMC Genomics  2014;15:4.
Because few studies exist to describe the unique molecular network regulation behind pig pre-implantation embryonic development (PED), genetic engineering in the pig embryo is limited. Also, this lack of research has hindered derivation and application of porcine embryonic stem cells and porcine induced pluripotent stem cells (iPSCs).
We identified and analyzed the genome wide transcriptomes of pig in vivo-derived and somatic cell nuclear transferred (SCNT) as well as mouse in vivo-derived pre-implantation embryos at different stages using mRNA deep sequencing. Comparison of the pig embryonic transcriptomes with those of mouse and human pre-implantation embryos revealed unique gene expression patterns during pig PED. Pig zygotic genome activation was confirmed to occur at the 4-cell stage via genome-wide gene expression analysis. This activation was delayed to the 8-cell stage in SCNT embryos. Specific gene expression analysis of the putative inner cell mass (ICM) and the trophectoderm (TE) revealed that pig and mouse pre-implantation embryos share regulatory networks during the first lineage segregation and primitive endoderm differentiation, but not during ectoderm commitment. Also, fatty acid metabolism appears to be a unique characteristic of pig pre-implantation embryonic development. In addition, the global gene expression patterns in the pig SCNT embryos were different from those in in vivo-derived pig embryos.
Our results provide a resource for pluripotent stem cell engineering and for understanding pig development.
PMCID: PMC3925986  PMID: 24383959
21.  Bulk segregant RNA-seq reveals expression and positional candidate genes and allele-specific expression for disease resistance against enteric septicemia of catfish 
BMC Genomics  2013;14:929.
The application of RNA-seq has accelerated gene expression profiling and identification of gene-associated SNPs in many species. However, the integrated studies of gene expression along with SNP mapping have been lacking. Coupling of RNA-seq with bulked segregant analysis (BSA) should allow correlation of expression patterns and associated SNPs with the phenotypes.
In this study, we demonstrated the use of bulked segregant RNA-seq (BSR-Seq) for the analysis of differentially expressed genes and associated SNPs with disease resistance against enteric septicemia of catfish (ESC). A total of 1,255 differentially expressed genes were found between resistant and susceptible fish. In addition, 56,419 SNPs residing on 4,304 unique genes were identified as significant SNPs between susceptible and resistant fish. Detailed analysis of these significant SNPs allowed differentiation of significant SNPs caused by genetic segregation and those caused by allele-specific expression. Mapping of the significant SNPs, along with analysis of differentially expressed genes, allowed identification of candidate genes underlining disease resistance against ESC disease.
This study demonstrated the use of BSR-Seq for the identification of genes involved in disease resistance against ESC through expression profiling and mapping of significantly associated SNPs. BSR-Seq is applicable to analysis of genes underlining various performance and production traits without significant investment in the development of large genotyping platforms such as SNP arrays.
PMCID: PMC3890627  PMID: 24373586
Bulk segregant analysis; RNA-seq; Disease resistance; Catfish; Allele-specific expression
22.  Genome wide association studies for body conformation traits in the Chinese Holstein cattle population 
BMC Genomics  2013;14:897.
Genome-wide association study (GWAS) is a powerful tool for revealing the genetic basis of quantitative traits. However, studies using GWAS for conformation traits of cattle is comparatively less. This study aims to use GWAS to find the candidates genes for body conformation traits.
The Illumina BovineSNP50 BeadChip was used to identify single nucleotide polymorphisms (SNPs) that are associated with body conformation traits. A least absolute shrinkage and selection operator (LASSO) was applied to detect multiple SNPs simultaneously for 29 body conformation traits with 1,314 Chinese Holstein cattle and 52,166 SNPs. Totally, 59 genome-wide significant SNPs associated with 26 conformation traits were detected by genome-wide association analysis; five SNPs were within previously reported QTL regions (Animal Quantitative Trait Loci (QTL) database) and 11 were very close to the reported SNPs. Twenty-two SNPs were located within annotated gene regions, while the remainder were 0.6–826 kb away from known genes. Some of the genes had clear biological functions related to conformation traits. By combining information about the previously reported QTL regions and the biological functions of the genes, we identified DARC, GAS1, MTPN, HTR2A, ZNF521, PDIA6, and TMEM130 as the most promising candidate genes for capacity and body depth, chest width, foot angle, angularity, rear leg side view, teat length, and animal size traits, respectively. We also found four SNPs that affected four pairs of traits, and the genetic correlation between each pair of traits ranged from 0.35 to 0.86, suggesting that these SNPs may have a pleiotropic effect on each pair of traits.
A total of 59 significant SNPs associated with 26 conformation traits were identified in the Chinese Holstein population. Six promising candidate genes were suggested, and four SNPs showed genetic correlation for four pairs of traits.
PMCID: PMC3879203  PMID: 24341352
Dairy cattle; GWAS; Body conformation traits; SNP; Holstein; QTL
23.  De novo transcriptome sequencing of radish (Raphanus sativus L.) and analysis of major genes involved in glucosinolate metabolism 
BMC Genomics  2013;14(1):836.
Radish (Raphanus sativus L.), is an important root vegetable crop worldwide. Glucosinolates in the fleshy taproot significantly affect the flavor and nutritional quality of radish. However, little is known about the molecular mechanisms underlying glucosinolate metabolism in radish taproots. The limited availability of radish genomic information has greatly hindered functional genomic analysis and molecular breeding in radish.
In this study, a high-throughput, large-scale RNA sequencing technology was employed to characterize the de novo transcriptome of radish roots at different stages of development. Approximately 66.11 million paired-end reads representing 73,084 unigenes with a N50 length of 1,095 bp, and a total length of 55.73 Mb were obtained. Comparison with the publicly available protein database indicates that a total of 67,305 (about 92.09% of the assembled unigenes) unigenes exhibit similarity (e –value ≤ 1.0e-5) to known proteins. The functional annotation and classification including Gene Ontology (GO), Clusters of Orthologous Group (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis revealed that the main activated genes in radish taproots are predominately involved in basic physiological and metabolic processes, biosynthesis of secondary metabolite pathways, signal transduction mechanisms and other cellular components and molecular function related terms. The majority of the genes encoding enzymes involved in glucosinolate (GS) metabolism and regulation pathways were identified in the unigene dataset by targeted searches of their annotations. A number of candidate radish genes in the glucosinolate metabolism related pathways were also discovered, from which, eight genes were validated by T-A cloning and sequencing while four were validated by quantitative RT-PCR expression profiling.
The ensuing transcriptome dataset provides a comprehensive sequence resource for molecular genetics research in radish. It will serve as an important public information platform to further understanding of the molecular mechanisms involved in biosynthesis and metabolism of the related nutritional and flavor components during taproot formation in radish.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-14-836) contains supplementary material, which is available to authorized users.
PMCID: PMC4046679  PMID: 24279309
Radish; De novo assembly; RNA-Seq; Transcriptome; Glucosinolate metabolic pathways
24.  Transcriptome analysis of chicken kidney tissues following coronavirus avian infectious bronchitis virus infection 
BMC Genomics  2013;14:743.
Infectious bronchitis virus (IBV), a prototype of the Coronaviridae family, is an economically important causative agent of infectious bronchitis in chickens and causes an acute and highly contagious upper respiratory tract infections that may lead to nephritis. However, the molecular antiviral mechanisms of chickens to IBV infection remain poorly understood. In this study, we conducted global gene expression profiling of chicken kidney tissue after nephropathogenic IBV infection to better understand the interactions between host and virus.
IBV infection contributed to differential expression of 1777 genes, of which 876 were up-regulated and 901 down-regulated in the kidney compared to those of control chickens and 103 associated with immune and inflammatory responses may play important roles in the host defense response during IBV infection. Twelve of the altered immune-related genes were confirmed by real-time RT-PCR. Gene ontology category, KEGG pathway, and gene interaction networks (STRING analysis) were analyzed to identify relationships among differentially expressed genes involved in signal transduction, cell adhesion, immune responses, apoptosis regulation, positive regulation of the I-kappaB kinase/NF-kappaB cascade and response to cytokine stimulus. Most of these genes were related and formed a large network, in which IL6, STAT1, MYD88, IRF1 and NFKB2 were key genes.
Our results provided comprehensive knowledge regarding the host transcriptional response to IBV infection in chicken kidney tissues, thereby providing insight into IBV pathogenesis, particularly the involvement of innate immune pathway genes associated with IBV infection.
PMCID: PMC3870970  PMID: 24168272
Infectious bronchitis virus; Kidney; Microarray; Transcriptome
25.  Bolbase: a comprehensive genomics database for Brassica oleracea 
BMC Genomics  2013;14:664.
Brassica oleracea is a morphologically diverse species in the family Brassicaceae and contains a group of nutrition-rich vegetable crops, including common heading cabbage, cauliflower, broccoli, kohlrabi, kale, Brussels sprouts. This diversity along with its phylogenetic membership in a group of three diploid and three tetraploid species, and the recent availability of genome sequences within Brassica provide an unprecedented opportunity to study intra- and inter-species divergence and evolution in this species and its close relatives.
We have developed a comprehensive database, Bolbase, which provides access to the B. oleracea genome data and comparative genomics information. The whole genome of B. oleracea is available, including nine fully assembled chromosomes and 1,848 scaffolds, with 45,758 predicted genes, 13,382 transposable elements, and 3,581 non-coding RNAs. Comparative genomics information is available, including syntenic regions among B. oleracea, Brassica rapa and Arabidopsis thaliana, synonymous (Ks) and non-synonymous (Ka) substitution rates between orthologous gene pairs, gene families or clusters, and differences in quantity, category, and distribution of transposable elements on chromosomes. Bolbase provides useful search and data mining tools, including a keyword search, a local BLAST server, and a customized GBrowse tool, which can be used to extract annotations of genome components, identify similar sequences and visualize syntenic regions among species. Users can download all genomic data and explore comparative genomics in a highly visual setting.
Bolbase is the first resource platform for the B. oleracea genome and for genomic comparisons with its relatives, and thus it will help the research community to better study the function and evolution of Brassica genomes as well as enhance molecular breeding research. This database will be updated regularly with new features, improvements to genome annotation, and new genomic sequences as they become available. Bolbase is freely available at
PMCID: PMC3849793  PMID: 24079801
Brassica oleracea; Database; Genome sequence; Synteny; Comparative genomics

Results 1-25 (447)