Search tips
Search criteria

Results 1-12 (12)

Clipboard (0)
Year of Publication
Document Types
1.  Insight into the specific virulence related genes and toxin-antitoxin virulent pathogenicity islands in swine streptococcosis pathogen Streptococcus equi ssp. zooepidemicus strain ATCC35246 
BMC Genomics  2013;14:377.
Streptococcus equi ssp. zooepidemicus (S. zooepidemicus) is an important pathogen causing swine streptococcosis in China. Pathogenicity islands (PAIs) of S. zooepidemicus have been transferred among bacteria through horizontal gene transfer (HGT) and play important roles in the adaptation and increased virulence of S. zooepidemicus. The present study used comparative genomics to examine the different pathogenicities of S. zooepidemicus.
Genome of S. zooepidemicus ATCC35246 (Sz35246) comprises 2,167,264-bp of a single circular chromosome, with a GC content of 41.65%. Comparative genome analysis of Sz35246, S. zooepidemicus MGCS10565 (Sz10565), Streptococcus equi. ssp. equi. 4047 (Se4047) and S. zooepidemicus H70 (Sz70) identified 320 Sz35246-specific genes, clustered into three toxin-antitoxin (TA) systems PAIs and one restriction modification system (RM system) PAI. These four acquired PAIs encode proteins that may contribute to the overall pathogenic capacity and fitness of this bacterium to adapt to different hosts. Analysis of the in vivo and in vitro transcriptomes of this bacterium revealed differentially expressed PAI genes and non-PAI genes, suggesting that Sz35246 possess mechanisms for infecting animals and adapting to a wide range of host environments. Analysis of the genome identified potential Sz35246 virulence genes. Genes of the Fim III operon were presumed to be involved in breaking the host-restriction of Sz35246.
Genome wide comparisons of Sz35246 with three other strains and transcriptome analysis revealed novel genes related to bacterial virulence and breaking the host-restriction. Four specific PAIs, which were judged to have been transferred into Sz35246 genome through HGT, were identified for the first time. Further analysis of the TA and RM systems in the PAIs will improve our understanding of the pathogenicity of this bacterium and could lead to the development of diagnostics and vaccines.
PMCID: PMC3750634  PMID: 23742619
2.  Mining genes involved in the stratification of Paris Polyphylla seeds using high-throughput embryo Transcriptome sequencing 
BMC Genomics  2013;14:358.
Paris polyphylla var. yunnanensis is an important medicinal plant. Seed dormancy is one of the main factors restricting artificial cultivation. The molecular mechanisms of seed dormancy remain unclear, and little genomic or transcriptome data are available for this plant.
In this study, massive parallel pyrosequencing on the Roche 454-GS FLX Titanium platform was used to generate a substantial sequence dataset for the P. polyphylla embryo. 369,496 high quality reads were obtained, ranging from 50 to 1146 bp, with a mean of 219 bp. These reads were assembled into 47,768 unigenes, which included 16,069 contigs and 31,699 singletons. Using BLASTX searches of public databases, 15,757 (32.3%) unique transcripts were identified. Gene Ontology and Cluster of Orthologous Groups of proteins annotations revealed that these transcripts were broadly representative of the P. polyphylla embryo transcriptome. The Kyoto Encyclopedia of Genes and Genomes assigned 5961 of the unique sequences to specific metabolic pathways. Relative expression levels analysis showed that eleven phytohormone-related genes and five other genes have different expression patterns in the embryo and endosperm in the seed stratification process.
Gene annotation and quantitative RT-PCR expression analysis identified 464 transcripts that may be involved in phytohormone catabolism and biosynthesis, hormone signal, seed dormancy, seed maturation, cell wall growth and circadian rhythms. In particular, the relative expression analysis of sixteen genes (CYP707A, NCED, GA20ox2, GA20ox3, ABI2, PP2C, ARP3, ARP7, IAAH, IAAS, BRRK, DRM, ELF1, ELF2, SFR6, and SUS) in embryo and endosperm and at two temperatures indicated that these related genes may be candidates for clarifying the molecular basis of seed dormancy in P. polyphlla var. yunnanensis.
PMCID: PMC3679829  PMID: 23718911
Embryo; Stratification; Seed dormancy; High-throughput sequencing; Paris polyphylla
3.  Detection and genotyping of restriction fragment associated polymorphisms in polyploid crops with a pseudo-reference sequence: a case study in allotetraploid Brassica napus 
BMC Genomics  2013;14:346.
The presence of homoeologous sequences and absence of a reference genome sequence make discovery and genotyping of single nucleotide polymorphisms (SNPs) more challenging in polyploid crops.
To address this challenge, we constructed reduced representation libraries (RRLs) for two Brassica napus inbred lines and their 91 doubled haploid (DH) progenies using a modified ddRADseq technique. A bioinformatics pipeline termed RFAPtools was developed to discover and genotype SNPs and presence/absence variations (PAVs). Using this pipeline, a pseudo-reference sequence (PRF) containing 180,991 sequence tags was constructed. By aligning sequence reads to the pseudo-reference sequence, allelic SNPs as well as PAVs were identified and genotyped with RFAPtools. Two parallel linkage maps, one SNP bin map containing 8,780 SNP loci and one PAV linkage map containing 12,423 dominant loci, were constructed. By aligning marker sequences to B. rapa sequence scaffolds, whose genome is available, we assigned 44 unassembled sequence scaffolds comprising 8.15 Mb onto the B. rapa chromosomes, and also identified 14 instances of misassembly and eight instances of mis-ordering sequence scaffolds.
These results indicate that the modified ddRADseq approach is a cost-effective and simple method to genotype tens of thousands SNPs and PAV markers in a polyploidy plant species. The results also demonstrated that RFAPtools developed in this study are powerful to mine allelic SNPs from homoeologous sequences in polyploids, therefore they are generally applicable in either diploid or polyploid species with or without a reference genome sequence.
PMCID: PMC3665465  PMID: 23706002
Polyploid crops; Brassica napus; Pseudo-reference sequence; Single nucleotide polymorphism; Presence/absence variation
4.  Comparative analysis of mitochondrial genomes between a wheat K-type cytoplasmic male sterility (CMS) line and its maintainer line 
BMC Genomics  2011;12:163.
Plant mitochondria, semiautonomous organelles that function as manufacturers of cellular ATP, have their own genome that has a slow rate of evolution and rapid rearrangement. Cytoplasmic male sterility (CMS), a common phenotype in higher plants, is closely associated with rearrangements in mitochondrial DNA (mtDNA), and is widely used to produce F1 hybrid seeds in a variety of valuable crop species. Novel chimeric genes deduced from mtDNA rearrangements causing CMS have been identified in several plants, such as rice, sunflower, pepper, and rapeseed, but there are very few reports about mtDNA rearrangements in wheat. In the present work, we describe the mitochondrial genome of a wheat K-type CMS line and compare it with its maintainer line.
The complete mtDNA sequence of a wheat K-type (with cytoplasm of Aegilops kotschyi) CMS line, Ks3, was assembled into a master circle (MC) molecule of 647,559 bp and found to harbor 34 known protein-coding genes, three rRNAs (18 S, 26 S, and 5 S rRNAs), and 16 different tRNAs. Compared to our previously published sequence of a K-type maintainer line, Km3, we detected Ks3-specific mtDNA (> 100 bp, 11.38%) and repeats (> 100 bp, 29 units) as well as genes that are unique to each line: rpl5 was missing in Ks3 and trnH was absent from Km3. We also defined 32 single nucleotide polymorphisms (SNPs) in 13 protein-coding, albeit functionally irrelevant, genes, and predicted 22 unique ORFs in Ks3, representing potential candidates for K-type CMS. All these sequence variations are candidates for involvement in CMS. A comparative analysis of the mtDNA of several angiosperms, including those from Ks3, Km3, rice, maize, Arabidopsis thaliana, and rapeseed, showed that non-coding sequences of higher plants had mostly divergent multiple reorganizations during the mtDNA evolution of higher plants.
The complete mitochondrial genome of the wheat K-type CMS line Ks3 is very different from that of its maintainer line Km3, especially in non-coding sequences. Sequence rearrangement has produced novel chimeric ORFs, which may be candidate genes for CMS. Comparative analysis of several angiosperm mtDNAs indicated that non-coding sequences are the most frequently reorganized during mtDNA evolution in higher plants.
PMCID: PMC3079663  PMID: 21443807
5.  Transcriptome and expression profiling analysis revealed changes of multiple signaling pathways involved in immunity in the large yellow croaker during Aeromonas hydrophila infection 
BMC Genomics  2010;11:506.
The large yellow croaker (Pseudosciaena crocea) is an economically important marine fish in China suffering from severe outbreaks of infectious disease caused by marine bacteria such as Aeromonas hydrophila (A. hydrophila), resulting in great economic losses. However, the mechanisms involved in the immune response of this fish to bacterial infection are not fully understood. To understand the molecular mechanisms underlying the immune response to such pathogenic bacteria, we used high-throughput deep sequencing technology to investigate the transcriptome and comparative expression profiles of the large yellow croaker infected with A. hydrophila.
A total of 13,611,340 reads were obtained and assembled into 26,313 scaffolds in transcriptional responses of the A. hydrophila-infected large yellow croaker. Via annotation to the NCBI database, we obtained 8216 identified unigenes. In total, 5590 (68%) unigenes were classified into Gene Ontology, and 3094 unigenes were found in 20 KEGG categories. These genes included representatives from almost all functional categories. By using Solexa/Illumina's DeepSAGE, 1996 differentially expressed genes (P value < 0.05) were detected in comparative analysis of the expression profiles between A. hydrophila-infected fish and control fish, including 727 remarkably upregulated genes and 489 remarkably downregulated genes. Dramatic differences were observed in genes involved in the inflammatory response. Bacterial infection affected the gene expression of many components of signaling cascades, including the Toll-like receptor, JAK-STAT, and MAPK pathways. Genes encoding factors involved in T cell receptor (TCR) signaling were also revealed to be regulated by infection in these fish.
Based on our results, we conclude that the inflammatory response may play an important role in the early stages of infection. The signaling cascades such as the Toll-like receptor, JAK-STAT, and MAPK pathways are regulated by A. hydrophila infection. Interestingly, genes encoding factors involved in TCR signaling were revealed to be downregulated by infection, indicating that TCR signaling was suppressed at this early period. These results revealed changes of multiple signaling pathways involved in immunity during A. hydrophila infection, which will facilitate our comprehensive understanding of the mechanisms involved in the immune response to bacterial infection in the large yellow croaker.
PMCID: PMC2997002  PMID: 20858287
6.  Genome evolution driven by host adaptations results in a more virulent and antimicrobial-resistant Streptococcus pneumoniae serotype 14 
BMC Genomics  2009;10:158.
Streptococcus pneumoniae serotype 14 is one of the most common pneumococcal serotypes that cause invasive pneumococcal diseases worldwide. Serotype 14 often expresses resistance to a variety of antimicrobial agents, resulting in difficulties in treatment. To gain insight into the evolution of virulence and antimicrobial resistance traits in S. pneumoniae from the genome level, we sequenced the entire genome of a serotype 14 isolate (CGSP14), and carried out comprehensive comparison with other pneumococcal genomes. Multiple serotype 14 clinical isolates were also genotyped by multilocus sequence typing (MLST).
Comparative genomic analysis revealed that the CGSP14 acquired a number of new genes by horizontal gene transfer (HGT), most of which were associated with virulence and antimicrobial resistance and clustered in mobile genetic elements. The most remarkable feature is the acquisition of two conjugative transposons and one resistance island encoding eight resistance genes. Results of MLST suggested that the major driving force for the genome evolution is the environmental drug pressure.
The genome sequence of S. pneumoniae serotype 14 shows a bacterium with rapid adaptations to its lifecycle in human community. These include a versatile genome content, with a wide range of mobile elements, and chromosomal rearrangement; the latter re-balanced the genome after events of HGT.
PMCID: PMC2678160  PMID: 19361343
7.  Analysis of tarantula skeletal muscle protein sequences and identification of transcriptional isoforms 
BMC Genomics  2009;10:117.
Tarantula has been used as a model system for studying skeletal muscle structure and function, yet data on the genes expressed in tarantula muscle are lacking.
We constructed a cDNA library from Aphonopelma sp. (Tarantula) skeletal muscle and got 2507 high-quality 5'ESTs (expressed sequence tags) from randomly picked clones. EST analysis showed 305 unigenes, among which 81 had more than 2 ESTs. Twenty abundant unigenes had matches to skeletal muscle-related genes including actin, myosin, tropomyosin, troponin-I, T and C, paramyosin, muscle LIM protein, muscle protein 20, a-actinin and tandem Ig/Fn motifs (found in giant sarcomere-related proteins). Matches to myosin light chain kinase and calponin were also identified. These results support the existence of both actin-linked and myosin-linked regulation in tarantula skeletal muscle.
We have predicted full-length as well as partial cDNA sequences both experimentally and computationally for myosin heavy and light chains, actin, tropomyosin, and troponin-I, T and C, and have deduced the putative peptides. A preliminary analysis of the structural and functional properties was also carried out. Sequence similarities suggested multiple isoforms of most myofibrillar proteins, supporting the generality of multiple isoforms known from previous muscle sequence studies. This may be related to a mix of muscle fiber types.
The present study serves as a basis for defining the transcriptome of tarantula skeletal muscle, for future in vitro expression of tarantula proteins, and for interpreting structural and functional observations in this model species.
PMCID: PMC2674065  PMID: 19298669
8.  A gene catalogue for post-diapause development of an anhydrobiotic arthropod Artemia franciscana 
BMC Genomics  2009;10:52.
Diapause is a reversible state of developmental suspension and found among diverse taxa, from plants to animals, including marsupials and some other mammals. Although previous work has accumulated ample data, the molecular mechanism underlying diapause and reactivation from it remain elusive.
Using Artemia franciscana, a model organism to study the development of post-diapause embryos in Arthropod, we sequenced random clones up to a total of 28,039 ESTs from four cDNA libraries made from dehydrated cysts and three time points after rehydration/reactivation, which were assembled into 8,018 unigene clusters. We identified 324 differentially-expressed genes (DEGs, P < 0.05) based on pairwise comparisons of the four cDNA libraries. We identified a group of genes that are involved in an anti-water-deficit system, including proteases, protease inhibitors, heat shock proteins, and several novel members of the late embryogenesis abundant (LEA) protein family. In addition, we classified most of the up-regulated genes after cyst reactivation into metabolism, biosynthesis, transcription, and translation, and this result is consistent with the rapid development of the embryo. Some of the specific expressions of DEGs were confirmed experimentally based on quantitative real-time PCR.
We found that the first 5-hour period after rehydration is most important for embryonic reactivation of Artemia. As the total number of expressed genes increases significantly, the majority of DEGs were also identified in this period, including a group of water-deficient-induced genes. A group of genes with similar functions have been described in plant seeds; for instance, one of the novel LEA members shares ~70% amino-acid identity with an Arabidopsis EM (embryonic abundant) protein, the closest animal relative to plant LEA families identified thus far. Our findings also suggested that not only nutrition, but also mRNAs are produced and stored during cyst formation to support rapid development after reactivation.
PMCID: PMC2649162  PMID: 19173719
9.  Complete genome of Phenylobacterium zucineum – a novel facultative intracellular bacterium isolated from human erythroleukemia cell line K562 
BMC Genomics  2008;9:386.
Phenylobacterium zucineum is a recently identified facultative intracellular species isolated from the human leukemia cell line K562. Unlike the known intracellular pathogens, P. zucineum maintains a stable association with its host cell without affecting the growth and morphology of the latter.
Here, we report the whole genome sequence of the type strain HLK1T. The genome consists of a circular chromosome (3,996,255 bp) and a circular plasmid (382,976 bp). It encodes 3,861 putative proteins, 42 tRNAs, and a 16S-23S-5S rRNA operon. Comparative genomic analysis revealed that it is phylogenetically closest to Caulobacter crescentus, a model species for cell cycle research. Notably, P. zucineum has a gene that is strikingly similar, both structurally and functionally, to the cell cycle master regulator CtrA of C. crescentus, and most of the genes directly regulated by CtrA in the latter have orthologs in the former.
This work presents the first complete bacterial genome in the genus Phenylobacterium. Comparative genomic analysis indicated that the CtrA regulon is well conserved between C. crescentus and P. zucineum.
PMCID: PMC2529317  PMID: 18700039
10.  A complete mitochondrial genome sequence of the wild two-humped camel (Camelus bactrianus ferus): an evolutionary history of camelidae 
BMC Genomics  2007;8:241.
The family Camelidae that evolved in North America during the Eocene survived with two distinct tribes, Camelini and Lamini. To investigate the evolutionary relationship between them and to further understand the evolutionary history of this family, we determined the complete mitochondrial genome sequence of the wild two-humped camel (Camelus bactrianus ferus), the only wild survivor of the Old World camel.
The mitochondrial genome sequence (16,680 bp) from C. bactrianus ferus contains 13 protein-coding, two rRNA, and 22 tRNA genes as well as a typical control region; this basic structure is shared by all metazoan mitochondrial genomes. Its protein-coding region exhibits codon usage common to all mammals and possesses the three cryptic stop codons shared by all vertebrates. C. bactrianus ferus together with the rest of mammalian species do not share a triplet nucleotide insertion (GCC) that encodes a proline residue found only in the nd1 gene of the New World camelid Lama pacos. This lineage-specific insertion in the L. pacos mtDNA occurred after the split between the Old and New World camelids suggests that it may have functional implication since a proline insertion in a protein backbone usually alters protein conformation significantly, and nd1 gene has not been seen as polymorphic as the rest of ND family genes among camelids. Our phylogenetic study based on complete mitochondrial genomes excluding the control region suggested that the divergence of the two tribes may occur in the early Miocene; it is much earlier than what was deduced from the fossil record (11 million years). An evolutionary history reconstructed for the family Camelidae based on cytb sequences suggested that the split of bactrian camel and dromedary may have occurred in North America before the tribe Camelini migrated from North America to Asia.
Molecular clock analysis of complete mitochondrial genomes from C. bactrianus ferus and L. pacos suggested that the two tribes diverged from their common ancestor about 25 million years ago, much earlier than what was predicted based on fossil records.
PMCID: PMC1939714  PMID: 17640355
11.  Transcriptome analysis of Deinagkistrodon acutus venomous gland focusing on cellular structure and functional aspects using expressed sequence tags 
BMC Genomics  2006;7:152.
The snake venom gland is a specialized organ, which synthesizes and secretes the complex and abundant toxin proteins. Though gene expression in the snake venom gland has been extensively studied, the focus has been on the components of the venom. As far as the molecular mechanism of toxin secretion and metabolism is concerned, we still knew a little. Therefore, a fundamental question being arisen is what genes are expressed in the snake venom glands besides many toxin components?
To examine extensively the transcripts expressed in the venom gland of Deinagkistrodon acutus and unveil the potential of its products on cellular structure and functional aspects, we generated 8696 expressed sequence tags (ESTs) from a non-normalized cDNA library. All ESTs were clustered into 3416 clusters, of which 40.16% of total ESTs belong to recognized toxin-coding sequences; 39.85% are similar to cellular transcripts; and 20.00% have no significant similarity to any known sequences. By analyzing cellular functional transcripts, we found high expression of some venom related genes and gland-specific genes, such as calglandulin EF-hand protein gene and protein disulfide isomerase gene. The transcripts of creatine kinase and NADH dehydrogenase were also identified at high level. Moreover, abundant cellular structural proteins similar to mammalian muscle tissues were also identified. The phylogenetic analysis of two snake venom toxin families of group III metalloproteinase and serine protease in suborder Colubroidea showed an early single recruitment event in the viperids evolutionary process.
Gene cataloguing and profiling of the venom gland of Deinagkistrodon acutus is an essential requisite to provide molecular reagents for functional genomic studies needed for elucidating mechanisms of action of toxins and surveying physiological events taking place in the very specialized secretory tissue. So this study provides a first global view of the genetic programs for the venom gland of Deinagkistrodon acutus described so far and an insight into molecular mechanism of toxin secreting.
All sequences data reported in this paper have been submitted into the public database [GenBank: DV556511-DV565206].
PMCID: PMC1525187  PMID: 16776837
12.  Pigs in sequence space: A 0.66X coverage pig genome survey based on shotgun sequencing 
BMC Genomics  2005;6:70.
Comparative whole genome analysis of Mammalia can benefit from the addition of more species. The pig is an obvious choice due to its economic and medical importance as well as its evolutionary position in the artiodactyls.
We have generated ~3.84 million shotgun sequences (0.66X coverage) from the pig genome. The data are hereby released (NCBI Trace repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project") together with an initial evolutionary analysis.
The non-repetitive fraction of the sequences was aligned to the UCSC human-mouse alignment and the resulting three-species alignments were annotated using the human genome annotation. Ultra-conserved elements and miRNAs were identified. The results show that for each of these types of orthologous data, pig is much closer to human than mouse is. Purifying selection has been more efficient in pig compared to human, but not as efficient as in mouse, and pig seems to have an isochore structure most similar to the structure in human.
The addition of the pig to the set of species sequenced at low coverage adds to the understanding of selective pressures that have acted on the human genome by bisecting the evolutionary branch between human and mouse with the mouse branch being approximately 3 times as long as the human branch. Additionally, the joint alignment of the shot-gun sequences to the human-mouse alignment offers the investigator a rapid way to defining specific regions for analysis and resequencing.
PMCID: PMC1142312  PMID: 15885146

Results 1-12 (12)