1.  Increasing gene discovery and coverage using RNA-seq of globin RNA reduced porcine blood samples 
BMC Genomics  2014;15(1):954.
Transcriptome analysis of porcine whole blood has several applications, which include deciphering genetic mechanisms for host responses to viral infection and vaccination. The abundance of alpha- and beta-globin transcripts in blood, however, impedes the ability to cost-effectively detect transcripts of low abundance. Although protocols exist for reduction of globin transcripts from human and mouse/rat blood, preliminary work demonstrated these are not useful for porcine blood Globin Reduction (GR). Our objectives were to develop a porcine specific GR protocol and to evaluate the GR effects on gene discovery and sequence read coverage in RNA-sequencing (RNA-seq) experiments.
A GR protocol for porcine blood samples was developed using RNase H with antisense oligonucleotides specifically targeting porcine hemoglobin alpha (HBA) and beta (HBB) mRNAs. Whole blood samples (n = 12) collected in Tempus tubes were used for evaluating the efficacy and effects of GR on RNA-seq. The HBA and HBB mRNA transcripts comprised an average of 46.1% of the mapped reads in pre-GR samples, but those reads reduced to an average of 8.9% in post-GR samples. Differential gene expression analysis showed that the expression level of 11,046 genes were increased, whereas 34 genes, excluding HBA and HBB, showed decreased expression after GR (FDR <0.05). An additional 815 genes were detected only in post-GR samples.
Our porcine specific GR primers and protocol minimize the number of reads of globin transcripts in whole blood samples and provides increased coverage as well as accuracy and reproducibility of transcriptome analysis. Increased detection of low abundance mRNAs will ensure that studies relying on transcriptome analyses do not miss information that may be vital to the success of the study.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-954) contains supplementary material, which is available to authorized users.
PMCID: PMC4230834  PMID: 25374277
Pig; Blood; Globin reduction; RNA-seq; Transcriptome
2.  Gene co-expression network analysis identifies porcine genes associated with variation in Salmonella shedding 
BMC Genomics  2014;15(1):452.
Salmonella enterica serovar Typhimurium is a gram-negative bacterium that can colonise the gut of humans and several species of food producing farm animals to cause enteric or septicaemic salmonellosis. While many studies have looked into the host genetic response to Salmonella infection, relatively few have used correlation of shedding traits with gene expression patterns to identify genes whose variable expression among different individuals may be associated with differences in Salmonella clearance and resistance. Here, we aimed to identify porcine genes and gene co-expression networks that differentiate distinct responses to Salmonella challenge with respect to faecal Salmonella shedding.
Peripheral blood transcriptome profiles from 16 pigs belonging to extremes of the trait of faecal Salmonella shedding counts recorded up to 20 days post-inoculation (low shedders (LS), n = 8; persistent shedders (PS), n = 8) were generated using RNA-sequencing from samples collected just before (day 0) and two days after (day 2) Salmonella inoculation. Weighted gene co-expression network analysis (WGCNA) of day 0 samples identified four modules of co-expressed genes significantly correlated with Salmonella shedding counts upon future challenge. Two of those modules consisted largely of innate immunity related genes, many of which were significantly up-regulated at day 2 post-inoculation. The connectivity at both days and the mean gene-wise expression levels at day 0 of the genes within these modules were higher in networks constructed using LS samples alone than those using PS alone. Genes within these modules include those previously reported to be involved in Salmonella resistance such as SLC11A1 (formerly NRAMP1), TLR4, CD14 and CCR1 and those for which an association with Salmonella is novel, for example, SIGLEC5, IGSF6 and TNFSF13B.
Our analysis integrates gene co-expression network analysis, gene-trait correlations and differential expression to provide new candidate regulators of Salmonella shedding in pigs. The comparatively higher expression (also confirmed in an independent dataset) and the significantly higher connectivity of genes within the Salmonella shedding associated modules in LS compared to PS even before Salmonella challenge may be factors that contribute to the decreased faecal Salmonella shedding observed in LS following challenge.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-452) contains supplementary material, which is available to authorized users.
PMCID: PMC4070558  PMID: 24912583
3.  Expansion of ruminant-specific microRNAs shapes target gene expression divergence between ruminant and non-ruminant species 
BMC Genomics  2013;14:609.
Understanding how species-specific microRNAs (miRNAs) contribute to species-specific phenotypes is a central topic in biology. This study aimed to elucidate the role of ruminant-specific miRNAs in shaping mRNA expression divergence between ruminant and non-ruminant species.
We analyzed miRNA and mRNA transcriptomes generated by Illumina sequencing from whole blood samples of cattle and a closely related non-ruminant species, pig. We found evidence of expansion of cattle-specific miRNAs by analyzing miRNA conservation among 57 vertebrate species. The emergence of cattle-specific miRNAs was accompanied by accelerated sequence evolution at their target sites. Further, the target genes of cattle-specific miRNAs show markedly reduced expression compared to their pig and human orthologues. We found that target genes with conserved or non-conserved target sites of cattle-specific miRNAs exhibit reduced expression. One of the significantly enriched KEGG pathway terms for the target genes of the cattle-specific miRNAs is the insulin signalling pathway, raising the possibility that some of these miRNAs may modulate insulin resistance in ruminants.
We provide evidence of rapid miRNA-mediated regulatory evolution in the ruminant lineage. Cattle-specific miRNAs play an important role in shaping gene expression divergence between ruminant and non-ruminant species, by influencing the expression of targets genes through both conserved and cattle-specific target sites.
PMCID: PMC3847189  PMID: 24020371
4.  Whole-genome resequencing of Hanwoo (Korean cattle) and insight into regions of homozygosity 
BMC Genomics  2013;14:519.
Hanwoo (Korean cattle), which originated from natural crossbreeding between taurine and zebu cattle, migrated to the Korean peninsula through North China. Hanwoo were raised as draft animals until the 1970s without the introduction of foreign germplasm. Since 1979, Hanwoo has been bred as beef cattle. Genetic variation was analyzed by whole-genome deep resequencing of a Hanwoo bull. The Hanwoo genome was compared to that of two other breeds, Black Angus and Holstein, and genes within regions of homozygosity were investigated to elucidate the genetic and genomic characteristics of Hanwoo.
The Hanwoo bull genome was sequenced to 45.6-fold coverage using the ABI SOLiD system. In total, 4.7 million single-nucleotide polymorphisms and 0.4 million small indels were identified by comparison with the Btau4.0 reference assembly. Of the total number of SNPs and indels, 58% and 87%, respectively, were novel. The overall genotype concordance between the SNPs and BovineSNP50 BeadChip data was 96.4%. Of 1.6 million genetic differences in Hanwoo, approximately 25,000 non-synonymous SNPs, splice-site variants, and coding indels (NS/SS/Is) were detected in 8,360 genes. Among 1,045 genes containing reliable specific NS/SS/Is in Hanwoo, 109 genes contained more than one novel damaging NS/SS/I. Of the genes containing NS/SS/Is, 610 genes were assigned as trait-associated genes. Moreover, 16, 78, and 51 regions of homozygosity (ROHs) were detected in Hanwoo, Black Angus, and Holstein, respectively. ‘Regulation of actin filament length’ was revealed as a significant gene ontology term and 25 trait-associated genes for meat quality and disease resistance were found in 753 genes that resided in the ROHs of Hanwoo. In Hanwoo, 43 genes were located in common ROHs between whole-genome resequencing and SNP chips in BTA2, 10, and 13 coincided with quantitative trait loci for meat fat traits. In addition, the common ROHs in BTA2 and 16 were in agreement between Hanwoo and Black Angus.
We identified 4.7 million SNPs and 0.4 million small indels by whole-genome resequencing of a Hanwoo bull. Approximately 25,000 non-synonymous SNPs, splice-site variants, and coding indels (NS/SS/Is) were detected in 8,360 genes. Additionally, we found 25 trait-associated genes for meat quality and disease resistance among 753 genes that resided in the ROHs of Hanwoo. These findings will provide useful genomic information for identifying genes or casual mutations associated with economically important traits in cattle.
PMCID: PMC3750754  PMID: 23899338
Hanwoo; Resequencing; NS/SS/I; ROH
5.  Comparative analysis of two phenotypically-similar but genomically-distinct Burkholderia cenocepacia-specific bacteriophages 
BMC Genomics  2012;13:223.
Genomic analysis of bacteriophages infecting the Burkholderia cepacia complex (BCC) is an important preliminary step in the development of a phage therapy protocol for these opportunistic pathogens. The objective of this study was to characterize KL1 (vB_BceS_KL1) and AH2 (vB_BceS_AH2), two novel Burkholderia cenocepacia-specific siphoviruses isolated from environmental samples.
KL1 and AH2 exhibit several unique phenotypic similarities: they infect the same B. cenocepacia strains, they require prolonged incubation at 30°C for the formation of plaques at low titres, and they do not form plaques at similar titres following incubation at 37°C. However, despite these similarities, we have determined using whole-genome pyrosequencing that these phages show minimal relatedness to one another. The KL1 genome is 42,832 base pairs (bp) in length and is most closely related to Pseudomonas phage 73 (PA73). In contrast, the AH2 genome is 58,065 bp in length and is most closely related to Burkholderia phage BcepNazgul. Using both BLASTP and HHpred analysis, we have identified and analyzed the putative virion morphogenesis, lysis, DNA binding, and MazG proteins of these two phages. Notably, MazG homologs identified in cyanophages have been predicted to facilitate infection of stationary phase cells and may contribute to the unique plaque phenotype of KL1 and AH2.
The nearly indistinguishable phenotypes but distinct genomes of KL1 and AH2 provide further evidence of both vast diversity and convergent evolution in the BCC-specific phage population.
PMCID: PMC3483164  PMID: 22676492
6.  Comparing thousands of circular genomes using the CGView Comparison Tool 
BMC Genomics  2012;13:202.
Continued sequencing efforts coupled with advances in sequencing technology will lead to the completion of a vast number of small genomes. Whole-genome comparisons represent an important part of the analysis of any new genome sequence, as they can provide a better understanding of the biology and evolution of the source organism. Visualization of the results is important, as it allows information from a variety of sources to be integrated and interpreted. However, existing graphical comparison tools lack features needed for efficiently comparing a new genome to hundreds or thousands of existing sequences. Moreover, existing tools are limited in terms of the types of comparisons that can be performed, the extent to which the output can be customized, and the ease with which the entire process can be automated.
The CGView Comparison Tool (CCT) is a package for visually comparing bacterial, plasmid, chloroplast, or mitochondrial sequences of interest to existing genomes or sequence collections. The comparisons are conducted using BLAST, and the BLAST results are presented in the form of graphical maps that can also show sequence features, gene and protein names, COG (Clusters of Orthologous Groups of proteins) category assignments, and sequence composition characteristics. CCT can generate maps in a variety of sizes, including 400 Megapixel maps suitable for posters. Comparisons can be conducted within a particular species or genus, or all available genomes can be used. The entire map creation process, from downloading sequences to redrawing zoomed maps, can be completed easily using scripts included with the CCT. User-defined features or analysis results can be included on maps, and maps can be extensively customized. To simplify program setup, a CCT virtual machine that includes all dependencies preinstalled is available. Detailed tutorials illustrating the use of CCT are included with the CCT documentation.
CCT can be used to visually compare a reference sequence to thousands of existing genomes or sequence collections (next-generation sequencing reads for example) on a standard desktop computer. It provides analysis and visualization functionality not available in any existing circular genome visualization tool. By visually presenting sequence conservation information along with functional classifications and sequence composition characteristics, CCT can be a useful tool for identifying rapidly evolving or novel sequences, horizontally transferred sequences, or unusual functional properties in newly sequenced genomes. CCT is freely available for download at
PMCID: PMC3469350  PMID: 22621371
7.  Whole genome resequencing of black Angus and Holstein cattle for SNP and CNV discovery 
BMC Genomics  2011;12:559.
One of the goals of livestock genomics research is to identify the genetic differences responsible for variation in phenotypic traits, particularly those of economic importance. Characterizing the genetic variation in livestock species is an important step towards linking genes or genomic regions with phenotypes. The completion of the bovine genome sequence and recent advances in DNA sequencing technology allow for in-depth characterization of the genetic variations present in cattle. Here we describe the whole-genome resequencing of two Bos taurus bulls from distinct breeds for the purpose of identifying and annotating novel forms of genetic variation in cattle.
The genomes of a Black Angus bull and a Holstein bull were sequenced to 22-fold and 19-fold coverage, respectively, using the ABI SOLiD system. Comparisons of the sequences with the Btau4.0 reference assembly yielded 7 million single nucleotide polymorphisms (SNPs), 24% of which were identified in both animals. Of the total SNPs found in Holstein, Black Angus, and in both animals, 81%, 81%, and 75% respectively are novel. In-depth annotations of the data identified more than 16 thousand distinct non-synonymous SNPs (85% novel) between the two datasets. Alignments between the SNP-altered proteins and orthologues from numerous species indicate that many of the SNPs alter well-conserved amino acids. Several SNPs predicted to create or remove stop codons were also found. A comparison between the sequencing SNPs and genotyping results from the BovineHD high-density genotyping chip indicates a detection rate of 91% for homozygous SNPs and 81% for heterozygous SNPs. The false positive rate is estimated to be about 2% for both the Black Angus and Holstein SNP sets, based on follow-up genotyping of 422 and 427 SNPs, respectively. Comparisons of read depth between the two bulls along the reference assembly identified 790 putative copy-number variations (CNVs). Ten randomly selected CNVs, five genic and five non-genic, were successfully validated using quantitative real-time PCR. The CNVs are enriched for immune system genes and include genes that may contribute to lactation capacity. The majority of the CNVs (69%) were detected as regions with higher abundance in the Holstein bull.
Substantial genetic differences exist between the Black Angus and Holstein animals sequenced in this work and the Hereford reference sequence, and some of this variation is predicted to affect evolutionarily conserved amino acids or gene copy number. The deeply annotated SNPs and CNVs identified in this resequencing study can serve as useful genetic tools, and as candidates in searches for phenotype-altering DNA differences.
PMCID: PMC3229636  PMID: 22085807
8.  Genomic analysis and relatedness of P2-like phages of the Burkholderia cepacia complex 
BMC Genomics  2010;11:599.
The Burkholderia cepacia complex (BCC) is comprised of at least seventeen Gram-negative species that cause infections in cystic fibrosis patients. Because BCC bacteria are broadly antibiotic resistant, phage therapy is currently being investigated as a possible alternative treatment for these infections. The purpose of our study was to sequence and characterize three novel BCC-specific phages: KS5 (vB_BceM-KS5 or vB_BmuZ-ATCC 17616), KS14 (vB_BceM-KS14) and KL3 (vB_BamM-KL3 or vB_BceZ-CEP511).
KS5, KS14 and KL3 are myoviruses with the A1 morphotype. The genomes of these phages are between 32317 and 40555 base pairs in length and are predicted to encode between 44 and 52 proteins. These phages have over 50% of their proteins in common with enterobacteria phage P2 and so can be classified as members of the Peduovirinae subfamily and the "P2-like viruses" genus. The BCC phage proteins similar to those encoded by P2 are predominantly structural components involved in virion morphogenesis. As prophages, KS5 and KL3 integrate into an AMP nucleosidase gene and a threonine tRNA gene, respectively. Unlike other P2-like viruses, the KS14 prophage is maintained as a plasmid. The P2 E+E' translational frameshift site is conserved among these three phages and so they are predicted to use frameshifting for expression of two of their tail proteins. The lysBC genes of KS14 and KL3 are similar to those of P2, but in KS5 the organization of these genes suggests that they may have been acquired via horizontal transfer from a phage similar to λ. KS5 contains two sequence elements that are unique among these three phages: an ISBmu2-like insertion sequence and a reverse transcriptase gene. KL3 encodes an EcoRII-C endonuclease/methylase pair and Vsr endonuclease that are predicted to function during the lytic cycle to cleave non-self DNA, protect the phage genome and repair methylation-induced mutations.
KS5, KS14 and KL3 are the first BCC-specific phages to be identified as P2-like. As KS14 has previously been shown to be active against Burkholderia cenocepacia in vivo, genomic characterization of these phages is a crucial first step in the development of these and similar phages for clinical use against the BCC.
PMCID: PMC3091744  PMID: 20973964
9.  Metabolic flexibility revealed in the genome of the cyst-forming α-1 proteobacterium Rhodospirillum centenum 
BMC Genomics  2010;11:325.
Rhodospirillum centenum is a photosynthetic non-sulfur purple bacterium that favors growth in an anoxygenic, photosynthetic N2-fixing environment. It is emerging as a genetically amenable model organism for molecular genetic analysis of cyst formation, photosynthesis, phototaxis, and cellular development. Here, we present an analysis of the genome of this bacterium.
R. centenum contains a singular circular chromosome of 4,355,548 base pairs in size harboring 4,105 genes. It has an intact Calvin cycle with two forms of Rubisco, as well as a gene encoding phosphoenolpyruvate carboxylase (PEPC) for mixotrophic CO2 fixation. This dual carbon-fixation system may be required for regulating internal carbon flux to facilitate bacterial nitrogen assimilation. Enzymatic reactions associated with arsenate and mercuric detoxification are rare or unique compared to other purple bacteria. Among numerous newly identified signal transduction proteins, of particular interest is a putative bacteriophytochrome that is phylogenetically distinct from a previously characterized R. centenum phytochrome, Ppr. Genes encoding proteins involved in chemotaxis as well as a sophisticated dual flagellar system have also been mapped.
Remarkable metabolic versatility and a superior capability for photoautotrophic carbon assimilation is evident in R. centenum.
PMCID: PMC2890560  PMID: 20500872
10.  A first generation whole genome RH map of the river buffalo with comparison to domestic cattle 
BMC Genomics  2008;9:631.
The recently constructed river buffalo whole-genome radiation hybrid panel (BBURH5000) has already been used to generate preliminary radiation hybrid (RH) maps for several chromosomes, and buffalo-bovine comparative chromosome maps have been constructed. Here, we present the first-generation whole genome RH map (WG-RH) of the river buffalo generated from cattle-derived markers. The RH maps aligned to bovine genome sequence assembly Btau_4.0, providing valuable comparative mapping information for both species.
A total of 3990 markers were typed on the BBURH5000 panel, of which 3072 were cattle derived SNPs. The remaining 918 were classified as cattle sequence tagged site (STS), including coding genes, ESTs, and microsatellites. Average retention frequency per chromosome was 27.3% calculated with 3093 scorable markers distributed in 43 linkage groups covering all autosomes (24) and the X chromosomes at a LOD ≥ 8. The estimated total length of the WG-RH map is 36,933 cR5000. Fewer than 15% of the markers (472) could not be placed within any linkage group at a LOD score ≥ 8. Linkage group order for each chromosome was determined by incorporation of markers previously assigned by FISH and by alignment with the bovine genome sequence assembly (Btau_4.0).
We obtained radiation hybrid chromosome maps for the entire river buffalo genome based on cattle-derived markers. The alignments of our RH maps to the current bovine genome sequence assembly (Btau_4.0) indicate regions of possible rearrangements between the chromosomes of both species. The river buffalo represents an important agricultural species whose genetic improvement has lagged behind other species due to limited prior genomic characterization. We present the first-generation RH map which provides a more extensive resource for positional candidate cloning of genes associated with complex traits and also for large-scale physical mapping of the river buffalo genome.
PMCID: PMC2625372  PMID: 19108729
11.  Genomic sequence and activity of KS10, a transposable phage of the Burkholderia cepacia complex 
BMC Genomics  2008;9:615.
The Burkholderia cepacia complex (BCC) is a versatile group of Gram negative organisms that can be found throughout the environment in sources such as soil, water, and plants. While BCC bacteria can be involved in beneficial interactions with plants, they are also considered opportunistic pathogens, specifically in patients with cystic fibrosis and chronic granulomatous disease. These organisms also exhibit resistance to many antibiotics, making conventional treatment often unsuccessful. KS10 was isolated as a prophage of B. cenocepacia K56-2, a clinically relevant strain of the BCC. Our objective was to sequence the genome of this phage and also determine if this prophage encoded any virulence determinants.
KS10 is a 37,635 base pairs (bp) transposable phage of the opportunistic pathogen Burkholderia cenocepacia. Genome sequence analysis and annotation of this phage reveals that KS10 shows the closest sequence homology to Mu and BcepMu. KS10 was found to be a prophage in three different strains of B. cenocepacia, including strains K56-2, J2315, and C5424, and seven tested clinical isolates of B. cenocepacia, but no other BCC species. A survey of 23 strains and 20 clinical isolates of the BCC revealed that KS10 is able to form plaques on lawns of B. ambifaria LMG 19467, B. cenocepacia PC184, and B. stabilis LMG 18870.
KS10 is a novel phage with a genomic organization that differs from most phages in that its capsid genes are not aligned into one module but rather separated by approximately 11 kb, giving evidence of one or more prior genetic rearrangements. There were no potential virulence factors identified in KS10, though many hypothetical proteins were identified with no known function.
PMCID: PMC2628397  PMID: 19094239
12.  High resolution radiation hybrid maps of bovine chromosomes 19 and 29: comparison with the bovine genome sequence assembly 
BMC Genomics  2007;8:310.
High resolution radiation hybrid (RH) maps can facilitate genome sequence assembly by correctly ordering genes and genetic markers along chromosomes. The objective of the present study was to generate high resolution RH maps of bovine chromosomes 19 (BTA19) and 29 (BTA29), and compare them with the current 7.1X bovine genome sequence assembly (bovine build 3.1). We have chosen BTA19 and 29 as candidate chromosomes for mapping, since many Quantitative Trait Loci (QTL) for the traits of carcass merit and residual feed intake have been identified on these chromosomes.
We have constructed high resolution maps of BTA19 and BTA29 consisting of 555 and 253 Single Nucleotide Polymorphism (SNP) markers respectively using a 12,000 rad whole genome RH panel. With these markers, the RH map of BTA19 and BTA29 extended to 4591.4 cR and 2884.1 cR in length respectively. When aligned with the current bovine build 3.1, the order of markers on the RH map for BTA19 and 29 showed inconsistencies with respect to the genome assembly. Maps of both the chromosomes show that there is a significant internal rearrangement of the markers involving displacement, inversion and flips within the scaffolds with some scaffolds being misplaced in the genome assembly. We also constructed cattle-human comparative maps of these chromosomes which showed an overall agreement with the comparative maps published previously. However, minor discrepancies in the orientation of few homologous synteny blocks were observed.
The high resolution maps of BTA19 (average 1 locus/139 kb) and BTA29 (average 1 locus/208 kb) presented in this study suggest that by the incorporation of RH mapping information, the current bovine genome sequence assembly can be significantly improved. Furthermore, these maps can serve as a potential resource for fine mapping QTL and identification of causative mutations underlying QTL for economically important traits.
PMCID: PMC2064936  PMID: 17784962
13.  A high resolution radiation hybrid map of bovine chromosome 14 identifies scaffold rearrangement in the latest bovine assembly 
BMC Genomics  2007;8:254.
Radiation hybrid (RH) maps are considered to be a tool of choice for fine mapping closely linked loci, considering that the resolution of linkage maps is determined by the number of informative meiosis and recombination events which may require very large mapping populations. Accurately defining the marker order on chromosomes is crucial for correct identification of quantitative trait loci (QTL), haplotype map construction and refinement of candidate gene searches.
A 12 k Radiation hybrid map of bovine chromosome 14 was constructed using 843 single nucleotide polymorphism markers. The resulting map was aligned with the latest version of the bovine assembly (Btau_3.1) as well as other previously published RH maps. The resulting map identified distinct regions on Bovine chromosome 14 where discrepancies between this RH map and the bovine assembly occur. A major region of discrepancy was found near the centromere involving the arrangement and order of the scaffolds from the assembly. The map further confirms previously published conserved synteny blocks with human chromosome 8. As well, it identifies an extra breakpoint and conserved synteny block previously undetected due to lower marker density. This conserved synteny block is in a region where markers between the RH map presented here and the latest sequence assembly are in very good agreement.
The increase of publicly available markers shifts the rate limiting step from marker discovery to the correct identification of their order for further use by the research community. This high resolution map of bovine chromosome 14 will facilitate identification of regions in the sequence assembly where additional information is required to resolve marker ordering.
PMCID: PMC1959194  PMID: 17655763

