The Animal QTL Database (QTLdb; http://www.animalgenome.org/QTLdb) has undergone dramatic growth in recent years in terms of new data curated, data downloads and new functions and tools. We have focused our development efforts to cope with challenges arising from rapid growth of newly published data and end users’ data demands, and to optimize data retrieval and analysis to facilitate users’ research. Evidenced by the 27 releases in the past 11 years, the growth of the QTLdb has been phenomenal. Here we report our recent progress which is highlighted by addition of one new species, four new data types, four new user tools, a new API tool set, numerous new functions and capabilities added to the curator tool set, expansion of our data alliance partners and more than 20 other improvements. In this paper we present a summary of our progress to date and an outlook regarding future directions.
The presence of variability in the response of pigs to Porcine Reproductive and Respiratory Syndrome virus (PRRSv) infection, and recent demonstration of significant genetic control of such responses, leads us to believe that selection towards more disease resistant pigs could be a valid strategy to reduce its economic impact on the swine industry. To find underlying molecular differences in PRRS susceptible versus more resistant pigs, 100 animals with extremely different growth rates and viremia levels after PRRSv infection were selected from a total of 600 infected pigs. A microarray experiment was conducted on whole blood RNA samples taken at 0, 4 and 7 days post infection (dpi) from these pigs. From these data, we examined associations of gene expression with weight gain and viral load phenotypes. The single nucleotide polymorphism (SNP) marker WUR10000125 (WUR) on the porcine 60 K SNP chip was shown to be associated with viral load and weight gain after PRRSv infection, and so the effect of the WUR10000125 (WUR) genotype on expression in whole blood was also examined.
Limited information was obtained through linear modeling of blood gene differential expression (DE) that contrasted pigs with extreme phenotypes, for growth or viral load or between animals with different WUR genotype. However, using network-based approaches, molecular pathway differences between extreme phenotypic classes could be identified. Several gene clusters of interest were found when Weighted Gene Co-expression Network Analysis (WGCNA) was applied to 4dpi contrasted with 0dpi data. The expression pattern of one such cluster of genes correlated with weight gain and WUR genotype, contained numerous immune response genes such as cytokines, chemokines, interferon type I stimulated genes, apoptotic genes and genes regulating complement activation. In addition, Partial Correlation and Information Theory (PCIT) identified differentially hubbed (DH) genes between the phenotypically divergent groups. GO enrichment revealed that the target genes of these DH genes are enriched in adaptive immune pathways.
There are molecular differences in blood RNA patterns between pigs with extreme phenotypes or with a different WUR genotype in early responses to PRRSv infection, though they can be quite subtle and more difficult to discover with conventional DE expression analyses. Co-expression analyses such as WGCNA and PCIT can be used to reveal network differences between such extreme response groups.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1741-8) contains supplementary material, which is available to authorized users.
Pig; PRRS; Microarray; Transcriptomics; WGCNA; PCIT; Immune response
Intramuscular fat (IMF) content is related to insulin resistance, which is an important prediction factor for disorders, such as cardiovascular disease, obesity and type 2 diabetes in human. At the same time, it is an economically important trait, which influences the sensorial and nutritional value of meat. The deposition of IMF is influenced by many factors such as sex, age, nutrition, and genetics. In this study Nellore steers (Bos taurus indicus subspecies) were used to better understand the molecular mechanisms involved in IMF content. This was accomplished by identifying differentially expressed genes (DEG), biological pathways and putative regulatory factors. Animals included in this study had extreme genomic estimated breeding value (GEBV) for IMF. RNA-seq analysis, gene set enrichment analysis (GSEA) and co-expression network methods, such as partial correlation coefficient with information theory (PCIT), regulatory impact factor (RIF) and phenotypic impact factor (PIF) were utilized to better understand intramuscular adipogenesis. A total of 16,101 genes were analyzed in both groups (high (H) and low (L) GEBV) and 77 DEG (FDR 10%) were identified between the two groups. Pathway Studio software identified 13 significantly over-represented pathways, functional classes and small molecule signaling pathways within the DEG list. PCIT analyses identified genes with a difference in the number of gene-gene correlations between H and L group and detected putative regulatory factors involved in IMF content. Candidate genes identified by PCIT include: ANKRD26, HOXC5 and PPAPDC2. RIF and PIF analyses identified several candidate genes: GLI2 and IGF2 (RIF1), MPC1 and UBL5 (RIF2) and a host of small RNAs, including miR-1281 (PIF). These findings contribute to a better understanding of the molecular mechanisms that underlie fat content and energy balance in muscle and provide important information for the production of healthier beef for human consumption.
Previously, we identified a major quantitative trait locus (QTL) for host response to Porcine Respiratory and Reproductive Syndrome virus (PRRSV) infection in high linkage disequilibrium (LD) with SNP rs80800372 on Sus scrofa chromosome 4 (SSC4).
Within this QTL, guanylate binding protein 5 (GBP5) was differentially expressed (DE) (p < 0.05) in blood from AA versus AB rs80800372 genotyped pigs at 7,11, and 14 days post PRRSV infection. All variants within the GBP5 transcript in LD with rs80800372 exhibited allele specific expression (ASE) in AB individuals (p < 0.0001). A transcript re-assembly revealed three alternatively spliced transcripts for GBP5. An intronic SNP in GBP5, rs340943904, introduces a splice acceptor site that inserts five nucleotides into the transcript. Individuals homozygous for the unfavorable AA genotype predominantly produced this transcript, with a shifted reading frame and early stop codon that truncates the 88 C-terminal amino acids of the protein. RNA-seq analysis confirmed this SNP was associated with differential splicing by QTL genotype (p < 0.0001) and this was validated by quantitative capillary electrophoresis (p < 0.0001). The wild-type transcript was expressed at a higher level in AB versus AA individuals, whereas the five-nucleotide insertion transcript was the dominant form in AA individuals. Splicing and ASE results are consistent with the observed dominant nature of the favorable QTL allele. The rs340943904 SNP was also 100 % concordant with rs80800372 in a validation population that possessed an alternate form of the favorable B QTL haplotype.
GBP5 is known to play a role in inflammasome assembly during immune response. However, the role of GBP5 host genetic variation in viral immunity is novel. These findings demonstrate that rs340943904 is a strong candidate causal mutation for the SSC4 QTL that controls variation in host response to PRRSV.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1635-9) contains supplementary material, which is available to authorized users.
Kinase activity of cGMP-dependent, type II, protein kinase (PRKG2) is required for the proliferative to hypertrophic transition of growth plate chondrocytes during endochondral ossification. Loss of PRKG2 function in rodent and bovine models results in dwarfism. The objective of this study was to identify pathways regulated or impacted by PRKG2 loss of function that may be responsible for disproportionate dwarfism at the molecular level.
Microarray technology was used to compare growth plate cartilage gene expression in dwarf versus unaffected Angus cattle to identify putative downstream targets of PRGK2.
Pathway enrichment of 1284 transcripts (nominal p < 0.05) was used to identify candidate pathways consistent with the molecular phenotype of disproportionate dwarfism. Analysis with the DAVID pathway suite identified differentially expressed genes that clustered in the MHC, cytochrome B, WNT, and Muc1 pathways. A second analysis with pathway studio software identified differentially expressed genes in a host of pathways (e.g. CREB1, P21, CTNNB1, EGFR, EP300, JUN, P53, RHOA, and SRC). As a proof of concept, we validated the differential expression of five genes regulated by P53, including CEBPA, BRCA1, BUB1, CD58, and VDR by real-time PCR (p < 0.05).
Known and novel targets of PRKG2 were identified as enriched pathways in this study. This study indicates that loss of PRKG2 function results in differential expression of P53 regulated genes as well as additional pathways consistent with increased proliferation and apoptosis in the growth plate due to achondroplastic dwarfism.
Electronic supplementary material
The online version of this article (doi:10.1186/s13104-015-1136-6) contains supplementary material, which is available to authorized users.
Cattle; cGMP-dependent; Type II; Protein kinase (PRKG2); Dwarfism
We describe the organization of a nascent international effort, the Functional Annotation of Animal Genomes (FAANG) project, whose aim is to produce comprehensive maps of functional elements in the genomes of domesticated animal species.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-015-0622-4) contains supplementary material, which is available to authorized users.
Beef cattle require dietary minerals for optimal health, production and reproduction. Concentrations of minerals in tissues are at least partly genetically determined. Mapping genomic regions that affect the mineral content of bovine longissimus dorsi muscle can contribute to the identification of genes that control mineral balance, transportation, absorption and excretion and that could be associated to metabolic disorders.
We applied a genome-wide association strategy and genotyped 373 Nelore steers from 34 half-sib families with the Illumina BovineHD BeadChip. Genome-wide association analysis was performed for mineral content of longissimus dorsi muscle using a Bayesian approach implemented in the GenSel software.
Muscle mineral content in Bos indicus cattle was moderately heritable, with estimates ranging from 0.29 to 0.36. Our results suggest that variation in mineral content is influenced by numerous small-effect QTL (quantitative trait loci) but a large-effect QTL that explained 6.5% of the additive genetic variance in iron content was detected at 72 Mb on bovine chromosome 12. Most of the candidate genes present in the QTL regions for mineral content were involved in signal transduction, signaling pathways via integral (also called intrinsic) membrane proteins, transcription regulation or metal ion binding.
This study identified QTL and candidate genes that affect the mineral content of skeletal muscle. Our findings provide the first step towards understanding the molecular basis of mineral balance in bovine muscle and can also serve as a basis for the study of mineral balance in other organisms.
Electronic supplementary material
The online version of this article (doi:10.1186/s12711-014-0083-3) contains supplementary material, which is available to authorized users.
Myostatin (Mstn) knockout mice exhibit large increases in skeletal muscle mass. However, relatively few of the genes that mediate or modify MSTN effects are known. In this study, we performed co-expression network analysis using whole transcriptome microarray data from MSTN-null and wild-type mice to identify genes involved in important biological processes and pathways related to skeletal muscle and adipose development. Genes differentially expressed between wild-type and MSTN-null mice were further analyzed for shared DNA motifs using DREME. Differentially expressed genes were identified at 13.5 d.p.c. during primary myogenesis and at d35 during postnatal muscle development, but not at 17.5 d.p.c. during secondary myogenesis. In total, 283 and 2034 genes were differentially expressed at 13.5 d.p.c. and d35, respectively. Over-represented transcription factor binding sites in differentially expressed genes included SMAD3, SP1, ZFP187, and PLAGL1. The use of regulatory (RIF) and phenotypic (PIF) impact factor and differential hubbing co-expression analyses identified both known and potentially novel regulators of skeletal muscle growth, including Apobec2, Atp2a2, and Mmp13 at d35 and Sox2, Tmsb4x, and Vdac1 at 13.5 d.p.c. Among the genes with the highest PIF scores were many fiber type specifying genes. The use of RIF, PIF, and differential hubbing analyses identified both known and potentially novel regulators of muscle development. These results provide new details of how MSTN may mediate transcriptional regulation as well as insight into novel regulators of MSTN signal transduction that merit further study regarding their physiological roles in muscle and adipose development.
The Animal QTL database (QTLdb; http://www.animalgenome.org/QTLdb) is designed to house all publicly available QTL and single-nucleotide polymorphism/gene association data on livestock animal species. An earlier version was published in the Nucleic Acids Research Database issue in 2007. Since then, we have continued our efforts to develop new and improved database tools to allow more data types, parameters and functions. Our efforts have transformed the Animal QTLdb into a tool that actively serves the research community as a quality data repository and more importantly, a provider of easily accessible tools and functions to disseminate QTL and gene association information. The QTLdb has been heavily used by the livestock genomics community since its first public release in 2004. To date, there are 5920 cattle, 3442 chicken, 7451 pigs, 753 sheep and 88 rainbow trout data points in the database, and at least 290 publications that cite use of the database. The rapid advancement in genomic studies of cattle, chicken, pigs, sheep and other livestock animals has presented us with challenges, as well as opportunities for the QTLdb to meet the evolving needs of the research community. Here, we report our progress over the recent years and highlight new functions and services available to the general public.
Transcriptome analysis of porcine whole blood has several applications, which include deciphering genetic mechanisms for host responses to viral infection and vaccination. The abundance of alpha- and beta-globin transcripts in blood, however, impedes the ability to cost-effectively detect transcripts of low abundance. Although protocols exist for reduction of globin transcripts from human and mouse/rat blood, preliminary work demonstrated these are not useful for porcine blood Globin Reduction (GR). Our objectives were to develop a porcine specific GR protocol and to evaluate the GR effects on gene discovery and sequence read coverage in RNA-sequencing (RNA-seq) experiments.
A GR protocol for porcine blood samples was developed using RNase H with antisense oligonucleotides specifically targeting porcine hemoglobin alpha (HBA) and beta (HBB) mRNAs. Whole blood samples (n = 12) collected in Tempus tubes were used for evaluating the efficacy and effects of GR on RNA-seq. The HBA and HBB mRNA transcripts comprised an average of 46.1% of the mapped reads in pre-GR samples, but those reads reduced to an average of 8.9% in post-GR samples. Differential gene expression analysis showed that the expression level of 11,046 genes were increased, whereas 34 genes, excluding HBA and HBB, showed decreased expression after GR (FDR <0.05). An additional 815 genes were detected only in post-GR samples.
Our porcine specific GR primers and protocol minimize the number of reads of globin transcripts in whole blood samples and provides increased coverage as well as accuracy and reproducibility of transcriptome analysis. Increased detection of low abundance mRNAs will ensure that studies relying on transcriptome analyses do not miss information that may be vital to the success of the study.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-954) contains supplementary material, which is available to authorized users.
Pig; Blood; Globin reduction; RNA-seq; Transcriptome
Feed efficiency is jointly determined by productivity and feed requirements, both of which are economically relevant traits in beef cattle production systems. The objective of this study was to identify genes/QTLs associated with components of feed efficiency in Nelore cattle using Illumina BovineHD BeadChip (770 k SNP) genotypes from 593 Nelore steers. The traits analyzed included: average daily gain (ADG), dry matter intake (DMI), feed-conversion ratio (FCR), feed efficiency (FE), residual feed intake (RFI), maintenance efficiency (ME), efficiency of gain (EG), partial efficiency of growth (PEG) and relative growth rate (RGR). The Bayes B analysis was completed with Gensel software parameterized to fit fewer markers than animals. Genomic windows containing all the SNP loci in each 1 Mb that accounted for more than 1.0% of genetic variance were considered as QTL region. Candidate genes within windows that explained more than 1% of genetic variance were selected by putative function based on DAVID and Gene Ontology.
Thirty-six QTL (1-Mb SNP window) were identified on chromosomes 1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 14, 15, 16, 18, 19, 20, 21, 22, 24, 25 and 26 (UMD 3.1). The amount of genetic variance explained by individual QTL windows for feed efficiency traits ranged from 0.5% to 9.07%. Some of these QTL minimally overlapped with previously reported feed efficiency QTL for Bos taurus. The QTL regions described in this study harbor genes with biological functions related to metabolic processes, lipid and protein metabolism, generation of energy and growth. Among the positional candidate genes selected for feed efficiency are: HRH4, ALDH7A1, APOA2, LIN7C, CXADR, ADAM12 and MAP7.
Some genomic regions and some positional candidate genes reported in this study have not been previously reported for feed efficiency traits in Bos indicus. Comparison with published results indicates that different QTLs and genes may be involved in the control of feed efficiency traits in this Nelore cattle population, as compared to Bos taurus cattle.
Electronic supplementary material
The online version of this article (doi:10.1186/s12863-014-0100-0) contains supplementary material, which is available to authorized users.
Bos indicus; Candidate gene; Residual feed intake; Single nucleotide polymorphisms
The domestication and development of cattle has considerably impacted human societies, but the histories of cattle breeds and populations have been poorly understood especially for African, Asian, and American breeds. Using genotypes from 43,043 autosomal single nucleotide polymorphism markers scored in 1,543 animals, we evaluate the population structure of 134 domesticated bovid breeds. Regardless of the analytical method or sample subset, the three major groups of Asian indicine, Eurasian taurine, and African taurine were consistently observed. Patterns of geographic dispersal resulting from co-migration with humans and exportation are recognizable in phylogenetic networks. All analytical methods reveal patterns of hybridization which occurred after divergence. Using 19 breeds, we map the cline of indicine introgression into Africa. We infer that African taurine possess a large portion of wild African auroch ancestry, causing their divergence from Eurasian taurine. We detect exportation patterns in Asia and identify a cline of Eurasian taurine/indicine hybridization in Asia. We also identify the influence of species other than Bos taurus taurus and B. t. indicus in the formation of Asian breeds. We detect the pronounced influence of Shorthorn cattle in the formation of European breeds. Iberian and Italian cattle possess introgression from African taurine. American Criollo cattle originate from Iberia, and not directly from Africa with African ancestry inherited via Iberian ancestors. Indicine introgression into American cattle occurred in the Americas, and not Europe. We argue that cattle migration, movement and trading followed by admixture have been important forces in shaping modern bovine genomic variation.
The DNA of domesticated plants and animals contains information about how species were domesticated, exported, and bred by early farmers. Modern breeds were developed by lengthy and complex processes; however, our use of 134 breeds and new analytical models enabled us to reveal some of the processes that created modern cattle diversity. In Asia, Africa, North and South America, humpless (Bos t. taurus or taurine) and humped (Bos t. indicus or indicine) cattle were crossbred to produce hybrids adapted to the environment and local production systems. The history of Asian cattle involves the domestication and admixture of several species whereas African taurines arose through the introduction of domesticated Fertile Crescent taurines and their hybridization with wild African aurochs. African taurine genetic background is commonly observed among European Mediterranean breeds. The absence of indicine introgression within most European taurine breeds, but presence within three Italian breeds is consistent with at least two separate migration waves of cattle to Europe, one from the Middle East which captured taurines in which indicine introgression had already occurred and the second from western Africa into Spain with no indicine introgression. This second group seems to have radiated from Spain into the Mediterranean resulting in a cline of African taurine introgression into European taurines.
Meat from Bos taurus and Bos indicus breeds are an important source of nutrients for humans and intramuscular fat (IMF) influences its flavor, nutritional value and impacts human health. Human consumption of fat that contains high levels of monounsaturated fatty acids (MUFA) can reduce the concentration of undesirable cholesterol (LDL) in circulating blood. Different feeding practices and genetic variation within and between breeds influences the amount of IMF and fatty acid (FA) composition in meat. However, it is difficult and costly to determine fatty acid composition, which has precluded beef cattle breeding programs from selecting for a healthier fatty acid profile. In this study, we employed a high-density single nucleotide polymorphism (SNP) chip to genotype 386 Nellore steers, a Bos indicus breed and, a Bayesian approach to identify genomic regions and putative candidate genes that could be involved with deposition and composition of IMF.
Twenty-three genomic regions (1-Mb SNP windows) associated with IMF deposition and FA composition that each explain ≥ 1% of the genetic variance were identified on chromosomes 2, 3, 6, 7, 8, 9, 10, 11, 12, 17, 26 and 27. Many of these regions were not previously detected in other breeds. The genes present in these regions were identified and some can help explain the genetic basis of deposition and composition of fat in cattle.
The genomic regions and genes identified contribute to a better understanding of the genetic control of fatty acid deposition and can lead to DNA-based selection strategies to improve meat quality for human consumption.
Fatty acid; GWAS; Bos indicus; Beef; Positional candidate gene
Host genetics has been shown to play a role in porcine reproductive and respiratory syndrome (PRRS), which is the most economically important disease in the swine industry. A region on Sus scrofa chromosome (SSC) 4 has been previously reported to have a strong association with serum viremia and weight gain in pigs experimentally infected with the PRRS virus (PRRSV). The objective here was to identify haplotypes associated with the favorable phenotype, investigate additional genomic regions associated with host response to PRRSV, and to determine the predictive ability of genomic estimated breeding values (GEBV) based on the SSC4 region and based on the rest of the genome. Phenotypic data and 60 K SNP genotypes from eight trials of ~200 pigs from different commercial crosses were used to address these objectives.
Across the eight trials, heritability estimates were 0.44 and 0.29 for viral load (VL, area under the curve of log-transformed serum viremia from 0 to 21 days post infection) and weight gain to 42 days post infection (WG), respectively. Genomic regions associated with VL were identified on chromosomes 4, X, and 1. Genomic regions associated with WG were identified on chromosomes 4, 5, and 7. Apart from the SSC4 region, the regions associated with these two traits each explained less than 3% of the genetic variance. Due to the strong linkage disequilibrium in the SSC4 region, only 19 unique haplotypes were identified across all populations, of which four were associated with the favorable phenotype. Through cross-validation, accuracies of EBV based on the SSC4 region were high (0.55), while the rest of the genome had little predictive ability across populations (0.09).
Traits associated with response to PRRSV infection in growing pigs are largely controlled by genomic regions with relatively small effects, with the exception of SSC4. Accuracies of EBV based on the SSC4 region were high compared to the rest of the genome. These results show that selection for the SSC4 region could potentially reduce the effects of PRRS in growing pigs, ultimately reducing the economic impact of this disease.
Shifts in body composition, such as accumulation of body fat, can be a symptom of
many chronic human diseases; hence, efforts have been made to investigate the
genetic mechanisms that underlie body composition. For example, a few quantitative
trait loci (QTL) have been discovered using genome-wide association studies, which
will eventually lead to the discovery of causal mutations that are associated with
tissue traits. Although some body composition QTL have been identified in mice,
limited research has been focused on the imprinting and interaction effects that
are involved in these traits. Previously, we found that Myostatin
genotype, reciprocal cross, and sex interacted with numerous chromosomal regions
to affect growth traits.
Here, we report on the identification of muscle, adipose, and morphometric
phenotypic QTL (pQTL), translation and transcription QTL (tQTL) and expression QTL
(eQTL) by applying a QTL model with additive, dominance, imprinting, and
interaction effects. Using an F2 population of 1000 mice derived from the
Myostatin-null C57BL/6 and M16i mouse lines, six imprinted pQTL were
discovered on chromosomes 6, 9, 10, 11, and 18. We also identified two IGF1 and
two Atp2a2 eQTL, which could be important trans-regulatory elements. pQTL, tQTL
and eQTL that interacted with Myostatin, reciprocal cross, and sex were
detected as well. Combining with the additive and dominance effect, these variants
accounted for a large amount of phenotypic variation in this study.
Our study indicates that both imprinting and interaction effects are important
components of the genetic model of body composition traits. Furthermore, the
integration of eQTL and traditional QTL mapping may help to explain more
phenotypic variation than either alone, thereby uncovering more molecular details
of how tissue traits are regulated.
eQTL mapping; QTL mapping; Body composition; Myostatin; Imprinting; Interaction; Mouse
As consumers continue to request food products that have health advantages, it will be important for the livestock industry to supply a product that meet these demands. One such nutrient is fatty acids, which have been implicated as playing a role in cardiovascular disease. Therefore, the objective of this study was to determine the extent to which molecular markers could account for variation in fatty acid composition of skeletal muscle and identify genomic regions that harbor genetic variation.
Subsets of markers on the Illumina 54K bovine SNPchip were able to account for up to 57% of the variance observed in fatty acid composition. In addition, these markers could be used to calculate a direct genomic breeding values (DGV) for a given fatty acids with an accuracy (measured as simple correlations between DGV and phenotype) ranging from -0.06 to 0.57. Furthermore, 57 1-Mb regions were identified that were associated with at least one fatty acid with a posterior probability of inclusion greater than 0.90. 1-Mb regions on BTA19, BTA26 and BTA29, which harbored fatty acid synthase, Sterol-CoA desaturase and thyroid hormone responsive candidate genes, respectively, explained a high percentage of genetic variance in more than one fatty acid. It was also observed that the correlation between DGV for different fatty acids at a given 1-Mb window ranged from almost 1 to -1.
Further investigations are needed to identify the causal variants harbored within the identified 1-Mb windows. For the first time, Angus breeders have a tool whereby they could select for altered fatty acid composition. Furthermore, these reported results could improve our understanding of the biology of fatty acid metabolism and deposition.
Intramuscular fat; Genome architecture; Angus
The use of ontologies to standardize biological data and facilitate comparisons among datasets has steadily grown as the complexity and amount of available data have increased. Despite the numerous ontologies available, one area currently lacking a robust ontology is the description of vertebrate traits. A trait is defined as any measurable or observable characteristic pertaining to an organism or any of its substructures. While there are several ontologies to describe entities and processes in phenotypes, diseases, and clinical measurements, one has not been developed for vertebrate traits; the Vertebrate Trait Ontology (VT) was created to fill this void.
Significant inconsistencies in trait nomenclature exist in the literature, and additional difficulties arise when trait data are compared across species. The VT is a unified trait vocabulary created to aid in the transfer of data within and between species and to facilitate investigation of the genetic basis of traits. Trait information provides a valuable link between the measurements that are used to assess the trait, the phenotypes related to the traits, and the diseases associated with one or more phenotypes. Because multiple clinical and morphological measurements are often used to assess a single trait, and a single measurement can be used to assess multiple physiological processes, providing investigators with standardized annotations for trait data will allow them to investigate connections among these data types.
The annotation of genomic data with ontology terms provides unique opportunities for data mining and analysis. Links between data in disparate databases can be identified and explored, a strategy that is particularly useful for cross-species comparisons or in situations involving inconsistent terminology. The VT provides a common basis for the description of traits in multiple vertebrate species. It is being used in the Rat Genome Database and Animal QTL Database for annotation of QTL data for rat, cattle, chicken, swine, sheep, and rainbow trout, and in the Mouse Phenome Database to annotate strain characterization data. In these databases, data are also cross-referenced to applicable terms from other ontologies, providing additional avenues for data mining and analysis. The ontology is available at http://bioportal.bioontology.org/ontologies/50138.
Quantitative trait loci; Gene association; Trait ontology
The Animal Quantitative Trait Loci (QTL) database (AnimalQTLdb) is designed to house all publicly available QTL data on livestock animal species from which researchers can easily locate and compare QTL within species. The database tools are also added to link the QTL data to other types of genomic information, such as radiation hybrid (RH) maps, finger printed contig (FPC) physical maps, linkage maps and comparative maps to the human genome, etc. Currently, this database contains data on 1287 pig, 630 cattle and 657 chicken QTL, which are dynamically linked to respective RH, FPC and human comparative maps. We plan to apply the tool to other animal species, and add more structural genome information for alignment, in an attempt to aid comparative structural genome studies ().
The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems.
The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome.
This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig’s adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response.
Immune response; Porcine; Genome annotation; Co-expression network; Phylogenetic analysis; Accelerated evolution
The availability of gene expression data that corresponds to pig immune response challenges provides compelling material for the understanding of the host immune system. Meta-analysis offers the opportunity to confirm and expand our knowledge by combining and studying at one time a vast set of independent studies creating large datasets with increased statistical power. In this study, we performed two meta-analyses of porcine transcriptomic data: i) scrutinized the global immune response to different challenges, and ii) determined the specific response to Porcine Reproductive and Respiratory Syndrome Virus (PRRSV) infection. To gain an in-depth knowledge of the pig response to PRRSV infection, we used an original approach comparing and eliminating the common genes from both meta-analyses in order to identify genes and pathways specifically involved in the PRRSV immune response. The software Pointillist was used to cope with the highly disparate data, circumventing the biases generated by the specific responses linked to single studies. Next, we used the Ingenuity Pathways Analysis (IPA) software to survey the canonical pathways, biological functions and transcription factors found to be significantly involved in the pig immune response. We used 779 chips corresponding to 29 datasets for the pig global immune response and 279 chips obtained from 6 datasets for the pig response to PRRSV infection, respectively.
The pig global immune response analysis showed interconnected canonical pathways involved in the regulation of translation and mitochondrial energy metabolism. Biological functions revealed in this meta-analysis were centred around translation regulation, which included protein synthesis, RNA-post transcriptional gene expression and cellular growth and proliferation. Furthermore, the oxidative phosphorylation and mitochondria dysfunctions, associated with stress signalling, were highly regulated. Transcription factors such as MYCN, MYC and NFE2L2 were found in this analysis to be potentially involved in the regulation of the immune response.
The host specific response to PRRSV infection engendered the activation of well-defined canonical pathways in response to pathogen challenge such as TREM1, toll-like receptor and hyper-cytokinemia/ hyper-chemokinemia signalling. Furthermore, this analysis brought forth the central role of the crosstalk between innate and adaptive immune response and the regulation of anti-inflammatory response. The most significant transcription factor potentially involved in this analysis was HMGB1, which is required for the innate recognition of viral nucleic acids. Other transcription factors like interferon regulatory factors IRF1, IRF3, IRF5 and IRF8 were also involved in the pig specific response to PRRSV infection.
This work reveals key genes, canonical pathways and biological functions involved in the pig global immune response to diverse challenges, including PRRSV infection. The powerful statistical approach led us to consolidate previous findings as well as to gain new insights into the pig immune response either to common stimuli or specifically to PRRSV infection.
Meta-analysis; Microarrays; Pig immune response; PRRSV infection
Infectious Bovine Keratoconjunctivitis (IBK) in beef cattle, commonly known as pinkeye, is a bacterial disease caused by Moraxellabovis. IBK is characterized by excessive tearing and ulceration of the cornea. Perforation of the cornea may also occur in severe cases. IBK is considered the most important ocular disease in cattle production, due to the decreased growth performance of infected individuals and its subsequent economic effects. IBK is an economically important, lowly heritable categorical disease trait. Mass selection of unaffected animals has not been successful at reducing disease incidence. Genome-wide studies can determine chromosomal regions associated with IBK susceptibility. The objective of the study was to detect single-nucleotide polymorphism (SNP) markers in linkage disequilibrium (LD) with genetic variants associated with IBK in American Angus cattle.
The proportion of phenotypic variance explained by markers was 0.06 in the whole genome analysis of IBK incidence classified as two, three or nine categories. Whole-genome analysis using any categorisation of (two, three or nine) IBK scores showed that locations on chromosomes 2, 12, 13 and 21 were associated with IBK disease. The genomic locations on chromosomes 13 and 21 overlap with QTLs associated with Bovine spongiform encephalopathy, clinical mastitis or somatic cell count.
Results of these genome-wide analyses indicated that if the underlying genetic factors confer not only IBK susceptibility but also IBK severity, treating IBK phenotypes as a two-categorical trait can cause information loss in the genome-wide analysis. These results help our overall understanding of the genetics of IBK and have the potential to provide information for future use in breeding schemes.
Keratoconjunctivitis; Pinkeye; BayesB; Threshold model; Genome-wide analysis
For 10,000 years pigs and humans have shared a close and complex relationship. From domestication to modern breeding practices, humans have shaped the genomes of domestic pigs. Here we present the assembly and analysis of the genome sequence of a female domestic Duroc pig (Sus scrofa) and a comparison with the genomes of wild and domestic pigs from Europe and Asia. Wild pigs emerged in South East Asia and subsequently spread across Eurasia. Our results reveal a deep phylogenetic split between European and Asian wild boars ~1 million years ago, and a selective sweep analysis indicates selection on genes involved in RNA processing and regulation. Genes associated with immune response and olfaction exhibit fast evolution. Pigs have the largest repertoire of functional olfactory receptor genes, reflecting the importance of smell in this scavenging animal. The pig genome sequence provides an important resource for further improvements of this important livestock species, and our identification of many putative disease-causing variants extends the potential of the pig as a biomedical model.
The SLA (swine leukocyte antigen, MHC: SLA) genes are the most important determinants of immune, infectious disease and vaccine response in pigs; several genetic associations with immunity and swine production traits have been reported. However, most of the current knowledge on SLA is limited to gene coding regions. MicroRNAs (miRNAs) are small molecules that post-transcriptionally regulate the expression of a large number of protein-coding genes in metazoans, and are suggested to play important roles in fine-tuning immune mechanisms and disease responses. Polymorphisms in either miRNAs or their gene targets may have a significant impact on gene expression by abolishing, weakening or creating miRNA target sites, possibly leading to phenotypic variation. We explored the impact of variants in the 3′-UTR miRNA target sites of genes within the whole SLA region. The combined predictions by TargetScan, PACMIT and TargetSpy, based on different biological parameters, empowered the identification of miRNA target sites and the discovery of polymorphic miRNA target sites (poly-miRTSs). Predictions for three SLA genes characterized by a different range of sequence variation provided proof of principle for the analysis of poly-miRTSs from a total of 144 M RNA-Seq reads collected from different porcine tissues. Twenty-four novel SNPs were predicted to affect miRNA-binding sites in 19 genes of the SLA region. Seven of these genes (SLA-1, SLA-6, SLA-DQA, SLA-DQB1, SLA-DOA, SLA-DOB and TAP1) are linked to antigen processing and presentation functions, which is reminiscent of associations with disease traits reported for altered miRNA binding to MHC genes in humans. An inverse correlation in expression levels was demonstrated between miRNAs and co-expressed SLA targets by exploiting a published dataset (RNA-Seq and small RNA-Seq) of three porcine tissues. Our results support the resource value of RNA-Seq collections to identify SNPs that may lead to altered miRNA regulation patterns.
The newly available pig genome sequence has provided new information to fine map quantitative trait loci (QTL) in order to eventually identify causal variants. With targeted genomic sequencing efforts, we were able to obtain high quality BAC sequences that cover a region on pig chromosome 17 where a number of meat quality QTL have been previously discovered. Sequences from 70 BAC clones were assembled to form an 8-Mbp contig. Subsequently, we successfully mapped five previously identified QTL, three for meat color and two for lactate related traits, to the contig. With an additional 25 genetic markers that were identified by sequence comparison, we were able to carry out further linkage disequilibrium analysis to narrow down the genomic locations of these QTL, which allowed identification of the chromosomal regions that likely contain the causative variants. This research has provided one practical approach to combine genetic and molecular information for QTL mining.
meat quality QTL; pig chromosome 17; integrated analysis
Infectious bovine keratoconjunctivitis (IBK), also known as pinkeye, is characterized by damage to the cornea and is an economically important, lowly heritable, categorical disease trait in beef cattle. Scores of eye damage were collected at weaning on 858 Angus cattle. SNP genotypes for each animal were obtained from BovineSNP50 Infinium-beadchips. Simultaneous associations of all SNP with IBK phenotype were determined using Bayes-C that treats SNP effects as random with equal variance for an assumed fraction (π=0.999) of SNP having no effect on IBK scores. Bayes-C threshold models were used to estimate SNP effects by classifying IBK into two, three or nine ordered categories. Magnitudes of genetic variances estimated in localized regions across the genome indicated that SNP within the most informative regions accounted for much of the genetic variance of IBK and pointed out some degree of association to IBK. There are many candidate genes in these regions which could include a gene or group of genes associated with bacterial disease in cattle.