|Home | About | Journals | Submit | Contact Us | Français|
Selection for sound conformation has been widely used as a primary approach to reduce lameness and leg weakness in pigs. Identification of genomic regions that affect conformation traits would help to improve selection accuracy for these lowly to moderately heritable traits. Our objective was to identify genetic factors that underlie leg and back conformation traits in three Danish pig breeds by performing a genome-wide association study followed by meta-analyses.
Data on four conformation traits (front leg, back, hind leg and overall conformation) for three Danish pig breeds (23,898 Landrace, 24,130 Yorkshire and 16,524 Duroc pigs) were used for association analyses. Estimated effects of single nucleotide polymorphisms (SNPs) from single-trait association analyses were combined in two meta-analyses: (1) a within-breed meta-analysis for multiple traits to examine if there are pleiotropic genetic variants within a breed; and (2) an across-breed meta-analysis for a single trait to examine if the same quantitative trait loci (QTL) segregate across breeds. SNP annotation was implemented through Sus scrofa Build 10.2 on Ensembl to search for candidate genes.
Among the 14, 12 and 13 QTL that were detected in the single-trait association analyses for the three breeds, the most significant SNPs explained 2, 2.3 and 11.4% of genetic variance for back quality in Landrace, overall conformation in Yorkshire and back quality in Duroc, respectively. Several candidate genes for these QTL were also identified, i.e. LRPPRC, WRAP73, VRTN and PPARD likely control conformation traits through the regulation of bone and muscle development, and IGF2BP2, GH1, CCND2 and MSH2 can have an influence through growth-related processes. Meta-analyses not only confirmed many significant SNPs from single-trait analyses with higher significance levels, but also detected several additional associated SNPs and suggested QTL with possible pleiotropic effects.
Our results imply that conformation traits are complex and may be partly controlled by genes that are involved in bone and skeleton development, muscle and fat metabolism, and growth processes. A reliable list of QTL and candidate genes was provided that can be used in fine-mapping and marker assisted selection to improve conformation traits in pigs.
The online version of this article (doi:10.1186/s12711-017-0289-2) contains supplementary material, which is available to authorized users.
Lameness and leg weakness are issues of concern in pig production due to economic and welfare aspects. Leg weakness has been reported as the second most common reason for involuntary culling in pigs for many years, and accounted for 8.6 to 15% of the sows being removed from commercials herds in Nordic countries [1, 2]. Several studies have reported favorable genetic correlations between good legs and litter size and sow stayability [3–5]. Thus, breeding for reduced leg weakness is expected to induce a favorable correlated response on sow reproduction and longevity, and thereby to improve these traits. In fact, conformation traits have been included in breeding goals in almost all Nordic countries, with the aim of reducing lameness . The evaluation of conformation traits has often been carried out subjectively by scoring gait and movement, leg and feet visual observations, and knee and pastern postures [5, 7]. In the literature, heritability estimates for leg conformation traits range from 0.01 to 0.37 [7–10]. These low to moderate heritabilities suggest that faster genetic progress could be achieved by incorporating genetic marker information in the selection process rather than using a traditional pedigree-based selection scheme .
Reliability of genomic prediction can be increased by including single nucleotide polymorphisms (SNPs), that are significant in genome-wide association studies (GWAS), in the SNP chips used for routine genomic prediction . Thus, once quantitative trait loci (QTL) associated with conformation traits have been identified, they can be used to improve the reliability of genomic estimated breeding values (GEBV) . In addition, these QTL can be used in fine-mapping studies to identify the causal genetic factors and thus help to understand the biological processes that underlie conformational development in pigs. Information about which genetic factors are involved in the conformation and locomotion of an animal and how much they affect these traits may contribute in setting up a standard scoring system with more objective criteria for uniform evaluation.
Few studies have focused on the mapping of genes for leg conformation traits in pigs. Thus, the objectives of this study were (1) to identify the QTL that are associated with conformation traits in three Danish pig breeds (Landrace, Yorkshire and Duroc) by performing a GWAS; and (2) to examine if the identified genetic variants are associated with multiple conformation traits within one breed and if the same QTL are segregating across breeds by performing meta-analyses. The biological functions of the genes that were closest to the most significant SNP within the detected QTL were also examined to unravel the genetic background of conformation traits.
Data on conformation traits from the three Danish breeds, Landrace, Yorkshire and Duroc, which were analyzed in this study, were provided by the Danish pig breeding company DanAvl, Axeltorv, Copenhagen, Denmark (http://www.danavl.dk/). In Denmark, purebred pigs in nucleus herds are performance-tested. Conformation traits are evaluated by trained technicians when the pigs are around five months of age and weigh approximately 100 kg. The data used here were recorded from 2002 to 2015 and included four conformation traits: front leg quality (FRONT), back quality (BACK), hind leg quality (HIND) and overall conformation trait (CONF). The first three traits (FRONT, BACK and HIND) were scored using a three-point scale from 1 to 3, with 3 corresponding to the best conformation. For CONF, a five-point scale from 1 to 5 was used to score the animals, with scores of 1 corresponding to animals that have serious legs or back problems, 3 to average animals, and 5 to animals with excellent conformation. The number of observations in each breed and the means and standard deviation of each trait are in Table 1. Due to the very low frequency of the extreme score categories 1 and 5, the few observations in these categories were merged into the adjacent categories for the association analyses.
Corrected phenotype, rather than raw phenotype, was used as the dependent variable in the association analysis. Fixed effects used for routine genetic evaluation in DanAvl were included in the following model:
where y is a vector of conformation scores, b is a vector of fixed effects including sex, and the combination of herd, year and month at performance testing; body weight at testing performance as covariate; u is a vector of additive genetic values of the animals; e is a vector of the residual effects; X and Z are incidence matrices that associate b and u with y. The vectors of random effects a and e were assumed to be normally distributed, i.e. and where is additive genetic variance, A is additive relationship matrix derived from pedigree records, is residual variance and I is the identity matrix. The combination of herd, year and month at performance testing also accounts for the effect of the technician who performed the measures, since all records within each level of this combination were recorded by the same person. The analyses were carried out by using the REML method with an R interface to the DMU software package . Heritabilities and genetic correlations between traits were estimated by using a bi-variate linear mixed model based on Model 1. Corrected phenotypes were obtained as the sum of the estimated breeding value and the residual value for each animal from Model 1 and were later used as dependent variables in the association model (Model 2).
Genotyping was carried out using three types of SNP chip: Illumina PorcineSNP60 BeadChip (Illumina, San Diego, CA, USA), and two GeneSeek® custom SNP arrays (Neogen Corporation, Lansing, MI, USA) namely Genomic Profiler (GGP) Porcine LD array featuring over 8500 SNPs (8.5 K) or GGP Porcine HD array, featuring over 70,000 SNPs (70 K). SNPs were quality-controlled within each breed using the following criteria: SNPs with a call-rate lower than 80% across all samples genotyped with each chip or SNPs with a minor allele frequency (MAF) lower than 0.01 were excluded; SNPs that deviated strongly from the Hardy–Weinberg equilibrium (P < 10−7) and SNPs that were not mapped in the porcine reference genome build Sscrofa10.2 (http://www.ensembl.org/Sus scrofa/Info/Index) were also excluded. Missing genotypes for the remaining SNPs on the 60 K chip were imputed using Beagle version 3.3.2 . After quality control and imputation, 37,080 SNPs, 36,080 SNPs and 32,376 SNPs on the 18 porcine autosomes were retained for the Landrace, Yorkshire and Duroc breeds, respectively. A total of 44,390 different SNPs were available for the three breeds and were later used in the across-breed meta-analysis. The genotype of an animal was removed if the call frequency was less than 90%. The numbers of animals genotyped by different SNP chips and used for the association analysis in each breed are in Table 1.
All steps in the single-trait association analyses were carried out using available options in the software Genome-wide complex trait analysis (GCTA) .
The GRM between individuals within each breed was used to perform principal component analysis and single-trait association analyses. The method that was used to estimate GRM between individuals using SNP data is described in Yang et al. . In this approach, genotype dosages (0, 1, 2) and allele frequency at each SNP were used to calculate the “relationship score” between individual i and j for each SNP. The average “relationship score” across all SNPs was then used as the relationship between individuals i and j.
Principal component analysis was performed to assess the population structure within each breed. The top ten eigenvectors with the largest eigenvalues were subsequently included as covariates in the association analysis model to account for the confounding effect of population structure . Including these ten eigenvectors resulted in acceptable genomic inflation factors (lambda-λ) (see Additional file 1: Figure S1, Additional file 2: Figure S2 and Additional file 3: Figure S3). Lambda is one of the standard quality-control measures used for GWAS and reflects the level of inflation of Chi square values when compared with their expectation under the null hypothesis of no association between the trait and SNPs .
Association analyses were carried out using a mixed-linear model with the leaving-one-chromosome-out approach described by Yang et al. . Mixed-linear based association analysis treats the genotype at a single SNP as a fixed effect and the additive polygenic effect as a random effect, and will be called the SNP model (Model 2):
where yc is a vector of corrected phenotypes obtained from Model 1; 1 is a vector of 1s; μ is the general mean; X is a matrix containing the 10 eigenvectors that were derived from PCA and included as covariates, and b is a vector of associated effects; S is the design matrix that contains the allele contents at the fitted SNP, i.e. counts (0, 1, 2) of the second allele, and α is the allele substitution effect; u is a vector of additive polygenic effect, and Z is an incidence matrix associating u with y. Vectors of random effects u and e were assumed to be normally distributed, i.e. and , where is polygenic variance, G is the GRM, is residual variance and I is the identity matrix. In the “leaving-one-chromosome-out” approach association analysis is performed for a candidate SNP using a G that is computed without including the chromosome on which the candidate SNP is located. For instance, an association analysis between a SNP on chromosome 1 used a G that was computed only from SNPs on chromosomes 2 to 18. This approach avoids “double-fitting” of the candidate SNP into the model (both as a fixed effect and a random effect) which can reduce power, as demonstrated by Listgarten et al. .
The Bonferroni correction was applied to correct for multiple testing. Since the number of SNPs was relatively similar for the three breeds, the Bonferroni correction threshold was calculated for all three breeds based on the number of SNPs for the Landrace breed, i.e. 0.05/36,080 = 1.38 × 10−6. Thus, the 5% genome-wide significance level used to avoid type I errors was 1.38 × 10−6 for all three breeds.
A multiple trait meta-analysis was performed within each breed using the approximate multi-trait test statistic described by Bolormaa et al. . Effects of a SNP across all traits were calculated and combined with the genomic correlation matrix between traits to perform a multi-trait Chi square test with the number of degrees of freedom equal to the number of traits. The formula to calculate the multi-trait statistic for each SNP was as follows:
where ti is a vector of signed t-value of SNPi across traits (t-value = SNP effect/standard error of SNP effect), V-1 is the inverse matrix of the genomic correlation matrix between traits calculated from these t-values. The significance threshold from the single-trait association analyses (i.e. P value <1.38 × 10−6) was applied for these within-breed multi-trait analyses.
Meta-analyses across the three breeds for a trait was carried out using the METAL software developed by Willer et al. . The direction of the effect and the P values from each study i were converted into z-scores. These z-scores for each SNP were weighted by the sample size of each study and combined in a weighted sum across breeds. The test statistic follows the standard normal distribution and was calculated as follows:
where Pi is the P value for study i; Δi is the direction of effect for study i; and Ni is the sample size for study i. A Bonferroni correction was applied with a significance threshold of 0.05/44,390 = 1.13 × 10−6 for all four traits.
Since the software used does not report model convergence, SNPs with an allele substitution effect estimate that fell outside the range of ±2SD of the corrected phenotypes were filtered out to avoid their probably unrealistic large effect due to inappropriate convergence in the parameter estimation. Also, only SNPs that passed the Bonferroni correction threshold were selected to mark the QTL region. A QTL region was defined by extending the position of the most significant SNP (top SNP) on either side until all SNPs within that region had a - log10(P-value) higher than the - log10(P-value) of the top SNP minus 3 units.
The size of the effect of a QTL, which was defined as the contribution of the most significant SNP within that QTL to the phenotypic variance of the trait, was calculated as , in which p and q are the allele frequencies, β is the estimated SNP effect, and is the phenotypic variance of the trait. The corresponding contribution to the genetic variance for that SNP was calculated as a proportion of the genetic variance .
Genes that harbored the most significant SNPs within each QTL region from single-trait analyses and across-breeds meta-analyses were searched based on the pig genome assembly, Sscrofa10.2 (http://www.ensembl.org/Susscrofa/Info/Index).
Means and standard deviations for each trait in the Landrace, Yorkshire and Duroc breeds are in Table 1 (see “Methods” section). Means for CONF were slightly above the middle score 3 and ranged from 3.09 to 3.31, with the standard deviations ranged from 0.60 to 0.70 across the three breeds. Means for FRONT, BACK and HIND were all significantly greater than the middle score 2. The trait BACK had the highest means (from 2.79 to 2.91) and low standard deviations (from 0.27 to 0.41) which indicate that most animals obtained the highest score i.e. 3 for this trait. For FRONT and HIND, means ranged from 2.29 to 2.36 and from 2.32 to 2.51, with standard deviations ranging from 0.45 to 0.48 and from 0.46 to 0.50, respectively.
Heritability estimates in each breed are presented on the diagonals in Table 2. In general, estimated heritabilities were low for all traits, ranging from 0.02 to 0.13. The trait CONF had a slightly higher heritability than the other traits in all three breeds.
Genetic correlations were estimated using the linear mixed model (Model 1) and genomic correlations were estimated from signed t-values of all SNPs between traits in each breed; they are presented on the lower and upper diagonals, respectively in Table 2. The traits FRONT, HIND and CONF were highly correlated with the genetic correlations ranging from 0.72 to 0.97. The trait BACK seems to be genetically more different from the other three traits, especially in the Landrace and Yorkshire breeds, for which the estimated genetic correlations ranged from 0.23 to 0.66. When breeds were compared, genetic correlations between traits were higher in Duroc than in Landrace and Yorkshire. Regardless of the breed, genomic correlations between traits followed the same pattern as the genetic correlations but at lower magnitudes. For instance, genomic correlations of BACK with the other traits ranged from 0.14 to 0.59.
Genomic regions that were found to be associated with conformation traits for Landrace, Yorkshire and Duroc are in Table 3, together with the candidate genes and the top associated SNPs within each region. Visual overviews of the location of the QTL are in the Manhattan plots of Additional file 1: Figure S1, Additional file 2: Figure S2 and Additional file 3: Figure S3. QTL regions were identified on Sus scrofa chromosome (SSC) 1, 2, 3, 4, 5, 6, 7, 10, 12, 13 and 18. In general, the number of associated regions was larger for CONF than for the other traits.
In total, 14 regions were significantly associated with conformation traits in Landrace, of which five were associated with BACK, one with HIND, and eight with CONF, but none with FRONT (Table 3). The most significant SNP identified for the Landrace breed, rs80828473, was located at 36.2 Mb within the PPARD gene on SCC7. This SNP explained 0.2% of phenotypic variance (2% of genetic variance) of BACK. The second most significant SNP, rs81389032 (SSC6: 74.4 Mb), was not within any reported coding gene, but was located in a region between the genes ENSSCT00000032147 and ENSSCT00000003913.
For the Yorkshire breed, 12 regions were associated with the traits analyzed: one region with FRONT, one with BACK, three with HIND, and seven with CONF (Table 3). The most significant SNP was rs80783847 (SSC1: 199.4 Mb) and was located close to gene ENSSSCT00000005518; explained 0.2% of the phenotypic variance (2.3% of the genetic variance) of CONF. The next two most significant SNPs were located between the genes RIPPLY2 and SNAP91 (SNP on SSC1: 92.7 Mb), and between the genes CD79B and GH1 (SNP rs81440562 on SSC12: 15.0 Mb).
Among the 13 QTL regions identified for the Duroc breed, the region on SSC3 between 100.2 and 100.4 Mb showed the highest peak (Table 3). The most significant SNP in this region, rs81373717, contributed 0.9% of the phenotypic variance (11.4% of the genetic variance) of BACK. This SNP was not located within any coding gene but was between the genes EPAS1 and PRKCE. The next two most significant SNPs were rs81373756 (SSC3: 100.4 Mb) within PRKCE and rs81333163 (SSC6: 57.6 Mb) between A1BG and RPS5. They were both associated with CONF and each explained 0.5% of the phenotypic variance of this trait.
Some regions and genes showed significant association with more than one trait within a breed, suggesting the presence of a variant that may affect multiple conformation traits. For example, strong associations were found: (1) PPARD (SSC7: 36.2 Mb) was highly associated with both BACK and CONF in Landrace; (2) the QTL region located between RIPPLY2 (92.6 Mb) and SNAP91 (92.8 Mb) on SSC1 exhibited high association with both CONF and HIND in Yorkshire; and (3) the region that comprises PRKCE (SSC3: 100.5 Mb) showed high association with both CONF and BACK in Duroc. In the single-trait within-breed association analyses, the only region that overlapped between breeds was on SSC7 and it was associated with CONF in Yorkshire and Duroc.
The numbers of significant SNPs identified in within-breed multi-trait meta-analyses for Landrace, Yorkshire and Duroc (206, 257 and 306 SNPs, respectively) were larger than in single-trait analyses (not shown). Their distribution on the genome was showed in the Manhattan plots in Additional file 4: Figure S4. Many of the SNPs that were found significantly associated with more than one trait in the single-trait analyses were confirmed in the multi-trait analyses, which suggests that the QTL containing these SNPs have pleiotropic effects. For instance, Fig. 1 shows the significance of the effects of SNPs on SSC6 and SSC3 from single-trait and multi-trait analyses in Duroc, which suggests that the corresponding QTL regions on these chromosomes affect multiple traits. Multi-trait meta-analysis can increase the power of QTL detection, since in general SNPs had lower P-values in the multi-trait analyses than in the single-trait analyses. Table 4 presents examples of such lower P-values for the most significant SNPs detected in the multi-trait analyses compared with the single-trait analyses.
In total 36 regions were associated with the traits analyzed in the across-breed meta-analyses: three regions were associated with FRONT, eight with BACK, seven with HIND and 18 with CONF [Fig. 2 and Additional file 5: Table S1]. Among these 36 regions, several QTL regions that were detected in the single-trait within-breed analysis were confirmed and several additional regions with novel candidate genes were identified. For instance, the most significant SNP associated with CONF, rs81344309 (SSC6: 52.0 Mb), which is located between the two coding genes ENSSCG00000003243 and ZNF614, was not identified in the single-trait analyses. Similarly, the two second most significant SNPs are also new and are intergenic on SSC7: rs342640079 (34.9 Mb) between GRM4 and HMGA1 and rs80894106 (103.5 Mb) between VRTN and SYNDIG1L. The latter SNP is associated with CONF in both Yorkshire and Duroc in the single-trait analyses, but reached the higher level of significance in the across-breed meta-analysis.
The present GWAS detected several chromosomal regions that are associated with conformation traits in the three pig breeds, and in many cases, we were able to identify candidate genes within these regions. Results from GWAS in humans and other species suggest that conformation traits are complex and affected by various factors such as bone and cartilage development, muscle growth, fat accumulation and body weight gain [17, 24, 25]. Changes in these factors and how they interact with each other probably determine the skeletal structure and movement pattern of an individual. Several of the identified candidate genes e.g. LRPPRC, WRAP73, VRTN and PPARD are involved in bone and muscle development at different levels. Liu and McKeehan  suggested that LRPPRC has a role in the regulation of cytoskeleton network activity by analyzing its sequence and the proteins it interacts with. WRAP73 belongs to the WD repeat protein gene family, which includes the WDR8 gene that plays an essential role in the ossification process. Expression of WDR8 was observed in bone-forming cells and in bone and cartilage tissue during the early stage of ectopic ossification in mouse . A polymorphism in the VRTN gene was found to be related to the number of vertebrae in domestic pigs with one allele resulting in an additional segment in the thoracic vertebrae compared with the wild type allele . PPARD, a candidate gene for BACK and CONF in Landrace, was reported to be linked with muscle development and metabolism in pigs . Expression of porcine PPARD inhibits the formation of myotube and increases adipocyte differentiation in mouse myoblasts . In addition, several GWAS revealed the association of PPARD with limb bone length  and growth and fatness traits  in pigs. Similarly, HMGA1 which is located in a QTL region that we identified on SSC7 in Duroc, was reported to be associated with length of limb bone in pigs [30, 32] and height and length of hip axis in humans . In vitro studies showed that addition of HMGA1 enhanced the proliferation of chondrocyte cells by regulating the expression of a chondrocyte-specific marker . These findings suggest that VRTN, PPARD and HMGA1 are candidate genes for conformation traits in pigs by regulating bone and muscle development.
Growth-promoting factors, including insulin and IGF, are known to participate in bone and fat metabolism and in the growth process [34, 35]. In this study, several genes involved in the growth pathway were identified, such as IGF2BP2, GH1, CCND2 and MSH2. The protein IGF2BP2 plays an important role in controlling the action of IGF  which are associated with growth and fatness traits in pigs . Similarly, GH1 is necessary for growth promotion and energy metabolism regulation . Members of the candidate gene CCND2 for BACK in Landrace are essential for growth of pancreatic islets , which play a role in the regulation of the growth of an animal via its hormones. Another gene related to the growth pathway that was identified in this study, MSH2, regulates the activity of the melanocortin system which is involved in fat accumulation, feed intake and daily weight gain in pigs . These results suggest that growth-related genes have a regulatory function on conformation traits in pigs but how they are associated with each other needs further study.
In this study, we identified several SNPs and chromosomal regions that were significantly associated with the traits analyzed. However, some of the top SNPs were located in intergenic regions between coding genes. Some of these genes, such as RIPPLY2 (SSC1: 92.6 Mb) and SYNDIG1L (SSC7: 103.5) are associated with vertebrae and rib development. RIPPLY2 is involved in somitogenesis during the embryo stage in mice, and knockout mice for this gene die during the perinatal period due to severe vertebrae and rib malformation . In humans, it was shown that mutations in RIPPLY2 are associated with segment defects of the vertebrae . In pigs, Verardo et al.  reported that SYNDIG1L was associated with number of teats while Duijvesteijn et al.  showed that number of teats and number of vertebras were controlled by several pleiotropic coding genes. These findings suggest that SYNDIG1L may have a role in the development of vertebrae in pigs. Many causal genetic factors have been previously reported to be located in the regulatory regions of genes which may explain why several candidate SNPs were identified at intergenic locations in this study. However, it may also indicate that the density of the SNP chips used was not sufficiently high. In that case, an association analysis with imputed whole-genome sequence might be able to further pinpoint the causal mutations .
The highly significantly associated gene with CONF in Duroc, PRKCE, is involved in several biological processes. PRKCE is a well-known key factor in cell proliferation and differentiation, muscle contraction, gene expression, cell growth and apoptosis, metabolism and diabetes, as reviewed by Akita  and Geraldes and King . PRKCE plays an essential role in regulating lipogenesis by the interaction with GH  or IGF1 . However, overexpression of PRKCE results in malignant tumors and diabetes [44, 46]. This relationship between growth-related factors and diabetes and bone metabolism could explain the association of PRKCE with conformation traits.
Not all candidate genes that were identified here have an obvious biological function on conformation traits. For example, some of the candidate genes are expressed in neuronal tissues and are related to the activity of the neuronal system, i.e. SLC14A2, PPM1G and NGFR [47–49]. The role of neuronal genes in the regulation of the metabolism, fat accumulation and body weight gain was investigated in humans and pigs [47, 50, 51]. Willer et al.  conducted a meta-analysis of 15 GWAS, which detected eight loci that are significantly associated with body mass index in humans. Another association study that combined human and pig data showed that several neuronal genes were associated with subcutaneous fat accumulation, which provides further support for a role of the nervous system on fat metabolism . Another gene, SLC14A2 which is a member of the SLC super family, is mainly expressed in brain  and associated with fat thickness in humans and pigs . Since the fat accumulation process probably affects the body structure and movement pattern of an animal, it would be interesting to investigate further how the nervous system influences conformation traits.
In this study, the detection of QTL was enhanced in the multi-trait meta-analyses compared with the single-trait analyses. In other words, larger numbers of significant SNPs and higher significance levels of the top SNPs were observed in the meta-analyses. This phenomenon was also reported by Bolormaa et al.  for stature, fatness and reproduction related traits in beef cattle and more recently by Pausch et al.  for mammary gland morphology traits in dairy cattle. The ability to account for the relatedness between traits of multi-trait meta-analyses can explain the enhanced power of this approach here, where the four traits studied were highly genetically correlated . The most significant SNPs detected in the multi-trait analyses for all three breeds reached higher significance level compared with the single-trait analysis, but they were not located within any annotated genes. Our results confirmed that multi-trait analyses can enhance the power to detect new associated SNPs, pleiotropic QTL and SNPs that were associated with only one of the correlated traits .
Phenotype and genotype data from all three breeds were available, and thus a GWAS analysis with pooled data could have been done instead of the across-breed meta-analysis. However, these breeds have been separated for many generations and have undergone strong artificial selection and genetic drift, which means that different sets of QTL may be segregating in each breed. Moreover, GWAS relies on linkage disequilibrium (LD) between markers and causal variants. The marker-QTL linkage phase can differ by breed and, thus, a joint analysis will have less power to detect such QTL. The genome-wide and local pattern of LD and the persistence of LD phase have been investigated in these three breeds , which showed that the persistence of LD phase was higher between Landrace and Yorkshire than between these two breeds and Duroc.
The results of the meta-analysis confirmed the associations of a number of candidate genes that had been detected in the single-breed analyses, such as PRKCE, SLC14A2, HMGA1, VRTN and SYNDIG1L. The greater significance level of the top SNPs was expected due to the improved power of detection for common SNPs in the meta-analyses compared with single-breed analyses. This advantage of meta-analyses has been reported in pigs , cattle [57, 58], and human [24, 50]. In this study, meta-analyses revealed several new QTL regions and candidate genes associated with the traits studied among which some are linked with bone, skeletal or muscle development, including SOS2, TRIM24 and ELMO1. Mutations in SOS2 are associated with the Noonan syndrome in humans, i.e. patients with this syndrome have short stature, weak muscles and malformed skeleton . They are often diagnosed with GH deficiency and respond well to GH therapy , which suggests that SOS2 is associated with conformation traits via growth-regulated processes. TRIM24 belongs to the superfamily of tripartite motif-containing proteins which have a role in the immune system. A member of this protein family, TRIM76, was reported to be highly expressed in porcine skeletal muscle and significantly associated with carcass traits such as ham percentage and intramuscular fat . The results of our meta-analysis suggested that ELMO1, which is associated with the development and progress of diabetes nephropathy in humans, is a candidate gene for CONF . Another interesting region associated with CONF was detected at 25 Mb on SSC12 where the Antp homeobox (HOX) gene family is located. Four HOX gene clusters (HOXA, HOXB, HOXC and HOXD) are known to be associated with the formation and development of vertebrae . The combined expression of these genes defines somite identities in mammalian embryos, which direct the differentiation and the development of the vertebrae according to their location . In this study, HOXB5 and HOXB13 were the closest genes to the top SNP in the associated region. An association between HOXB genes and number of lumbar and thoracolumbar vertebrae was also reported in pigs .
The diversity in the candidate genes and their biological functions found in this study confirmed the complex pattern of the genetic mechanisms that underlie conformation traits in pigs. Bone and skeleton development, muscle and fat metabolism and growth processes probably interact together to determine the general conformation and movement of a pig. However, these interactions are still unclear and should be investigated further.
Conformation traits are complex and appear to be controlled by genes that are involved in different biological processes, including bone and skeleton development, muscle and fat metabolism and body growth. Our results suggested the association of the LRPPRC, WRAP73, VRTN, PPARD, IGF2BP2, GH1, CCND2 and MSH2 genes with conformation traits in pigs. We show that meta-analysis is a powerful QTL detection approach since we were able to detect possible QTL with pleiotropic effects in the multi-trait meta-analyses, and novel relevant candidate genes such as SOS2, TRIM24 and ELMO1 in the across-breed meta-analyses. Our findings are reliable and can be used in fine-mapping to confirm the effects of the genes identified, as well as in marker-assisted selection to improve the conformation in pigs.
BN extracted the data from the database. GS and OFC conceived and supervised the research project. THL performed the statistical analyses and wrote the manuscript. All authors contributed to the final manuscript. All authors read and approved the final manuscript.
Thu Hong Le benefited from a joint grant from the European Commission within the framework of the Erasmus Mundus joint doctorate “EGS-ABG”. We are grateful to EGS ABG for funding and DanAvl for providing the data. Xiaowei Mao and Xiaoping Wu are acknowledged for helpful discussions during the data analyses.
The authors declare that they have no competing interests.
Additional file 1: Figure S1. Manhattan plot of GWAS in Landrace pigs for (a) FRONT, (b) BACK, (c) HIND and (d) CONF. The data provided represent the Manhattan plot of single-trait association analyses in Landrace pigs for four traits studied.(967K, docx) Additional file 2: Figure S2. Manhattan plot of GWAS in Yorkshire pigs for (a) FRONT, (b) BACK, (c) HIND and (d) CONF. The data provided represent the Manhattan plot of single-trait association analyses in Yorkshire pigs for four traits studied.(954K, docx) Additional file 3: Figure S3. Manhattan plot of GWAS in Duroc pigs for (a) FRONT, (b) BACK, (c) HIND and (d) CONF. The data provided represent the Manhattan plot of single-trait association analyses in Duroc pigs for four traits studied.(615K, docx) Additional file 4: Figure S4. Manhattan plot of within-breed multi-trait meta-analyses in (a) Landrace, (b) Yorkshire and (c) Duroc. The data provided represent the Manhattan plot of within-breed multi-trait meta-analyses in three breeds studied.(529K, docx) Additional file 5: Table S1. QTL regions and the most significant SNP within each region in across-breed meta-analyses. The data provided represent the QTL regions and the information of the most significant SNP within each region in across-breed meta-analyses.(26K, docx)
Thu H. Le, Email: email@example.com.
Ole F. Christensen, Email: firstname.lastname@example.org.
Bjarne Nielsen, Email: kd.seges@INB.
Goutam Sahana, Email: email@example.com.