PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of gseBioMed CentralBiomed Central Web Sitesearchsubmit a manuscriptregisterthis articleGenetics, Selection, Evolution : GSEJournal Front Page
 
Genet Sel Evol. 2017; 49: 33.
Published online 2017 March 7. doi:  10.1186/s12711-017-0308-3
PMCID: PMC5439168

Estimation of breeding values for uniformity of growth in Atlantic salmon (Salmo salar) using pedigree relationships or single-step genomic evaluation

Abstract

Background

In farmed Atlantic salmon, heritability for uniformity of body weight is low, indicating that the accuracy of estimated breeding values (EBV) may be low. The use of genomic information could be one way to increase accuracy and, hence, obtain greater response to selection. Genomic information can be merged with pedigree information to construct a combined relationship matrix (H matrix) for a single-step genomic evaluation (ssGBLUP), allowing realized relationships of the genotyped animals to be exploited, in addition to numerator pedigree relationships (A matrix). We compared the predictive ability of EBV for uniformity of body weight in Atlantic salmon, when implementing either the A or H matrix in the genetic evaluation. We used double hierarchical generalized linear models (DHGLM) based either on a sire-dam (sire-dam DHGLM) or an animal model (animal DHGLM) for both body weight and its uniformity.

Results

With the animal DHGLM, the use of H instead of A significantly increased the correlation between the predicted EBV and adjusted phenotypes, which is a measure of predictive ability, for both body weight and its uniformity (41.1 to 78.1%). When log-transformed body weights were used to account for a scale effect, the use of H instead of A produced a small and non-significant increase (1.3 to 13.9%) in predictive ability. The sire-dam DHGLM had lower predictive ability for uniformity compared to the animal DHGLM.

Conclusions

Use of the combined numerator and genomic relationship matrix (H) significantly increased the predictive ability of EBV for uniformity when using the animal DHGLM for untransformed body weight. The increase was only minor when using log-transformed body weights, which may be due to the lower heritability of scaled uniformity, the lower genetic correlation of transformed body weight with its uniformity compared to the untransformed traits, and the small number of genotyped animals in the reference population. This study shows that ssGBLUP increases the accuracy of EBV for uniformity of body weight and is expected to increase response to selection in uniformity.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-017-0308-3) contains supplementary material, which is available to authorized users.

Background

In aquaculture, selection to increase economically important traits such as growth is one of the main breeding goals. However, fish producers show interest to improve not only the mean but also the variance of traits [1]. Uniformity of growth is preferable because more uniform growth allows a more uniform product, harvest of a larger proportion of the population at market size, and reduction of size grading and multiple harvests [24]. More uniform growth may also reduce competitive interactions between animals, which contributes to reduce feed monopolization and dominant behaviour, and thus improve well-being of fish [5]. Uniformity is also important for traits that have an intermediate optimal trait value [6], such as fillet lipid%, body shape, and condition factor in the aquaculture industry. A fish whose growth is sensitive to non-measurable environmental factors, known as micro-environments, shows micro-environmental sensitivity, which results in high environmental variance and consequently contributes to increased phenotypic variation, leading to increased size variation within a group of fish. A number of empirical studies in terrestrial and aquatic species show that uniformity is partly determined by genetic factors [4, 716]. Thus, selective breeding can open up one avenue to improve uniformity of fish traits.

Atlantic salmon (Salmo salar L.) is a farmed fish that is of major economic importance. Heritability for uniformity of body weight has been estimated in Atlantic salmon [14], rainbow trout (Oncorhynchus mykiss Walbaum) [4, 8], and Nile tilapia (Oreochromis niloticus) [15, 16]. In general, heritability for uniformity (hv2) is low in livestock and aquaculture species (hv2 < 0.05), indicating that the prediction accuracy of breeding values for uniformity may be low [17, 18]. However, the coefficient of genetic variation (GCV) of uniformity of body weight is high in fish species (median GCV = 34.0%: min = 17.4% and max = 64.0%), which indicates high potential for response to selection [4, 8, 14, 16, 19]. One way to increase response to selection for uniformity is to increase the accuracy of estimated breeding values (EBV) for uniformity [20].

In aquaculture, full- and half-sib family sizes are usually large and thus the accuracy of EBV based on full-sibs, half-sibs and own performance is high for body weight, but not for uniformity due to its low heritability [8]. One approach to increase the accuracy of EBV is to use genomic information [21]. With genomic selection, genomic estimated breeding values (GEBV) can be obtained for the selection candidates that are genotyped, even when they have no phenotype records. One reason why genomic selection results in higher accuracy of selection is the more accurate estimation of the Mendelian sampling genetic effects through realized additive genetic relationships among animals [22]. Consequently, individual squared residuals, which is the phenotype for uniformity in a double hierarchical generalized linear model (DHGLM), may also be more accurately estimated when using genomic information.

In many cases, combining numerator pedigree and genomic information in genomic evaluations is implemented in multiple steps, which may introduce bias and need some calculations to combine with pedigree-based EBV [23, 24]. Single step genomic best linear unbiased prediction (ssGBLUP) avoids this, and genomic and pedigree information are combined in one step [23], which may lead to less bias and is less prone to double counting of information compared to genomic evaluation methods that are performed in multiple steps. The ssGBLUP augments the numerator relationship (A) matrix by the genomic relationship (G) matrix in conventional genetic evaluation using BLUP [24]. This combined numerator and genomic relationship matrix is known as the H matrix [25]. In fish breeding, combining pedigree and genomic information allows exploiting the large full- and half-sib families and the more accurate relationships of the genotyped animals, and may yield a higher accuracy of selection for uniformity than the use of the A matrix.

To date, the use of ssGBLUP for uniformity has not been studied. Furthermore, according to a previous study, the sire-dam model, but not the animal model, implemented within the framework of DHGLM provided unbiased (co)variance component estimates [14]. However, an animal DHGLM is expected to perform better than a sire-dam DHGLM for genetic evaluation because the animal DHGLM uses full relationships between animals rather than only among sires and dams. This is particularly important for uniformity, which is quantified by the residuals of individuals, which in the animal model do not contain the Mendelian sampling term. Moreover, for genetic evaluation, the animal DHGLM uses all phenotypic information and, for most breeding programs, at least part of the selection candidates, e.g. females for sex-linked traits, have phenotypes available at the time of selection. Use of the animal DHGLM with ssGBLUP for uniformity has not been tested.

In this study, we implemented ssGBLUP for predicting GEBV for uniformity in Atlantic salmon. Specifically, our aim was to compare the predictive ability of EBV for uniformity of body weight when implementing either BLUP with the A matrix or ssGBLUP with the H matrix. The (co)variance components were estimated from the sire-dam DHGLM with either A or H and compared prior to genetic evaluation.

Methods

Data

The data used in this study originated from the experiment conducted by Nofima AS and the breeding company SalmoBreed in Norway. The experiment followed all the regulations of animal ethical practice and was approved by the Norwegian Research Animal Committee (ID 6489). In 2013, 234 full-sib families were established from the mating of 131 sires to 234 dams (Table 1) during four weeks. Forty-seven percent of the parents were from year class 2009 and the rest from year class 2010. After hatching, fingerlings from each family were held in a 180-L family tank until tagging size (at mean body weight of 50 g). Each animal was tagged using passive integrated transponder (PIT) tags (Satpos AS, Norway). During tagging, a fin sample for genotyping was collected from 21 to 38 sibs of each of 50 full-sib families. Thereafter, all fish were randomly allocated to three experiment tanks and grown for 11 months. At the average age of 16 months, all fish were challenged with sea lice using a co-habitat challenge, and at the end of the challenge test, final body weight (g) was measured for all 3595 fish with an electronic balance. A total of 1416 offspring (39% of all offspring) and the 131 sires and 234 dams were genotyped using the 31 K Affymetrix single nucleotide polymorphism (SNP) chip for Atlantic salmon developed by Nofima. Quality control of SNPs was performed in PLINK v1.9 [26] based on the following criteria: SNPs were removed if (1) their call rate was lower than 90%, (2) they deviated from Hardy–Weinberg equilibrium with a P value cut-off of 10−15, and (3) their minor allele frequency (MAF) was lower than 0.01. After quality control, 921 of 31,013 SNPs were removed (2.9%) and, thus 30,092 SNPs remained to create the genomic relationship.

Table 1
Population structure of Atlantic salmon

Relationship matrix

The numerator relationship (A) matrix with 814 ancestors in four generations was prepared based on pedigree information using ASReml [27]. The combined numerator and genomic relationship (H) matrix was defined as [23]:

H=A11+A12+A22-1G-A22A22-1A21A12A22-1GGA22-1A21G,

where A11 is the pedigree relationship matrix between non-genotyped animals, A12 and A21 are pedigree relationship matrices between genotyped and non-genotyped animals, A22 is the pedigree relationship matrix between genotyped animals, and G is the genomic relationship matrix between genotyped animals. The G matrix was computed as [28]: G=WWN, where W is the matrix of the scaled SNP genotypes for all loci and N is the total number of SNPs (30,092). The elements of W were calculated as:

wij=xij-2pj2pj1-pj,

where xij is the SNP genotype (coded 0, 1, or 2) for the ith individual at SNP j and pj is the allele frequency of the homozygous genotype coded as 2.

However, Aguilar et al. [29] and Christensen and Lund [30] showed that the inverse of the H matrix can be computed as:

H-1=A-1+000G-1-A22-1,

which is less computational demanding and more simple than preparing and subsequently inverting the H matrix. The H-1 was prepared by using the Calc_grm computer software [31], which prepares both A-1 and G-1 internally before computing H-1.

Statistical analysis

Analysis of residuals

Uniformity can be quantified by squared residuals from a BLUP mixed model equation [32]. The use of genomic information to construct realised relationships between animals, especially for full-sibs, is expected to increase the accuracy of residual estimates due to a greater accuracy of EBV for body weight. Therefore, we investigated the effect of ssGBLUP and traditional BLUP on individual residual estimation. Furthermore, we investigated sire-dam and animal models because residual estimates from a sire-dam model contain not only the unexplained environmental effects but also Mendelian sampling genetic effects. Residual estimates from an animal model do not contain the latter when EBV are estimated with an accuracy of 1. In total, residuals from four models were compared, i.e. the sire-dam or animal model with either A or H.

The animal mixed model was:

yiklmnμβagektlycmaicneiklmn
1

where yiklmn is the observation (body weight) of the ith individual, μ is the overall mean, age is the fixed covariate effect due to different levels of age of the fish, calculated from the start feeding date until the date of measurement (day), β is the fixed linear regression coefficient on age, t is the lth fixed communal tank effect, yc is the mth fixed effect of year class of the parents, ai is the random additive genetic effect, a0,Aσa2, where A is the numerator relationship matrix, or aN0,Hσa2, where H is the combined genomic and pedigree relationship matrix, N is the normal distribution, and σa2 is the additive genetic variance for body weight, cn is the random common effect for full-sibs, cN0,Iσc2, where I is the identity matrix and σc2 is the common environmental variance of body weight, and eiklmn is the random residual effect, eN0,Iσe2, where σe2 is the residual variance of body weight assumed to be homogeneous. For the sire-dam model, the term ai in Eq. (1) was replaced by the random sire-dam (ui) effect, uN0,Aσu2 or uN0,Hσu2. The same A and H matrices were used for the sire-dam and the animal models.

Estimation of genetic parameters for uniformity

To estimate genetic parameters for body weight and its uniformity, the sire-dam DHGLM was used [33, 34] because it is expected to provide unbiased (co)variance components for uniformity [14]. Body weight records were treated in two different ways. First, observed body weight was standardized to a mean of 0 and variance of 1, which facilitates convergence of the DHGLM. Second, we used either the natural log or the Box–Cox transformation to account for possible scale effects, because variances typically increase with increasing trait means [35, 36]. For the Box–Cox transformation, each observation was computed as yiλ-1λ, where λ is the transformation parameter, which was estimated based on Eq. (1) without the random effects [37] by maximum likelihood using the MASS package in R software [38]. The estimate of λ was close to 0 (0.076), indicating that the Box–Cox transformation is very similar to log-transformation, which sets λ equal to 0. Therefore, the Box–Cox transformed body weight was not used further. The standardized body weight and natural logarithm body weight are abbreviated as stdWT and lnWT, respectively.

To estimate genetic parameters, standardized and transformed body weights were modelled using sire-dam DHGLM in ASReml [32]:

yΨ=X00Xvbbv+Zs+Zd00Zs+Zdv×uuv+Q00Qvccv+eev,
2

where y is the vector of stdWT or lnWT records for the ith individual; Ψ is the vector of response variables for the residual variance, where ψi=logσ^ei2+e^i21-hi-σ^ei2σ^ei2, which was linearized using a Taylor series approximation in ASReml [34], e^i2 is the squared residual of the ith body weight record, hi is the diagonal element in the hat-matrix of y [39], and σ^ei2 is the estimated residual variance of the ith observation in the previous iteration of ASReml; X and Xv are incidence matrices of the fixed effects described in Eq. (1) for the trait mean and its uniformity, respectively; b (bv) is the solution vector for the corresponding fixed effects; Zs and Zd are incidence matrices for the random sire (s) and dam (d) effects; u (uv) is the vector of additive genetic effects of sire-dam on the weight (uniformity), which was assumed to follow a normal distribution for the A matrix:

uuvN00,14σa2σa,av,expσa,av,expσav,exp2A,

and for the H matrix:

uuvN00,14σa2σa,av,expσa,av,expσav,exp2H,

where the ¼ accounts for the fact that the sire and dam each explain only a quarter of the additive genetic variance; Q (Qv) is the incidence matrix for the random common effects to full-sibs; c (cv) is the vector of common effects to full-sibs:

ccvN00,σc2σc,cv,expσc,cv,expσcv,exp2I.

The residuals of y (e) and Ψ (ev) were assumed to be independently normally distributed as follows:

eevN00,W-1σϵ200Wv-1σϵv2,

where W=diag(Ψ^-1) and Wv=diag1-h2, and σϵ2 (σϵv2) is a scaled variance that was expected to be 1. The sire-dam DHGLM was fitted iteratively to update Ψ, diag(W) and diag(Wv) until the log-likelihood converged [34].

Calculation of genetic parameters

In the sire-dam DHGLM, the estimated variance for sires was set equal to the estimated genetic variance for dams and equal to one quarter of the additive genetic variance. Hence, the additive genetic variance for body weight (σa2) and its uniformity (σav,exp2) were equal to 4σu2 and 4σuv,exp2, respectively. Estimates for σuv,exp2 and σcv,exp2 for uniformity of body weight were on the exponential scale (exp) and were converted to an additive scale (σuv2 and σcv2) using the extension of the equations of Mulder et al. [17], as derived by Sae-Lim et al. [8]. The additive genetic variance for uniformity of body weight on the additive scale was equal to 4σuv2. Phenotypic variance (σP2) of body weight was equal to 2σu2+σc2+σe2, where σc2 is the variance component for the effect common to full-sibs and σe2 is the residual variance of body weight. Heritability for body weight (h2) was calculated as σa2/σP2. Heritability for uniformity of body weight (hv2) on the additive scale was calculated as σav22σP4+3σav2+σcv2 [8, 40]. Similarly, the common environmental effect was calculated as c2=σc2/σP2 for body weight and as cv2= σcv22σP4+3σav2+σcv2 for uniformity of body weight [8]. The genetic coefficient of variation for uniformity of body weight (GCV) was calculated as σav,exp2. Standard errors of hv2 and GCV were approximated using the equations presented by Mulder et al. [41].

Genetic evaluation and cross-validation

Two genetic evaluations, i.e., BLUP with A and ssGBLUP with H, were performed in a 10-fold cross-validation using the genetic parameters estimated based on the sire-dam DHGLM and !BLUP option in ASReml. In total, four models were used in the 10-fold cross-validation, i.e. animal DHGLM with either A or H on stdWT and lnWT.

The 10-fold cross-validation was performed on standardized and transformed body weight data as follows:

  1. Adjusted phenotypes for body weight (yi) and its uniformity (ψi) were calculated as yi=a^i+c^i+e^i and ψi=a^vi+c^vi+e^vi, using the solutions from the analysis with Eq. (2) on the full dataset.
  2. In a modified dataset, approximately 10% of observed phenotypes (yi) of animals from each family were masked (=10% of the full dataset). All phenotypes had an equal chance to be masked, but the animals that were masked in the previous fold were not masked again in the next fold.
  3. The genetic analysis with Eq. (2) was run on the modified dataset using the A and H matrices and EBV for body weight and its uniformity were predicted for the masked animals.
  4. For each fold, two measurements were computed:
    1. The predictive ability of EBV was calculated as the Pearson correlation of adjusted phenotypes (step 1) with the corresponding EBV (a^) (step 3) for the masked animals that were genotyped, i.e., cor(yi, a^i) for body weight and cor(ψi, a^vi) for uniformity. Kendall and Spearman correlations were also calculated for uniformity because ψi was exponentially rather than normally distributed.
  5. To measure the degree of bias and accuracy of EBV or GEBV of the masked records, the mean square error prediction (MSEP) was calculated as ina^i-yi2n for body weight and ina^vi-ψi2n for uniformity of body weight, where n is the number of masked records in each fold. The MSEP was scaled by the variance of the adjusted phenotypes of the corresponding trait.
  6. Steps (1) to (4) were repeated for each of the 10 folds.

Finally, average Pearson, Kendall, and Spearman correlations, MSEP and their standard error (SE) over the 10 folds were calculated. A 95% confidence interval of the difference (d) in the predictive ability from different models with either A or H was constructed using d ± 1.96 × SEd, where the SEd=SDanimalDHGLM2+SDsire-damDHGLM2numberoffolds. When 0 was not within the 95% confidence interval, the predictive abilities of two models were considered statistically different (P < 0.05).

Results

Residual estimates

Individual residuals estimated from using the A (BLUP) and H matrices (ssGBLUP) were plotted against each other to examine their relationship. As expected, the range of residual estimates from the animal models was lower than that from the sire-dam model since residual estimates from the sire-dam model included the entire Mendelian sampling term (Fig. 1).

Fig. 1
Scatter plot of residuals of body weight from the univariate analysis with A matrix (x-axis) or H matrix (y-axis). Two models were performed; sire-dam univariate model (left) and animal univariate model (right). Red dots are genotyped animals and grey ...

For the sire-dam model, the use of H instead of A did not affect estimated residuals of genotyped animals since the regression coefficient of the estimated residuals using H on the estimated residuals using A and the Pearson correlation between the two were equal to 0.999, which was very similar to the regression coefficient of non-genotyped animals (0.998). The Pearson correlations between estimated residuals using H and A were the same as regression coefficients of estimated residuals using A on estimated residuals using H for genotyped animals (0.999) and non-genotyped animals (0.998).

In contrast, the use of H in the animal model affected residual estimates of genotyped animals since their distribution was much more scattered (Fig. 1). The slope of estimated residuals using A on estimated residuals using H was lower than 1 and slightly steeper for genotyped animals (regression coefficient = 0.7025) than for non-genotyped animals (regression coefficient = 0.6798). The Pearson correlations between estimated residuals using H and A were equal to 0.922 for genotyped animals and 0.966 for non-genotyped animals.

When using the sire-dam model, the difference in estimated residuals with H and A was small and ranged from −10.8 to 10.0. When using the animal model, this difference was larger and ranged from −95.3 to 104.5.

Genetic parameters of body weight and its uniformity

For body weight, estimates of additive genetic variances from the sire-dam DHGLM with either A or H were similar (Table 2). Likewise, estimates of h2 were similar with A and H for both traits: 0.266 and 0.296, respectively for stdWT and 0.325 and 0.346, respectively for lnWT.

Table 2
Estimates of variance components and genetic parameters of body weight and its uniformity based on the sire-dam double hierarchical generalized linear model when using pedigree (A) or combined pedigree and genomic relationships (H) and standard or log-transformed ...

When using A, the estimate of hv2 was higher for uniformity of stdWT (0.036) than for uniformity of lnWT (0.015), while the use of H did not affect the magnitude of hv2 for uniformity. Standard errors of hv2 estimates were, however, high (Table 2).

Although the estimates of hv2 were low, estimates of GCV were high for uniformity of stdWT (48.0% for A and 52.3% for H), which indicates substantial genetic potential for response to selection. After accounting for scale effects, estimates of GCV for uniformity of lnWT were reduced to 30% (for both A and H), which supports the existence of genetic variation for uniformity beyond the scale effects.

Estimates of c2 for stdWT and lnWT were moderate and similar for A (0.103 to 0.117) and H (0.103 to 0.111), which suggests that part of the phenotypic variation was explained by non-genetic effects that are common to full-sibs. Instead, the estimates of cv2 for uniformity of stdWT and lnWT were very low and ranged from 0.001 to 0.022 for A and 0.002 to 0.019 for H (Table 2).

The estimate of the genetic correlation between stdWT and its uniformity was close to 1, using either A (0.952) or H (0.951), which shows the high dependency between mean and variance of body weight. However, the estimate of the genetic correlation between lnWT and its uniformity was reduced to −0.093 with A and to 0.024 with H, which suggests that after accounting for the scale effects, the mean and variance became independent.

Cross-validation

The use of H instead of A with the animal DHGLM resulted in more variation of the within-family GEBV for stdWT and its uniformity, compared to within-family EBV (Fig. 2; Additional file 1: Figure S1), for sire-dam DHGLM).

Fig. 2
Boxplots of estimated breeding values for standardized body weight (stdWT) and its uniformity from genotyped animals by family. The breeding values were estimated using the animal double hierarchical generalized linear model. Green boxplots are estimated ...

The average correlation of adjusted phenotypes with predicted breeding values for stdWT and its uniformity was significantly higher with H (stdWT = 0.443; uniformity = 0.217 to 0.317) than with A (stdWT = 0.372; uniformity = 0.128 to 0.192). However, after accounting for scale effects using log-transformation, the average Pearson, Kendall and Spearman correlations of adjusted phenotypes with predicted breeding values for lnWT and their uniformity were only slightly higher with H than with A, and not significantly different from each other (P > 0.05).

The average MSEP for uniformity from the animal DHGLM (0.608 to 0.944) were lower than those from the sire-dam DHGLM (0.973 to 1.112), suggesting that the use of an animal DHGLM increases the accuracy and may reduce bias in predicting breeding values for uniformity (Table 3; Additional file 2: Table S1). However, the average MSEP for uniformity of stdWT and lnWT obtained with H (0.608 to 0.944) were not notably different from those obtained with A (0.625 to 0.936).

Table 3
Average Pearson, Kendall and Spearman correlations and mean square error prediction from a 10-fold cross-validation based on the sire-dam double hierarchical generalized linear modela when using pedigree (A) or combined pedigree and genomic relationships ...

The predictive ability of EBV of uniformity was sensitive to the type of correlation used, i.e. Pearson, Kendall and Spearman (Table 3). Spearman correlations were 39.1 to 49.0% higher than Kendall correlations. Predictive abilities of EBV and GEBV for uniformity of lnWT differed more from each other based on Kendall and Spearman correlations, albeit not significant at P < 0.05, than based on Pearson correlations. However, the SE of Kendall correlations were approximately 50% lower than the SE of Pearson and Spearman correlations, suggesting that Kendall correlations provide a more reliable estimate of predictive ability than Pearson and Spearman correlations.

Discussion

To the best of our knowledge, this is the first study that compares the use of the numerator relationship (A) and a combined genomics and numerator relationship (H) matrix for estimating genetic parameters and predicting breeding values for body weight and its uniformity. The use of the animal DHGLM with H significantly improved the predictive ability of GEBV for uniformity of body weight (stdWT) but not for scale-adjusted uniformity.

Genetic parameters

The estimate of heritability for uniformity of stdWT from sire-dam DHGLM with A was low (hv2 = 0.036) but higher than estimates of hv2 obtained in previous studies on rainbow trout [4, 8] and Nile tilapia [15, 16] (h¯v2 = 0.016: min = 0.010: max = 0.024). However, after accounting for scale effects by logarithm transformations, the estimate of hv2 decreased to 0.014 to 0.015, which is in line with the previous reports that also used transformations [4, 8, 1416].

Estimates of hv2 for stdWT and lnWT using the sire-dam DHGLM with H did not differ from those with A, which is in line with estimates of hv2 for uniformity of piglet birth weight obtained using either A or only the genomic relationship matrix (G) [42], while lower estimates were reported for environmental variance of somatic cell score in dairy cattle when using G compared to A [43]. The similarity of the estimates of hv2 obtained by using A or H in this study can be explained by the very similar estimated residuals (proxy of uniformity) between non-genotyped and genotyped animals when using the sire-dam model with A and H. The sire-dam model only exploits relationships between sires and dams and does not exploit the full potential of the genotype-based relationships between animals, and especially between full-sibs. In contrast, residuals of genotyped animals estimated by using the animal model were more differentiated when either A or H was used, and likely more accurate than estimates of residuals for non-genotyped animals. However, in a DHGLM analysis, the sire-dam model provides less biased (co)variance components than the animal model [14], likely because of the dependence between estimates of the breeding value and residual of an individual, which are obtained from the same phenotype of body weight. The use of genomic relationships combined with numerator relationships is expected to reduce the dependency between EBV and estimated residuals because the EBV are more accurate. Therefore, we performed the animal DHGLM with H but the model did not converge when the variance components were estimated, which may be due to (1) the dependency between EBV and estimated residuals for body weight remaining high, or (2) the difficulty to disentangle genetic effects from the common environmental effects for uniformity of body weight.

The standard errors of hv2 estimates were high, which may be due to the large variation in family size (4 to 54). According to Hill and Mulder [30], large family sizes or repeated measurements are recommended for estimating the genetic heteroscedasticity of traits. The optimal full-sib family size is 39 with a GCV of 39% and an h2 of 0.36 [18]. The use of H did not affect the standard errors of the hv2 estimates, which does not agree with previous studies, for example Veerkamp et al. [44] reported lower standard errors of h2 estimates for dry matter intake, milk yield, and body weight of heifers when using genomic relationships with an animal model. One possible explanation could be that the benefit of using the genomic relationship matrix may be limited when the sire-dam DHGLM is applied, since the variance of genomic relationships between sires (0.02) was very similar to the variance of numerator relationships between sires (0.01). In contrast, the variance in genomic relationships between animals (0.01) was much larger than the variance of numerator relationships between animals (0.004). Hence, the SE of hv2 estimates may be lower when using an animal DHGLM with a genomic relationship matrix, compared to the numerator relationship matrix. Nevertheless, in our study, it was not possible to investigate this phenomenon since the log-likelihood did not converge for the animal DHGLM with H.

The GCV of uniformity for stdWT was substantial (48.0%), which indicates high potential for response to selection. This result is in the upper range of previous findings in fish species (17.4 to 64.0%) [4, 8, 1416, 19] and in terrestrial animals (10.0 to 58.0%) [6, 913, 42]. After accounting for scale effects by logarithm transformations, the GCV for uniformity was reduced but still substantial (29.5 to 29.9%), which was also reported in previous studies on Atlantic salmon [14], rainbow trout [8], rabbit, and pig [45]. Thus, scale effects affect estimates of genetic parameters for uniformity of body weight considerably, but there is genetic variation for uniformity beyond the scale effect.

Genomic information slightly increased the GCV of uniformity of stdWT (from 48.0 to 52.3%). In contrast, genomic information did not influence the GCV for uniformity of lnWT (29.8%). Since estimates of genetic parameters obtained with A and H were similar, the GCV for body weight remained similar, which is in agreement with the previous comparison between A (GCV = 11.0 to 12.0%) and G (GCV = 10.0 to 11.0%) for uniformity of birth weight of piglets using a dam model [42].

Genetic and genomic predictions

In this study, we used the Pearson correlation of EBV and GEBV with adjusted phenotype as the measure of predictive ability. The use of H instead of A in the animal DHGLM significantly improved the ability to predict breeding values for stdWT (19%) and its uniformity (41.1 to 78.1%). Furthermore, the use of the animal DHGLM instead of the sire-dam DHGLM significantly increased the predictive ability of EBV and GEBV for uniformity (see Additional file 2: Table S1), as expected.

Our findings indicate that ssGBLUP with an animal DHGLM can increase the accuracy of EBV for uniformity substantially compared to pedigree-based BLUP. However, after accounting for scaling effects by using log transformations, the use of H compared to A only slightly improved the correlation (1.6 to 13.9%) and MSEP between GEBV and adjusted phenotypes, and the improvement was not significant. There are two main reasons why these results differed between uniformity of stdWT and lnWT. First, log-transformation substantially reduced the genetic correlation of stdWT with its uniformity. As a result, any increases in predictive ability of GEBV of lnWT when using H instead of A (15.7%) did not positively influence the predictive ability of GEBV of its uniformity. Second, the lower additive genetic variance and hv2 for uniformity after accounting for scale effects reduces the accuracy of EBV for uniformity. Consequently, MSEP increased from 0.63 (stdWT) to 0.94 (lnWT) with A and from 0.61 (stdWT) to 0.94 (lnWT) with H in the animal DHGLM.

The accuracy of genomic selection is expected to increase when the number of genotyped animals in the reference population increases for any trait [46] but in particular for lowly heritable traits, such as uniformity as shown by Sell-Kubiak et al. [42] and somatic cell score by Mulder et al. [43]. In this study, uniformity of lnWT had an even lower hv2 than uniformity of stdWT. The number of genotyped animals in the reference population used for cross-validation was on average equal to 1274, which may have limited the benefit of using genomic information in ssGBLUP. A future empirical study should investigate the effect of the number of genotyped animals in the reference population on the ability to predict breeding values for uniformity to validate our findings and conclusions.

Pearson or rank correlations?

Squared residuals or adjusted phenotype for uniformity (ψi) are exponentially rather than normally distributed, which may not justify quantifying predictive ability using a Pearson correlation. Thus, we also calculated distribution-free rank correlations (Kendall and Spearman) and indeed found estimation of the predictive ability for uniformity to be sensitive to the type of correlation used.

Although not significantly different, Kendall and Spearman correlations explained differences in predictive ability of EBV and GEBV for uniformity of lnWT slightly better than the Pearson correlation. Hence, the conclusion that the benefit of using genomic relationship for computing EBV for uniformity of logarithm transformations is limited remained the same when using Kendall and Spearman correlations. Colwel and Gillett [47] showed that, in general, estimates of Kendall correlations are similar to estimates of Spearman correlations, but in some cases, the magnitude of Spearman correlations can be 50% greater than the magnitude of Kendall correlations [47]. This is in line with our findings since Spearman correlations were 42.2 to 49.0% and 39.1 to 46.1% greater than Kendall correlations for the sire-dam DHGLM (see Additional file 2: Table S1) and the animal DHGLM, respectively. Nevertheless, the SE of Kendall correlations were notably lower by approximately 50% than the SE of Pearson and Spearman correlations, which indicates that the Kendall correlation may be a more reliable estimate of predictive ability than the Pearson and Spearman correlations. Hence, it is recommended to use Kendall instead of Pearson correlations when studying predictive ability for uniformity.

Selection for uniformity

For fish breeding, major goals are to increase mean body weight and reduce variability (more uniformity) of body weight. Nevertheless, definitions of uniformity of stdWT and lnWT are not the same. From a biological point of view, genetic variation for environmental canalization can be quantified after the scale effect is accounted for. However, from an animal breeding point of view, uniformity on the observed scale explains the actual range of fish sizes that are processed by aquaculture industries.

Selection for body weight and uniformity may be challenging because the genetic correlation between body weight and its variability is high and positive, and sometimes approaches 1. A general observation is that the genetic correlation between log-transformed body weight and its variability is zero or even negative, allowing selection to simultaneously increase transformed body weight and reduce variability. Therefore log-transformed body weight and its variability could be included in a selection index. This would require knowledge of the genetic correlation between variability of stdWT and variability of lnWT, which is not available. Hence, we used the sire-EBV and sire-GEBV, obtained from BLUP and ssGBLUP with the animal DHGLM, to calculate the Pearson correlations between EBV for the trait and its variability. Pearson correlations between EBV for variability of stdWT and EBV for variability of lnWT were positively moderate with BLUP (0.48) and ssGBLUP (0.68). The Pearson correlation between EBV for stdWT and EBV for its variability was close to 1 with either BLUP (0.96) or ssGBLUP (0.97), while the Pearson correlation between EBV of lnWT and its variability was −0.24 for BLUP and -0.005 for ssGBLUP. Not surprisingly, the Pearson correlation between EBV for lnWT and EBV for stdWT was highly positive (0.82). These Pearson correlations suggest that variability of lnWT should be included in the selection index because the GEBV for variability of lnWT are positively correlated with GEBV for variability of stdWT, which indicates that selection against variability of lnWT will indirectly reduce variability on the observed scale. Furthermore, GEBV for variability of lnWT are not correlated to GEBV for lnWT and thus selection against variability of lnWT will not indirectly reduce lnWT.

To elucidate responses to selection for body weight and its variability, we performed truncation selection on sires based on their GEBV from the animal DHGLM with ssGBLUP. The breeding goal could include body weight and variability on the observed scale and their economic values (v): v1BVstdWTv2BVvariabilitystdWT, while, based on the Pearson correlations discussed above, the selection index (I) could include lnWT and its variability with their relative weighting factors (b):

I = b1GEBVlnWTb2GEBVvariabilitylnWT.

Selecting the 10% best sires on an index with b1 of 0.3, i.e., I = 0.3 ∗ GEBVlnWT - 0.7 ∗ GEBVvariabilitylnWT provides almost no genetic gain in variability of stdWT (−0.001) but positive genetic gain in stdWT (3.62% of mean body weight in g). In contrast, a selection index, based on breeding goal traits:I = 0.52 ∗ GEBVstdWT - 0.48 ∗ GEBVvariabilitystdWT, provides zero genetic gains for both stdWT and its variability, showing no possibility to achieve genetic gain on body weight while maintaining stable phenotypic variability. Nevertheless, the genetic gain in stdWT was much greater (17.32% of mean body weight in g) when variability was not included in the selection index. Therefore, although it is possible to increase body weight while keeping variability constant, there is a trade-off in genetic gain for body weight when selecting for reduced variability.

Conclusions

The use of the animal DHGLM instead of the sire-dam DHGLM substantially increased the predictive ability for breeding values of uniformity, because the animal DHGLM fully exploits the relationships between full- and half-sibs. When using the animal DHGLM, the use of a combined numerator and genomic relationship matrix significantly increased the predictive ability for breeding values of uniformity of body weight, but only a slight and non-significant increase was observed after accounting for the scale effects by using transformed body weights. The small increase in predictive ability with transformed body weights may be due to lower heritability for uniformity of transformed body weight, a lower genetic correlation between transformed body weights and their uniformities, and/or a small number of genotyped animals in the reference population. The use of a Kendall correlation provided the lowest SE of predictive ability for uniformity and provided a more accurate estimate of predictive ability for uniformity over Pearson and Spearman correlations. In conclusion, the use of ssGBLUP increases the accuracy of breeding values for uniformity of harvest weight, which is expected to increase response to selection in uniformity.

Authors’ contributions

PSL analyzed the data. HAM, AK, and ML provided theoretical support for genomic DHGLM and cross-validation. HAM, AK, and ML contributed to the discussion of the results. PSL drafted the manuscript. HAM, AK, and ML improved the manuscript. All authors read and approved the final manuscript.

Acknowledgements

This study is a part of the research project entitled STABLEFISH funded by Norwegian Research Council (NRC: 234144/E49). We would like to thank SalmoBreed for providing data for this study. Matthew Baranski genotyped the animals and generated genotype data file for this study. Arthur Gilmour is acknowledged for help in implementing genomic DHGLM in ASReml v4. PSL would like to thank Mario Calus for his guidance on Cal_grm computer software, Solomon Antwi Boison and Sergio Vela Avitúa for a fruitful discussion during the drafting of this manuscript.

Competing interests

The authors declare that they have no competing interests.

Footnotes

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-017-0308-3) contains supplementary material, which is available to authorized users.

Contributor Information

Panya Sae-Lim, on.amifon@mil-eas.aynap.

Antti Kause, if.ekul@esuak.ittna.

Marie Lillehammer, on.amifon@remmahellil.eiram.

Han A. Mulder, ln.ruw@redlum.nah.

References

1. Sae-Lim P, Komen H, Kause A, van Arendonk JAM, Barfoot AJ, Martin KE, et al. Defining desired genetic gains for rainbow trout breeding objective using analytic hierarchy process. J Anim Sci. 2012;90:1766–1776. doi: 10.2527/jas.2011-4267. [PubMed] [Cross Ref]
2. Gilmour KM, DiBattista JD, Thomas JB. Physiological causes and consequences of social status in salmonid fish. Integr Comp Biol. 2005;45:263–273. doi: 10.1093/icb/45.2.263. [PubMed] [Cross Ref]
3. Janhunen M, Kause A, Järvisalo O. Costs of being extreme - Do body size deviations from population or sire means decrease vitality in rainbow trout? Aquaculture. 2012;370–371:123–129. doi: 10.1016/j.aquaculture.2012.10.013. [Cross Ref]
4. Janhunen M, Kause A, Vehviläinen H, Jarvisalo O. Genetics of microenvironmental sensitivity of body weight in rainbow trout (Oncorhynchus mykiss) selected for improved growth. PLoS One. 2012;7:e38766. doi: 10.1371/journal.pone.0038766. [PMC free article] [PubMed] [Cross Ref]
5. Baras E, Jobling M. Dynamics of intracohort cannibalism in cultured fish. Aquacult Res. 2002;33:461–479. doi: 10.1046/j.1365-2109.2002.00732.x. [Cross Ref]
6. Mulder HA, Bijma P, Hill WG. Selection for uniformity in livestock by exploiting genetic heterogeneity of residual variance. Genet Sel Evol. 2008;40:37–59. [PMC free article] [PubMed]
7. Mulder H, Hill W, Vereijken A, Veerkamp R. Estimation of genetic variation in residual variance in female and male broiler chickens. Animal. 2009;3:1673–1680. doi: 10.1017/S1751731109990668. [PubMed] [Cross Ref]
8. Sae-Lim P, Kause A, Janhunen M, Vehviläinen H, Koskinen H, Gjerde B, et al. Genetic (co) variance of rainbow trout (Oncorhynchus mykiss) body weight and its uniformity across production environments. Genet Sel Evol. 2015;47:46. doi: 10.1186/s12711-015-0122-8. [PMC free article] [PubMed] [Cross Ref]
9. Ros M, Sorensen D, Waagepetersen R, Dupont-Nivet M, SanCristobal M, Bonnet JC, et al. Evidence for genetic control of adult weight plasticity in the snail Helix aspersa. Genetics. 2004;168:2089–2097. doi: 10.1534/genetics.104.032672. [PubMed] [Cross Ref]
10. Rowe S, White IM, Avendano S, Hill WG. Genetic heterogeneity of residual variance in broiler chickens. Genet Sel Evol. 2006;38:617–635. doi: 10.1186/1297-9686-38-6-617. [PMC free article] [PubMed] [Cross Ref]
11. Wolc A, White IM, Avendano S, Hill WG. Genetic variability in residual variation of body weight and conformation scores in broiler chickens. Poult Sci. 2009;88:1156–1161. [PubMed]
12. Ibáñez-Escriche N, Moreno A, Nieto B, Piqueras P, Salgado C, Gutiérrez JP. Genetic parameters related to environmental variability of weight traits in a selection experiment for weight gain in mice; signs of correlated canalised response. Genet Sel Evol. 2008;40:279–293. [PMC free article] [PubMed]
13. Ibáñez-Escriche N, Varona L, Sorensen D, Noguera JL. A study of heterogeneity of environmental variance for slaughter weight in pigs. Animal. 2008;2:19–26. doi: 10.1017/S1751731107001000. [PubMed] [Cross Ref]
14. Sonesson A, Ødegård J, Ronnegard L. Genetic heterogeneity of within-family variance of body weight in Atlantic salmon (Salmo salar) Genet Sel Evol. 2013;45:41. doi: 10.1186/1297-9686-45-41. [PMC free article] [PubMed] [Cross Ref]
15. Marjanovic J, Mulder H, Khaw H, Bijma P. Genetic parameters for uniformity of harvest weight in the gift strain of nile tilapia estimated using double hierarchical generalized linear models. In: Proceedings of the international symposium on genetics in aquaculture XII, 21–27 June 2015; Santiago de Compostela; 2015. http://isga2015.acuigen.es/isga-2015-Abstract-Book.pdf.
16. Khaw HL, Ponzoni RW, Yee HY, bin Aziz MA, Mulder HA, Marjanovic J, et al. Genetic variance for uniformity of harvest weight in Nile tilapia (Oreochromis niloticus) Aquaculture. 2016;451:113–120. doi: 10.1016/j.aquaculture.2015.09.003. [Cross Ref]
17. Mulder HA, Bijma P, Hill WG. Prediction of breeding values and selection response with genetic heterogeneity of environmental variance. Genetics. 2007;175:1895–1910. doi: 10.1534/genetics.106.063743. [PubMed] [Cross Ref]
18. Sae-Lim P, Gjerde B, Nielsen HM, Mulder H, Kause A. A review of genotype-by-environment interaction and micro-environmental sensitivity in aquaculture species. Rev Aquacult. 2015;8:369–393. doi: 10.1111/raq.12098. [Cross Ref]
19. Marjanovic J, Mulder HA, Khaw HL, Bijma P. Genetic parameters for uniformity of harvest weight and body size traits in the GIFT strain of Nile tilapia. Genet Sel Evol. 2016;48:41. doi: 10.1186/s12711-016-0218-9. [PMC free article] [PubMed] [Cross Ref]
20. Falconer DS, Mackay TFC. Introduction to quantitative genetics. 4. London: Pearson; 1996.
21. Meuwissen THE, Hayes BJ, Goddard ME. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001;157:1819–1829. [PubMed]
22. Goddard M, Hayes B, Meuwissen THE. Genomic selection in farm animal species-lessons learnt and future perspectives. In: Proceedings of the 9th world congress on genetics applied to livestock production, 1–6 August 2010; Leipzig. 2010.
23. Misztal I, Aggrey SE, Muir WM. Experiences with a single-step genome evaluation. Poult Sci. 2013;92:2530–2534. doi: 10.3382/ps.2012-02739. [PubMed] [Cross Ref]
24. Misztal I, Legarra A, Aguilar I. Computing procedures for genetic evaluation including phenotypic, full pedigree, and genomic information. J Dairy Sci. 2009;92:4648–4655. doi: 10.3168/jds.2009-2064. [PubMed] [Cross Ref]
25. Legarra A, Aguilar I, Misztal I. A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009;92:4656–4663. doi: 10.3168/jds.2009-2061. [PubMed] [Cross Ref]
26. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–575. doi: 10.1086/519795. [PubMed] [Cross Ref]
27. Gilmour AR, Gogel BJ, Cullis BR, Thompson R. ASReml User Guide Release 4.0. Hemel Hempstead: VSM International Ltd; 2012.
28. VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–4423. doi: 10.3168/jds.2007-0980. [PubMed] [Cross Ref]
29. Aguilar I, Misztal I, Johnson DL, Legarra A, Tsuruta S, Lawlor TJ. Hot topic: a unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score1. J Dairy Sci. 2010;93:743–752. doi: 10.3168/jds.2009-2730. [PubMed] [Cross Ref]
30. Christensen OF, Lund MS. Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010;42:2. doi: 10.1186/1297-9686-42-2. [PMC free article] [PubMed] [Cross Ref]
31. Calus MPL, Vandenplas J. calc_grm—a program to compute pedigree, genomic, and combined relationship matrices. Animal Breeding and Genomics Centre: Wageningen; 2015.
32. Hill WG, Mulder HA. Genetic analysis of environmental variation. Genet Res (Camb) 2010;92:381–395. doi: 10.1017/S0016672310000546. [PubMed] [Cross Ref]
33. Rönnegård L, Felleki M, Fikse F, Mulder H, Strandberg E. Genetic heterogeneity of residual variance—estimation of variance components using double hierarchical generalized linear models. Genet Sel Evol. 2010;42:8. doi: 10.1186/1297-9686-42-8. [PMC free article] [PubMed] [Cross Ref]
34. Felleki M, Lee D, Lee Y, Gilmour AR, Rönnegård L. Estimation of breeding values for mean and dispersion, their variance and correlation using double hierarchical generalized linear models. Genet Res (Camb) 2012;94:307–317. doi: 10.1017/S0016672312000766. [PubMed] [Cross Ref]
35. Lande R. On comparing coefficients of variation. Syst Zool. 1977;26:214–217. doi: 10.2307/2412845. [Cross Ref]
36. Box GE, Cox DR. An analysis of transformations. J R Stat Soc Ser B Stat Methodol. 1964;26:211–252.
37. Sakia R. The Box–Cox transformation technique: a review. Statistician. 1992;41:169–178. doi: 10.2307/2348250. [Cross Ref]
38. R Development Core Team . R: a language and environment for statistical computing. Vienna: The R Foundation for Statistical Computing; 2011.
39. Hoaglin DC, Welsch RE. The hat matrix in regression and ANOVA. Am Stat. 1978;32:17–22.
40. Felleki M, Lundeheim N. Genetic control of residual variance for teat number in pigs. Proc Assoc Advmt Anim Breed Genet. 2013;20:538–541.
41. Mulder HA, Visscher J, Fablet J. Estimating the purebred–crossbred genetic correlation for uniformity of eggshell color in laying hens. Genet Sel Evol. 2016;48:39. doi: 10.1186/s12711-016-0212-2. [PMC free article] [PubMed] [Cross Ref]
42. Sell-Kubiak E, Wang S, Knol EF, Mulder HA. Genetic analysis of within-litter variation in piglets’ birth weight using genomic or pedigree relationship matrices. J Anim Sci. 2015;93:1471–1480. doi: 10.2527/jas.2014-8674. [PubMed] [Cross Ref]
43. Mulder HA, Crump RE, Calus MPL, Veerkamp RF. Unraveling the genetic architecture of environmental variance of somatic cell score using high-density single nucleotide polymorphism and cow data from experimental farms. J Dairy Sci. 2013;96:7306–7317. doi: 10.3168/jds.2013-6818. [PubMed] [Cross Ref]
44. Veerkamp RF, Mulder HA, Thompson R, Calus MPL. Genomic and pedigree-based genetic parameters for scarcely recorded traits when some animals are genotyped. J Dairy Sci. 2011;94:4189–4197. doi: 10.3168/jds.2011-4223. [PubMed] [Cross Ref]
45. Yang Y, Christensen O, Sorensen D. Analysis of a genetically structured variance heterogeneity model using the Box–Cox transformation. Genet Res (Camb) 2011;93:33–46. doi: 10.1017/S0016672310000418. [PubMed] [Cross Ref]
46. Daetwyler HD, Villanueva B, Woolliams JA. Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS One. 2008;3:e3395. doi: 10.1371/journal.pone.0003395. [PMC free article] [PubMed] [Cross Ref]
47. Colwell DJ, Gillett JR. 66.49 Spearman versus Kendall. Math Gaz. 1982;66:307–309. doi: 10.2307/3615525. [Cross Ref]

Articles from Genetics, Selection, Evolution : GSE are provided here courtesy of BioMed Central