|Home | About | Journals | Submit | Contact Us | Français|
In farmed Atlantic salmon, heritability for uniformity of body weight is low, indicating that the accuracy of estimated breeding values (EBV) may be low. The use of genomic information could be one way to increase accuracy and, hence, obtain greater response to selection. Genomic information can be merged with pedigree information to construct a combined relationship matrix (H matrix) for a single-step genomic evaluation (ssGBLUP), allowing realized relationships of the genotyped animals to be exploited, in addition to numerator pedigree relationships (A matrix). We compared the predictive ability of EBV for uniformity of body weight in Atlantic salmon, when implementing either the A or H matrix in the genetic evaluation. We used double hierarchical generalized linear models (DHGLM) based either on a sire-dam (sire-dam DHGLM) or an animal model (animal DHGLM) for both body weight and its uniformity.
With the animal DHGLM, the use of H instead of A significantly increased the correlation between the predicted EBV and adjusted phenotypes, which is a measure of predictive ability, for both body weight and its uniformity (41.1 to 78.1%). When log-transformed body weights were used to account for a scale effect, the use of H instead of A produced a small and non-significant increase (1.3 to 13.9%) in predictive ability. The sire-dam DHGLM had lower predictive ability for uniformity compared to the animal DHGLM.
Use of the combined numerator and genomic relationship matrix (H) significantly increased the predictive ability of EBV for uniformity when using the animal DHGLM for untransformed body weight. The increase was only minor when using log-transformed body weights, which may be due to the lower heritability of scaled uniformity, the lower genetic correlation of transformed body weight with its uniformity compared to the untransformed traits, and the small number of genotyped animals in the reference population. This study shows that ssGBLUP increases the accuracy of EBV for uniformity of body weight and is expected to increase response to selection in uniformity.
The online version of this article (doi:10.1186/s12711-017-0308-3) contains supplementary material, which is available to authorized users.
In aquaculture, selection to increase economically important traits such as growth is one of the main breeding goals. However, fish producers show interest to improve not only the mean but also the variance of traits . Uniformity of growth is preferable because more uniform growth allows a more uniform product, harvest of a larger proportion of the population at market size, and reduction of size grading and multiple harvests [2–4]. More uniform growth may also reduce competitive interactions between animals, which contributes to reduce feed monopolization and dominant behaviour, and thus improve well-being of fish . Uniformity is also important for traits that have an intermediate optimal trait value , such as fillet lipid%, body shape, and condition factor in the aquaculture industry. A fish whose growth is sensitive to non-measurable environmental factors, known as micro-environments, shows micro-environmental sensitivity, which results in high environmental variance and consequently contributes to increased phenotypic variation, leading to increased size variation within a group of fish. A number of empirical studies in terrestrial and aquatic species show that uniformity is partly determined by genetic factors [4, 7–16]. Thus, selective breeding can open up one avenue to improve uniformity of fish traits.
Atlantic salmon (Salmo salar L.) is a farmed fish that is of major economic importance. Heritability for uniformity of body weight has been estimated in Atlantic salmon , rainbow trout (Oncorhynchus mykiss Walbaum) [4, 8], and Nile tilapia (Oreochromis niloticus) [15, 16]. In general, heritability for uniformity () is low in livestock and aquaculture species ( < 0.05), indicating that the prediction accuracy of breeding values for uniformity may be low [17, 18]. However, the coefficient of genetic variation (GCV) of uniformity of body weight is high in fish species (median GCV = 34.0%: min = 17.4% and max = 64.0%), which indicates high potential for response to selection [4, 8, 14, 16, 19]. One way to increase response to selection for uniformity is to increase the accuracy of estimated breeding values (EBV) for uniformity .
In aquaculture, full- and half-sib family sizes are usually large and thus the accuracy of EBV based on full-sibs, half-sibs and own performance is high for body weight, but not for uniformity due to its low heritability . One approach to increase the accuracy of EBV is to use genomic information . With genomic selection, genomic estimated breeding values (GEBV) can be obtained for the selection candidates that are genotyped, even when they have no phenotype records. One reason why genomic selection results in higher accuracy of selection is the more accurate estimation of the Mendelian sampling genetic effects through realized additive genetic relationships among animals . Consequently, individual squared residuals, which is the phenotype for uniformity in a double hierarchical generalized linear model (DHGLM), may also be more accurately estimated when using genomic information.
In many cases, combining numerator pedigree and genomic information in genomic evaluations is implemented in multiple steps, which may introduce bias and need some calculations to combine with pedigree-based EBV [23, 24]. Single step genomic best linear unbiased prediction (ssGBLUP) avoids this, and genomic and pedigree information are combined in one step , which may lead to less bias and is less prone to double counting of information compared to genomic evaluation methods that are performed in multiple steps. The ssGBLUP augments the numerator relationship (A) matrix by the genomic relationship (G) matrix in conventional genetic evaluation using BLUP . This combined numerator and genomic relationship matrix is known as the H matrix . In fish breeding, combining pedigree and genomic information allows exploiting the large full- and half-sib families and the more accurate relationships of the genotyped animals, and may yield a higher accuracy of selection for uniformity than the use of the A matrix.
To date, the use of ssGBLUP for uniformity has not been studied. Furthermore, according to a previous study, the sire-dam model, but not the animal model, implemented within the framework of DHGLM provided unbiased (co)variance component estimates . However, an animal DHGLM is expected to perform better than a sire-dam DHGLM for genetic evaluation because the animal DHGLM uses full relationships between animals rather than only among sires and dams. This is particularly important for uniformity, which is quantified by the residuals of individuals, which in the animal model do not contain the Mendelian sampling term. Moreover, for genetic evaluation, the animal DHGLM uses all phenotypic information and, for most breeding programs, at least part of the selection candidates, e.g. females for sex-linked traits, have phenotypes available at the time of selection. Use of the animal DHGLM with ssGBLUP for uniformity has not been tested.
In this study, we implemented ssGBLUP for predicting GEBV for uniformity in Atlantic salmon. Specifically, our aim was to compare the predictive ability of EBV for uniformity of body weight when implementing either BLUP with the A matrix or ssGBLUP with the H matrix. The (co)variance components were estimated from the sire-dam DHGLM with either A or H and compared prior to genetic evaluation.
The data used in this study originated from the experiment conducted by Nofima AS and the breeding company SalmoBreed in Norway. The experiment followed all the regulations of animal ethical practice and was approved by the Norwegian Research Animal Committee (ID 6489). In 2013, 234 full-sib families were established from the mating of 131 sires to 234 dams (Table 1) during four weeks. Forty-seven percent of the parents were from year class 2009 and the rest from year class 2010. After hatching, fingerlings from each family were held in a 180-L family tank until tagging size (at mean body weight of 50 g). Each animal was tagged using passive integrated transponder (PIT) tags (Satpos AS, Norway). During tagging, a fin sample for genotyping was collected from 21 to 38 sibs of each of 50 full-sib families. Thereafter, all fish were randomly allocated to three experiment tanks and grown for 11 months. At the average age of 16 months, all fish were challenged with sea lice using a co-habitat challenge, and at the end of the challenge test, final body weight (g) was measured for all 3595 fish with an electronic balance. A total of 1416 offspring (39% of all offspring) and the 131 sires and 234 dams were genotyped using the 31 K Affymetrix single nucleotide polymorphism (SNP) chip for Atlantic salmon developed by Nofima. Quality control of SNPs was performed in PLINK v1.9  based on the following criteria: SNPs were removed if (1) their call rate was lower than 90%, (2) they deviated from Hardy–Weinberg equilibrium with a P value cut-off of 10−15, and (3) their minor allele frequency (MAF) was lower than 0.01. After quality control, 921 of 31,013 SNPs were removed (2.9%) and, thus 30,092 SNPs remained to create the genomic relationship.
The numerator relationship (A) matrix with 814 ancestors in four generations was prepared based on pedigree information using ASReml . The combined numerator and genomic relationship (H) matrix was defined as :
where A11 is the pedigree relationship matrix between non-genotyped animals, A12 and A21 are pedigree relationship matrices between genotyped and non-genotyped animals, A22 is the pedigree relationship matrix between genotyped animals, and G is the genomic relationship matrix between genotyped animals. The G matrix was computed as : , where W is the matrix of the scaled SNP genotypes for all loci and N is the total number of SNPs (30,092). The elements of W were calculated as:
where xij is the SNP genotype (coded 0, 1, or 2) for the ith individual at SNP j and pj is the allele frequency of the homozygous genotype coded as 2.
which is less computational demanding and more simple than preparing and subsequently inverting the H matrix. The H-1 was prepared by using the Calc_grm computer software , which prepares both A-1 and G-1 internally before computing H-1.
Uniformity can be quantified by squared residuals from a BLUP mixed model equation . The use of genomic information to construct realised relationships between animals, especially for full-sibs, is expected to increase the accuracy of residual estimates due to a greater accuracy of EBV for body weight. Therefore, we investigated the effect of ssGBLUP and traditional BLUP on individual residual estimation. Furthermore, we investigated sire-dam and animal models because residual estimates from a sire-dam model contain not only the unexplained environmental effects but also Mendelian sampling genetic effects. Residual estimates from an animal model do not contain the latter when EBV are estimated with an accuracy of 1. In total, residuals from four models were compared, i.e. the sire-dam or animal model with either A or H.
The animal mixed model was:
where yiklmn is the observation (body weight) of the ith individual, μ is the overall mean, age is the fixed covariate effect due to different levels of age of the fish, calculated from the start feeding date until the date of measurement (day), β is the fixed linear regression coefficient on age, t is the lth fixed communal tank effect, yc is the mth fixed effect of year class of the parents, ai is the random additive genetic effect, , where A is the numerator relationship matrix, or , where H is the combined genomic and pedigree relationship matrix, N is the normal distribution, and is the additive genetic variance for body weight, cn is the random common effect for full-sibs, where I is the identity matrix and is the common environmental variance of body weight, and eiklmn is the random residual effect, , where is the residual variance of body weight assumed to be homogeneous. For the sire-dam model, the term ai in Eq. (1) was replaced by the random sire-dam (ui) effect, or . The same A and H matrices were used for the sire-dam and the animal models.
To estimate genetic parameters for body weight and its uniformity, the sire-dam DHGLM was used [33, 34] because it is expected to provide unbiased (co)variance components for uniformity . Body weight records were treated in two different ways. First, observed body weight was standardized to a mean of 0 and variance of 1, which facilitates convergence of the DHGLM. Second, we used either the natural log or the Box–Cox transformation to account for possible scale effects, because variances typically increase with increasing trait means [35, 36]. For the Box–Cox transformation, each observation was computed as , where λ is the transformation parameter, which was estimated based on Eq. (1) without the random effects  by maximum likelihood using the MASS package in R software . The estimate of λ was close to 0 (0.076), indicating that the Box–Cox transformation is very similar to log-transformation, which sets λ equal to 0. Therefore, the Box–Cox transformed body weight was not used further. The standardized body weight and natural logarithm body weight are abbreviated as stdWT and lnWT, respectively.
To estimate genetic parameters, standardized and transformed body weights were modelled using sire-dam DHGLM in ASReml :
where y is the vector of stdWT or lnWT records for the ith individual; Ψ is the vector of response variables for the residual variance, where , which was linearized using a Taylor series approximation in ASReml , is the squared residual of the ith body weight record, hi is the diagonal element in the hat-matrix of y , and is the estimated residual variance of the ith observation in the previous iteration of ASReml; X and Xv are incidence matrices of the fixed effects described in Eq. (1) for the trait mean and its uniformity, respectively; b (bv) is the solution vector for the corresponding fixed effects; Zs and Zd are incidence matrices for the random sire (s) and dam (d) effects; u (uv) is the vector of additive genetic effects of sire-dam on the weight (uniformity), which was assumed to follow a normal distribution for the A matrix:
and for the H matrix:
where the ¼ accounts for the fact that the sire and dam each explain only a quarter of the additive genetic variance; Q (Qv) is the incidence matrix for the random common effects to full-sibs; c (cv) is the vector of common effects to full-sibs:
The residuals of y (e) and Ψ (ev) were assumed to be independently normally distributed as follows:
where and , and () is a scaled variance that was expected to be 1. The sire-dam DHGLM was fitted iteratively to update Ψ, diag(W) and diag(Wv) until the log-likelihood converged .
In the sire-dam DHGLM, the estimated variance for sires was set equal to the estimated genetic variance for dams and equal to one quarter of the additive genetic variance. Hence, the additive genetic variance for body weight () and its uniformity () were equal to and , respectively. Estimates for and for uniformity of body weight were on the exponential scale (exp) and were converted to an additive scale ( and ) using the extension of the equations of Mulder et al. , as derived by Sae-Lim et al. . The additive genetic variance for uniformity of body weight on the additive scale was equal to . Phenotypic variance () of body weight was equal to , where is the variance component for the effect common to full-sibs and is the residual variance of body weight. Heritability for body weight (h2) was calculated as . Heritability for uniformity of body weight () on the additive scale was calculated as [8, 40]. Similarly, the common environmental effect was calculated as for body weight and as for uniformity of body weight . The genetic coefficient of variation for uniformity of body weight (GCV) was calculated as . Standard errors of and GCV were approximated using the equations presented by Mulder et al. .
Two genetic evaluations, i.e., BLUP with A and ssGBLUP with H, were performed in a 10-fold cross-validation using the genetic parameters estimated based on the sire-dam DHGLM and !BLUP option in ASReml. In total, four models were used in the 10-fold cross-validation, i.e. animal DHGLM with either A or H on stdWT and lnWT.
The 10-fold cross-validation was performed on standardized and transformed body weight data as follows:
Finally, average Pearson, Kendall, and Spearman correlations, MSEP and their standard error (SE) over the 10 folds were calculated. A 95% confidence interval of the difference (d) in the predictive ability from different models with either A or H was constructed using d ± 1.96 × SEd, where the . When 0 was not within the 95% confidence interval, the predictive abilities of two models were considered statistically different (P < 0.05).
Individual residuals estimated from using the A (BLUP) and H matrices (ssGBLUP) were plotted against each other to examine their relationship. As expected, the range of residual estimates from the animal models was lower than that from the sire-dam model since residual estimates from the sire-dam model included the entire Mendelian sampling term (Fig. 1).
For the sire-dam model, the use of H instead of A did not affect estimated residuals of genotyped animals since the regression coefficient of the estimated residuals using H on the estimated residuals using A and the Pearson correlation between the two were equal to 0.999, which was very similar to the regression coefficient of non-genotyped animals (0.998). The Pearson correlations between estimated residuals using H and A were the same as regression coefficients of estimated residuals using A on estimated residuals using H for genotyped animals (0.999) and non-genotyped animals (0.998).
In contrast, the use of H in the animal model affected residual estimates of genotyped animals since their distribution was much more scattered (Fig. 1). The slope of estimated residuals using A on estimated residuals using H was lower than 1 and slightly steeper for genotyped animals (regression coefficient = 0.7025) than for non-genotyped animals (regression coefficient = 0.6798). The Pearson correlations between estimated residuals using H and A were equal to 0.922 for genotyped animals and 0.966 for non-genotyped animals.
When using the sire-dam model, the difference in estimated residuals with H and A was small and ranged from −10.8 to 10.0. When using the animal model, this difference was larger and ranged from −95.3 to 104.5.
For body weight, estimates of additive genetic variances from the sire-dam DHGLM with either A or H were similar (Table 2). Likewise, estimates of h2 were similar with A and H for both traits: 0.266 and 0.296, respectively for stdWT and 0.325 and 0.346, respectively for lnWT.
When using A, the estimate of was higher for uniformity of stdWT (0.036) than for uniformity of lnWT (0.015), while the use of H did not affect the magnitude of for uniformity. Standard errors of estimates were, however, high (Table 2).
Although the estimates of were low, estimates of GCV were high for uniformity of stdWT (48.0% for A and 52.3% for H), which indicates substantial genetic potential for response to selection. After accounting for scale effects, estimates of GCV for uniformity of lnWT were reduced to 30% (for both A and H), which supports the existence of genetic variation for uniformity beyond the scale effects.
Estimates of c2 for stdWT and lnWT were moderate and similar for A (0.103 to 0.117) and H (0.103 to 0.111), which suggests that part of the phenotypic variation was explained by non-genetic effects that are common to full-sibs. Instead, the estimates of for uniformity of stdWT and lnWT were very low and ranged from 0.001 to 0.022 for A and 0.002 to 0.019 for H (Table 2).
The estimate of the genetic correlation between stdWT and its uniformity was close to 1, using either A (0.952) or H (0.951), which shows the high dependency between mean and variance of body weight. However, the estimate of the genetic correlation between lnWT and its uniformity was reduced to −0.093 with A and to 0.024 with H, which suggests that after accounting for the scale effects, the mean and variance became independent.
The use of H instead of A with the animal DHGLM resulted in more variation of the within-family GEBV for stdWT and its uniformity, compared to within-family EBV (Fig. 2; Additional file 1: Figure S1), for sire-dam DHGLM).
The average correlation of adjusted phenotypes with predicted breeding values for stdWT and its uniformity was significantly higher with H (stdWT = 0.443; uniformity = 0.217 to 0.317) than with A (stdWT = 0.372; uniformity = 0.128 to 0.192). However, after accounting for scale effects using log-transformation, the average Pearson, Kendall and Spearman correlations of adjusted phenotypes with predicted breeding values for lnWT and their uniformity were only slightly higher with H than with A, and not significantly different from each other (P > 0.05).
The average MSEP for uniformity from the animal DHGLM (0.608 to 0.944) were lower than those from the sire-dam DHGLM (0.973 to 1.112), suggesting that the use of an animal DHGLM increases the accuracy and may reduce bias in predicting breeding values for uniformity (Table 3; Additional file 2: Table S1). However, the average MSEP for uniformity of stdWT and lnWT obtained with H (0.608 to 0.944) were not notably different from those obtained with A (0.625 to 0.936).
The predictive ability of EBV of uniformity was sensitive to the type of correlation used, i.e. Pearson, Kendall and Spearman (Table 3). Spearman correlations were 39.1 to 49.0% higher than Kendall correlations. Predictive abilities of EBV and GEBV for uniformity of lnWT differed more from each other based on Kendall and Spearman correlations, albeit not significant at P < 0.05, than based on Pearson correlations. However, the SE of Kendall correlations were approximately 50% lower than the SE of Pearson and Spearman correlations, suggesting that Kendall correlations provide a more reliable estimate of predictive ability than Pearson and Spearman correlations.
To the best of our knowledge, this is the first study that compares the use of the numerator relationship (A) and a combined genomics and numerator relationship (H) matrix for estimating genetic parameters and predicting breeding values for body weight and its uniformity. The use of the animal DHGLM with H significantly improved the predictive ability of GEBV for uniformity of body weight (stdWT) but not for scale-adjusted uniformity.
The estimate of heritability for uniformity of stdWT from sire-dam DHGLM with A was low ( = 0.036) but higher than estimates of obtained in previous studies on rainbow trout [4, 8] and Nile tilapia [15, 16] ( = 0.016: min = 0.010: max = 0.024). However, after accounting for scale effects by logarithm transformations, the estimate of decreased to 0.014 to 0.015, which is in line with the previous reports that also used transformations [4, 8, 14–16].
Estimates of for stdWT and lnWT using the sire-dam DHGLM with H did not differ from those with A, which is in line with estimates of for uniformity of piglet birth weight obtained using either A or only the genomic relationship matrix (G) , while lower estimates were reported for environmental variance of somatic cell score in dairy cattle when using G compared to A . The similarity of the estimates of obtained by using A or H in this study can be explained by the very similar estimated residuals (proxy of uniformity) between non-genotyped and genotyped animals when using the sire-dam model with A and H. The sire-dam model only exploits relationships between sires and dams and does not exploit the full potential of the genotype-based relationships between animals, and especially between full-sibs. In contrast, residuals of genotyped animals estimated by using the animal model were more differentiated when either A or H was used, and likely more accurate than estimates of residuals for non-genotyped animals. However, in a DHGLM analysis, the sire-dam model provides less biased (co)variance components than the animal model , likely because of the dependence between estimates of the breeding value and residual of an individual, which are obtained from the same phenotype of body weight. The use of genomic relationships combined with numerator relationships is expected to reduce the dependency between EBV and estimated residuals because the EBV are more accurate. Therefore, we performed the animal DHGLM with H but the model did not converge when the variance components were estimated, which may be due to (1) the dependency between EBV and estimated residuals for body weight remaining high, or (2) the difficulty to disentangle genetic effects from the common environmental effects for uniformity of body weight.
The standard errors of estimates were high, which may be due to the large variation in family size (4 to 54). According to Hill and Mulder , large family sizes or repeated measurements are recommended for estimating the genetic heteroscedasticity of traits. The optimal full-sib family size is 39 with a GCV of 39% and an h2 of 0.36 . The use of H did not affect the standard errors of the estimates, which does not agree with previous studies, for example Veerkamp et al.  reported lower standard errors of h2 estimates for dry matter intake, milk yield, and body weight of heifers when using genomic relationships with an animal model. One possible explanation could be that the benefit of using the genomic relationship matrix may be limited when the sire-dam DHGLM is applied, since the variance of genomic relationships between sires (0.02) was very similar to the variance of numerator relationships between sires (0.01). In contrast, the variance in genomic relationships between animals (0.01) was much larger than the variance of numerator relationships between animals (0.004). Hence, the SE of estimates may be lower when using an animal DHGLM with a genomic relationship matrix, compared to the numerator relationship matrix. Nevertheless, in our study, it was not possible to investigate this phenomenon since the log-likelihood did not converge for the animal DHGLM with H.
The GCV of uniformity for stdWT was substantial (48.0%), which indicates high potential for response to selection. This result is in the upper range of previous findings in fish species (17.4 to 64.0%) [4, 8, 14–16, 19] and in terrestrial animals (10.0 to 58.0%) [6, 9–13, 42]. After accounting for scale effects by logarithm transformations, the GCV for uniformity was reduced but still substantial (29.5 to 29.9%), which was also reported in previous studies on Atlantic salmon , rainbow trout , rabbit, and pig . Thus, scale effects affect estimates of genetic parameters for uniformity of body weight considerably, but there is genetic variation for uniformity beyond the scale effect.
Genomic information slightly increased the GCV of uniformity of stdWT (from 48.0 to 52.3%). In contrast, genomic information did not influence the GCV for uniformity of lnWT (29.8%). Since estimates of genetic parameters obtained with A and H were similar, the GCV for body weight remained similar, which is in agreement with the previous comparison between A (GCV = 11.0 to 12.0%) and G (GCV = 10.0 to 11.0%) for uniformity of birth weight of piglets using a dam model .
In this study, we used the Pearson correlation of EBV and GEBV with adjusted phenotype as the measure of predictive ability. The use of H instead of A in the animal DHGLM significantly improved the ability to predict breeding values for stdWT (19%) and its uniformity (41.1 to 78.1%). Furthermore, the use of the animal DHGLM instead of the sire-dam DHGLM significantly increased the predictive ability of EBV and GEBV for uniformity (see Additional file 2: Table S1), as expected.
Our findings indicate that ssGBLUP with an animal DHGLM can increase the accuracy of EBV for uniformity substantially compared to pedigree-based BLUP. However, after accounting for scaling effects by using log transformations, the use of H compared to A only slightly improved the correlation (1.6 to 13.9%) and MSEP between GEBV and adjusted phenotypes, and the improvement was not significant. There are two main reasons why these results differed between uniformity of stdWT and lnWT. First, log-transformation substantially reduced the genetic correlation of stdWT with its uniformity. As a result, any increases in predictive ability of GEBV of lnWT when using H instead of A (15.7%) did not positively influence the predictive ability of GEBV of its uniformity. Second, the lower additive genetic variance and for uniformity after accounting for scale effects reduces the accuracy of EBV for uniformity. Consequently, MSEP increased from 0.63 (stdWT) to 0.94 (lnWT) with A and from 0.61 (stdWT) to 0.94 (lnWT) with H in the animal DHGLM.
The accuracy of genomic selection is expected to increase when the number of genotyped animals in the reference population increases for any trait  but in particular for lowly heritable traits, such as uniformity as shown by Sell-Kubiak et al.  and somatic cell score by Mulder et al. . In this study, uniformity of lnWT had an even lower than uniformity of stdWT. The number of genotyped animals in the reference population used for cross-validation was on average equal to 1274, which may have limited the benefit of using genomic information in ssGBLUP. A future empirical study should investigate the effect of the number of genotyped animals in the reference population on the ability to predict breeding values for uniformity to validate our findings and conclusions.
Squared residuals or adjusted phenotype for uniformity ( are exponentially rather than normally distributed, which may not justify quantifying predictive ability using a Pearson correlation. Thus, we also calculated distribution-free rank correlations (Kendall and Spearman) and indeed found estimation of the predictive ability for uniformity to be sensitive to the type of correlation used.
Although not significantly different, Kendall and Spearman correlations explained differences in predictive ability of EBV and GEBV for uniformity of lnWT slightly better than the Pearson correlation. Hence, the conclusion that the benefit of using genomic relationship for computing EBV for uniformity of logarithm transformations is limited remained the same when using Kendall and Spearman correlations. Colwel and Gillett  showed that, in general, estimates of Kendall correlations are similar to estimates of Spearman correlations, but in some cases, the magnitude of Spearman correlations can be 50% greater than the magnitude of Kendall correlations . This is in line with our findings since Spearman correlations were 42.2 to 49.0% and 39.1 to 46.1% greater than Kendall correlations for the sire-dam DHGLM (see Additional file 2: Table S1) and the animal DHGLM, respectively. Nevertheless, the SE of Kendall correlations were notably lower by approximately 50% than the SE of Pearson and Spearman correlations, which indicates that the Kendall correlation may be a more reliable estimate of predictive ability than the Pearson and Spearman correlations. Hence, it is recommended to use Kendall instead of Pearson correlations when studying predictive ability for uniformity.
For fish breeding, major goals are to increase mean body weight and reduce variability (more uniformity) of body weight. Nevertheless, definitions of uniformity of stdWT and lnWT are not the same. From a biological point of view, genetic variation for environmental canalization can be quantified after the scale effect is accounted for. However, from an animal breeding point of view, uniformity on the observed scale explains the actual range of fish sizes that are processed by aquaculture industries.
Selection for body weight and uniformity may be challenging because the genetic correlation between body weight and its variability is high and positive, and sometimes approaches 1. A general observation is that the genetic correlation between log-transformed body weight and its variability is zero or even negative, allowing selection to simultaneously increase transformed body weight and reduce variability. Therefore log-transformed body weight and its variability could be included in a selection index. This would require knowledge of the genetic correlation between variability of stdWT and variability of lnWT, which is not available. Hence, we used the sire-EBV and sire-GEBV, obtained from BLUP and ssGBLUP with the animal DHGLM, to calculate the Pearson correlations between EBV for the trait and its variability. Pearson correlations between EBV for variability of stdWT and EBV for variability of lnWT were positively moderate with BLUP (0.48) and ssGBLUP (0.68). The Pearson correlation between EBV for stdWT and EBV for its variability was close to 1 with either BLUP (0.96) or ssGBLUP (0.97), while the Pearson correlation between EBV of lnWT and its variability was −0.24 for BLUP and -0.005 for ssGBLUP. Not surprisingly, the Pearson correlation between EBV for lnWT and EBV for stdWT was highly positive (0.82). These Pearson correlations suggest that variability of lnWT should be included in the selection index because the GEBV for variability of lnWT are positively correlated with GEBV for variability of stdWT, which indicates that selection against variability of lnWT will indirectly reduce variability on the observed scale. Furthermore, GEBV for variability of lnWT are not correlated to GEBV for lnWT and thus selection against variability of lnWT will not indirectly reduce lnWT.
To elucidate responses to selection for body weight and its variability, we performed truncation selection on sires based on their GEBV from the animal DHGLM with ssGBLUP. The breeding goal could include body weight and variability on the observed scale and their economic values (v): v1BVstdWT - v2BVvariabilitystdWT, while, based on the Pearson correlations discussed above, the selection index (I) could include lnWT and its variability with their relative weighting factors (b):
Selecting the 10% best sires on an index with b1 of 0.3, i.e., I = 0.3 ∗ GEBVlnWT - 0.7 ∗ GEBVvariabilitylnWT provides almost no genetic gain in variability of stdWT (−0.001) but positive genetic gain in stdWT (3.62% of mean body weight in g). In contrast, a selection index, based on breeding goal traits:I = 0.52 ∗ GEBVstdWT - 0.48 ∗ GEBVvariabilitystdWT, provides zero genetic gains for both stdWT and its variability, showing no possibility to achieve genetic gain on body weight while maintaining stable phenotypic variability. Nevertheless, the genetic gain in stdWT was much greater (17.32% of mean body weight in g) when variability was not included in the selection index. Therefore, although it is possible to increase body weight while keeping variability constant, there is a trade-off in genetic gain for body weight when selecting for reduced variability.
The use of the animal DHGLM instead of the sire-dam DHGLM substantially increased the predictive ability for breeding values of uniformity, because the animal DHGLM fully exploits the relationships between full- and half-sibs. When using the animal DHGLM, the use of a combined numerator and genomic relationship matrix significantly increased the predictive ability for breeding values of uniformity of body weight, but only a slight and non-significant increase was observed after accounting for the scale effects by using transformed body weights. The small increase in predictive ability with transformed body weights may be due to lower heritability for uniformity of transformed body weight, a lower genetic correlation between transformed body weights and their uniformities, and/or a small number of genotyped animals in the reference population. The use of a Kendall correlation provided the lowest SE of predictive ability for uniformity and provided a more accurate estimate of predictive ability for uniformity over Pearson and Spearman correlations. In conclusion, the use of ssGBLUP increases the accuracy of breeding values for uniformity of harvest weight, which is expected to increase response to selection in uniformity.
PSL analyzed the data. HAM, AK, and ML provided theoretical support for genomic DHGLM and cross-validation. HAM, AK, and ML contributed to the discussion of the results. PSL drafted the manuscript. HAM, AK, and ML improved the manuscript. All authors read and approved the final manuscript.
This study is a part of the research project entitled STABLEFISH funded by Norwegian Research Council (NRC: 234144/E49). We would like to thank SalmoBreed for providing data for this study. Matthew Baranski genotyped the animals and generated genotype data file for this study. Arthur Gilmour is acknowledged for help in implementing genomic DHGLM in ASReml v4. PSL would like to thank Mario Calus for his guidance on Cal_grm computer software, Solomon Antwi Boison and Sergio Vela Avitúa for a fruitful discussion during the drafting of this manuscript.
The authors declare that they have no competing interests.
Electronic supplementary material
The online version of this article (doi:10.1186/s12711-017-0308-3) contains supplementary material, which is available to authorized users.
Panya Sae-Lim, Email: email@example.com.
Antti Kause, Email: firstname.lastname@example.org.
Marie Lillehammer, Email: email@example.com.
Han A. Mulder, Email: firstname.lastname@example.org.