|Home | About | Journals | Submit | Contact Us | Français|
Genetic imprinting, by which the expression of a gene depends on the parental origin of its alleles, may be subjected to reprogramming through each generation. Currently, such reprogramming is limited to qualitative description only, lacking more precise quantitative estimation for its extent, pattern and mechanism. Here, we present a computational framework for analyzing the magnitude of genetic imprinting and its transgenerational inheritance mode. This quantitative model is based on the breeding scheme of reciprocal backcrosses between reciprocal F1 hybrids and original inbred parents, in which the transmission of genetic imprinting across generations can be tracked. We define a series of quantitative genetic parameters that describe the extent and transmission mode of genetic imprinting and further estimate and test these parameters within a genetic mapping framework using a new powerful computational algorithm. The model and algorithm described will enable geneticists to identify and map imprinted quantitative trait loci and dictate a comprehensive atlas of developmental and epigenetic mechanisms related to genetic imprinting. We illustrate the new discovery of the role of genetic imprinting in regulating hyperoxic acute lung injury survival time using a mouse reciprocal backcross design.
Genomic imprinting is a genetic phenomenon by which certain genes are expressed or repressed depending on which parent the gene was inherited [1–3]. These so-called imprinted genes violate the classical Mendelian inheritance, which are either expressed only from the allele inherited from the mother, such as H19 or CDKN1C [4, 5], or from the allele inherited from the father, such as IGF-2 . From a quantitative genetic perspective, genomic imprinting may provide the organisms that possess it with evolutionary merits by contributing additional genetic variation and conferring a fitness benefit in changing environments [7, 8]. Nowadays, different forms of genomic imprinting have been detected in a variety of species and thought to play an important role in regulating crucial aspects of embryonic growth and development as well as pathogenesis [2, 9]. Recent bioinformatic analyses suggest that the number of imprinted genes may be higher than we thought previously but this remains to be demonstrated experimentally [3, 10].
As a ubiquitous phenomenon, some properties of the imprinting mechanism have been already established; these include the modification of DNA and chromosomes in the form of DNA methylation and possibly heritable chromatin structures . Recently, a body of molecular evidence shows that epigenetic information experiences widespread erasure and reprogramming across generations [11, 12], leading to the transgenerational change of genetic imprinting. Despite an increasing interest in this area, many important issues remain to be resolved, which include the nature of the primary imprints that are inherited from the parental gametes, the genes that control the imprinting process , and the pattern of genetic imprinting that is transmitted into and shapes the epigenome of an individual’s progeny .
A strategy based on genetic mapping has been shown to be powerful for mapping imprinted quantitative trait loci (iQTL) that control complex traits [13–16]. Compared with the studies of molecular regulation related to epigenegtic variation, the advantage of genetic mapping lies in its quantification of the phenotypic effects of imprinted genes by identifying their number, genome-wide distribution and inheritance mode. Several new iQTLs have been identified for the mediation of body growth in mice , endosperm development in maize  and canine hip dysplasia . More recently, Wang et al.  proposed an innovative reciprocal F2 design for studying the effects of imprinting loci and their interactions with other genetic effects (including the additive, dominant and epistatic). A computational model was derived for characterizing how genetic imprinting effects transmit from generation to generation through these previously unknown genetic interactions. Their model not only confirms the existence of iQTLs for hyperoxic acute lung injury (HALI) survival time in mice , but also provides new insights into the genetic control mechanisms of this trait.
In this article, we present a reciprocal backcross design in which the pattern of genetic imprinting can be estimated and mapped. Different from Wang et al.’s F2 design  focusing on the test of interactions between genetic imprinting and other genetic components, the new design quantifies the pattern of how genetic imprinting that forms in a parental generation affects the performance of a complex trait in next generations. The new design will also have power to study how genetic imprinting affects the epigenomes of parents and their progeny. In so doing, a series of quantitative genetic parameters that describe the extent and transmission mode of genetic imprinting are defined. A powerful computational algorithm is derived to estimate and test these parameters within the framework of genetic mapping. By analyzing a mouse data set with eight reciprocal backcrosses , the new approach identified the same iQTL for HALI survival time, as detected by Wang et al.’ F2 design , but illustrated the new discovery of the role of genetic imprinting in this trait through transgenerational transmission. We have performed simulation studies to investigate the statistical properties of the approach, validating its use in dictating a comprehensive atlas of developmental and epigenetic mechanisms related to genetic imprinting.
Consider two contrasting inbred lines, each of which can serve as a maternal and paternal parent. Two F1 families are produced from reciprocal crosses. Because of the assumption of parent-of-origin effects, the two families would be different in phenotypic traits. Progeny of different sexes from each F1 family are reciprocally backcrossed with each original parental line, leading to eight possible backcross families. Using a quantitative trait locus (QTL) with two alleles and , we illustrate such a backcross breeding scheme involving the original parents, reciprocal F1 families and reciprocal backcross families (Figure 1).
In each backcross population, the same panel of molecular markers is genotyped and also the same trait of interest is phenotyped. An integrative linkage map that covers the genome can be constructed by a linkage analysis with these markers in all these backcross families. The construction of such a map is used to identify imprinted quantitative trait loci (iQTLs) that control the trait. Next, we describe a new model which has power to study how the effect of an iQTL is transmitted from the parental generation to next generation.
Two inbred lines with genotypes and at a QTL, respectively, produce two reciprocal F1 families, and , where the first allele is inherited from the maternal parent and the second allele inherited from the paternal parent. Different from traditional Mendelian genetics, we will consider and as two different genotypes. By reciprocally backcrossing with the original inbred lines, eight backcrosses will be produced. Figure 1 shows a visual representation of this breeding strategy. Each backcross has two segregating genotypes, which together have 16 genotypes if different parent-of-origins of alleles are considered (Table 1).
Let μkj denote the genotypic value of a parent-of-origin-specific genotype, where k=1,…,8 is the backcross identity and j=1, 2 is the QTL genotype within each backcross. Table 1 lists different genetic components of any μkj. Below, we explain the genetic meaning of each component.
All 16 backcross genotypes are sorted into four groups AA, Aa, aA and aa. Without considering the influence of genetic imprinting produced in the F1 family, these four groups should contain the additive (a), dominant (d) and imprinting effects (i, due to the difference between Aa and aA). If the parental imprinting is considered, we will need to define additional parameters to describe μkj. Let us first consider genotype AA. There are four types of backcrosses which produce AA, which are shown as follows:
where a is the additive effect which is positive for genotype AA; i1 is the genetic imprinting effect, produced in reciprocal F1 crosses of two inbred lines, is transmitted to genotype AA through the paternal parent of the backcross; i2 is the genetic imprinting effect, produced in reciprocal F crosses of two inbred lines, is transmitted to genotype AA through the maternal parent of the backcross; and I1 is the genetic imprinting effect due to the difference of the imprinted F progeny as the maternal and paternal parents of the backcross, which is transmitted to genotype AA.
Similarly, the genotypic value of genotype Aa can be partitioned into the following components:
where d is the dominant effect due to the allelic interaction between A and a; i is the genetic imprinting effect which is positive for genotype Aa; i3 is the genetic imprinting effect, produced in reciprocal F1 crosses of two inbred lines, is transmitted to genotype Aa through the paternal parent of the backcross; i4 is the genetic imprinting effect, produced in reciprocal F1 crosses of two inbred lines, is transmitted to genotype Aa through the maternal parent of the backcross; and I2 is the genetic imprinting effect due to the difference of the imprinted F1 progeny as the maternal and paternal parents of the backcross, which is transmitted to genotype Aa.
The genotypic value of genotype aA can be partitioned into the following components:
where d is the dominant effect due to the allelic interaction between A and a; i is the genetic imprinting effect which is negative for genotype aA; i5 is the genetic imprinting effect, produced in reciprocal F1 crosses of two inbred lines, is transmitted to genotype aA through the paternal parent of the backcross; i6 is the genetic imprinting effect, produced in reciprocal F1 crosses of two inbred lines, is transmitted to genotype aA through the maternal parent of the backcross; and I3 is the genetic imprinting effect due to the difference of the imprinted F1 progeny as the maternal and paternal parents of the backcross, which is transmitted to genotype aA.
The genotypic value of genotype aa can be partitioned into the following components:
where a is the additive which is negative for genotype aa; i7 is the genetic imprinting effect, produced in reciprocal F1 crosses of two inbred lines, is transmitted to genotype aa through the paternal parent of the backcross; i8 is the genetic imprinting effect, produced in reciprocal F1 crosses of two inbred lines, is transmitted to genotype aa through the maternal parent of the backcross; and I4 is the genetic imprinting effect due to the difference of the imprinted F1 progeny as the maternal and paternal parents of the backcross, which is transmitted to genotype aa.
The genetic effect parameters for iQTLs described above can be estimated using a genetic mapping model. The Expectation-Maximization (EM) algorithm was implemented to estimate these parameters. Furthermore, each of these effects can be tested individually or jointly, depending on the purpose of the mapping study. Note that the signs of the estimates of each parameter will explain the direction of expression of the alleles inherited from two original inbred lines.
Consider the same pair of markers, typed for the eight backcrosses, between which a QTL is assumed to be located. For any progeny i within a backcross k, we derive the conditional probability of a QTL genotype j, (expressed as ), conditional upon the genotype for the two markers which this subject carries .
The joint likelihood of phenotypic values by combing these eight backcrosses is written as
where nk is the size of backcross k and fjk (yki) is the normal distribution of the trait with mean μjk and variance σ2k.
The EM algorithm is implemented to estimate the genotypic means and variances for each backcross. In the E step, the posterior probabilities of QTL genotype j carried by progeny i within backcross k is calculated using
In the E step, genotypic values and variances are calculated by the following log-likelihood equations:
Both the E and M steps are iterated until the estimates converge to a stable value. The stable estimates are the maximum likelihood estimate (MLE) of the parameters.
The MLEs of various genetic components can be estimated by solving a group of regular equations in Table 1. They are expressed as
The existence of a QTL can be tested by formulating the hypotheses
The log-likelihood ratio test statistic is calculated under the H0 and H1. The critical threshold for claiming the existence of a significant QTL is determined from permutation tests . To overcome heavy computational burden of permutation tests, Chang et al.  proposed a score statistic to determine the critical threshold. Each genetic component contributes to genotypic values of backcross genotypes. Their individual contributions can be tested by formulating the null hypothesis of letting them equal zero, separately.
It is interesting to test how different backcross genotypes are affected by imprinting effects that are produced in the previous generation. For example, the genotype AA contains imprinting effects i1, i2 and I1. The influences of these imprinting effects on genotypic value of AA can be tested by formulating the null hypothesis,
Similarly, the influences of imprinting effects on genotypic values of other genotypes Aa, aA and aa can be tested using
Except for the overall test (9) of the existence of a QTL in which the QTL position in the H0 is not identifiable, all the hypotheses have an H0 nested within an H1 so that the log-likelihood ratios for these hypotheses can be thought of as being asymptotically distributed with the degree of freedom equal to the difference in the numbers of parameters between the H0 and H1.
Prows et al.  detected a pronounced difference in HALI survival time between two reciprocal F1 families derived from inbred lines C57BL/6J and 129X1/SvJ, suggesting a significant imprinting effect formed during the F1 cross. Two contrasting mouse strains in survival time due to HALI, sensitive C57BL/6J (B) (i.e. die early) and resistant 129X1/SvJ (S), are reciprocally crossed to generate two types of F1 families, (BS and SB). Then, using the strategy shown in Figure 1, eight reciprocal backcrosses were generated. A total of 935 backcross mice was composed of 154 for B(BS), 105 for B(SB), 97 for (BS)B, 100 (SB)B, 122 for S(BS), 94 for S(SB), 106 for (BS)S and 157 for (SB)S. All 935 backcross mice were typed for 78 polymorphic microsatellite markers. An integrated linkage map for the eight backcrosses was constructed using these markers distributed at 20–25 cM intervals across the 19 autosomes. The phenotype used for QTL mapping, HALI survival time, was log transformed, since the transformed data better display a normal distribution. For a description of breeding schemes, DNA analysis, map construction and phenotypic measurement see .
By analyzing the data of reciprocal mouse backcrosses, three significant QTLs were identified for this trait located near marker Mit303 on chromosome 1, between markers Mit17 and Mit145 on chromosome 4, and between markers Mit251 and Mit5 on chromosome 15 (Supplementary Figure S1). The three QTLs were confirmed in F2 reciprocal crosses by Prows et al.  who named these QTLs Shali1, Shali2, and Shali3, respectively. Using a traditional interval mapping approach for analyzing current reciprocal backcross data, the third QTL was detected in the backcrosses inbred with 129X1/SvJ and the F1 parent, but not detected in the backcrosses inbred with C57BL/6J and the F1 parent. Shali3 (chromosome 15) was also identified by our model that incorporates the imprinting inheritance of a QTL.
The model allows the dissection of the phenotypic value for each detected QTL into different genotypic components; results are tabulated in Table 2. Hypothesis tests for each of these components were performed to determine their significance. At QTL , we did not see significant additive (a) and dominant effects (d) expressed in the backcross, but a significant imprinting effect (i,P=4.44×10−5) was identified. This imprinting effect is due to the stronger expression of the allele inherited from the C57BL/6J line over the allele from the 129X1/SvJ line. The imprinted effect formed during the F1 cross will be transmitted into backcross genotypes in a different manner. This type of imprinting effect can be transmitted into genotypes AA and aa through the paternal F1 parent (i1, P=1.80×10−7; i7, P=8.34×10−8) but not through the maternal F1 parent (i2, P=0.80; i8, P=0.026). The use of the imprinted F1 as a maternal or paternal parent for the backcross does not provide a significant effect on HALI survival time for these two genotypes (I1,P=0.817; I4,P=0.017). A different pattern was observed for the influences of the imprinting F effect on backcross genotypes Aa and aA (Table 2).
At QTL , a significant dominant effect was found, but no additive and imprinting effects were significant (Table 2). At QTL Shali3, the additive effect is more significant than the dominant effect, but there is no imprinting effect. Despite, no significant imprinting effect detected for these two QTLs, the imprinting effects expressed in the F1 cross can be transmitted to their different backcross genotypes through various patterns of transmission. For example, through both maternal and paternal F1 parents, this imprinting effect is expressed in backcross genotype Aa; yet in genotype aa no imprinting effect was transmitted. It appears that the transmission of the F1 imprinting effect through the maternal parent was important for backcross genotypes AA and aA. Specific patterns of transmission can be identified for different genotypes at the QTL on chromosome 15.
Figure 2 demonstrates the differences of the same backcross genotype at QTL when it was generated through different breeding schemes. If there was no imprinting effect, the same genotype should have no difference among these schemes. For example, genotype AA was supposed to have a genotypic value μ+a, but owing to the occurrence of i1, i2 and I1, it was significantly different among different breeding schemes (Figure 2A). The cumulative effect of these three parameters was tested using hypothesis (10), which was found to be significant (P<10−4). There were different genotypic values for backcross genotype Aa under different breeding schemes because the contributions of i3, i4 and I2 are significant (Figure 2A). Similarly, we observed different genotypic values for both backcross genotypes aA (Figure 2A) and aa (Figure 2A) because their underlying components contribute significantly to the overall genotypic values. At QTL (Figure 2B) and QTL Shali3 (Figure 2C), the value of each genotype was found to differ, depending on the breeding scheme from which it is derived.
Within the same backcross, there are two different genotypes at each QTL. A traditional approach can only detect the difference of the two QTL genotypes due to an additive or dominant effect. However, our model can discern these differences by examining the occurrence of imprinting effects and their transmission patterns. For example, within backcross B(BS), two genotypes BB and BS are different purely due to the additive and dominant effects according to classic genetics, but their difference also includes imprinting effects formed in the F1 cross and expressed in each of these two genotypes, i.e. i1, i3, I1 and I2. Figure 3 illustrates these genotypic differences for each backcross.
By mimicking the reciprocal backcross scheme used in the above mouse example, we performed simulation studies to examine the statistical behavior of the new model. We simulated a new set of phenotypic data using the genetic effects estimated from QTL and residual errors scaled for a particular heritability level. Different sample sizes (100 and 400 for each backcross) and heritabilities (0.1 and 0.4) are assumed. Table 3 lists the results of the MLEs of different parameters and their standard errors. In general, all genetic parameters can be reasonably estimated when a sample size is 100 and heritability is 0.1. The excellent precision of the estimation of the imprinting effect derived from the difference of maternal and paternal F1 parent can be obtained by increasing the sample size for each backcross to 400. Of course, increased heritability by minimizing the noise of phenotypic measurement can always enhance our estimation even with a modest sample size (i.e. 100).
Given the estimated effect values for QTL , we ran an additional simulation to assess the power of our model by assuming different heritabilities and sample sizes (Table 4). Our model has good power to detect most of these genetic effects. For some small values, larger sample sizes and/or heritabilities are required. In any case, we found that our model has an acceptable false positive rates, usually with 0.04–0.06 (Table 4). Thus, there is a small possibility for our model to detect a significant genetic effect although it does not occur. This simulation provides a general guidance for designing any new cross experiment in terms of sample size determination in a hope to obtain convincing results.
As an epigenetic form of gene regulation, genomic imprinting influences the expression of marked or imprinted genes during gametogenesis and embryonic development in a parent-of-origin-specific manner. Imprinting functions through methylation and histone modifications, which are established in the germline and maintained throughout all somatic cells of an organism. Recent emerging evidence suggests that genetic imprinting may not be transferred between generations through epigenetic remodeling and reprogramming, in order to ensure the totipotency of the zygote and prevent perpetuation of abnormal epigenetic states [11, 26–29]. To address whether transgenerational epigenetic reprogramming occurs, Wang et al.  proposed a new mapping model for mapping imprinted quantitative trait loci (iQTLs) showing the transgenerational inheritance of imprinting effects using a reciprocal F2 design. By analyzing a published data set of mice , this model identified several iQTLs for survival time due to HALI. The main result obtained was that, while most of genetic imprinting effects established in the previous generation might be erased, some were still transmitted to the next generations through their interactions with other genetic effects.
The model proposed for a reciprocal backcross design in this article builds upon Wang et al.’s work to make two key contributions. First, it provides successful inferences about parent-of-origin effects at iQTLs for HALI survival time from an extensive data analysis of mouse reciprocal backcrosses . The new model confirms the QTLs (i.e. and ) discovered by traditional mapping approaches, validating the genetic usefulness of our model. Second, we here put forward a series of new genetic parameters that define the transgenerational inheritance of iQTLs, greatly facilitating our understanding of the genetic mechanisms of imprinting effects. By linking these definitions with a mapping study, the new model allows the genome-wide scan and discovery of iQTLs, their number, the type and magnitude of their effects, their genetic interactions and genotype–environment interactions. Although many iQTLs have been identified for different traits in plants, animals and humans [3, 10, 17–19], this model will, for the first time, make it possible to study the interplay between iQTLs and transgenerational epigenetic inheritance. The newly defined parameters for genomic imprinting for HALI survival time will help understand the genetic control mechanisms of this trait in terms of the underlying inheritance, transmission and interactions.
Acute lung injury and adult respiratory distress syndrome, associated with 38.5% mortality or nearly 75000 deaths/year in the United States , are fatal to any population, especially older people. The genetic control of survival time due to this disease has been studied using the animal model system—mouse. Prows et al.  detected the two major QTLs ( and ) for HALI survival time in a reciprocal F2 population derived from sensitive C57BL/6J and resistant 129X1/SvJ inbred mouse strains, and these two QTLs were confirmed by a subsequent mapping study using the reciprocal backcross design produced by the same inbred lines . All these QTLs were further confirmed by Wang et al.’s imprinting model  and the model presented here. The new imprinting models provide an explanation about the genetic underpinnings for imprinting inheritance at these QTLs. For example, any identical genotype should be expressed equally under the same condition, but in our study it was found to have different values due to the impact of transgenerational imprinting effects (Figure 2). In Wang et al.  and here, we used HALI survival time as an example to assess the usefulness of these models. It can be anticipated that the models can be used to study any other quantitative traits. In other species like maize, similar cross schemes are made , thus the models will find its immediate application in general genetic studies.
Maternal effects may confound the estimation of genomic imprinting. By incorporating Cui’s model  into our four-way reciprocal crosses, it is possible to estimate and eliminate the confounding maternal effect from estimated imprinting effects. In addition, several mechanisms have evolved to erase the epigenetic marks, including germline and somatic reprogramming of DNA methylation and chromatin proteins. However, our previous  and current studies using different designs on the same data set found that at some iQTLs the epigenetic marks are not cleared across generations. Other examples of this include genomic imprinting in mammals, mating type switching in yeast and paramutation in plants . The resistance of these imprinted loci to reprogramming may be regarded as part of normal development, but they should not be independent of environmental triggers. Our models can be used to address fundamental issues of what is the extent of resistance to transgenerational epigenetic reprogramming and whether or not epigenetic marks established in response to environmental cues are also resistant . Further, when the processes of DNA methylation and chromatin proteins are integrated, our model will enable geneticists to predict which type of epigenetic marks will be erased and which will not be erased.
Joint grant DMS/NIGMS-0540745; the Changjiang Scholars Award; ‘One-thousand Person Plan’ Award at Beijing Forestry University.
Chenguang Wang obtained his PhD in Statistics at the University of Florida in 2010. He is a biostatistician in the Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration. His interest is in developing statistical models for genetic studies.
Zhong Wang obtained his PhD in Engineering Mechanics at Dalian University of Technology in 2000. He found and managed a software company in Japan from 2000 to 2008. He is a post-doctoral researcher in the Center for Statistical Genetics at the Pennsylvania State University. He writes computer software for statistical genetic models.
Daniel R. Prows obtained his PhD in Pharmaceutical Sciences at the University of Cincinnati in 1995. He is Associate Professor of Genetics at the University of Cincinnati College of Medicine. His interest is the genetic mapping of complex traits in mice. His research program has identified several important QTLs for hyperoxic acute lung injury survival time using backcross and F2 populations.
Rongling Wu obtained his PhD in Quantitative Genetics at the University of Washington in 1995. He is Professor of Statistics and Genetics and the Director of the Center for Statistical Genetics at the Pennsylvania State University. Dr Wu is also Changjiang Scholars Professor of Genetics and the Director of the Center for Computational Biology at Beijing Forestry University. His interest is to unravel the genetic roots for the outcome of a biological trait by dissecting the trait into its biochemical and developmental pathways. He uses multi-disciplinary tools, integrating genetics, molecular biology, statistics, mathematics, computer sciences and engineering, to solve genetic problems.