Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Med Image Comput Comput Assist Interv. Author manuscript; available in PMC 2010 July 9.
Published in final edited form as:
Med Image Comput Comput Assist Interv. 2009; 12(Pt 2): 232–239.
PMCID: PMC2900780

Mapping Growth Patterns and Genetic Influences on Early Brain Development in Twins*


Despite substantial progress in understanding the anatomical and functional development of the human brain, little is known on the spatial-temporal patterns and genetic influences on white matter maturation in twins. Neuroimaging data acquired from longitudinal twin studies provide a unique platform for scientists to investigate such issues. However, the interpretation of neuroimaging data from longitudinal twin studies is hindered by the lacking of appropriate image processing and statistical tools. In this study, we developed a statistical framework for analyzing longitudinal twin neuroimaging data, which is consisted of generalized estimating equation (GEE2) and a test procedure. The GEE2 method can jointly model imaging measures with genetic effect, environmental effect, and behavioral and clinical variables. The score test statistic is used to test linear hypothesis such as the association between brain structure and function with the covariates of interest. A resampling method is used to control the family-wise error rate to adjust for multiple comparisons. With diffusion tensor imaging (DTI), we demonstrate the application of our statistical methods in quantifying the spatiotemporal white matter maturation patterns and in detecting the genetic effects in a longitudinal neonatal twin study. The proposed approach can be easily applied to longitudinal twin data with multiple outcomes and accommodate incomplete and unbalanced data, i.e., subjects with different number of measurements.

Keywords: twin DTI study, pediatric imaging, GEE

1 Introduction

Longitudinal neuroimaging studies have grown rapidly for better understanding the progress of neuropsychiatric and neurodegenerative disorders or the normal brain development, and typical large-scale longitudinal studies include ADNI (Alzheimer's Disease Neuroimaging Initiative) and the NIH MRI study of normal brain as in [1]. Compared with cross-sectional neuroimaging studies, longitudinal neuroimaging follow-up may allow characterization of correlation between individual change in neuroimaging measurements (e.g., volumetric and morphometric) and the covariates of interest (such as age, diagnostic status, gene, and gender). Longitudinal design may also allow one to examine a causal role of time-dependent covariate (e.g., exposure) in disease process. A distinctive feature of longitudinal neuroimaging data is the temporal order of the imaging measures (see more discussions in [2, 3]). Particularly, imaging measurements of the same individual usually exhibit positive correlation and the strength of the correlation may decrease with prolonged time separation.

Twin neuroimaging studies are invaluable for disentangling the effects of genes and environments on brain functions and structures. The twin design typically compares the similarity of monozygotic twins (MZ, who are developed from a single fertilized egg and therefore share 100% of their genes) to that of dizygotic twins (DZ, who are developed from two fertilized eggs and therefore share on average 50% of their alleles). These known differences in genetic similarity, together with the assumption of equal environments for MZ and DZ twins allows us to explore the effects of genetic and environmental variance on a phenotype, such as brain structure. The current neuroimaging twin studies have focused upon locating the brain regions subject to either environmental factors or genetic factors. For instance, high heritability was found in intracranial volume, global gray and white matter volume [4], cerebral hemisphere volume [5]. Cortical thickness in sensorimotor cortex, middle frontal cortex and anterior temporal cortex were found to be under the influence of genetic factors [6]. High heritabilities were also located in paralimbic structures and temporal/parietal neocortical regions [7].

The longitudinal twin neuroimaging studies, which combine both the longitudinal design and the twin design, provide a unique platform for examining the effects of gene and environment on the development of brain functions and structures. To properly analyze the longitudinal twin imaging measures, any image processing and statistical tools must account for three key features: the temporal correlation among the repeated measures, the different genetic and environmental effects among MZ and DZ twins, and the spatial correlation between each twin pair. Failure to account for these three features can result in misleading scientific inferences [2]. However, advanced image processing and statistical tools designated to complex and correlated image data along with behavioral and clinical information remains lacking. The cross-sectional image processing and statistical tools may be useful for longitudinal twin imaging data, but they are not statistically optimal in power. To the best of our knowledge, most existing neuroimaging software platforms including SPM, AFNI, and FSL do not have any valid methods to process and analyze neuroimaging data from longitudinal twin studies.

We propose two statistical methods for the analysis of neuroimaging data from longitudinal twin studies. We develop second-order generalized estimating equations (GEE2) for jointly modeling univariate (or multivariate) imaging measures with covariates of interest in longitudinal twin studies (including genetic and environmental factors, behavioral and clinical variables). Compared with the structural equation modeling (SEM) for twin neuroimaging data, GEE2 avoids the assumption that latent genetic and environmental variables follow a Gaussian distribution. We develop a score test statistic to test linear hypotheses such as the associations between brain structure and function and covariates of interest. In order to adjust for multiple comparisons, a resampling method is used to control the family-wise error rate. We demonstrate the utility of the proposed approach in analyzing diffusion tensor imaging (DTI) data to quantify spatiotemporal patterns and detect genetic influences on early postnatal white matter development.

2 Methods

2.1 Image acquisition and preprocessing

Our study is approved by the institutional review board. A total of 30 pairs of same sex twins were recruited with the consents of parents. These subjects were followed longitudinally at the time close to birth, at 1 year and 2 years after birth. With missing data, a total of 142 datasets were obtained. All subjects were fed and calmed to sleep on a warm blanket inside the scanner wearing proper ear protection. All images were acquired using a 3T Allegra head only MR system with 6 encoding gradient directions with an istropic voxel size of 2 mm3. Two DTI parametric maps including fractional anistropy (FA) and mean diffusivity (MD) were computed with the diffusion tensor tool box in FSL ( In order to construct voxel based atlas, the FA images from all subjects were co-registered towards a template of a two-year old FA image (not a subject in this study) with a widely used elastic registration method HAMMER [8], which relies on neighborhood intensity distribution and edge information for image alignment instead of image intensity alone.

2.2 Generalized Estimating Equations

We observe imaging, behavioral and clinical data from n twins at mi time points tij for i = 1,..., n, j = 1,..., mi in a longitudinal study. Let xij = (xij,1,...xij,q)T be a qx1 covariate vector, which may contain age, gender, height, gene, and others. Note that the number of time points for the i-th twin mi may differ across twins. There are a total i=1nmi=N sets of images in this study. Based on observed image data, we compute neuroimaging measures, denotated by Yi = {yij (d) : d [set membership] D, j = 1,..., mi} across all mi time points from the i-th twin, where d represents a voxel (or a region of interest) on D, a specific brain area. For simplicity, we assume that imaging measure yij (d) = (yij,1 (d), yij,2 (d))T at voxel d is a 2x1 vector consisting of the same measure from two subjects within each twin.

We apply the second-order GEE method for jointly modeling univariate (or multivariate) imaging measures with covariates of interest in longitudinal twin studies (such as behavioral, clinical variables or genetic and environmental effects). The GEE2 explicitly introduces two sets of estimating equations for regression estimates on original data and covariance parameters, respectively. For notational simplicity, d is dropped from our notation temporarily.

To study the growth trajectories for imaging measures in healthy neonatal/pediatric subjects, we assume that the model for yij,k at the j-th time point for the i-th twin is


for i = 1,..., n, j = 1,..., mi where xij,1 is usually set to 1, xij,k (k ≥ 2) can be chosen as time, gender, gene, and others, and β is a qx1 vector.

For all measurements from the i-th twin, we can form a 2mi × 1 vector Yi = (yi1,1, yi1,2, ..., yimi,1, yimi,2)T and Ui(β) = (ui1,1, ui1,2, ..., uimi,1, uimi,2)T. To solve the regression coefficients in β = (β□,1, β□,2)T we construct a set of estimating equations given by


where Di = [partial differential]ui(β)/[partial differential]β and Vi is a working covariance matrix such as autoregressive structure. To study the genetic and environmental effects on imaging measures, we assume that


where εij,k is random error, a0:i,k, d0:i,k and c0:i,k are, respectively, the additive genetic, dominance genetic, and environmental residual random effects (so called ADE model in twin study) associated with intercept. as:i,k, ds:i,k and cs:i,k are the additive genetic, dominance genetic, and environmental residual random effects associated with time, respectively. We assume that εij,k, a0:i,k, d0:i,k, c0:i,k, as:i,k, ds:i,k and cs:i,k are independently distributed with zero mean and variances σe2, σ0,a2, σ0,d2, σ0,c2, σs,a2, σs,d2, and σs,c2, respectively. According to ADE models, we assume that cov(a0:i,1,a0:i,2)=σ0,a22, cov(d0:i,1,d0:i,2)=σ0,a24, cov(as:i,1,as:i,2)=σs,a22 and cov(ds:i,1,ds:i,2)=σs,a24 for DZ, and cov(a0:i,1,a0:i,2)=σ0,a2, cov(d0:i,1,d0:i,2)=σ0,a2, cov(as:i,1,as:i,2)=σs,a2 and cov(ds:i,1,ds:i,2)=σs,a2 for MZ. For model identifiability, we may drop either dominance genetic effect or environmental effect from the model.

Based on these assumptions, we calculate the covariance between y~ij,k=yij,kuij,k and y~ij,k=yij,kuij,k for any j, j′ and k, k′. Specifically, E(y~ij,ky~ij,k) can be expressed as


in which (zi,1, zi,2) takes (1,1) for either k = k′ or MZ and (0.5, 0.25) for DZ. For all products between y~ij,k and y~ij,k, we can form a mi(2mi+1)×1 vector Si=(y~i1,12,y~i1,1y~i1,1,,y~imi,22)T and Si(σ)=(σi,(1,1),(1,1),,σi,(mi,mi),(2,2))T.

To solve the regression coefficients in σ, we construct a set of estimating equations given by


Where, D~i=Si(σ)σ and VS,i is a working covariance matrix.

Applying GEE2 methods has many attractive advantages. First, this model proposed above is very flexible and free of distribution assumption. Second, the GEE2 estimator is consistent even we mis-specify the covariance structure Vi and VS,i. Third, our inferences using the empirical standard errors are robust even if our knowledge of the covariance structure is imperfect. Fourth, our GEE2 method avoids modeling the high order moments of imaging measures. Finally, it is computationally straightforward to compute GEE2 estimators β^ and σ^ by iterating between Eq. (2) and Eq. (5).

2.3 Hypothesis and Test Statistics

In longitudinal twin studies, one is interested in answering various scientific questions involving the asessment of brain development across time and the testing of genetic influences on brain structure and function. These questions concerning brain development can often be reformulated as either testing linear hypothesis of β as follows:


where R is an r × 2q matrix of full row rank and b0 is an r × 1 specified vector. The question concerning genetic effect on brain are usually formulated as testing


where Rs is an kx7 of full row rank. For instance, if we are interested in testing the genetic effect a0:i,k, then we choose Rsσ=a0,a2. To test these hypotheses in Eq. (6) and [7], we use the score test statistics with appropriate asymptotic null distributions [9]. A wild boostrap method was used to control for multiple comparisons. The proposed test procedure is computationally much more efficient than the permutation method.

3 Results

3.1 Growth Patterns

In the longitudinal analysis of the DTI images using GEE2 (Eq. (2) for growth pattern quantification), covariates of interest including intercept, age, age*age, zygote (0 for MZ and 1 for DZ) and zygote * times were tested for significance (Eq. (8)).


Significant contributions were only found for β1, β2 and β3. Thus, nonlinear changing patterns were observed in early postnatal stages for FA and MD. But no zygote related significance was detected. Squared ROIs with a fixed size (2×2 pixels) were drawn in axial view at posterior limb of internal capsules, external capsules bilaterally and at the centers of genu and splenium. The growth patterns of FA and MD from these regions are given in Fig. 1 for both MZ and DZ twins. There is a slight difference existed between the growth curves between MZ and DZ twins. Among these brain regions, external capsule and internal capsule respectively have the lowest FA and MD values in this period of time (Fig. 1).

Fig. 1
Temporal growth patterns for FA (nonlinear increase, left panel) and MD (nonlinear decrease, right panel) in both MZ (top panel) and DZ (bottom panel) twins in external capsule (EC), posterior limb of internal capsule (IC), genu (GE) and splenium (SP). ...

3.2 Genetic Influence

For model identifiability, we use AE model to estimate genetic influences on brain development. Since each twin pair share similar nurturing environment, the squared difference (sqd=[yij′,1uij′,1) – (yij′,2uij′,2)]2) between the DTI images from the same twin pair should exclude the environmental effect from analysis. In such a situation, Eq. (4) can be shortened as in Eq. (9). In our current implementation, statistical testing was performed with Eq. (10].


In Eq. (10), the two zygote related terms can be tested for the significance of static and dynamic genetic influences upon early brain development separately. Significant regions were found in left parietal white matter with FA, and significant regions in basal gangalia and right frontal white matter were identified with MD for term zygtote in Eq. (10). Thus, these regions demonstrate static genetic influence (Fig. 2). Furthermore, brain regions with significant genetic influence on growth were identified with MD in frontal, occipital and parietal white matter for term zygote* age2, which demonstrates dynamic genetic influence (Fig. 3).

Fig. 2
Regions under significant static genetic influence on growth in FA (left panel) and MD (right panel).
Fig. 3
Regions under significant dynamic genetic influence on growth in MD.

4 Discussion

In this study, we have demonstrated the potentials of using GEE2 based statistical methods in analyzing twin images in a longitudinal study. This work may be the first study to identify the growth patterns of DTI parameters in longitudinal twin study. Our preliminary results demonstrated that genetic influences upon brain development can be identified with the squared difference images under the assumption of equal environmental exposure. Furthermore, our approach may suggest the existence of dynamic component of genetic influences on brain development in this early postnatal stage.

There are several potential improvements can be made to the current approach. One is to use the two GEE equations (Eq. (2) and (5)) iteratively for joint estimation of growth patterns and genetic influences. Another extension is to use multivariate analysis to improve the sensitivity in detecting genetic related influences. At last, from imaging registration end, the statistical analysis will benefit from an improved registration of the DTI images across different ages.


*This study was supported in part by grants R01MH070890, R01NS055754, R21AG033387, UL1-RR025747-01, 1R01EB006733, R01EB008374, and 1R01EB009634 from NIH and grant BCS-08-26844 from NSF.


1. Almli CR, Rivkin MJ, McKinstry RC. The NIH MRI Study of Normal Brain Development (Objective-2): Newborns, Infants, Toddlers and Preschoolers. IEEE-TMI. 2007;35:308–25. [PubMed]
2. Diggle P, Heagerty P, Liang KY, Zeger S. 2nd Edition Oxford University; 2002. Analysis of Longitudinal Data.
3. Liang KY, Zeger SL. Longitudinal Data Analysis Using Generalized Linear Models. Biometrika. 1986;73:13–22.
4. Baare WF, Hulschoff HE, Boomsma DI, Posthuma D, Schnack HG, van Haren NE, van Oel CJ, Kahn RS. Quatitative Genetic Modeling of Variation in Human Brain Morphology. Cereb. Cortex. 2001;11:816–24. [PubMed]
5. Geschwind DH, Miller BL, DeCarli C, Carmelli D. Heritability of Lobar Brain Volumes in Twins Supports Genetic Models of Cerebral Laterality and Handedness. PNAS. 2002;99:3176–81. [PubMed]
6. Thompson PM, Cannon MD, Narr KL, van Erp T, Poutanen VP, Huttunen M, Lonnqvist J, Standerskjoid-Nordestam CG, Kaprio J, Khaledy M, Dail R, Zoumalan CL, Toga AW. Genetic influences on Brain Structure. Nat. Neurosci. 2001;4:1253–8. [PubMed]
7. Wright IC, Sham P, Murray RM, Weinberger DR, Bullmore ET. Genetic Contributions to Regional Variability in Human Brain Structure: Methods and Preliminary Results. Neuroimage. 2002;17:256–71. [PubMed]
8. Shen D. Image Registration by Local Histogram Matching. IEEE Trans Med Imaging. 2007;40:1161–1172.
9. Zhu H, Li YM, Tang NS, Bansal R, Hao XJ, Weissman MM, Peterson BS. Statistical Modelling of Brain Morphometric Measures in General Pedigree. Statistica Sinica. 2008;18:1554–1569.