PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1046721)

Clipboard (0)
None

Related Articles

1.  Estimating and Testing Variance Components in a Multi-level GLM 
NeuroImage  2011;59(1):490-501.
Most analysis of multi-subject fMRI data is concerned with determining whether there exists a significant population-wide ‘activation’ in a comparison between two or more conditions. This is typically assessed by testing the average value of a contrast of parameter estimates (COPE) against zero in a general linear model (GLM) analysis. However, important information can also be obtained by testing whether there exist significant individual differences in effect magnitude between subjects, i.e. whether the variance of a COPE is significantly different from zero. Intuitively, such a test amounts to testing whether inter-individual differences are larger than would be expected given the within-subject error variance. We compare several methods for estimating variance components, including a) a naïve estimate using ordinary least squares (OLS); b) linear mixed effects in R (LMER); c) a novel Matlab implementation of iterative generalized least squares (IGLS) and its restricted maximum likelihood variant (RIGLS). All methods produced reasonable estimates of within- and between-subject variance components, with IGLS providing an attractive balance between sensitivity and appropriate control of false positives. Finally, we use the IGLS method to estimate inter-subject variance in a perfusion fMRI study (N = 18) of social evaluative threat, and show evidence for significant inter-individual differences in ventromedial prefrontal cortex (VMPFC), amygdala, hippocampus and medial temporal lobes, insula, and brainstem, with predicted inverse coupling between VMPFC and the midbrain periaqueductal gray only when high inter-individual variance was used to define the seed for functional connectivity analyses. In sum, tests of variance provides a way of selecting regions that show significant inter-individual variability for subsequent analyses that attempt to explain those individual differences.
doi:10.1016/j.neuroimage.2011.07.077
PMCID: PMC3195889  PMID: 21835242
fMRI; variance components; multi-level GLM; likelihood ratio tests; iterative generalized least squares; restricted iterative generalized least squares
2.  Variance Components and Genetic Parameters for Milk Production and Lactation Pattern in an Ethiopian Multibreed Dairy Cattle Population 
The objective of this study was to estimate variance components and genetic parameters for lactation milk yield (LY), lactation length (LL), average milk yield per day (YD), initial milk yield (IY), peak milk yield (PY), days to peak (DP) and parameters (ln(a) and c) of the modified incomplete gamma function (MIG) in an Ethiopian multibreed dairy cattle population. The dataset was composed of 5,507 lactation records collected from 1,639 cows in three locations (Bako, Debre Zeit and Holetta) in Ethiopia from 1977 to 2010. Parameters for MIG were obtained from regression analysis of monthly test-day milk data on days in milk. The cows were purebred (Bos indicus) Boran (B) and Horro (H) and their crosses with different fractions of Friesian (F), Jersey (J) and Simmental (S). There were 23 breed groups (B, H, and their crossbreds with F, J, and S) in the population. Fixed and mixed models were used to analyse the data. The fixed model considered herd-year-season, parity and breed group as fixed effects, and residual as random. The single and two-traits mixed animal repeatability models, considered the fixed effects of herd-year-season and parity subclasses, breed as a function of cow H, F, J, and S breed fractions and general heterosis as a function of heterozygosity, and the random additive animal, permanent environment, and residual effects. For the analysis of LY, LL was added as a fixed covariate to all models. Variance components and genetic parameters were estimated using average information restricted maximum likelihood procedures. The results indicated that all traits were affected (p<0.001) by the considered fixed effects. High grade B×F cows (3/16B 13/16F) had the highest least squares means (LSM) for LY (2,490±178.9 kg), IY (10.5±0.8 kg), PY (12.7±0.9 kg), YD (7.6±0.55 kg) and LL (361.4±31.2 d), while B cows had the lowest LSM values for these traits. The LSM of LY, IY, YD, and PY tended to increase from the first to the fifth parity. Single-trait analyses yielded low heritability (0.03±0.03 and 0.08±0.02) and repeatability (0.14±0.01 to 0.24±0.02) estimates for LL, DP and parameter c. Medium heritability (0.21±0.03 to 0.33±0.04) and repeatability (0.27±0.02 to 0.53±0.01) estimates were obtained for LY, IY, PY, YD and ln(a). Genetic correlations between LY, IY, PY, YD, ln(a), and LL ranged from 0.59 to 0.99. Spearman’s rank correlations between sire estimated breeding values for LY, LL, IY, PY, YD, ln(a) and c were positive (0.67 to 0.99, p<0.001). These results suggested that selection for IY, PY, YD, or LY would genetically improve lactation milk yield in this Ethiopian dairy cattle population.
doi:10.5713/ajas.2013.13040
PMCID: PMC4093399  PMID: 25049905
Genetic Correlations; Genetic Parameters; Milk Yield; Multibreed; Tropics
3.  Substantial SNP-based heritability estimates for working memory performance 
Translational Psychiatry  2014;4(9):e438-.
Working memory (WM) is an important endophenotype in neuropsychiatric research and its use in genetic association studies is thought to be a promising approach to increase our understanding of psychiatric disease. As for any genetically complex trait, demonstration of sufficient heritability within the specific study context is a prerequisite for conducting genetic studies of that trait. Recently developed methods allow estimating trait heritability using sets of common genetic markers from genome-wide association study (GWAS) data in samples of unrelated individuals. Here we present single-nucleotide polymorphism (SNP)-based heritability estimates (h2SNP) for a WM phenotype. A Caucasian sample comprising a total of N=2298 healthy and young individuals was subjected to an N-back WM task. We calculated the genetic relationship between all individuals on the basis of genome-wide SNP data and performed restricted maximum likelihood analyses for variance component estimation to derive the h2SNP estimates. Heritability estimates for three 2-back derived WM performance measures based on all autosomal chromosomes ranged between 31 and 41%, indicating a substantial SNP-based heritability for WM traits. These results indicate that common genetic factors account for a prominent part of the phenotypic variation in WM performance. Hence, the application of GWAS on WM phenotypes is a valid method to identify the molecular underpinnings of WM.
doi:10.1038/tp.2014.81
PMCID: PMC4203010  PMID: 25203169
4.  Human metabolic profiles are stably controlled by genetic and environmental variation 
A comprehensive variation map of the human metabolome identifies genetic and stable-environmental sources as major drivers of metabolite concentrations. The data suggest that sample sizes of a few thousand are sufficient to detect metabolite biomarkers predictive of disease.
We designed a longitudinal twin study to characterize the genetic, stable-environmental, and longitudinally fluctuating influences on metabolite concentrations in two human biofluids—urine and plasma—focusing specifically on the representative subset of metabolites detectable by 1H nuclear magnetic resonance (1H NMR) spectroscopy.We identified widespread genetic and stable-environmental influences on the (urine and plasma) metabolomes, with (30 and 42%) attributable on average to familial sources, and (47 and 60%) attributable to longitudinally stable sources.Ten of the metabolites annotated in the study are estimated to have >60% familial contribution to their variation in concentration.Our findings have implications for the design and interpretation of 1H NMR-based molecular epidemiology studies. On the basis of the stable component of variation quantified in the current paper, we specified a model of disease association under which we inferred that sample sizes of a few thousand should be sufficient to detect disease-predictive metabolite biomarkers.
Metabolites are small molecules involved in biochemical processes in living systems. Their concentration in biofluids, such as urine and plasma, can offer insights into the functional status of biological pathways within an organism, and reflect input from multiple levels of biological organization—genetic, epigenetic, transcriptomic, and proteomic—as well as from environmental and lifestyle factors. Metabolite levels have the potential to indicate a broad variety of deviations from the ‘normal' physiological state, such as those that accompany a disease, or an increased susceptibility to disease. A number of recent studies have demonstrated that metabolite concentrations can be used to diagnose disease states accurately. A more ambitious goal is to identify metabolite biomarkers that are predictive of future disease onset, providing the possibility of intervention in susceptible individuals.
If an extreme concentration of a metabolite is to serve as an indicator of disease status, it is usually important to know the distribution of metabolite levels among healthy individuals. It is also useful to characterize the sources of that observed variation in the healthy population. A proportion of that variation—the heritable component—is attributable to genetic differences between individuals, potentially at many genetic loci. An effective, molecular indicator of a heritable, complex disease is likely to have a substantive heritable component. Non-heritable biological variation in metabolite concentrations can arise from a variety of environmental influences, such as dietary intake, lifestyle choices, general physical condition, composition of gut microflora, and use of medication. Variation across a population in stable-environmental influences leads to long-term differences between individuals in their baseline metabolite levels. Dynamic environmental pressures lead to short-term fluctuations within an individual about their baseline level. A metabolite whose concentration changes substantially in response to short-term pressures is relatively unlikely to offer long-term prediction of disease. In summary, the potential suitability of a metabolite to predict disease is reflected by the relative contributions of heritable and stable/unstable-environmental factors to its variation in concentration across the healthy population.
Studies involving twins are an established technique for quantifying the heritable component of phenotypes in human populations. Monozygotic (MZ) twins share the same DNA genome-wide, while dizygotic (DZ) twins share approximately half their inherited DNA, as do ordinary siblings. By comparing the average extent of phenotypic concordance within MZ pairs to that within DZ pairs, it is possible to quantify the heritability of a trait, and also to quantify the familiality, which refers to the combination of heritable and common-environmental effects (i.e., environmental influences shared by twins in a pair). In addition to incorporating twins into the study design, it is useful to quantify the phenotype in some individuals at multiple time points. The longitudinal aspect of such a study allows environmental effects to be decomposed into those that affect the phenotype over the short term and those that exert stable influence.
For the current study, urine and blood samples were collected from a cohort of MZ and DZ twins, with some twins donating samples on two occasions several months apart. Samples were analysed by 1H nuclear magnetic resonance (1H NMR) spectroscopy—an untargeted, discovery-driven technique for quantifying metabolite concentrations in biological samples. The application of 1H NMR to a biological sample creates a spectrum, made up of multiple peaks, with each peak's size quantitatively representing the concentration of its corresponding hydrogen-containing metabolite.
In each biological sample in our study, we extracted a full set of peaks, and thereby quantified the concentrations of all common plasma and urine metabolites detectable by 1H NMR. We developed bespoke statistical methods to decompose the observed concentration variation at each metabolite peak into that originating from familial, individual-environmental, and unstable-environmental sources.
We quantified the variability landscape across all common metabolite peaks in the urine and plasma 1H NMR metabolomes. We annotated a subset of peaks with a total of 65 metabolites; the variance decompositions for these are shown in Figure 1. Ten metabolites' concentrations were estimated to have familial contributions in excess of 60%. The average proportion of stable variation across all extracted metabolite peaks was estimated to be 47% in the urine samples and 60% in the plasma samples; the average estimated familiality was 30% for urine and 42% for plasma. These results comprise the first quantitative variation map of the 1H NMR metabolome. The identification and quantification of substantive widespread stability provides support for the use of these biofluids in molecular epidemiology studies. On the basis of our findings, we performed power calculations for a hypothetical study searching for predictive disease biomarkers among 1H NMR-detectable urine and plasma metabolites. Our calculations suggest that sample sizes of 2000–5000 should allow reliable identification of disease-predictive metabolite concentrations explaining 5–10% of disease risk, while greater sample sizes of 5000–20 000 would be required to identify metabolite concentrations explaining 1–2% of disease risk.
1H Nuclear Magnetic Resonance spectroscopy (1H NMR) is increasingly used to measure metabolite concentrations in sets of biological samples for top-down systems biology and molecular epidemiology. For such purposes, knowledge of the sources of human variation in metabolite concentrations is valuable, but currently sparse. We conducted and analysed a study to create such a resource. In our unique design, identical and non-identical twin pairs donated plasma and urine samples longitudinally. We acquired 1H NMR spectra on the samples, and statistically decomposed variation in metabolite concentration into familial (genetic and common-environmental), individual-environmental, and longitudinally unstable components. We estimate that stable variation, comprising familial and individual-environmental factors, accounts on average for 60% (plasma) and 47% (urine) of biological variation in 1H NMR-detectable metabolite concentrations. Clinically predictive metabolic variation is likely nested within this stable component, so our results have implications for the effective design of biomarker-discovery studies. We provide a power-calculation method which reveals that sample sizes of a few thousand should offer sufficient statistical precision to detect 1H NMR-based biomarkers quantifying predisposition to disease.
doi:10.1038/msb.2011.57
PMCID: PMC3202796  PMID: 21878913
biomarker; 1H nuclear magnetic resonance spectroscopy; metabolome-wide association study; top-down systems biology; variance decomposition
5.  Single-Tissue and Cross-Tissue Heritability of Gene Expression Via Identity-by-Descent in Related or Unrelated Individuals 
PLoS Genetics  2011;7(2):e1001317.
Family studies of individual tissues have shown that gene expression traits are genetically heritable. Here, we investigate cis and trans components of heritability both within and across tissues by applying variance-components methods to 722 Icelanders from family cohorts, using identity-by-descent (IBD) estimates from long-range phased genome-wide SNP data and gene expression measurements for ∼19,000 genes in blood and adipose tissue. We estimate the proportion of gene expression heritability attributable to cis regulation as 37% in blood and 24% in adipose tissue. Our results indicate that the correlation in gene expression measurements across these tissues is primarily due to heritability at cis loci, whereas there is little sharing of trans regulation across tissues. One implication of this finding is that heritability in tissues composed of heterogeneous cell types is expected to be more dominated by cis regulation than in tissues composed of more homogeneous cell types, consistent with our blood versus adipose results as well as results of previous studies in lymphoblastoid cell lines. Finally, we obtained similar estimates of the cis components of heritability using IBD between unrelated individuals, indicating that transgenerational epigenetic inheritance does not contribute substantially to the “missing heritability” of gene expression in these tissue types.
Author Summary
An important goal in biology is to understand how genotype affects gene expression. Because gene expression varies across tissues, the relationship between genotype and gene expression may be tissue-specific. In this study, we used heritability approaches to study the regulation of gene expression in two tissue types, blood and adipose tissue, as well as the regulation of gene expression that is shared across these tissues. Heritability can be partitioned into cis and trans effects by assessing identity-by-descent (IBD) at the genomic location close to the expressed gene or genome-wide, respectively, and applying variance-components methods to partition the heritability of each gene. We estimated the proportion of gene expression heritability explained by cis regulation as 37% in blood and 24% in adipose tissue. Notably, the heritability shared across tissue types was primarily due to cis regulation. Thus, the relative contribution of cis versus trans regulation is expected to increase with the number of cell types present in the tissue being assayed, just as observed in our study and in a comparison to previous work on lymphoblastoid cell lines (LCL). We specifically ruled out a substantial contribution of transgenerational epigenetic inheritance to heritability of gene expression in these cohorts by repeating our heritability analyses using segments shared IBD in distantly related Icelanders.
doi:10.1371/journal.pgen.1001317
PMCID: PMC3044684  PMID: 21383966
6.  Using Extended Genealogy to Estimate Components of Heritability for 23 Quantitative and Dichotomous Traits 
PLoS Genetics  2013;9(5):e1003520.
Important knowledge about the determinants of complex human phenotypes can be obtained from the estimation of heritability, the fraction of phenotypic variation in a population that is determined by genetic factors. Here, we make use of extensive phenotype data in Iceland, long-range phased genotypes, and a population-wide genealogical database to examine the heritability of 11 quantitative and 12 dichotomous phenotypes in a sample of 38,167 individuals. Most previous estimates of heritability are derived from family-based approaches such as twin studies, which may be biased upwards by epistatic interactions or shared environment. Our estimates of heritability, based on both closely and distantly related pairs of individuals, are significantly lower than those from previous studies. We examine phenotypic correlations across a range of relationships, from siblings to first cousins, and find that the excess phenotypic correlation in these related individuals is predominantly due to shared environment as opposed to dominance or epistasis. We also develop a new method to jointly estimate narrow-sense heritability and the heritability explained by genotyped SNPs. Unlike existing methods, this approach permits the use of information from both closely and distantly related pairs of individuals, thereby reducing the variance of estimates of heritability explained by genotyped SNPs while preventing upward bias. Our results show that common SNPs explain a larger proportion of the heritability than previously thought, with SNPs present on Illumina 300K genotyping arrays explaining more than half of the heritability for the 23 phenotypes examined in this study. Much of the remaining heritability is likely to be due to rare alleles that are not captured by standard genotyping arrays.
Author Summary
Phenotype is a function of a genome and its environment. Heritability is the fraction of variation in a phenotype determined by genetic factors in a population. Current methods to estimate heritability rely on the phenotypic correlations of closely related individuals and are potentially upwardly biased, due to the impact of epistasis and shared environment. We develop new methods to estimate heritability over both closely and distantly related individuals. By examining the phenotypic correlation among different types of related individuals such as siblings, half-siblings, and first cousins, we show that shared environment is the primary determinant of inflated estimates of heritability. For a large number of phenotypes, it is not known how much of the heritability is explained by SNPs included on current genotyping platforms. Existing methods to estimate this component of heritability are biased in the presence of related individuals. We develop a method that permits the inclusion of both closely and distantly related individuals when estimating heritability explained by genotyped SNPs and use it to make estimates for 23 medically relevant phenotypes. These estimates can be used to increase our understanding of the distribution and frequency of functionally relevant variants and thereby inform the design of future studies.
doi:10.1371/journal.pgen.1003520
PMCID: PMC3667752  PMID: 23737753
7.  On estimation of genetic variance within families using genome-wide identity-by-descent sharing 
Background
Traditionally, heritability and other genetic parameters are estimated from between-family variation. With the advent of dense genotyping, it is now possible to compute the proportion of the genome that is shared by pairs of sibs and thus undertake the estimation within families, thereby avoiding environmental covariances of family members. Formulae for the sampling variance of estimates have been derived previously for families with two sibs, which are relevant for humans, but sampling errors are large. In livestock and plants much larger families can be obtained, and simulation has shown sampling variances are then much smaller.
Methods
Based on the assumptions that realised relationship of sibs can be obtained from genomic data and that data are analyzed by restricted maximum likelihood, formulae were derived for the sampling variance of the estimates of genetic variance for arbitrary family sizes. The analysis used statistical differentiation, assuming the variance of relationships is small.
Results
The variance of the estimate of the additive genetic variance was approximately proportional to 1/ (fn2σR2), for f families of size n and variance of relationships σR2.
Conclusions
Because the standard error of the estimate of heritability decreased in proportion to family size, the use of within-family information becomes increasingly efficient as the family size increases. There are however, limitations, such as near complete confounding of additive and dominance variances in full sib families.
doi:10.1186/1297-9686-45-32
PMCID: PMC3871764  PMID: 24007429
8.  The Robustness of Generalized Estimating Equations for Association Tests in Extended Family Data 
Human heredity  2012;74(1):17-26.
Variance-component analysis (VCA), the traditional method for handling correlations within families in genetic association studies, is computationally intensive for genome-wide analyses, and the computational burden of VCA, a likelihood-based test, increases with family size and the number of genetic markers. Alternative approaches that do not require the computation of familial correlations is preferable, provided that they do not inflate type I error or decrease power. We performed a simulation study to evaluate practical alternatives to VCA that use regression with generalized estimating equations (GEE) in extended family data. We compared the properties of linear regression with GEE applied to an entire extended family structure (GEE-EXT) and GEE applied to nuclear family structures split from these extended families (GEE-SPL) to variance-components likelihood-based methods (FastAssoc). GEE-EXT was evaluated with and without robust variance estimators to estimate the standard errors. We observed similar average type I error rates from GEE-EXT and FastAssoc compared to GEE-SPL. Type I error rates for the GEE-EXT method with a robust variance estimator were marginally higher than the nominal rate when the MAF was < 0.1, but were close to nominal rate when MAF ≥ 0.2. All methods gave consistent effect estimates and had similar power. In summary, the GEE framework with the robust variance estimator, the computationally fastest and least data management intensive, appears to work well in extended families and thus provides a reasonable alternative to full variance components approaches for extended pedigrees in the GWAS setting.
doi:10.1159/000341636
PMCID: PMC3736986  PMID: 23038411
Generalized estimating equation; Variance components analysis; Family-based association study; Genome-wide scan
9.  Evaluating alternate models to estimate genetic parameters of calving traits in United Kingdom Holstein-Friesian dairy cattle 
Background
The focus in dairy cattle breeding is gradually shifting from production to functional traits and genetic parameters of calving traits are estimated more frequently. However, across countries, various statistical models are used to estimate these parameters. This study evaluates different models for calving ease and stillbirth in United Kingdom Holstein-Friesian cattle.
Methods
Data from first and later parity records were used. Genetic parameters for calving ease, stillbirth and gestation length were estimated using the restricted maximum likelihood method, considering different models i.e. sire (−maternal grandsire), animal, univariate and bivariate models. Gestation length was fitted as a correlated indicator trait and, for all three traits, genetic correlations between first and later parities were estimated. Potential bias in estimates was avoided by acknowledging a possible environmental direct-maternal covariance. The total heritable variance was estimated for each trait to discuss its theoretical importance and practical value. Prediction error variances and accuracies were calculated to compare the models.
Results and discussion
On average, direct and maternal heritabilities for calving traits were low, except for direct gestation length. Calving ease in first parity had a significant and negative direct-maternal genetic correlation. Gestation length was maternally correlated to stillbirth in first parity and directly correlated to calving ease in later parities. Multi-trait models had a slightly greater predictive ability than univariate models, especially for the lowly heritable traits. The computation time needed for sire (−maternal grandsire) models was much smaller than for animal models with only small differences in accuracy. The sire (−maternal grandsire) model was robust when additional genetic components were estimated, while the equivalent animal model had difficulties reaching convergence.
Conclusions
For the evaluation of calving traits, multi-trait models show a slight advantage over univariate models. Extended sire models (−maternal grandsire) are more practical and robust than animal models. Estimated genetic parameters for calving traits of UK Holstein cattle are consistent with literature. Calculating an aggregate estimated breeding value including direct and maternal values should encourage breeders to consider both direct and maternal effects in selection decisions.
doi:10.1186/1297-9686-44-23
PMCID: PMC3468354  PMID: 22839757
10.  Estimating Heritabilities and Genetic Correlations: Comparing the ‘Animal Model’ with Parent-Offspring Regression Using Data from a Natural Population 
PLoS ONE  2008;3(3):e1739.
Quantitative genetic parameters are nowadays more frequently estimated with restricted maximum likelihood using the ‘animal model’ than with traditional methods such as parent-offspring regressions. These methods have however rarely been evaluated using equivalent data sets. We compare heritabilities and genetic correlations from animal model and parent-offspring analyses, respectively, using data on eight morphological traits in the great reed warbler (Acrocephalus arundinaceus). Animal models were run using either mean trait values or individual repeated measurements to be able to separate between effects of including more extended pedigree information and effects of replicated sampling from the same individuals. We show that the inclusion of more pedigree information by the use of mean traits animal models had limited effect on the standard error and magnitude of heritabilities. In contrast, the use of repeated measures animal model generally had a positive effect on the sampling accuracy and resulted in lower heritabilities; the latter due to lower additive variance and higher phenotypic variance. For most trait combinations, both animal model methods gave genetic correlations that were lower than the parent-offspring estimates, whereas the standard errors were lower only for the mean traits animal model. We conclude that differences in heritabilities between the animal model and parent-offspring regressions were mostly due to the inclusion of individual replicates to the animal model rather than the inclusion of more extended pedigree information. Genetic correlations were, on the other hand, primarily affected by the inclusion of more pedigree information. This study is to our knowledge the most comprehensive empirical evaluation of the performance of the animal model in relation to parent-offspring regressions in a wild population. Our conclusions should be valuable for reconciliation of data obtained in earlier studies as well as for future meta-analyses utilizing estimates from both traditional methods and the animal model.
doi:10.1371/journal.pone.0001739
PMCID: PMC2254494  PMID: 18320057
11.  A Kernel of Truth: Statistical Advances in Polygenic Variance Component Models for Complex Human Pedigrees 
Advances in genetics  2013;81:1-31.
Statistical genetic analysis of quantitative traits in large pedigrees is a formidable computational task due to the necessity of taking the non-independence among relatives into account. With the growing awareness that rare sequence variants may be important in human quantitative variation, heritability and association study designs involving large pedigrees will increase in frequency due to the greater chance of observing multiple copies of rare variants amongst related individuals. Therefore, it is important to have statistical genetic test procedures that utilize all available information for extracting evidence regarding genetic association. Optimal testing for marker/phenotype association involves the exact calculation of the likelihood ratio statistic which requires the repeated inversion of potentially large matrices. In a whole genome sequence association context, such computation may be prohibitive. Toward this end, we have developed a rapid and efficient eigensimplification of the likelihood that makes analysis of family data commensurate with the analysis of a comparable sample of unrelated individuals. Our theoretical results which are based on a spectral representation of the likelihood yield simple exact expressions for the expected likelihood ratio test statistic (ELRT) for pedigrees of arbitrary size and complexity. For heritability, the ELRT is: −∑ln[1+ĥ2(λgi−1)], where ĥ2 and λgi are respectively the heritability and eigenvalues of the pedigree-derived genetic relationship kernel (GRK). For association analysis of sequence variants, the ELRT is given by ELRT[hq2>0:unrelateds]−(ELRT[ht2>0:pedigrees]−ELRT[hr2>0:pedigrees]), where ht2,hq2, and hr2 are the total, quantitative trait nucleotide, and residual heritabilities, respectively. Using these results, fast and accurate analytical power analyses are possible, eliminating the need for computer simulation. Additional benefits of eigensimplification include a simple method for calculation of the exact distribution of the ELRT under the null hypothesis which turns out to differ from that expected under the usual asymptotic theory. Further, when combined with the use of empirical GRKs—estimated over a large number of genetic markers— our theory reveals potential problems associated with non positive semi-definite kernels. These procedures are being added to our general statistical genetic computer package, SOLAR.
doi:10.1016/B978-0-12-407677-8.00001-4
PMCID: PMC4019427  PMID: 23419715
12.  Assumption-Free Estimation of Heritability from Genome-Wide Identity-by-Descent Sharing between Full Siblings 
PLoS Genetics  2006;2(3):e41.
The study of continuously varying, quantitative traits is important in evolutionary biology, agriculture, and medicine. Variation in such traits is attributable to many, possibly interacting, genes whose expression may be sensitive to the environment, which makes their dissection into underlying causative factors difficult. An important population parameter for quantitative traits is heritability, the proportion of total variance that is due to genetic factors. Response to artificial and natural selection and the degree of resemblance between relatives are all a function of this parameter. Following the classic paper by R. A. Fisher in 1918, the estimation of additive and dominance genetic variance and heritability in populations is based upon the expected proportion of genes shared between different types of relatives, and explicit, often controversial and untestable models of genetic and non-genetic causes of family resemblance. With genome-wide coverage of genetic markers it is now possible to estimate such parameters solely within families using the actual degree of identity-by-descent sharing between relatives. Using genome scans on 4,401 quasi-independent sib pairs of which 3,375 pairs had phenotypes, we estimated the heritability of height from empirical genome-wide identity-by-descent sharing, which varied from 0.374 to 0.617 (mean 0.498, standard deviation 0.036). The variance in identity-by-descent sharing per chromosome and per genome was consistent with theory. The maximum likelihood estimate of the heritability for height was 0.80 with no evidence for non-genetic causes of sib resemblance, consistent with results from independent twin and family studies but using an entirely separate source of information. Our application shows that it is feasible to estimate genetic variance solely from within-family segregation and provides an independent validation of previously untestable assumptions. Given sufficient data, our new paradigm will allow the estimation of genetic variation for disease susceptibility and quantitative traits that is free from confounding with non-genetic factors and will allow partitioning of genetic variation into additive and non-additive components.
Synopsis
Quantitative geneticists attempt to understand variation between individuals within a population for traits such as height in humans and the number of bristles in fruit flies. This has been traditionally done by partitioning the variation in underlying sources due to genetic and environmental factors, using the observed amount of variation between and within families. A problem with this approach is that one can never be sure that the estimates are correct, because nature and nurture can be confounded without one knowing it. The authors got around this problem by comparing the similarity between relatives as a function of the exact proportion of genes that they have in common, looking only within families. Using this approach, the authors estimated the amount of total variation for height in humans that is due to genetic factors from 3,375 sibling pairs. For each pair, the authors estimated the proportion of genes that they share from DNA markers. It was found that about 80% of the total variation can be explained by genetic factors, close to results that are obtained from classical studies. This study provides the first validation of an estimate of genetic variation by using a source of information that is free from nature–nurture assumptions.
doi:10.1371/journal.pgen.0020041
PMCID: PMC1413498  PMID: 16565746
13.  Correcting for bias in estimation of quantitative trait loci effects 
Estimates of quantitative trait loci (QTL) effects derived from complete genome scans are biased, if no assumptions are made about the distribution of QTL effects. Bias should be reduced if estimates are derived by maximum likelihood, with the QTL effects sampled from a known distribution. The parameters of the distributions of QTL effects for nine economic traits in dairy cattle were estimated from a daughter design analysis of the Israeli Holstein population including 490 marker-by-sire contrasts. A separate gamma distribution was derived for each trait. Estimates for both the α and β parameters and their SE decreased as a function of heritability. The maximum likelihood estimates derived for the individual QTL effects using the gamma distributions for each trait were regressed relative to the least squares estimates, but the regression factor decreased as a function of the least squares estimate. On simulated data, the mean of least squares estimates for effects with nominal 1% significance was more than twice the simulated values, while the mean of the maximum likelihood estimates was slightly lower than the mean of the simulated values. The coefficient of determination for the maximum likelihood estimates was five-fold the corresponding value for the least squares estimates.
doi:10.1186/1297-9686-37-6-501
PMCID: PMC2697222  PMID: 16093012
genetic markers; quantitative trait loci; genome scans; maximum likelihood; dairy cattle
14.  Heritability of Different Forms of Memory in the Late Onset Alzheimer’s Disease Family Study 
The study aim was to estimate the genetic contribution to individual differences in different forms of memory in a large family-based group of older adults. As part of the Late Onset Alzheimer’s Disease Family Study, 899 persons (277 with Alzheimer’s disease, 622 unaffected) from 325 families completed a battery of memory tests from which previously established composite measures of episodic memory, semantic memory, and working memory were derived. Heritability in these measures was estimated using the maximum likelihood variance component method, controlling for age, sex, and education. In analyses of unaffected family members, the adjusted heritability estimates were 0.62 for episodic memory, 0.49 for semantic memory, and 0.72 for working memory, where a heritability estimate of 1 indicates that genetic factors explain all of the phenotypic variance and a heritability of 0 indicates that genetic factors explain none. Adjustment for APOE genotype had little effect on these estimates. When analyses included affected and unaffected family members, adjusted heritability estimates were lower (0.47 for episodic memory, 0.32 for semantic memory, 0.42 for working memory). Adjusting for APOE slightly reduced the estimate for episodic memory (0.40) but had no effect on the remaining estimates. The results indicate that memory functions are under strong genetic influence in older persons with and without AD, only partly attributable to APOE. This suggests that genetic analyses of memory endophenotypes may help to identify genetic variants associated with AD.
doi:10.3233/JAD-2010-101515
PMCID: PMC3130303  PMID: 20930268
Alzheimer’s disease; memory; heritability; apolipoprotein E
15.  Multilocus estimation of selfing and its heritability 
Heredity  2012;109(3):173-179.
We describe a new method of estimating the selfing rate (S) in a mixed mating population based on a population structure approach that accounts for possible intergenerational correlation in selfing rate, giving rise to an estimate of the upper limit for heritability of selfing rate (h2). A correlation between generations in selfing rate is shown to affect one- and two-locus probabilities of identity by descent. Conventional estimates of selfing rate based on a population structure approach are positively biased by intergenerational correlation in selfing. Multilocus genotypes of individuals are used to give maximum-likelihood estimates of S and h2 in the presence of scoring artifacts. Our multilocus estimation of selfing rate and its heritability (MESH) method was tested with simulated data for a range of conditions. Selfing rate estimates from MESH have low bias and root mean squared error, whereas estimates of the heritability of selfing rate have more uncertainty. Increasing the number of individuals in a sample helps to reduce bias and root mean squared error more than increasing the number of loci of sampled individuals. Improved estimates of selfing rate, as well as estimates of its heritability, can be obtained with this method, although a large number of loci and individuals are needed to achieve best results.
doi:10.1038/hdy.2012.27
PMCID: PMC3424919  PMID: 22617475
selfing; mixed mating system; descent measures; heritability; bias of selfing rate estimates
16.  The estimation of genetic relationships using molecular markers and their efficiency in estimating heritability in natural populations 
Molecular marker data collected from natural populations allows information on genetic relationships to be established without referencing an exact pedigree. Numerous methods have been developed to exploit the marker data. These fall into two main categories: method of moment estimators and likelihood estimators. Method of moment estimators are essentially unbiased, but utilise weighting schemes that are only optimal if the analysed pair is unrelated. Thus, they differ in their efficiency at estimating parameters for different relationship categories. Likelihood estimators show smaller mean squared errors but are much more biased. Both types of estimator have been used in variance component analysis to estimate heritability. All marker-based heritability estimators require that adequate levels of the true relationship be present in the population of interest and that adequate amounts of informative marker data are available. I review the different approaches to relationship estimation, with particular attention to optimizing the use of this relationship information in subsequent variance component estimation.
doi:10.1098/rstb.2005.1675
PMCID: PMC1569511  PMID: 16048788
Markov Chain Monte Carlo; allele frequency; relatedness; pedigree reconstruction; likelihood
17.  Quantitative genetics of cortical bone mass in healthy 10-year-old children from the Fels Longitudinal Study 
Bone  2006;40(2):464-470.
The genetic influences on bone mass likely change throughout the life span, but most genetic studies of bone mass regulation have focused on adults. There is, however, a growing awareness of the importance of genes influencing the acquisition of bone mass during childhood on lifelong bone health. The present investigation examines genetic influences on childhood bone mass by estimating the residual heritabilities of different measures of second metacarpal bone mass in a sample of 600 10-year-old participants from 144 families in the Fels Longitudinal Study. Bivariate quantitative genetic analyses were conducted to estimate genetic correlations between cortical bone mass measures, and measures of bone growth and development. Using a maximum likelihood-based variance components method for pedigree data, we found a residual heritability estimate of 0.71 for second metacarpal cortical index. Residual heritability estimates for individual measures of cortical bone (e.g., lateral cortical thickness, medial cortical thickness) ranged from 0.47 to 0.58, at this pre-pubertal childhood age. Low genetic correlations were found between cortical bone measures and both bone length and skeletal age. However, after Bonferonni adjustment for multiple testing, ρG was not significantly different from 0 for any of these pairs of traits. Results of this investigation provide evidence of significant genetic control over bone mass largely independent of maturation while bones are actively growing and before rapid accrual of bone that typically occurs during puberty.
doi:10.1016/j.bone.2006.09.015
PMCID: PMC1945206  PMID: 17056310
bone size; genetics; radiography; maturation
18.  Twin Studies and Their Implications for Molecular Genetic Studies: Endophenotypes Integrate Quantitative and Molecular Genetics in ADHD Research 
Objective
To describe the utility of twin studies for attention-deficit/hyperactivity disorder (ADHD) research and demonstrate their potential for the identification of alternative phenotypes suitable for genomewide association, developmental risk assessment, treatment response, and intervention targets.
Method
Brief descriptions of the classic twin study and genetic association study methods are provided, with illustrative findings from ADHD research. Biometrical genetics refers to the statistical modeling of data gathered from one or more group of known biological relation; it was apparently coined by Francis Galton in the 1860s and led to the “Biometrical School” at the University of London. Twin studies use genetic correlations between pairs of relatives, derived using this theoretical framework, to parse the individual differences in a trait into latent (unmeasured) genetic and environmental influences. This method enables the estimation of heritability, i.e., the percentage of variance due to genetic influences. It is usually implemented with a method called structural equation modeling, which is a statistical technique for fitting models to data, typically using maximum likelihood estimation. Genetic association studies aim to identify those genetic variants that account for the heritability estimated in twin studies. Measurements other than those used for the clinical diagnosis of the disorder are popular phenotype choices in current ADHD research. It is argued that twin studies have great potential to refine phenotypes relevant to ADHD.
Results
Prior studies have consistently found that the majority of the variance in ADHD symptoms is due to genetic factors. To date, genomewide association studies of ADHD have not identified replicable associations that account for the heritable variation. Possibly, the application of genomewide association studies to these alternative phenotypic measurements will assist in identifying the pathways from genetic variants to ADHD.
Conclusion
Power to detect associations should be improved by the study of highly heritable endophenotypes for ADHD and by reducing the number of phenotypes to be considered. Therefore, twin studies are an important research tool in the development of endophenotypes, defined as alternative, more highly heritable traits that act at earlier stages of the pathway from genes to behavior. Although genetic variation in liability to ADHD is likely polygenic, the proposed approach should help to identify improved alternative measurements for genetic association studies.
doi:10.1016/j.jaac.2010.06.006
PMCID: PMC3148177  PMID: 20732624
twin studies; ADHD; genomewide association; endophenotype; translational research
19.  Assumptions and Properties of Limiting Pathway Models for Analysis of Epistasis in Complex Traits 
PLoS ONE  2013;8(7):e68913.
For most complex traits, results from genome-wide association studies show that the proportion of the phenotypic variance attributable to the additive effects of individual SNPs, that is, the heritability explained by the SNPs, is substantially less than the estimate of heritability obtained by standard methods using correlations between relatives. This difference has been called the “missing heritability”. One explanation is that heritability estimates from family (including twin) studies are biased upwards. Zuk et al. revisited overestimation of narrow sense heritability from twin studies as a result of confounding with non-additive genetic variance. They propose a limiting pathway (LP) model that generates significant epistatic variation and its simple parametrization provides a convenient way to explore implications of epistasis. They conclude that over-estimation of narrow sense heritability from family data (‘phantom heritability’) may explain an important proportion of missing heritability. We show that for highly heritable quantitative traits large phantom heritability estimates from twin studies are possible only if a large contribution of common environment is assumed. The LP model is underpinned by strong assumptions that are unlikely to hold, including that all contributing pathways have the same mean and variance and are uncorrelated. Here, we relax the assumptions that underlie the LP model to be more biologically plausible. Together with theoretical, empirical, and pragmatic arguments we conclude that in outbred populations the contribution of additive genetic variance is likely to be much more important than the contribution of non-additive variance.
doi:10.1371/journal.pone.0068913
PMCID: PMC3728313  PMID: 23935903
20.  Potential application of item-response theory to interpretation of medical codes in electronic patient records 
Background
Electronic patient records are generally coded using extensive sets of codes but the significance of the utilisation of individual codes may be unclear. Item response theory (IRT) models are used to characterise the psychometric properties of items included in tests and questionnaires. This study asked whether the properties of medical codes in electronic patient records may be characterised through the application of item response theory models.
Methods
Data were provided by a cohort of 47,845 participants from 414 family practices in the UK General Practice Research Database (GPRD) with a first stroke between 1997 and 2006. Each eligible stroke code, out of a set of 202 OXMIS and Read codes, was coded as either recorded or not recorded for each participant. A two parameter IRT model was fitted using marginal maximum likelihood estimation. Estimated parameters from the model were considered to characterise each code with respect to the latent trait of stroke diagnosis. The location parameter is referred to as a calibration parameter, while the slope parameter is referred to as a discrimination parameter.
Results
There were 79,874 stroke code occurrences available for analysis. Utilisation of codes varied between family practices with intraclass correlation coefficients of up to 0.25 for the most frequently used codes. IRT analyses were restricted to 110 Read codes. Calibration and discrimination parameters were estimated for 77 (70%) codes that were endorsed for 1,942 stroke patients. Parameters were not estimated for the remaining more frequently used codes. Discrimination parameter values ranged from 0.67 to 2.78, while calibration parameters values ranged from 4.47 to 11.58. The two parameter model gave a better fit to the data than either the one- or three-parameter models. However, high chi-square values for about a fifth of the stroke codes were suggestive of poor item fit.
Conclusion
The application of item response theory models to coded electronic patient records might potentially contribute to identifying medical codes that offer poor discrimination or low calibration. This might indicate the need for improved coding sets or a requirement for improved clinical coding practice. However, in this study estimates were only obtained for a small proportion of participants and there was some evidence of poor model fit. There was also evidence of variation in the utilisation of codes between family practices raising the possibility that, in practice, properties of codes may vary for different coders.
doi:10.1186/1471-2288-11-168
PMCID: PMC3261214  PMID: 22176509
21.  Genetic Architecture of Knee Radiographic Joint Space in Healthy Young Adults 
Human biology  2008;80(1):1-9.
Evidence of a significant genetic component to the age-related degenerative joint disease osteoarthritis has been established, but the nature of genetic influences on normal joint morphology in healthy individuals remains unclear. Following up on our previous findings on the influence of body habitus on phenotypic variation in knee joint space [Duren et al., Human Biology 78:353–364 (2006)], the objective of the current study was to estimate the heritability of radiographic joint space in the knees of healthy young adults from a community-based sample of families. A sample of 253 subjects (mean age = 18.02 years) from 87 randomly ascertained nuclear and extended families was examined. Joint width (JW) and minimum joint space in the medial (MJS) and lateral (LJS) knee compartments were measured. A maximum-likelihood variance components method was used to estimate the heritability of MJS, LJS, and JW. Covariate effects of age, sex, age-by-sex interactions, stature, weight, and BMI were simultaneously estimated. Genetic correlation analyses were then conducted to examine relationships between trait pairs. MJS, LJS, and JW were each significantly heritable (p < 0.001), with heritabilities of 0.52, 0.53, and 0.63, respectively. The genetic correlation between MJS and LJS was not significantly different from 1. Genetic correlations between each joint space measure and JW were not significantly different from 0. This study demonstrates a significant genetic component to radiographic knee joint space during young adulthood in healthy subjects. This suggests that there are specific but as yet unidentified genes that influence the morphology of healthy articular cartilage, the target tissue of osteoarthritis. Genetic correlation analyses indicate complete pleiotropy between MJS and LJS but genetic independence of joint space and JW.
PMCID: PMC3988673  PMID: 18505041
KNEE JOINT SPACE; CARTILAGE; OSTEOARTHRITIS; HERITABILITY OF OSTEOARTHRITIS; X-RAY; FELS LONGITUDINAL STUDY
22.  Quantifying Missing Heritability at Known GWAS Loci 
PLoS Genetics  2013;9(12):e1003993.
Recent work has shown that much of the missing heritability of complex traits can be resolved by estimates of heritability explained by all genotyped SNPs. However, it is currently unknown how much heritability is missing due to poor tagging or additional causal variants at known GWAS loci. Here, we use variance components to quantify the heritability explained by all SNPs at known GWAS loci in nine diseases from WTCCC1 and WTCCC2. After accounting for expectation, we observed all SNPs at known GWAS loci to explain more heritability than GWAS-associated SNPs on average (). For some diseases, this increase was individually significant: for Multiple Sclerosis (MS) () and for Crohn's Disease (CD) (); all analyses of autoimmune diseases excluded the well-studied MHC region. Additionally, we found that GWAS loci from other related traits also explained significant heritability. The union of all autoimmune disease loci explained more MS heritability than known MS SNPs () and more CD heritability than known CD SNPs (), with an analogous increase for all autoimmune diseases analyzed. We also observed significant increases in an analysis of Rheumatoid Arthritis (RA) samples typed on ImmunoChip, with more heritability from all SNPs at GWAS loci () and more heritability from all autoimmune disease loci () compared to known RA SNPs (including those identified in this cohort). Our methods adjust for LD between SNPs, which can bias standard estimates of heritability from SNPs even if all causal variants are typed. By comparing adjusted estimates, we hypothesize that the genome-wide distribution of causal variants is enriched for low-frequency alleles, but that causal variants at known GWAS loci are skewed towards common alleles. These findings have important ramifications for fine-mapping study design and our understanding of complex disease architecture.
Author Summary
Heritable diseases have an unknown underlying “genetic architecture” that defines the distribution of effect-sizes for disease-causing mutations. Understanding this genetic architecture is an important first step in designing disease-mapping studies, and many theories have been developed on the nature of this distribution. Here, we evaluate the hypothesis that additional heritable variation lies at previously known associated loci but is not fully explained by the single most associated marker. We develop methods based on variance-components analysis to quantify this type of “local” heritability, demonstrating that standard strategies can be falsely inflated or deflated due to correlation between neighboring markers and propose a robust adjustment. In analysis of nine common diseases we find a significant average increase of local heritability, consistent with multiple common causal variants at an average locus. Intriguingly, for autoimmune diseases we also observe significant local heritability in loci not associated with the specific disease but with other autoimmune diseases, implying a highly correlated underlying disease architecture. These findings have important implications to the design of future studies and our general understanding of common disease.
doi:10.1371/journal.pgen.1003993
PMCID: PMC3873246  PMID: 24385918
23.  Partitioning the Heritability of Tourette Syndrome and Obsessive Compulsive Disorder Reveals Differences in Genetic Architecture 
Davis, Lea K. | Yu, Dongmei | Keenan, Clare L. | Gamazon, Eric R. | Konkashbaev, Anuar I. | Derks, Eske M. | Neale, Benjamin M. | Yang, Jian | Lee, S. Hong | Evans, Patrick | Barr, Cathy L. | Bellodi, Laura | Benarroch, Fortu | Berrio, Gabriel Bedoya | Bienvenu, Oscar J. | Bloch, Michael H. | Blom, Rianne M. | Bruun, Ruth D. | Budman, Cathy L. | Camarena, Beatriz | Campbell, Desmond | Cappi, Carolina | Cardona Silgado, Julio C. | Cath, Danielle C. | Cavallini, Maria C. | Chavira, Denise A. | Chouinard, Sylvain | Conti, David V. | Cook, Edwin H. | Coric, Vladimir | Cullen, Bernadette A. | Deforce, Dieter | Delorme, Richard | Dion, Yves | Edlund, Christopher K. | Egberts, Karin | Falkai, Peter | Fernandez, Thomas V. | Gallagher, Patience J. | Garrido, Helena | Geller, Daniel | Girard, Simon L. | Grabe, Hans J. | Grados, Marco A. | Greenberg, Benjamin D. | Gross-Tsur, Varda | Haddad, Stephen | Heiman, Gary A. | Hemmings, Sian M. J. | Hounie, Ana G. | Illmann, Cornelia | Jankovic, Joseph | Jenike, Michael A. | Kennedy, James L. | King, Robert A. | Kremeyer, Barbara | Kurlan, Roger | Lanzagorta, Nuria | Leboyer, Marion | Leckman, James F. | Lennertz, Leonhard | Liu, Chunyu | Lochner, Christine | Lowe, Thomas L. | Macciardi, Fabio | McCracken, James T. | McGrath, Lauren M. | Mesa Restrepo, Sandra C. | Moessner, Rainald | Morgan, Jubel | Muller, Heike | Murphy, Dennis L. | Naarden, Allan L. | Ochoa, William Cornejo | Ophoff, Roel A. | Osiecki, Lisa | Pakstis, Andrew J. | Pato, Michele T. | Pato, Carlos N. | Piacentini, John | Pittenger, Christopher | Pollak, Yehuda | Rauch, Scott L. | Renner, Tobias J. | Reus, Victor I. | Richter, Margaret A. | Riddle, Mark A. | Robertson, Mary M. | Romero, Roxana | Rosàrio, Maria C. | Rosenberg, David | Rouleau, Guy A. | Ruhrmann, Stephan | Ruiz-Linares, Andres | Sampaio, Aline S. | Samuels, Jack | Sandor, Paul | Sheppard, Brooke | Singer, Harvey S. | Smit, Jan H. | Stein, Dan J. | Strengman, E. | Tischfield, Jay A. | Valencia Duarte, Ana V. | Vallada, Homero | Van Nieuwerburgh, Filip | Veenstra-VanderWeele, Jeremy | Walitza, Susanne | Wang, Ying | Wendland, Jens R. | Westenberg, Herman G. M. | Shugart, Yin Yao | Miguel, Euripedes C. | McMahon, William | Wagner, Michael | Nicolini, Humberto | Posthuma, Danielle | Hanna, Gregory L. | Heutink, Peter | Denys, Damiaan | Arnold, Paul D. | Oostra, Ben A. | Nestadt, Gerald | Freimer, Nelson B. | Pauls, David L. | Wray, Naomi R. | Stewart, S. Evelyn | Mathews, Carol A. | Knowles, James A. | Cox, Nancy J. | Scharf, Jeremiah M.
PLoS Genetics  2013;9(10):e1003864.
The direct estimation of heritability from genome-wide common variant data as implemented in the program Genome-wide Complex Trait Analysis (GCTA) has provided a means to quantify heritability attributable to all interrogated variants. We have quantified the variance in liability to disease explained by all SNPs for two phenotypically-related neurobehavioral disorders, obsessive-compulsive disorder (OCD) and Tourette Syndrome (TS), using GCTA. Our analysis yielded a heritability point estimate of 0.58 (se = 0.09, p = 5.64e-12) for TS, and 0.37 (se = 0.07, p = 1.5e-07) for OCD. In addition, we conducted multiple genomic partitioning analyses to identify genomic elements that concentrate this heritability. We examined genomic architectures of TS and OCD by chromosome, MAF bin, and functional annotations. In addition, we assessed heritability for early onset and adult onset OCD. Among other notable results, we found that SNPs with a minor allele frequency of less than 5% accounted for 21% of the TS heritability and 0% of the OCD heritability. Additionally, we identified a significant contribution to TS and OCD heritability by variants significantly associated with gene expression in two regions of the brain (parietal cortex and cerebellum) for which we had available expression quantitative trait loci (eQTLs). Finally we analyzed the genetic correlation between TS and OCD, revealing a genetic correlation of 0.41 (se = 0.15, p = 0.002). These results are very close to previous heritability estimates for TS and OCD based on twin and family studies, suggesting that very little, if any, heritability is truly missing (i.e., unassayed) from TS and OCD GWAS studies of common variation. The results also indicate that there is some genetic overlap between these two phenotypically-related neuropsychiatric disorders, but suggest that the two disorders have distinct genetic architectures.
Author Summary
Family and twin studies have shown that genetic risk factors are important in the development of Tourette Syndrome (TS) and obsessive compulsive disorder (OCD). However, efforts to identify the individual genetic risk factors involved in these two neuropsychiatric disorders have been largely unsuccessful. One possible explanation for this is that many genetic variations scattered throughout the genome each contribute a small amount to the overall risk. For TS and OCD, the genetic architecture (characterized by the number, frequency, and distribution of genetic risk factors) is presently unknown. This study examined the genetic architecture of TS and OCD in a variety of ways. We found that rare genetic changes account for more genetic risk in TS than in OCD; certain chromosomes contribute to OCD risk more than others; and variants that influence the level of genes expressed in two regions of the brain can account for a significant amount of risk for both TS and OCD. Results from this study might help in determining where, and what kind of variants are individual risk factors for TS and OCD and where they might be located in the human genome.
doi:10.1371/journal.pgen.1003864
PMCID: PMC3812053  PMID: 24204291
24.  Somatic Cells Count and Its Genetic Association with Milk Yield in Dairy Cattle Raised under Thai Tropical Environmental Conditions 
Somatic cells count (SCC), milk yield (MY) and pedigree information of 2,791 first lactation cows that calved between 1990 and 2010 on 259 Thai farms were used to estimate genetic parameters and trends for SCC and its genetic association with MY. The SCC were log-transformed (lnSCC) to make them normally distributed. An average information-restricted maximum likelihood procedure was used to estimate variance components. A bivariate animal model that considered herd-yr-season, calving age, and regression additive genetic group as fixed effects, and animal and residual as random effects was used for genetic evaluation. Heritability estimates were 0.12 (SE = 0.19) for lnSCC, and 0.31 (SE = 0.06) for MY. The genetic correlation estimate between lnSCC and MY was 0.26 (SE = 0.59). Mean yearly estimated breeding values during the last 20 years increased for SCC (49.02 cells/ml/yr, SE = 26.81 cells/ml/yr; p = 0.08), but not for MY (0.37 kg/yr, SE = 0.87 kg/yr; p = 0.68). Sire average breeding values for SCC and MY were higher than those of cows and dams (p<0.01). Heritability estimates for lnSCC and MY and their low but positive genetic correlation suggested that selection for low SCC may be feasible in this population as it is in other populations of dairy cows. Thus, selection for high MY and low SCC should be encouraged in Thai dairy improvement programs to increase profitability by improving both cow health and milk yield.
doi:10.5713/ajas.2012.12159
PMCID: PMC4092935  PMID: 25049683
Dairy Cattle; Milk Yield; Selection; Somatic Cell Count; Tropics
25.  Ordinary kriging approach to predicting long-term particulate matter concentrations in seven major Korean cities 
Objectives
Cohort studies of associations between air pollution and health have used exposure prediction approaches to estimate individual-level concentrations. A common prediction method used in Korean cohort studies is ordinary kriging. In this study, performance of ordinary kriging models for long-term particulate matter less than or equal to 10 μm in diameter (PM10) concentrations in seven major Korean cities was investigated with a focus on spatial prediction ability.
Methods
We obtained hourly PM10 data for 2010 at 226 urban-ambient monitoring sites in South Korea and computed annual average PM10 concentrations at each site. Given the annual averages, we developed ordinary kriging prediction models for each of the seven major cities and for the entire country by using an exponential covariance reference model and a maximum likelihood estimation method. For model evaluation, cross-validation was performed and mean square error and R-squared (R2) statistics were computed.
Results
Mean annual average PM10 concentrations in the seven major cities ranged between 45.5 and 66.0 μg/m3 (standard deviation=2.40 and 9.51 μg/m3, respectively). Cross-validated R2 values in Seoul and Busan were 0.31 and 0.23, respectively, whereas the other five cities had R2 values of zero. The national model produced a higher crossvalidated R2 (0.36) than those for the city-specific models.
Conclusions
In general, the ordinary kriging models performed poorly for the seven major cities and the entire country of South Korea, but the model performance was better in the national model. To improve model performance, future studies should examine different prediction approaches that incorporate PM10 source characteristics.
doi:10.5620/eht.e2014012
PMCID: PMC4178540  PMID: 25262773
Exposure prediction; Health effect; Kriging; Long-term exposure; Particulate matter

Results 1-25 (1046721)