The limitations of genome-wide association (GWA) studies that focus on the phenotypic influence of common genetic variants have motivated human geneticists to consider the contribution of rare variants to phenotypic expression. The increasing availability of high-throughput sequencing technology has enabled studies of rare variants, but will not be sufficient for their success since appropriate analytical methods are also needed. We consider data analysis approaches to testing associations between a phenotype and collections of rare variants in a defined genomic region or set of regions. Ultimately, although a wide variety of analytical approaches exist, more work is needed to refine them and determine their properties and power in different contexts.
Comparing diversities between groups is a task biologists are frequently faced with, for example in ecological field trials or when dealing with metagenomics data. However, researchers often waver about which measure of diversity to choose since there is a multitude of approaches available. As Jost (2008) has pointed out, widely used measures such as the Shannon or Simpson index have undesirable properties which make them hard to compare and interpret. Many of the problems associated with the use of these “raw” indices can be corrected by transforming them into “true” diversity measures. We introduce a technique that allows the comparison of two or more groups of observations and simultaneously tests a user-defined selection of a number of “true” diversity measures. This procedure yields multiplicity-adjusted p-values according to the method of Westfall & Young (1993), which ensures that the rate of false-positives (type I error) does not rise when the number of groups and/or diversity indices is extended. Software is available in the R package “simboot”.
metagenomics; Simpson index; Shannon entropy; bootstrap; multiple contrasts; Westfall-Young
Understanding the risk for type 2 diabetes (T2D) early in the life course is important for prevention. Whether genetic information improves prediction models for diabetes from adolescence into adulthood is unknown.
With the use of data from 1030 participants in the Bogalusa Heart Study aged 12 to 18 followed into middle adulthood, we built Cox models for incident T2D with risk factors assessed in adolescence (demographics, family history, physical examination, and routine biomarkers). Models with and without a 38 single-nucleotide polymorphism diabetes genotype score were compared by C statistics and continuous net reclassification improvement indices.
Participant mean (± SD) age at baseline was 14.4 ± 1.6 years, and 32% were black. Ninety (8.7%) participants developed T2D over a mean 26.9 ± 5.0 years of follow-up. Genotype score significantly predicted T2D in all models. Hazard ratios ranged from 1.09 per risk allele (95% confidence interval 1.03–1.15) in the basic demographic model to 1.06 (95% confidence interval 1.00–1.13) in the full model. The addition of genotype score did not improve the discrimination of the full clinical model (C statistic 0.756 without and 0.760 with genotype score). In the full model, genotype score had weak improvement in reclassification (net reclassification improvement index 0.261).
Although a genotype score assessed among white and black adolescents is significantly associated with T2D in adulthood, it does not improve prediction over clinical risk factors. Genetic screening for T2D in its current state is not a useful addition to adolescents’ clinical care.
genetic predisposition to disease; diabetes mellitus, type 2; adolescent medicine
Natural epigenetic variation provides a source for the generation of phenotypic diversity, but to understand its contribution to phenotypic diversity, its interaction with genetic variation requires further investigation. Here, we report population-wide DNA sequencing of genomes, transcriptomes, and methylomes of wild Arabidopsis thaliana accessions. Single cytosine methylation polymorphisms are unlinked to genotype. However, the rate of linkage disequilibrium decay amongst differentially methylated regions targeted by RNA-directed DNA methylation is similar to the rate for single nucleotide polymorphisms. Association analyses of these RNA-directed DNA methylation regions with genetic variants identified thousands of methylQTL, which revealed the first population estimate of genetically dependent methylation variation. Analysis of invariably methylated transposons and genes across this population indicates that loci targeted by RNA-directed DNA methylation are epigenetically activated in pollen and seeds, which facilitates proper development of these structures.
The use of direct-to-consumer genomewide profiling to assess disease risk is controversial, and little is known about the effect of this technology on consumers. We examined the psychological, behavioral, and clinical effects of risk scanning with the Navigenics Health Compass, a commercially available test of uncertain clinical validity and utility.
We recruited subjects from health and technology companies who elected to purchase the Health Compass at a discounted rate. Subjects reported any changes in symptoms of anxiety, intake of dietary fat, and exercise behavior at a mean (±SD) of 5.6±2.4 months after testing, as compared with baseline, along with any test-related distress and the use of health-screening tests.
From a cohort of 3639 enrolled subjects, 2037 completed follow-up. Primary analyses showed no significant differences between baseline and follow-up in anxiety symptoms (P = 0.80), dietary fat intake (P = 0.89), or exercise behavior (P = 0.61). Secondary analyses revealed that test-related distress was positively correlated with the average estimated lifetime risk among all the assessed conditions (β = 0.117, P<0.001). However, 90.3% of subjects who completed follow-up had scores indicating no test-related distress. There was no significant increase in the rate of use of screening tests associated with genomewide profiling, most of which are not considered appropriate for screening asymptomatic persons in any case.
In a selected sample of subjects who completed follow-up after undergoing consumer genomewide testing, such testing did not result in any measurable short-term changes in psychological health, diet or exercise behavior, or use of screening tests. Potential effects of this type of genetic testing on the population at large are not known. (Funded by the National Institutes of Health and Scripps Health.)
The determination of the ancestry and genetic backgrounds of the subjects in genetic and general epidemiology studies is a crucial component in the analysis of relevant outcomes or associations. Although there are many methods for differentiating ancestral subgroups among individuals based on genetic markers only a few of these methods provide actual estimates of the fraction of an individual’s genome that is likely to be associated with different ancestral populations. We propose a method for assigning ancestry that works in stages to refine estimates of ancestral population contributions to individual genomes. The method leverages genotype data in the public domain obtained from individuals with known ancestries. Although we showcase the method in the assessment of ancestral genome proportions leveraging largely continental populations, the strategy can be used for assessing within-continent or more subtle ancestral origins with the appropriate data.
genetic ancestry; admixture; population genetics; admixture proportions
The ongoing controversy surrounding direct-to-consumer (DTC) personal genomic tests intensified last year when the U.S. Government Accountability Office (GAO) released results of an undercover investigation of four companies that offer such testing. Among their findings, they reported that some of their donors received DNA-based predictions that conflicted with their actual medical histories. We aimed to more rigorously evaluate the relationship between DTC genomic risk estimates and self-reported disease by leveraging data from the Scripps Genomic Health Initiative (SGHI). We prospectively collected self-reported personal and family health history data for 3,416 individuals who went on to purchase a commercially available DTC genomic test. For 5 out of 15 total conditions studied, we found that risk estimates from the test were significantly associated with self-reported family and/or personal health history. The 5 conditions, included Graves’ disease, Type 2 Diabetes, Lupus, Alzheimer’s disease, and Restless Leg Syndrome. To further investigate these findings, we ranked each of the 15 conditions based on published heritability estimates and conducted post-hoc power analyses based on the number of individuals in our sample who reported significant histories of each condition. We found that high heritability, coupled with high prevalence in our sample and thus adequate statistical power, explained the pattern of associations observed. Our study represents one of the first evaluations of the relationship between risk estimates from a commercially available DTC personal genomic test and self-reported health histories in the consumers of that test.
direct-to-consumer; genetic testing; genetic risk estimates; clinical validity; consumer genomics
Contemporary sequencing studies often ignore the diploid nature of the human genome because they do not routinely separate or ‘phase’ maternally and paternally derived sequence information. However, many findings — both from recent studies and in the more established medical genetics literature — indicate that relationships between human DNA sequence and phenotype, including disease, can be more fully understood with phase information. Thus, the existing technological impediments to obtaining phase information must be overcome if human genomics is to reach its full potential.
Multivariate distance matrix regression (MDMR) analysis is a statistical technique that allows researchers to relate P variables to an additional M factors collected on N individuals, where P ≫ N. The technique can be applied to a number of research settings involving high-dimensional data types such as DNA sequence data, gene expression microarray data, and imaging data. MDMR analysis involves computing the distance between all pairs of individuals with respect to P variables of interest and constructing an N × N matrix whose elements reflect these distances. Permutation tests can be used to test linear hypotheses that consider whether or not the M additional factors collected on the individuals can explain variation in the observed distances between and among the N individuals as reflected in the matrix. Despite its appeal and utility, properties of the statistics used in MDMR analysis have not been explored in detail. In this paper we consider the level accuracy and power of MDMR analysis assuming different distance measures and analysis settings. We also describe the utility of MDMR analysis in assessing hypotheses about the appropriate number of clusters arising from a cluster analysis.
regression analysis; multivariate analysis; distance matrix; simulation
In this review we address the subject of dental caries pathogenicity from a genomic and metagenomic perspective. The application of genomic technologies is certain to yield novel insights into the relationship between the bacterial flora, dental health and disease. Three primary attributes of bacterial species are thought to have direct impact on caries development, these include: adherence on tooth surfaces (biofilm formation), acid production and acid tolerance. Attempts to define the specific aetiological agents of dental caries have proven to be elusive, supporting the notion that caries aetiology is perhaps complex and multi-faceted. The recently introduced Human Microbiome Project (HMP) that endeavors to characterise the micro-organisms living in and on the human body is likely to shed new light on these questions and improve our understanding of polymicrobial disease, microbial ecology in the oral cavity and provide new avenues for therapeutic and molecular diagnostics developments.
Caries; biofilm; bacterial species; genomic; metagenomic; Human Microbiome Project
Integration of clinical evaluations and whole-genome sequence data from eight individuals in a recent study demonstrates that genetic and clinical information can be combined and applied to preventive medicine. Statistical and graphical tools were developed to assess and visualize the genetic risk of common chronic conditions and to show the changes in disease risk that result from monitoring clinical symptoms over time. This approach provides a direction to consider in the adoption of genetic information in health care, but, like all provocative scientific articles, it raises as many questions as it answers.
Please see related Research: http://genomemedicine.com/content/5/6/58
Inhibition of the P50 evoked electroencephalographic response to the second of paired auditory stimuli has been frequently examined as a neurophysiological deficit in schizophrenia. The National Institute of Mental Health Consortium on the Genetics of Schizophrenia (COGS) examined this endophenotype in a 7 center multi-site study. Recordings were analyzed from 181 probands with schizophrenia, 429 of their first degree relatives, and 333 community comparison control subjects. Most probands were being treated with second generation neuroleptic medications. Highly significant differences in P50 inhibition, measured as either the ratio of amplitudes or their difference in response to the two stimuli, were found between the probands and the community comparison sample. There were no differences between the COGS sites for these findings. For the ratio parameter, an admixture analysis indicated that nearly 40% of the relatives demonstrated deficiencies in P50 inhibition that are comparable to the deficit found in the probands. These results indicate that P50 auditory evoked potentials can be recorded across multiple sites and reliably demonstrate a physiological abnormality in schizophrenia. The appearance of the physiological abnormality in a substantial proportion of clinically unaffected first degree relatives is consistent with the hypothesis that deficits in cerebral inhibition may be a familial neurobiological risk factor for the illness.
Schizophrenia; Evoked potentials auditory; Inhibition; Genetics
In clinical medicine, lipids are commonly measured biomarkers used to assess an individual’s risk for cardiovascular disease, heart attack, and stroke. Accurately predicting longitudinal lipid levels based on genomic information can inform therapeutic practices and decrease cardiovascular risk by identifying high-risk patients prior to onset. Using genotyped and imputed genetic data from 523 unrelated Caucasian Americans from the Bogalusa Heart Study, surveyed on 4,026 occasions from 4 to 48 years of age, we generated various lipid genomic risk models based on previously reported markers. We observed a significant improvement in prediction over non-genetic risk models in high density lipoprotein cholesterol (increase in the squared correlation between observed and predicted values, ΔR2 = 0.032), low density lipoprotein cholesterol (ΔR2 = 0.053), total cholesterol (ΔR2 = 0.043), and triglycerides (ΔR2 = 0.031). Many of our approaches are based on an n-fold cross-validation procedure that are, by design, adaptable to a clinical environment.
lipids; polygenic model; prediction; cardiovascular diseases; statistical methods
The insulin/IGF1 signaling pathways affect lifespan in several model organisms, including worms, flies and mice. To investigate whether common genetic variation in this pathway influences lifespan in humans, we genotyped 291 common variants in 30 genes encoding proteins in the insulin/IGF1 signaling pathway in a cohort of elderly Caucasian women selected from the Study of Osteoporotic Fractures (SOF), including 293 long-lived cases (lifespan ≥ 92 years (y), mean ± standard deviation (SD) = 95.3 ± 2.2y) and 603 average-lifespan controls (lifespan ≤ 79y, mean=75.7 ± 2.6y). Variants were selected for genotyping using a haplotype tagging approach. We found a modest excess of variants nominally associated with longevity. We then replicated nominally significant variants in two additional Caucasian cohorts containing both males and females: the Cardiovascular Health Study (CHS) and Ashkenazi Jewish Centenarians (AJC). An intronic single nucleotide polymorphism (SNP) in AKT1, rs3803304, was significantly associated with lifespan in a meta-analysis across the three cohorts (odds ratio (OR)=0.78 (95% confidence interval (CI)=0.68-0.89), adjusted p=0.043); two intronic SNPs in FOXO3A demonstrated a significant lifespan association among women only (rs1935949, OR=1.35, 95% CI=1.15-1.57, adjusted p=0.0093). Conclusion: common variants in several insulin/IGF1 pathway genes are associated with human lifespan.
IGF1; longevity; gene; SNP; AKT1; FOXO3A
Dental decay is one of the most prevalent chronic diseases worldwide. A variety of factors, including microbial, genetic, immunological, behavioral and environmental, interact to contribute to dental caries onset and development. Previous studies focused on the microbial basis for dental caries have identified species associated with both dental health and disease. The purpose of the current study was to improve our knowledge of the microbial species involved in dental caries and health by performing a comprehensive 16S rDNA profiling of the dental plaque microbiome of both caries-free and caries-active subjects. Analysis of over 50,000 nearly full-length 16S rDNA clones allowed the identification of 1,372 operational taxonomic units (OTUs) in the dental plaque microbiome. Approximately half of the OTUs were common to both caries-free and caries-active microbiomes and present at similar abundance. The majority of differences in OTU’s reflected very low abundance phylotypes. This survey allowed us to define the population structure of the dental plaque microbiome and to identify the microbial signatures associated with dental health and disease. The deep profiling of dental plaque allowed the identification of 87 phylotypes that are over-represented in either caries-free or caries-active subjects. Among these signatures, those associated with dental health outnumbered those associated with dental caries by nearly two-fold. A comparison of this data to other published studies indicate significant heterogeneity in study outcomes and suggest that novel approaches may be required to further define the signatures of dental caries onset and progression.
Acute myocardial infarction (MI), which involves the rupture of existing atheromatous plaque, remains highly unpredictable despite recent advances in the diagnosis and treatment of coronary artery disease. Accordingly, a biomarker that can predict an impending MI is desperately needed. Here, we characterize circulating endothelial cells (CECs) using the first automated and clinically feasible CEC 3-channel fluorescence microscopy assay in 50 consecutive patients with ST-elevation myocardial infarction (STEMI) and 44 consecutive healthy controls. CEC counts were significantly elevated in MI cases versus controls with median numbers of 19 and 4 cells/ml respectively (p = 1.1 × 10−10). A receiver-operating characteristic (ROC) curve analysis demonstrated an area under the ROC curve of 0.95, suggesting near dichotomization of MI cases versus controls. We observed no correlation between CECs and typical markers of myocardial necrosis (ρ=0.02, CK-MB; ρ=−0.03, troponin). Morphologic analysis of the microscopy images of CECs revealed a 2.5-fold increase (P<0.0001) in cellular area and 2-fold increase (P<0.0001) in nuclear area of MI CECs versus healthy control, age-matched CECs, as well as CECs obtained from patients with preexisting peripheral vascular disease. The distribution of CEC images containing from 2 up to 10 nuclei demonstrates that MI patients are the only group to contain more than 3 nuclei/image, indicating that multi-cellular and multi-nuclear clusters are specific for acute MI. These data indicate that CECs may serve as promising biomarkers for the prediction of atherosclerotic plaque rupture events.
A number of recent genome-wide association (GWA) studies have identified unequivocal statistical associations between inherited genetic variations, mostly single nucleotide polymorphisms (SNPs), and common complex diseases such as diabetes, cardiovascular disease, and obesity. Genotyping individuals for these variations has the potential to help redefine how pharmacologic agents undergo clinical development. By identifying carriers of known genomic variants that contribute to susceptibility, a high risk population can be defined as well as individuals with potential for a better response to a drug. We evaluated the potential utility that selecting individuals for a trial on the basis of genotype identified in contemporary GWA studies would have had on recently described clinical trials. We pursued this by constraining both the risks of a disease outcome associated with particular genotypes and overall drug responses to those actually observed in genetic association and clinical trial studies, respectively. We pursued these evaluations in the context of clinical trials investigating drugs for macular degeneration, obesity, heart disease, type II diabetes, prostate cancer and Alzheimer’s disease. We show that the increase in incidence of outcomes in trials restricted to individuals with specific genotypic profiles can result in substantial reductions in requisite sample sizes for such trials. In addition, we also derive realistic bounds for samples sizes for clinical trials investigating pharmacogenetic effects that leverage genetic variations identified in recent association studies.
Polymorphism; Translational medicine; Drug validation; DNA sequencing; Study Design
Hyperhomocysteinemia is associated with increased venous thrombosis and cardiovascular disease (CVD). Mutations in the human methylenetetrahydrofolate reductase (MTHFR) gene have been associated with increased homocysteine levels and risks of CVD in various populations including those with kidney disease. Here, we evaluated the influence of MTHFR variants on progressive loss of kidney function.
We analyzed 821 subjects with hypertensive nephrosclerosis from the longitudinal National Institute of Diabetes and Digestive and Kidney Diseases African-American Study of Kidney Disease and Hypertension (AASK) Trial to determine whether decline in glomerular filtration rate (GFR) over ∼4.2 years was predicted by common genetic variation within MTHFR at non-synonymous positions C677T (Ala222Val) and A1298C (Glu429Ala) or by MTHFR haplotypes. The effect on GFR decline was then supported by a study of 1333 subjects from the San Diego Veterans Affairs Hypertension Cohort (VAHC), followed over ∼4.5 years. Linear effect models were utilized to determine both genotype [single-nucleotide polymorphism (SNP)] and genotype (SNP)-by-time interactions.
In AASK, the polymorphism at A1298C predicted the rate of GFR decline: A1298/A1298 major allele homozygosity resulted in a less pronounced decline of GFR, with a significant SNP-by-time interaction. An independent follow-up study in the San Diego VAHC subjects supports that A1298/A1298 homozygotes have the greatest estimated GFR throughout the study. Haplotype analysis with C677T yielded concurring results.
We conclude that the MTHFR-coding polymorphism at A1298C is associated with renal decline in African-Americans with hypertensive nephrosclerosis and is supported by a veteran cohort with a primary care diagnosis of hypertension. Further investigation is needed to confirm such findings and to determine what molecular mechanism may contribute to this association.
AASK; glomerular filtration rate; hypertension; kidney disease; MTHFR
Recent studies investigating the genetic determinants of cancer suggest that some of the genetic alterations contributing to tumorigenesis may be inherited, but the vast majority are somatically acquired during the transition of a normal cell to a cancer cell. A systematic understanding of the genetic and molecular determinants of cancers has already begun to have a transformative effect on the study and treatment of cancer, particularly through the identification of a range of genetic alterations in protein kinase genes, which are highly associated with the disease. Since kinases are prominent therapeutic targets for intervention within the cancer cell, studying the impact that genomic alterations within them have on cancer initiation, progression, and treatment is both logical and timely. In fact, recent sequencing and resequencing (i.e., polymorphism idenitification) efforts have catalyzed the quest for protein kinase ‘driver’ mutations (i.e., those genetic alterations which contribute to the transformation of a normal cell to a proliferating cancerous cell) in distinction to kinase ‘passenger’ mutations which reflect mutations that merely build up in course of normal and unchecked (i.e., cancerous) somatic cell replication and proliferation. In this review, we discuss the recent progress in the discovery and functional characterization of protein kinase cancer driver mutations and the implications of this progress for understanding tumorigenesis as well as the design of ‘personalized’ cancer therapeutics that target an individual’s unique mutational profile.
There has been growing debate over the nature of the genetic contribution to individual susceptibility to common complex diseases such as diabetes, osteoporosis, and cancer. The ‘Common Disease, Common Variant (CDCV)’ hypothesis argues that genetic variations with appreciable frequency in the population at large, but relatively low ‘penetrance’ (or the probability that a carrier of the relevant variants will express the disease), are the major contributors to genetic susceptibility to common diseases. The ‘Common Disease, Rare Variant (CDRV)’ hypothesis, on the other hand, argues that multiple rare DNA sequence variations, each with relatively high penetrance, are the major contributors to genetic susceptibility to common diseases. Both hypotheses have their place in current research efforts.
There have been a number of recent successes in the use of whole genome sequencing and sophisticated bioinformatics techniques to identify pathogenic DNA sequence variants responsible for individual idiopathic congenital conditions. However, the success of this identification process is heavily influenced by the ancestry or genetic background of a patient with an idiopathic condition. This is so because potential pathogenic variants in a patient’s genome must be contrasted with variants in a reference set of genomes made up of other individuals’ genomes of the same ancestry as the patient. We explored the effect of ignoring the ancestries of both an individual patient and the individuals used to construct reference genomes. We pursued this exploration in two major steps. We first considered variation in the per-genome number and rates of likely functional derived (i.e., non-ancestral, based on the chimp genome) single nucleotide variants and small indels in 52 individual whole human genomes sampled from 10 different global populations. We took advantage of a suite of computational and bioinformatics techniques to predict the functional effect of over 24 million genomic variants, both coding and non-coding, across these genomes. We found that the typical human genome harbors ∼5.5–6.1 million total derived variants, of which ∼12,000 are likely to have a functional effect (∼5000 coding and ∼7000 non-coding). We also found that the rates of functional genotypes per the total number of genotypes in individual whole genomes differ dramatically between human populations. We then created tables showing how the use of comparator or reference genome panels comprised of genomes from individuals that do not have the same ancestral background as a patient can negatively impact pathogenic variant identification. Our results have important implications for clinical sequencing initiatives.
clinical sequencing; congenital disease; whole genome sequencing; population genetics
Recent genome wide association studies (GWAS) have identified DNA sequence variations that exhibit unequivocal statistical associations with many common chronic diseases. However, the vast majority of these studies identified variations that explain only a very small fraction of disease burden in the population at large, suggesting that other factors, such as multiple rare or low-penetrance variations and interacting environmental factors, are major contributors to disease susceptibility. Identifying multiple low penetrance variations (or ‘polygenes’) contributing to disease susceptibility will be difficult. We present a pathway analysis approach to characterizing the likely polygenic basis of seven common diseases using the Wellcome Trust Case Control Consortium (WTCCC) GWAS results. We identify numerous pathways implicated in disease predisposition that would have not been revealed using standard single-locus GWAS statistical analysis criteria. Many of these pathways have long been assumed to contain polymorphic genes that lead to disease predisposition. Additionally, we analyze the genetic relationships between the seven diseases, and based upon similarities with respect to the associated genes and pathways affected in each, propose a new way of categorizing the diseases.
Pathway; genome-wide; disease; common; diabetes; crohn’s; coronary; bipolar; arthritis; hypertension
Over the past 18 months, there have been notable developments in the direct-to-consumer (DTC) genomic testing arena, in particular with regard to issues surrounding governmental regulation in the USA. While commentaries continue to proliferate on this topic, actual empirical research remains relatively scant. In terms of DTC genomic testing for disease susceptibility, most of the research has centered on uptake, perceptions and attitudes toward testing among health care professionals and consumers. Only a few available studies have examined actual behavioral response among consumers, and we are not aware of any studies that have examined response to DTC genetic testing for ancestry or for drug response. We propose that further research in this area is desperately needed, despite challenges in designing appropriate studies given the rapid pace at which the field is evolving. Ultimately, DTC genomic testing for common markers and conditions is only a precursor to the eventual cost-effectiveness and wide availability of whole genome sequencing of individuals, although it remains unclear whether DTC genomic information will still be attainable. Either way, however, current knowledge needs to be extended and enhanced with respect to the delivery, impact and use of increasingly accurate and comprehensive individualized genomic data.
Human skull and brain morphology are strongly influenced by genetic factors, and skull size and shape vary worldwide. However, the relationship between specific brain morphology and genetically-determined ancestry is largely unknown.
We used two independent data sets to characterize variation in skull and brain morphology among individuals of European ancestry. The first data set is a historical sample of 1,170 male skulls with 37 shape measurements drawn from 27 European populations. The second data set includes 626 North American individuals of European ancestry participating in the Alzheimer's Disease Neuroimaging Initiative (ADNI) with magnetic resonance imaging, height and weight, neurological diagnosis, and genome-wide single nucleotide polymorphism (SNP) data.
We found that both skull and brain morphological variation exhibit a population-genetic fingerprint among individuals of European ancestry. This fingerprint shows a Northwest to Southeast gradient, is independent of body size, and involves frontotemporal cortical regions.
Our findings are consistent with prior evidence for gene flow in Europe due to historical population movements and indicate that genetic background should be considered in studies seeking to identify genes involved in human cortical development and neuropsychiatric disease.
Biological anthropology; Cortex; Craniometry; Genetic drift; Imaging genomics; Neuroimaging; Population genetics