R1a-M420 is one of the most widely spread Y-chromosome haplogroups; however, its substructure within Europe and Asia has remained poorly characterized. Using a panel of 16 244 male subjects from 126 populations sampled across Eurasia, we identified 2923 R1a-M420 Y-chromosomes and analyzed them to a highly granular phylogeographic resolution. Whole Y-chromosome sequence analysis of eight R1a and five R1b individuals suggests a divergence time of ∼25 000 (95% CI: 21 300–29 000) years ago and a coalescence time within R1a-M417 of ∼5800 (95% CI: 4800–6800) years. The spatial frequency distributions of R1a sub-haplogroups conclusively indicate two major groups, one found primarily in Europe and the other confined to Central and South Asia. Beyond the major European versus Asian dichotomy, we describe several younger sub-haplogroups. Based on spatial distributions and diversity patterns within the R1a-M420 clade, particularly rare basal branches detected primarily within Iran and eastern Turkey, we conclude that the initial episodes of haplogroup R1a diversification likely occurred in the vicinity of present-day Iran.
The Slavic branch of the Balto-Slavic sub-family of Indo-European languages underwent rapid divergence as a result of the spatial expansion of its speakers from Central-East Europe, in early medieval times. This expansion–mainly to East Europe and the northern Balkans–resulted in the incorporation of genetic components from numerous autochthonous populations into the Slavic gene pools. Here, we characterize genetic variation in all extant ethnic groups speaking Balto-Slavic languages by analyzing mitochondrial DNA (n = 6,876), Y-chromosomes (n = 6,079) and genome-wide SNP profiles (n = 296), within the context of other European populations. We also reassess the phylogeny of Slavic languages within the Balto-Slavic branch of Indo-European. We find that genetic distances among Balto-Slavic populations, based on autosomal and Y-chromosomal loci, show a high correlation (0.9) both with each other and with geography, but a slightly lower correlation (0.7) with mitochondrial DNA and linguistic affiliation. The data suggest that genetic diversity of the present-day Slavs was predominantly shaped in situ, and we detect two different substrata: ‘central-east European’ for West and East Slavs, and ‘south-east European’ for South Slavs. A pattern of distribution of segments identical by descent between groups of East-West and South Slavs suggests shared ancestry or a modest gene flow between those two groups, which might derive from the historic spread of Slavic people.
Contemporary inhabitants of the Balkan Peninsula belong to several ethnic groups of diverse cultural background. In this study, three ethnic groups from Bosnia and Herzegovina - Bosniacs, Bosnian Croats and Bosnian Serbs - as well as the populations of Serbians, Croatians, Macedonians from the former Yugoslav Republic of Macedonia, Montenegrins and Kosovars have been characterized for the genetic variation of 660 000 genome-wide autosomal single nucleotide polymorphisms and for haploid markers. New autosomal data of the 70 individuals together with previously published data of 20 individuals from the populations of the Western Balkan region in a context of 695 samples of global range have been analysed. Comparison of the variation data of autosomal and haploid lineages of the studied Western Balkan populations reveals a concordance of the data in both sets and the genetic uniformity of the studied populations, especially of Western South-Slavic speakers. The genetic variation of Western Balkan populations reveals the continuity between the Middle East and Europe via the Balkan region and supports the scenario that one of the major routes of ancient gene flows and admixture went through the Balkan Peninsula.
Fine structural details of glycans attached to the conserved N-glycosylation site significantly not only affect function of individual immunoglobulin G (IgG) molecules but also mediate inflammation at the systemic level. By analyzing IgG glycosylation in 5,117 individuals from four European populations, we have revealed very complex patterns of changes in IgG glycosylation with age. Several IgG glycans (including FA2B, FA2G2, and FA2BG2) changed considerably with age and the combination of these three glycans can explain up to 58% of variance in chronological age, significantly more than other markers of biological age like telomere lengths. The remaining variance in these glycans strongly correlated with physiological parameters associated with biological age. Thus, IgG glycosylation appears to be closely linked with both chronological and biological ages. Considering the important role of IgG glycans in inflammation, and because the observed changes with age promote inflammation, changes in IgG glycosylation also seem to represent a factor contributing to aging.
Glycosylation is the key posttranslational mechanism that regulates function of immunoglobulins, with multiple systemic repercussions to the immune system. Our study of IgG glycosylation in 5,117 individuals from four European populations has revealed very extensive and complex changes in IgG glycosylation with age. The combined index composed of only three glycans explained up to 58% of variance in age, considerably more than other biomarkers of age like telomere lengths. The remaining variance in these glycans strongly correlated with physiological parameters associated with biological age; thus, IgG glycosylation appears to be closely linked with both chronological and biological ages. The ability to measure human biological aging using molecular profiling has practical applications for diverse fields such as disease prevention and treatment, or forensics.
Aging; Glycome; Glycosylation; Immunoglobulin G; Inflammation.
Haplogroup G, together with J2 clades, has been associated with the spread of agriculture, especially in the European context. However, interpretations based on simple haplogroup frequency clines do not recognize underlying patterns of genetic diversification. Although progress has been recently made in resolving the haplogroup G phylogeny, a comprehensive survey of the geographic distribution patterns of the significant sub-clades of this haplogroup has not been conducted yet. Here we present the haplogroup frequency distribution and STR variation of 16 informative G sub-clades by evaluating 1472 haplogroup G chromosomes belonging to 98 populations ranging from Europe to Pakistan. Although no basal G-M201* chromosomes were detected in our data set, the homeland of this haplogroup has been estimated to be somewhere nearby eastern Anatolia, Armenia or western Iran, the only areas characterized by the co-presence of deep basal branches as well as the occurrence of high sub-haplogroup diversity. The P303 SNP defines the most frequent and widespread G sub-haplogroup. However, its sub-clades have more localized distribution with the U1-defined branch largely restricted to Near/Middle Eastern and the Caucasus, whereas L497 lineages essentially occur in Europe where they likely originated. In contrast, the only U1 representative in Europe is the G-M527 lineage whose distribution pattern is consistent with regions of Greek colonization. No clinal patterns were detected suggesting that the distributions are rather indicative of isolation by distance and demographic complexities.
Y-chromosome; haplogroup G; human evolution; population genetics
It is a longstanding puzzle why non-coding variants in the complement factor H (CFH) gene are more strongly associated with age-related macular degeneration (AMD) than functional coding variants that directly influence the alternative complement pathway. The situation is complicated by tight genetic associations across the region, including the adjacent CFH-related genes CFHR3 and CFHR1, which may themselves influence the alternative complement pathway and are contained within a common deletion (CNP147) which is associated with protection against AMD. It is unclear whether this association is mediated through a protective effect of low plasma CFHR1 concentrations, high plasma CFH or both. We examined the triangular relationships of CFH/CFHR3/CFHR1 genotype, plasma CFH or CFHR1 concentrations and AMD susceptibility in combined case–control (1256 cases, 1020 controls) and cross-sectional population (n = 1004) studies and carried out genome-wide association studies of plasma CFH and CFHR1 concentrations. A non-coding CFH SNP (rs6677604) and the CNP147 deletion were strongly correlated both with each other and with plasma CFH and CFHR1 concentrations. The plasma CFH-raising rs6677604 allele and raised plasma CFH concentration were each associated with AMD protection. In contrast, the protective association of the CNP147 deletion with AMD was not mediated by low plasma CFHR1, since AMD-free controls showed increased plasma CFHR1 compared with cases, but it may be mediated by the association of CNP147 with raised plasma CFH concentration. The results are most consistent with a regulatory locus within a 32 kb region of the CFH gene, with a major effect on plasma CFH concentration and AMD susceptibility.
A genome-wide association study of serum uric acid levels was performed in a relatively isolated population of European descent from an island of the Adriatic coast of Croatia. The study sample included 532 unrelated and 768 related individuals from 235 pedigrees. Inflation due to relatedness was controlled by using genomic control. Genetic association was assessed with 2,241,249 SNPs in 1300 samples after adjusting for age and gender. Our study replicated four previously reported serum uric acid loci (SLC2A9, ABCG2, RREB1, and SLC22A12). The strongest association was found with a SNP in SLC2A9 (rs13129697, P=2.33×10−19), which exhibited significant gender-specific effects, 35.76μmol/L (P=2.11×10−19) in females and 19.58 μmol/L (P=5.40×10−5) in males. Within this region of high linkage disequilibrium, we also detected a strong association with a non-synonymous SNP, rs16890979 (P=2.24×10−17), a putative causal variant for serum uric acid variation. In addition, we identified several novel loci suggestive of association with uric acid levels (SEMA5A, TMEM18, SLC28A2, and ODZ2), although the P-values (P<5×10−6) did not reach the threshold of genome-wide significance. Together, these findings provide further confirmation of previously reported uric acid-related genetic variants and highlight suggestive new loci for additional investigation.
Serum uric acid; genome-wide association; Adriatic island population
Genome-wide association studies (GWAS) have identified many common variants associated with complex traits in human populations. Thus far, most reported variants have relatively small effects and explain only a small proportion of phenotypic variance, leading to the issues of ‘missing’ heritability and its explanation. Using height as an example, we examined two possible sources of missing heritability: first, variants with smaller effects whose associations with height failed to reach genome-wide significance and second, allelic heterogeneity due to the effects of multiple variants at a single locus. Using a novel analytical approach we examined allelic heterogeneity of height-associated loci selected from SNPs of different significance levels based on the summary data of the GIANT (stage 1) studies. In a sample of 1,304 individuals collected from an island population of the Adriatic coast of Croatia, we assessed the extent of height variance explained by incorporating the effects of less significant height loci and multiple effective SNPs at the same loci. Our results indicate that approximately half of the 118 loci that achieved stringent genome-wide significance (p-value<5×10−8) showed evidence of allelic heterogeneity. Additionally, including less significant loci (i.e., p-value<5×10−4) and accounting for effects of allelic heterogeneity substantially improved the variance explained in height.
The Neolithic transition from hunting and gathering to farming and cattle breeding marks one of the most drastic cultural changes in European prehistory. Short stretches of ancient mitochondrial DNA (mtDNA) from skeletons of pre-Neolithic hunter-gatherers as well as early Neolithic farmers support the demic diffusion model where a migration of early farmers from the Near East and a replacement of pre-Neolithic hunter-gatherers are largely responsible for cultural innovation and changes in subsistence strategies during the Neolithic revolution in Europe. In order to test if a signal of population expansion is still present in modern European mitochondrial DNA, we analyzed a comprehensive dataset of 1,151 complete mtDNAs from present-day Europeans. Relying upon ancient DNA data from previous investigations, we identified mtDNA haplogroups that are typical for early farmers and hunter-gatherers, namely H and U respectively. Bayesian skyline coalescence estimates were then used on subsets of complete mtDNAs from modern populations to look for signals of past population expansions. Our analyses revealed a population expansion between 15,000 and 10,000 years before present (YBP) in mtDNAs typical for hunters and gatherers, with a decline between 10,000 and 5,000 YBP. These corresponded to an analogous population increase approximately 9,000 YBP for mtDNAs typical of early farmers. The observed changes over time suggest that the spread of agriculture in Europe involved the expansion of farming populations into Europe followed by the eventual assimilation of resident hunter-gatherers. Our data show that contemporary mtDNA datasets can be used to study ancient population history if only limited ancient genetic data is available.
Twenty-two single-nucleotide polymorphisms (SNPs) in 10 gene regions previously identified in obesity and type 2 diabetes (T2D) genome-wide association studies (GWAS) were evaluated for association with metabolic traits in a sample from an island population of European descent. We performed a population-based study using 18 anthropometric and biochemical traits considered as continuous variables in a sample of 843 unrelated subjects (360 men and 483 women) aged 18–80 years old from the island of Hvar on the eastern Adriatic coast of Croatia. All eight GWAS SNPs in FTO were significantly associated with weight, body mass index, waist circumference and hip circumference; 20 of the 32 nominal P-values remained significant after permutation testing for multiple corrections. The strongest associations were found between the two TCF7L2 GWAS SNPs with fasting plasma glucose and HbA1c levels, all four P-values remained significant after permutation tests. Nominally significant associations were found between several SNPs and other metabolic traits; however, the significance did not hold after permutation tests. Although the sample size was modest, our study strongly replicated the association of FTO variants with obesity-related measures and TCF7L2 variants with T2D-related traits. The estimated effect sizes of these variants were larger or comparable to published studies. This is likely attributable to the homogenous genetic background of the relatively isolated study population.
genetic association; obesity; type 2 diabetes; FTO; TCF7L2; isolated population
The phylogenetic relationships of numerous branches within the core Y-chromosome haplogroup R-M207 support a West Asian origin of haplogroup R1b, its initial differentiation there followed by a rapid spread of one of its sub-clades carrying the M269 mutation to Europe. Here, we present phylogeographically resolved data for 2043 M269-derived Y-chromosomes from 118 West Asian and European populations assessed for the M412 SNP that largely separates the majority of Central and West European R1b lineages from those observed in Eastern Europe, the Circum-Uralic region, the Near East, the Caucasus and Pakistan. Within the M412 dichotomy, the major S116 sub-clade shows a frequency peak in the upper Danube basin and Paris area with declining frequency toward Italy, Iberia, Southern France and British Isles. Although this frequency pattern closely approximates the spread of the Linearbandkeramik (LBK), Neolithic culture, an advent leading to a number of pre-historic cultural developments during the past ≤10 thousand years, more complex pre-Neolithic scenarios remain possible for the L23(xM412) components in Southeast Europe and elsewhere.
Y-chromosome; haplogroup R1b; human evolution; population genetics
Human height is a classical example of a polygenic quantitative trait. Recent large-scale genome-wide association studies (GWAS) have identified more than 200 height-associated loci, though these variants explain only 2∼10% of overall variability of normal height. The objective of this study was to investigate the variance explained by these loci in a relatively isolated population of European descent with limited admixture and homogeneous genetic background from the Adriatic coast of Croatia.
In a sample of 1304 individuals from the island population of Hvar, Croatia, we performed genome-wide SNP typing and assessed the variance explained by genetic scores constructed from different panels of height-associated SNPs extracted from five published studies. The combined information of the 180 SNPs reported by Lango Allen el al. explained 7.94% of phenotypic variation in our sample. Genetic scores based on 20∼50 SNPs reported by the remaining individual GWA studies explained 3∼5% of height variance. These percentages of variance explained were within ranges comparable to the original studies and heterogeneity tests did not detect significant differences in effect size estimates between our study and the original reports, if the estimates were obtained from populations of European descent.
We have evaluated the portability of height-associated loci and the overall fitting of estimated effect sizes reported in large cohorts to an isolated population. We found proportions of explained height variability were comparable to multiple reference GWAS in cohorts of European descent. These results indicate similar genetic architecture and comparable effect sizes of height loci among populations of European descent.
The aim of this article is to offer a concise interpretation of the scientific data about the topic of Croatian genetic heritage that was obtained over the past 10 years. We made a short overview of previously published articles by our and other groups, based mostly on Y-chromosome results. The data demonstrate that Croatian human population, as almost any other European population, represents remarkable genetic mixture. More than 3/4 of the contemporary Croatian men are most probably the offspring of Old Europeans who came here before and after the Last Glacial Maximum. The rest of the population is the offspring of the people who were arriving in this part of Europe through the southeastern route in the last 10 000 years, mostly during the neolithization process. We believe that the latest discoveries made with the techniques for whole-genome typing using the array technology, will help us understand the structure of Croatian population in more detail, as well as the aspects of its demographic history.
Human Y-chromosome haplogroup structure is largely circumscribed by continental boundaries. One notable exception to this general pattern is the young haplogroup R1a that exhibits post-Glacial coalescent times and relates the paternal ancestry of more than 10% of men in a wide geographic area extending from South Asia to Central East Europe and South Siberia. Its origin and dispersal patterns are poorly understood as no marker has yet been described that would distinguish European R1a chromosomes from Asian. Here we present frequency and haplotype diversity estimates for more than 2000 R1a chromosomes assessed for several newly discovered SNP markers that introduce the onset of informative R1a subdivisions by geography. Marker M434 has a low frequency and a late origin in West Asia bearing witness to recent gene flow over the Arabian Sea. Conversely, marker M458 has a significant frequency in Europe, exceeding 30% in its core area in Eastern Europe and comprising up to 70% of all M17 chromosomes present there. The diversity and frequency profiles of M458 suggest its origin during the early Holocene and a subsequent expansion likely related to a number of prehistoric cultural developments in the region. Its primary frequency and diversity distribution correlates well with some of the major Central and East European river basins where settled farming was established before its spread further eastward. Importantly, the virtual absence of M458 chromosomes outside Europe speaks against substantial patrilineal gene flow from East Europe to Asia, including to India, at least since the mid-Holocene.
Y chromosome; haplogroup R1a; human evolution; population genetics
Multiple studies have provided compelling evidence that the FTO gene variants are associated with obesity measures. The objective of the study was to investigate whether FTO variants are associated with a broad range of obesity related anthropometric traits in an island population.
We examined genetic association between 29 FTO SNPs and a comprehensive set of anthropometric traits in 843 unrelated individuals from an island population in the eastern Adriatic coast of Croatia. The traits include 11 anthropometrics (height, weight, waist circumference, hip circumference, bicondilar upper arm width, upper arm circumference, and biceps, triceps, subscapular, suprailiac and abdominal skin-fold thicknesses) and two derived measures (BMI and WHR). Using single locus score tests, 15 common SNPs were found to be significantly associated with “body fatness” measures such as weight, BMI, hip and waist circumferences with P-values ranging from 0.0004 to 0.01. Similar but less significant associations were also observed between these markers and bicondilar upper arm width and upper arm circumference. Most of these significant findings could be explained by a mediating effect of “body fatness”. However, one unique association signal between upper arm width and rs16952517 (P-value = 0.00156) could not be explained by this mediating effect. In addition, using a principle component analysis and conditional association tests adjusted for “body fatness”, two novel association signals were identified between upper arm circumference and rs11075986 (P-value = 0.00211) and rs16945088 (P-value = 0.00203).
The current study confirmed the association of common variants of FTO gene with “body fatness” measures in an isolated island population. We also observed evidence of pleiotropic effects of FTO gene on fat-free mass, such as frame size and muscle mass assessed by bicondilar upper arm width and upper arm circumference respectively and these pleiotropic effects might be influenced by variants that are different from the ones associated with “body fatness”.
Genome-wide association studies (GWAS) have identified 38 larger genetic regions affecting classical blood lipid levels without adjusting for important environmental influences. We modeled diet and physical activity in a GWAS in order to identify novel loci affecting total cholesterol, LDL cholesterol, HDL cholesterol, and triglyceride levels. The Swedish (SE) EUROSPAN cohort (NSE = 656) was screened for candidate genes and the non-Swedish (NS) EUROSPAN cohorts (NNS = 3,282) were used for replication. In total, 3 SNPs were associated in the Swedish sample and were replicated in the non-Swedish cohorts. While SNP rs1532624 was a replication of the previously published association between CETP and HDL cholesterol, the other two were novel findings. For the latter SNPs, the p-value for association was substantially improved by inclusion of environmental covariates: SNP rs5400 (pSE,unadjusted = 3.6×10−5, pSE,adjusted = 2.2×10−6, pNS,unadjusted = 0.047) in the SLC2A2 (Glucose transporter type 2) and rs2000999 (pSE,unadjusted = 1.1×10−3, pSE,adjusted = 3.8×10−4, pNS,unadjusted = 0.035) in the HP gene (Haptoglobin-related protein precursor). Both showed evidence of association with total cholesterol. These results demonstrate that inclusion of important environmental factors in the analysis model can reveal new genetic susceptibility loci.
In this article we report a genome-wide association study on cholesterol levels in the human blood. We used a Swedish cohort to select genetic polymorphisms that showed the strongest association with cholesterol levels adjusted for diet and physical activity. We replicated several genetic loci in other European cohorts. This approach extends present genome-wide association studies on lipid levels, which did not take these lifestyle factors into account, to improve statistical results and discover novel genes. In our analysis, we could identify two genetic loci in the SLC2A2 (Glucose transporter type 2) and the HP (Haptoglobin-related protein precursor) gene whose effects on total cholesterol have not been reported yet. The results show that inclusion of important environmental factors in the analysis model can reveal new insights into genetic determinants of clinical parameters relevant for metabolic and cardiovascular disease.
A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000-year-old Neandertal individual using 8,341 mtDNA sequences identified among 4.8 Gb of DNA generated from ~0.3 grams of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs and allows an estimate of the divergence date between the two mtDNA lineages of 660,000±140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared to other primate lineages suggesting that the effective population size of Neandertals was small.
To perform a comprehensive evaluation of association of common genetic variants in candidate genes in the dopaminergic pathway with schizophrenia in a sample from Croatian population.
A case-control association study was performed on 104 unrelated patients with schizophrenia recruited from a psychiatric hospital in Zagreb and 131 phenotypically normal Croatian subjects. Forty-nine tagging single nucleotide polymorphisms (tagSNPs) in 8 candidate genes in the dopaminergic pathway were identified from the HapMap database and tested for association. Genotyping was performed using the SNPlex platform. Statistical analysis was conducted to assess allelic and genotypic associations between cases and controls using a goodness of fit χ2 test and trend test, respectively; adjustment for multiple testing was done by permutation based analysis.
Significant allele frequency differences between schizophrenia cases and controls were observed at 4 tagSNPs located in the genes DRD5, HTR1B1, DBH, and TH1 (P < 0.005). A trend test also confirmed the genotypic association (P < 0.001) of these 4 tagSNPs. Additionally, moderate association (P < 0.05) was observed with 8 tagSNPs on SLC6A3, DBH, DRD4, SLC6A4, and COMT.
Common genetic variants in genes involved in the dopaminergic pathway are associated with schizophrenia in the populations of Caucasian descent.
To investigate the prevalence of chronic respiratory symptoms in 9 metapopulations on Adriatic islands in Croatia, and the relationship between respiratory symptoms and individual genetic background.
We obtained random sample of 1001 adult inhabitants of 9 Adriatic island villages in Croatia, that also included immigrants to these villages. European Union respiratory health questionnaire and World Health Organization non-communicable diseases questionnaire were used. Personal genetic histories were reconstructed, based on the two-generation ancestral pedigrees. Bivariate and multivariate methods were used in the analysis.
Women reported the occurrence of acute dyspnea (P = 0.017), cough (P = 0.002), and asthma (P = 0.002) more often than men. Gender was the strongest predictor for acute and/or chronic cough (odds ratio [OR], 1.69; 95% confidence interval [CI], 1.23-2.33) and asthma (OR, 2.00; 95% CI, 1.00-4.01), whereas smoking was the strongest risk factor for acute and chronic dyspnea (OR, 1.90; 95% CI, 1.21-2.99) and airway narrowing (OR, 1.84; 95% CI, 1.18-2.87). Residence on the northern islands increased the odds of allergy, whereas the highest odds ratio of 3.20 was associated with the interaction of northern residence and immigrant background. Genetic background was a significant predictor only for the occurrence of allergy symptoms.
Differences in respiratory findings among the island inhabitants were often associated with smoking prevalence. Interaction of residence on northern Adriatic islands and immigrant background proved to be the strongest predictor for the occurrence of allergy symptoms. This study indicated that environmental factors played a very important role in the occurrence of respiratory symptoms.