Schizophrenia (SCZ) is a severe psychiatric disorder associated with many different risk factors, both genetic and environmental. A recent genome-wide association study (GWAS) of Han Chinese identified three single-nucleotide polymorphisms (SNPs rs11038167, rs11038172, and rs835784) in the tetraspanins gene TSPAN18 as possible susceptibility loci for schizophrenia. Hoping to validate these findings, we conducted a case-control study of Han Chinese with 1093 schizophrenia cases and 1022 healthy controls. Using the LDR-PCR method to genotype polymorphisms in TSPAN18, we found no significant differences (P>0.05) between patients and controls in either the allele or genotype frequency of the SNPs rs11038167 and rs11038172. We did find, however, that the frequency of the ‘A’ allele of SNP rs835784 is significantly higher in patients than in controls. We further observed a significant association (OR = 1.197, 95%CI = 1.047–1.369) between risk for SCZ and this ‘A’ allele. These results confirm the significant association, in Han Chinese populations, of increased SCZ risk and the variant of the TSPAN18 gene containing the ‘A’ allele of SNP rs835784.
Hirschsprung disease (HSCR, OMIM 142623) is a developmental disorder characterized by the absence of ganglion cells along variable lengths of the distal gastrointestinal tract, which results in tonic contraction of the aganglionic colon segment and functional intestinal obstruction. The RET proto-oncogene is the major gene associated to HSCR with differential contributions of its rare and common, coding and noncoding mutations to the multifactorial nature of this pathology. In addition, many other genes have been described to be associated with this pathology, including the semaphorins class III genes SEMA3A (7p12.1) and SEMA3D (7q21.11) through SNP array analyses and by next-generation sequencing technologies. Semaphorins are guidance cues for developing neurons implicated in the axonal projections and in the determination of the migratory pathway for neural-crest derived neural precursors during enteric nervous system development. In addition, it has been described that increased SEMA3A expression may be a risk factor for HSCR through the upregulation of the gene in the aganglionic smooth muscle layer of the colon in HSCR patients. Here we present the results of a comprehensive analysis of SEMA3A and SEMA3D in a series of 200 Spanish HSCR patients by the mutational screening of its coding sequence, which has led to find a number of potentially deleterious variants. RET mutations have been also detected in some of those patients carrying SEMAs variants. We have evaluated the A131T-SEMA3A, S598G-SEMA3A and E198K-SEMA3D mutations using colon tissue sections of these patients by immunohistochemistry. All mutants presented increased protein expression in smooth muscle layer of ganglionic segments. Moreover, A131T-SEMA3A also maintained higher protein levels in the aganglionic muscle layers. These findings strongly suggest that these mutants have a pathogenic effect on the disease. Furthermore, because of their coexistence with RET mutations, our data substantiate the additive genetic model proposed for this rare disorder and further support the association of SEMAs genes with HSCR.
Low back pain is associated with lumbar disc degeneration, which is mainly due to genetic predisposition. The objective of this study was to perform a systematic review to evaluate genetic association studies in lumbar disc degeneration as defined on magnetic resonance imaging (MRI) in humans.
A systematic literature search was conducted in MEDLINE, MEDLINE In-Process, SCOPUS, ISI Web of Science, The Genetic Association Database and The Human Genome Epidemiology Network for information published between 1990–2011 addressing genes and lumbar disc degeneration. Two investigators independently identified studies to determine inclusion, after which they performed data extraction and analysis. The level of cumulative genetic association evidence was analyzed according to The HuGENet Working Group guidelines.
Fifty-two studies were included for review. Forty-eight studies reported at least one positive association between a genetic marker and lumbar disc degeneration. The phenotype definition of lumbar disc degeneration was highly variable between the studies and replications were inconsistent. Most of the associations presented with a weak level of evidence. The level of evidence was moderate for ASPN (D-repeat), COL11A1 (rs1676486), GDF5 (rs143383), SKT (rs16924573), THBS2 (rs9406328) and MMP9 (rs17576).
Based on this first extensive systematic review on the topic, the credibility of reported genetic associations is mostly weak. Clear definition of lumbar disc degeneration phenotypes and large population-based cohorts are needed. An international consortium is needed to standardize genetic association studies in relation to disc degeneration.
Correct annotation of the genetic relationships between samples is essential for population genomic studies, which could be biased by errors or omissions. To this end, we used identity-by-state (IBS) and identity-by-descent (IBD) methods to assess genetic relatedness of individuals within HapMap phase III data. We analyzed data from 1,397 individuals across 11 ethnic populations. Our results support previous studies (Pemberton et al., 2010; Kyriazopoulou-Panagiotopoulou et al., 2011) assessing unknown relatedness present within this population. Additionally, we present evidence for 1,657 novel pairwise relationships across 9 populations. Surprisingly, significant Cotterman's coefficients of relatedness K1 (IBD1) values were detected between pairs of known parents. Furthermore, significant K2 (IBD2) values were detected in 32 previously annotated parent-child relationships. Consistent with a hypothesis of inbreeding, regions of homozygosity (ROH) were identified in the offspring of related parents, of which a subset overlapped those reported in previous studies (Gibson et al. 2010; Johnson et al. 2011). In total, we inferred 28 inbred individuals with ROH that overlapped areas of relatedness between the parents and/or IBD2 sharing at a different genomic locus between a child and a parent. Finally, 8 previously annotated parent-child relationships had unexpected K0 (IBD0) values (resulting from a chromosomal abnormality or genotype error), and 10 previously annotated second-degree relationships along with 38 other novel pairwise relationships had unexpected IBD2 (indicating two separate paths of recent ancestry). These newly described types of relatedness may impact the outcome of previous studies and should inform the design of future studies relying on the HapMap Phase III resource.
The trnH–psbA intergenic spacer region has been used in many DNA barcoding studies. However, a comprehensive evaluation with rigorous sequence preprocessing and statistical testing on the utility of trnH–psbA and its combinations as DNA barcodes is lacking.
Sequences were searched from GenBank for a meta-analysis on the usefulness of trnH–psbA and its combinations as DNA barcodes. After preprocessing, we constructed full and matching data sets that contained 17 983 trnH–psbA sequences and 2190 sets of trnH–psbA, matK, rbcL, and ITS2 sequences from the same sample, repectively. These datasets were used to analyze the ability of trnH–psbA and its combinations to discriminate species by the BLAST and BLAST+P methods. The Fisher's exact test was used to evaluate the significance of performance differences. For the full data set, the identification success rates of trnH–psbA exceeded 70% in 18 families and 12 genera, respectively. For the matching data set, the identification rates of trnH–psbA were significantly higher than those of the other loci in two families and four genera. Similarly, the identification rates of trnH–psbA+ITS2 were significantly higher than those of matK+rbcL in 18 families and 21 genera.
This study provides valuable information on the higher utility of trnH–psbA and its combinations. We found that trnH–psbA+ITS2 combination performs better or equally well compared with other combinations in most taxonomic groups investigated. This information will guide the optimal usage of trnH–psbA and its combinations for species identification.
Background & Aims
Previous studies indicate an association between sleep problems and gastroesophageal reflux disease (GERD). Although both these conditions separately have moderate heritabilities, confounding by genetic factors has not previously been taken into account. This study aimed to reveal the association between sleep problems and GERD, while adjusting for heredity and other potential confounding factors.
This cross-sectional population-based study included all 8,014 same-sexed twins of at least 65 years of age and born in Sweden between 1886 and 1958, who participated in telephone interviews in 1998–2002. Three logistic regression models were used 1) external control analysis, 2) within-pair co-twin analysis with dizygotic (DZ) twin pairs discordant for GERD, and 3) within-pair co-twin analysis with monozygotic (MZ) twin pairs discordant for GERD. Odds ratios (ORs) with 95% confidence intervals (CIs) were calculated and adjusted for established risk factors for GERD, i.e. sex, age, body mass index (BMI), tobacco smoking, and educational level.
A dose-response association was identified between increasing levels of sleep problems and GERD in the external control analysis. Individuals who often experienced sleep problems had a two-fold increased occurrence of GERD compared to those who seldom had sleep problems (OR 2.0, 95% CI 1.8–2.4). The corresponding association was of similar strength in the co-twin analysis including 356 DZ pairs (OR 2.2, 95% CI 1.6–3.4), and in the co-twin analysis including 210 MZ pairs (OR 1.5, 95% CI 0.9–2.7).
A dose-dependent association between sleep problems and GERD remains after taking heredity and other known risk factors for GERD into account.
Osteoclast activity and the fine balance between bone formation and resorption is affected by inflammatory factors such as cytokines and T lymphocyte activity, mediated by major histocompatibility complex (MHC) molecules, in turn regulated by the MHC class II transactivator (MHC2TA). We investigated the effect of functional polymorphisms in the MHC2TA gene (CIITA), and two additional genes; C-type lectin domain 16A (CLEC16A), in linkage disequilibrium with CIITA and Interferon-γ (IFNG), an inducer of CIITA; on bone density, bone resorption markers, bone loss and fracture risk in 75 year-old women followed for up to 10 years (OPRA n = 1003) and in young adult women (PEAK-25 n = 999). CIITA was associated with BMD at age 75 (lumbar spine p = 0.011; femoral neck (FN) p = 0.049) and age 80 (total body p = 0.015; total hip p = 0.042; FN p = 0.028). Carriers of the CIITA rs3087456(G) allele had 1.8–3.4% higher BMD and displayed increased rate of bone loss between age 75 and 80 (FN p = 0.013; total hip p = 0.030; total body p = 3.8E−5). Despite increasing bone loss, the rs3087456(G) allele was protective against incident fracture overall (p = 0.002), osteoporotic fracture and hip fracture. Carriers of CLEC16A and IFNG variant alleles had lower BMD (p<0.05) and ultrasound parameters and a lower risk of incident fracture (CLEC16A, p = 0.011). In 25-year old women, none of the genes were associated with BMD. In conclusion, variation in inflammatory genes CIITA, CLEC-16A and INFG appear to contribute to bone phenotypes in elderly women and suggest a role for low-grade inflammation and MHC class II expression for osteoporosis pathogenesis.
The objective was to estimate the heritability for height and weight during fetal life and early childhood in two independent studies, one including parent and singleton offsprings and one of mono- and dizygotic twins.
This study was embedded in the Generation R Study (n = 3407, singletons) and the Netherlands Twin Register (n = 33694, twins). For the heritability estimates in Generation R, regression models as proposed by Galton were used. In the Twin Register we used genetic structural equation modelling. Parental height and weight were measured and fetal growth characteristics (femur length and estimated fetal weight) were measured by ultrasounds in 2nd and 3rd trimester (Generation R only). Height and weight were assessed at multiple time-points from birth to 36 months in both studies.
Heritability estimates for length increased from 2nd to 3rd trimester from 13% to 28%. At birth, heritability estimates for length in singletons and twins were both 26% and 27%, respectively, and at 36 months, the estimates for height were 63% and 72%, respectively. Heritability estimates for fetal weight increased from 2nd to 3rd trimester from 17% to 27%. For birth weight, heritability estimates were 26% in singletons and 29% in twins. At 36 months, the estimate for twins was 71% and higher than for singletons (42%).
Heritability estimates for height and weight increase from second trimester to infancy. This increase in heritability is observed in singletons and twins. Longer follow-up studies are needed to examine how the heritability develops in later childhood and puberty.
Smoking behavior is a multifactorial phenotype with significant heritability. Identifying the specific loci that influence smoking behavior could provide important etiological insights and facilitate the development of treatments to further reduce smoking related mortality. Although several studies pointed to different candidate genes for smoking, there is still a need for replication especially in samples from different countries. In the present study, we investigated whether 21 positive signals for smoking behavior from these studies are replicated in a sample of 531 blood donors from the Brazilian population. The polymorphisms were chosen based on their representativeness of different candidate biologic systems, strength of previous evidence, location and allele frequencies. By genotyping with the Sequenom MassARRAY iPLEX platform and subsequent statistical analysis using Plink software, we show that two of the SNPs studied, in the SLC1A2 (rs1083658) and ACTN1 (rs2268983) genes, were associated with smoking behavior in our study population. These genes are involved in crucial aspects of nicotine dependence, glutamate system and synaptic plasticity, and as such, are biologically plausible candidates that merit further molecular analyses so as to clarify their potential role in smoking behavior.
Homozygosity mapping has played an important role in detecting recessive mutations using families of consanguineous marriages. However, detection of homozygous regions identity by descent (HBD) when family data is not available, or when relationship is hidden, is still a challenge. Making use of population data from high-density SNP genotyping may allow detection of regions HBD from recent common founders in singleton patients without genealogy information. We report a novel algorithm that detects such regions by estimating the population haplotype frequencies (HF) for an entire homozygous region. We also developed a simulation method to evaluate the probability of HBD for a homozygous region by examining the best regions in unaffected controls from the host population. The method can be applied to diseases of Mendelian inheritance and can be further extended to complex diseases to detect rare founder mutations using multiplex families or sporadic cases. Testing of the method on both real cases (singleton affected) and simulated data demonstrated its superb sensitivity and great resistance to genetic heterogeneity.
homozygosity mapping; recessive mutation; founder mutation; runs of homozygosity; hidden relationship
Hirschsprung disease (HSCR) is a congenital disorder characterized by aganglionosis of the distal intestine. To assess the contribution of copy number variants (CNVs) to HSCR, we analysed the data generated from our previous genome-wide association study on HSCR patients, whereby we identified NRG1 as a new HSCR susceptibility locus. Analysis of 129 Chinese patients and 331 ethnically matched controls showed that HSCR patients have a greater burden of rare CNVs (p = 1.50×10−5), particularly for those encompassing genes (p = 5.00×10−6). Our study identified 246 rare-genic CNVs exclusive to patients. Among those, we detected a NRG3 deletion (p = 1.64×10−3). Subsequent follow-up (96 additional patients and 220 controls) on NRG3 revealed 9 deletions (combined p = 3.36×10−5) and 2 de novo duplications among patients and two deletions among controls. Importantly, NRG3 is a paralog of NRG1. Stratification of patients by presence/absence of HSCR–associated syndromes showed that while syndromic–HSCR patients carried significantly longer CNVs than the non-syndromic or controls (p = 1.50×10−5), non-syndromic patients were enriched in CNV number when compared to controls (p = 4.00×10−6) or the syndromic counterpart. Our results suggest a role for NRG3 in HSCR etiology and provide insights into the relative contribution of structural variants in both syndromic and non-syndromic HSCR. This would be the first genome-wide catalog of copy number variants identified in HSCR.
Copy number variations (CNVs) are significant genetic risk factors in disease pathogenesis and represent an important portion of missing heritability for some human diseases, making their discovery essential for the identification of genes and risk factors for a wide range of diseases, including Hirschsprung disease (HSCR, congenital colon aganglionosis). Since the discovery of the major HSCR gene, RET, a number of rare mutations have been reported in RET and other genes involved in the development of the enteric nervous system. However, these mutations contribute to only a small proportion of the disease susceptibility. Taking advantage of the recent technical and methodological advances, we have examined the contribution of CNVs to the disease. We have found that HSCR patients are enriched with CNVs encompassing genes. In particular, we found that deletions of NRG3, a paralog of the previously identified HSCR–susceptibility gene NRG1, were associated with the HSCR phenotype.
Hirschsprung disease (HSCR, OMIM 142623) is a developmental disorder characterized by the absence of ganglion cells along variable lengths of the distal gastrointestinal tract, which results in tonic contraction of the aganglionic gut segment and functional intestinal obstruction. The RET proto-oncogene is the major gene for HSCR with differential contributions of its rare and common, coding and noncoding mutations to the multifactorial nature of this pathology. Many other genes have been described to be associated with the pathology, as NRG1 gene (8p12), encoding neuregulin 1, which is implicated in the development of the enteric nervous system (ENS), and seems to contribute by both common and rare variants. Here we present the results of a comprehensive analysis of the NRG1 gene in the context of the disease in a series of 207 Spanish HSCR patients, by both mutational screening of its coding sequence and evaluation of 3 common tag SNPs as low penetrance susceptibility factors, finding some potentially damaging variants which we have functionally characterized. All of them were found to be associated with a significant reduction of the normal NRG1 protein levels. The fact that those mutations analyzed alter NRG1 protein would suggest that they would be related with HSCR disease not only in Chinese but also in a Caucasian population, which reinforces the implication of NRG1 gene in this pathology.
The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes.
Rheumatoid arthritis (RA) is a chronic inflammatory disorder with a polygenic mode of inheritance. This study examined the hypothesis that runs of homozygosity (ROHs) play a recessive-acting role in the underlying RA genetic mechanism and identified RA-associated ROHs. Ours is the first genome-wide homozygosity association study for RA and characterized the ROH patterns associated with RA in the genomes of 2,000 RA patients and 3,000 normal controls of the Wellcome Trust Case Control Consortium. Genome scans consistently pinpointed two regions within the human major histocompatibility complex region containing RA-associated ROHs. The first region is from 32,451,664 bp to 32,846,093 bp (−log10(p)>22.6591). RA-susceptibility genes, such as HLA-DRB1, are contained in this region. The second region ranges from 32,933,485 bp to 33,585,118 bp (−log10(p)>8.3644) and contains other HLA-DPA1 and HLA-DPB1 genes. These two regions are physically close but are located in different blocks of linkage disequilibrium, and ∼40% of the RA patients' genomes carry these ROHs in the two regions. By analyzing homozygote intensities, an ROH that is anchored by the single nucleotide polymorphism rs2027852 and flanked by HLA-DRB6 and HLA-DRB1 was found associated with increased risk for RA. The presence of this risky ROH provides a 62% accuracy to predict RA disease status. An independent genomic dataset from 868 RA patients and 1,194 control subjects of the North American Rheumatoid Arthritis Consortium successfully validated the results obtained using the Wellcome Trust Case Control Consortium data. In conclusion, this genome-wide homozygosity association study provides an alternative to allelic association mapping for the identification of recessive variants responsible for RA. The identified RA-associated ROHs uncover recessive components and missing heritability associated with RA and other autoimmune diseases.
Hypertension is a complex disorder with high prevalence rates all over the world. We conducted the first genome-wide gene-based association scan for hypertension in a Han Chinese population. By analyzing genome-wide single-nucleotide-polymorphism data of 400 matched pairs of young-onset hypertensive patients and normotensive controls genotyped with the Illumina HumanHap550-Duo BeadChip, 100 susceptibility genes for hypertension were identified and also validated with permutation tests. Seventeen of the 100 genes exhibited differential allelic and expression distributions between patient and control groups. These genes provided a good molecular signature for classifying hypertensive patients and normotensive controls. Among the 17 genes, IGF1, SLC4A4, WWOX, and SFMBT1 were not only identified by our gene-based association scan and gene expression analysis but were also replicated by a gene-based association analysis of the Hong Kong Hypertension Study. Moreover, cis-acting expression quantitative trait loci associated with the differentially expressed genes were found and linked to hypertension. IGF1, which encodes insulin-like growth factor 1, is associated with cardiovascular disorders, metabolic syndrome, decreased body weight/size, and changes of insulin levels in mice. SLC4A4, which encodes the electrogenic sodium bicarbonate cotransporter 1, is associated with decreased body weight/size and abnormal ion homeostasis in mice. WWOX, which encodes the WW domain-containing protein, is related to hypoglycemia and hyperphosphatemia. SFMBT1, which encodes the scm-like with four MBT domains protein 1, is a novel hypertension gene. GRB14, TMEM56 and KIAA1797 exhibited highly significant differential allelic and expressed distributions between hypertensive patients and normotensive controls. GRB14 was also found relevant to blood pressure in a previous genetic association study in East Asian populations. TMEM56 and KIAA1797 may be specific to Taiwanese populations, because they were not validated by the two replication studies. Identification of these genes enriches the collection of hypertension susceptibility genes, thereby shedding light on the etiology of hypertension in Han Chinese populations.
Hypertension is caused by the interaction of environmental and genetic factors. The condition which is very common, with about 18% of the adult Hong Kong Chinese population and over 50% of older individuals affected, is responsible for considerable morbidity and mortality. To identify genes influencing hypertension and blood pressure, we conducted a combined linkage and association study using over 500,000 single nucleotide polymorphisms (SNPs) genotyped in 328 individuals comprising 111 hypertensive probands and their siblings. Using a family-based association test, we found an association with SNPs on chromosome 5q31.1 (rs6596140; P<9×10−8) for hypertension. One candidate gene, PDC, was replicated, with rs3817586 on 1q31.1 attaining P = 2.5×10−4 and 2.9×10−5 in the within-family tests for DBP and MAP, respectively. We also identified regions of significant linkage for systolic and diastolic blood pressure on chromosomes 2q22 and 5p13, respectively. Further family-based association analysis of the linkage peak on chromosome 5 yielded a significant association (rs1605685, P<7×10−5) for DBP. This is the first combined linkage and association study of hypertension and its related quantitative traits with Chinese ancestry. The associations reported here account for the action of common variants whereas the discovery of linkage regions may point to novel targets for rare variant screening.
Genetic evidence implicates the DISC1 gene in the etiology of a number of mental illnesses. Previously, we have reported association between DISC1 and measures of psychosis proneness, the Revised Social Anhedonia Scale (RSAS) and Revised Physical Anhedonia Scale (RPAS), in the Northern Finland Birth Cohort 1966 (NFBC66). As part of the studies of this Finnish birth cohort genome-wide association analysis has recently been performed.
In the present study, we re-analyzed the genome-wide association data with regard to these two measures of psychosis proneness, conditioning on our previous DISC1 observation. From the original NFBC66 sample (N = 12 058), 4 561 individuals provided phenotype and genotype data. No markers were significant at the genome-wide level. However, several genes with biological relevance to mental illnesses were highlighted through loci displaying suggestive evidence for association (≥3 SNP with P<10E-4). These included the protein coding genes, CXCL3, KIAA1128, LCT, MED13L, TMCO7, TTN, and the micro RNA MIR620.
By conditioning a previous genome-wide association study on DISC1, we have been able to identify eight genes as associating to psychosis proneness. Further, these molecules predominantly link to the DISC1 pathway, strengthening the evidence for the role of this gene network in the etiology of mental illness. The use of quantitative measures of psychosis proneness in a large population cohort will make these findings, once verified; more generalized to a broad selection of disorders related to psychoses and psychosis proneness.
Rare (RVs) and common variants of the RET gene contribute to Hirschsprung disease (HSCR; congenital aganglionosis). While RET common variants are strongly associated with the commonest manifestation of the disease (males; short-segment aganglionosis; sporadic), rare coding sequence (CDS) variants are more frequently found in the lesser common and more severe forms of the disease (females; long/total colonic aganglionosis; familial).
Here we present the screening for RVs in the RET CDS and intron/exon boundaries of 601 Chinese HSCR patients, the largest number of patients ever reported. We identified 61 different heterozygous RVs (50 novel) distributed among 100 patients (16.64%). Those include 14 silent, 29 missense, 5 nonsense, 4 frame-shifts, and one in-frame amino-acid deletion in the CDS, two splice-site deletions, 4 nucleotide substitutions and a 22-bp deletion in the intron/exon boundaries and 1 single-nucleotide substitution in the 5′ untranslated region. Exonic variants were mainly clustered in RET the extracellular domain. RET RVs were more frequent among patients with the most severe phenotype (24% vs. 15% in short-HSCR). Phasing RVs with the RET HSCR-associated haplotype suggests that RVs do not underlie the undisputable association of RET common variants with HSCR. None of the variants were found in 250 Chinese controls.
Current genome-wide association studies (GWAS) use commercial genotyping microarrays that can assay over a million single nucleotide polymorphisms (SNPs). The number of SNPs is further boosted by advanced statistical genotype-imputation algorithms and large SNP databases for reference human populations. The testing of a huge number of SNPs needs to be taken into account in the interpretation of statistical significance in such genome-wide studies, but this is complicated by the non-independence of SNPs because of linkage disequilibrium (LD). Several previous groups have proposed the use of the effective number of independent markers (Me) for the adjustment of multiple testing, but current methods of calculation for Me are limited in accuracy or computational speed. Here, we report a more robust and fast method to calculate Me. Applying this efficient method [implemented in a free software tool named Genetic type 1 error calculator (GEC)], we systematically examined the Me, and the corresponding p-value thresholds required to control the genome-wide type 1 error rate at 0.05, for 13 Illumina or Affymetrix genotyping arrays, as well as for HapMap Project and 1000 Genomes Project datasets which are widely used in genotype imputation as reference panels. Our results suggested the use of a p-value threshold of ~10−7 as the criterion for genome-wide significance for early commercial genotyping arrays, but slightly more stringent p-value thresholds ~5 × 10−8 for current or merged commercial genotyping arrays, ~10−8 for all common SNPs in the 1000 Genomes Project dataset and ~5 × 10−8 for the common SNPs only within genes.
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-011-1118-2) contains supplementary material, which is available to authorized users.
We attempt to elucidate whether there might be a causal connection between the socioeconomic status (SES) of the rearing environment and obesity in the offspring using data from two large-scale adoption studies: (1) The Copenhagen Adoption Study of Obesity (CASO), and (2) The Survey of Holt Adoptees and Their Families (HOLT). In CASO, the SES of both biological and adoptive parents was known, but all children were adopted. In HOLT, only the SES of the rearing parents was known, but the children could be either biological or adopted. After controlling for relevant covariates (e.g., adoptee age at measurement, adoptee age at transfer, adoptee sex) the raw (unstandardized) regression coefficients for adoptive and biological paternal SES on adoptee body mass index (BMI: kg/m2) in CASO were -.22 and -.23, respectively, both statistically significant (p = 0.01). Controlling for parental BMI (both adoptive and biological) reduced the coefficient for biological paternal SES by 44% (p = .034) and the coefficient for adoptive paternal SES by 1%. For HOLT, the regression coefficients for rearing parent SES were -.42 and -.25 for biological and adoptive children, respectively. Controlling for the average BMI of the rearing father and mother (i.e., mid-parental BMI) reduced the SES coefficient by 47% in their biological offspring (p≤.0001), and by 12% in their adoptive offspring (p = .09). Thus, despite the differing structures of the two adoption studies, both suggest that shared genetic diathesis and direct environmental transmission contribute about equally to the association between rearing SES and offspring BMI.
Though rooted in genomic expression studies, pathway analysis for genome-wide association studies (GWAS) has gained increasing popularity, since it has the potential to discover hidden disease pathogenic mechanisms by combining statistical methods with biological knowledge. Generally, algorithms or programs proposed recently can be categorized by different types of input data, null hypothesis or counts of analysis stages. Due to complexity caused by SNP, gene and pathway relationships, re-sampling strategies like permutation are always utilized to derive an empirical distribution for test statistics for evaluating the significance of candidate pathways. However, evaluation of these algorithms on real GWAS datasets and real biological pathway databases needs to be addressed before we apply them widely with confidence.
Two algorithms which use summary statistics from GWAS as input were implemented in KGG, a novel and user-friendly software tool for GWAS pathway analysis. Comparisons of these two algorithms as well as the other five selected algorithms were conducted by analyzing the WTCCC Crohn's Disease dataset utilizing the MsigDB canonical pathways. As a result of using permutation to obtain empirical p-value, most of these methods could control Type I error rate well, although some are conservative. However, the methods varied greatly in terms of power and running time, with the PLINK truncated set-based test being the most powerful and KGG being the fastest.
Raw data-based algorithms, such as those implemented in PLINK, are preferable for GWAS pathway analysis as long as computational capacity is available. It may be worthwhile to apply two or more pathway analysis algorithms on the same GWAS dataset, since the methods differ greatly in their outputs and might provide complementary findings for the studied complex disease.
Acute rheumatic fever is considered to be a heritable condition, but the magnitude of the genetic effect is unknown. The objective of this study was to conduct a systematic review and meta-analysis of twin studies of concordance of acute rheumatic fever in order to derive quantitative estimates of the size of the genetic effect.
We searched PubMed/MEDLINE, ISI Web of Science, EMBASE, and Google Scholar from their inception to 31 January 2011, and bibliographies of retrieved articles, for twin studies of the concordance for acute rheumatic fever or rheumatic heart disease in monozygotic versus dizygotic twins that used accepted diagnostic criteria for acute rheumatic fever and zygosity without age, gender or language restrictions. Twin similarity was measured by probandwise concordance rate and odds ratio (OR), and aggregate probandwise concordance risk was calculated by combining raw data from each study. ORs from separate studies were combined by random-effects meta-analysis to evaluate association between zygosity status and concordance. Heritability was estimated by fitting a variance components model to the data.
435 twin pairs from six independent studies met the inclusion criteria. The pooled probandwise concordance risk for acute rheumatic fever was 44% in monozygotic twins and 12% in dizygotic twins, and the association between zygosity and concordance was strong (OR 6.39; 95% confidence interval, 3.39 to 12.06; P<0.001), with no significant study heterogeneity (P = 0.768). The estimated heritability across all the studies was 60%.
Acute rheumatic fever is an autoimmune disorder with a high heritability. The discovery of all genetic susceptibility loci through whole genome scanning may provide a clinically useful genetic risk prediction tool for acute rheumatic fever and its sequel, rheumatic heart disease.