Search tips
Search criteria

Results 1-25 (25)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Cost effective assay choice for rare disease study designs 
High throughput assays tend to be expensive per subject. Often studies are limited not so much by the number of subjects available as by assay costs, making assay choice a critical issue. We have developed a framework for assay choice that maximises the number of true disease causing mechanisms ‘seen’, given limited resources. Although straightforward, some of the ramifications of our methodology run counter to received wisdom on study design. We illustrate our methodology with examples, and have built a website allowing calculation of quantities of interest to those designing rare disease studies.
PMCID: PMC4334400  PMID: 25648394
WES; WGS; High-throughput assay; Rare disease; Study design
2.  Lumbar disc degeneration is linked to a carbohydrate sulfotransferase 3 variant 
The Journal of Clinical Investigation  2013;123(11):4909-4917.
Lumbar disc degeneration (LDD) is associated with both genetic and environmental factors and affects many people worldwide. A hallmark of LDD is loss of proteoglycan and water content in the nucleus pulposus of intervertebral discs. While some genetic determinants have been reported, the etiology of LDD is largely unknown. Here we report the findings from linkage and association studies on a total of 32,642 subjects consisting of 4,043 LDD cases and 28,599 control subjects. We identified carbohydrate sulfotransferase 3 (CHST3), an enzyme that catalyzes proteoglycan sulfation, as a susceptibility gene for LDD. The strongest genome-wide linkage peak encompassed CHST3 from a Southern Chinese family–based data set, while a genome-wide association was observed at rs4148941 in the gene in a meta-analysis using multiethnic population cohorts. rs4148941 lies within a potential microRNA-513a-5p (miR-513a-5p) binding site. Interaction between miR-513a-5p and mRNA transcribed from the susceptibility allele (A allele) of rs4148941 was enhanced in vitro compared with transcripts from other alleles. Additionally, expression of CHST3 mRNA was significantly reduced in the intervertebral disc cells of human subjects carrying the A allele of rs4148941. Together, our data provide new insights into the etiology of LDD, implicating an interplay between genetic risk factors and miRNA.
PMCID: PMC3809787  PMID: 24216480
3.  Gene Network Analysis of Candidate Loci for Human Anorectal Malformations 
PLoS ONE  2013;8(8):e69142.
Anorectal malformations (ARMs) are birth defects that require surgery and carry significant chronic morbidity. Our earlier genome-wide copy number variation (CNV) study had provided a wealth of candidate loci. To find out whether these candidate loci are related to important developmental pathways, we have performed an extensive literature search coupled with the currently available bioinformatics tools. This has allowed us to assign both genic and non-genic CNVs to interrelated pathways known to govern the development of the anorectal region. We have linked 11 candidate genes to the WNT signalling pathway and 17 genes to the cytoskeletal network. Interestingly, candidate genes with similar functions are disrupted by the same type of CNV. The gene network we discovered provides evidence that rare mutations in different interrelated genes may lead to similar phenotypes, accounting for genetic heterogeneity in ARMs. Classification of patients according to the affected pathway and lesion type should eventually improve the diagnosis and the identification of common genes/molecules as therapeutic targets.
PMCID: PMC3731316  PMID: 23936318
4.  Genetic and Environmental Contributions to General Cognitive Ability Through the First 16 Years of Life 
Developmental psychology  2004;40(5):805-812.
The genetic and environmental contributions to the development of general cognitive ability throughout the first 16 years of life were examined using sibling data from the Colorado Adoption Project. Correlations were analyzed along with structural equation models to characterize the genetic and environmental influences on longitudinal stability and instability. Intraclass correlations reflected both considerable genetic influence at each age and modest shared environmental influence within and across ages. Modeling results suggested that genetic factors mediated phenotypic stability throughout this entire period, whereas most age-to-age instability appeared to be due to nonshared environmental influences.
PMCID: PMC3710702  PMID: 15355167
5.  Genetic Analyses of a Three Generation Family Segregating Hirschsprung Disease and Iris Heterochromia 
PLoS ONE  2013;8(6):e66631.
We present the genetic analyses conducted on a three-generation family (14 individuals) with three members affected with isolated-Hirschsprung disease (HSCR) and one with HSCR and heterochromia iridum (syndromic-HSCR), a phenotype reminiscent of Waardenburg-Shah syndrome (WS4). WS4 is characterized by pigmentary abnormalities of the skin, eyes and/or hair, sensorineural deafness and HSCR. None of the members had sensorineural deafness. The family was screened for copy number variations (CNVs) using Illumina-HumanOmni2.5-Beadchip and for coding sequence mutations in WS4 genes (EDN3, EDNRB, or SOX10) and in the main HSCR gene (RET). Confocal microscopy and immunoblotting were used to assess the functional impact of the mutations. A heterozygous A/G transition in EDNRB was identified in 4 affected and 3 unaffected individuals. While in EDNRB isoforms 1 and 2 (cellular receptor) the transition results in the abolishment of translation initiation (M1V), in isoform 3 (only in the cytosol) the replacement occurs at Met91 (M91V) and is predicted benign. Another heterozygous transition (c.-248G/A; -predicted to affect translation efficiency-) in the 5′-untranslated region of EDN3 (EDNRB ligand) was detected in all affected individuals but not in healthy carriers of the EDNRB mutation. Also, a de novo CNVs encompassing DACH1 was identified in the patient with heterochromia iridum and HSCR
Since the EDNRB and EDN3 variants only coexist in affected individuals, HSCR could be due to the joint effect of mutations in genes of the same pathway. Iris heterochromia could be due to an independent genetic event and would account for the additional phenotype within the family.
PMCID: PMC3694150  PMID: 23840513
6.  Utility of the trnH–psbA Intergenic Spacer Region and Its Combinations as Plant DNA Barcodes: A Meta-Analysis 
PLoS ONE  2012;7(11):e48833.
The trnH–psbA intergenic spacer region has been used in many DNA barcoding studies. However, a comprehensive evaluation with rigorous sequence preprocessing and statistical testing on the utility of trnH–psbA and its combinations as DNA barcodes is lacking.
Methodology/Principal Findings
Sequences were searched from GenBank for a meta-analysis on the usefulness of trnH–psbA and its combinations as DNA barcodes. After preprocessing, we constructed full and matching data sets that contained 17 983 trnH–psbA sequences and 2190 sets of trnH–psbA, matK, rbcL, and ITS2 sequences from the same sample, repectively. These datasets were used to analyze the ability of trnH–psbA and its combinations to discriminate species by the BLAST and BLAST+P methods. The Fisher's exact test was used to evaluate the significance of performance differences. For the full data set, the identification success rates of trnH–psbA exceeded 70% in 18 families and 12 genera, respectively. For the matching data set, the identification rates of trnH–psbA were significantly higher than those of the other loci in two families and four genera. Similarly, the identification rates of trnH–psbA+ITS2 were significantly higher than those of matK+rbcL in 18 families and 21 genera.
This study provides valuable information on the higher utility of trnH–psbA and its combinations. We found that trnH–psbA+ITS2 combination performs better or equally well compared with other combinations in most taxonomic groups investigated. This information will guide the optimal usage of trnH–psbA and its combinations for species identification.
PMCID: PMC3498263  PMID: 23155412
7.  Correction: A Genome-Wide Linkage and Association Scan Reveals Novel Loci for Hypertension and Blood Pressure Traits 
PLoS ONE  2012;7(6):10.1371/annotation/4415f88f-ab10-44dd-8ba9-1a57ade740c1.
PMCID: PMC3371059
9.  Homozygosity mapping on a single patient--identification of homozygous regions of recent common ancestry by using population data 
Human Mutation  2011;32(3):345-353.
Homozygosity mapping has played an important role in detecting recessive mutations using families of consanguineous marriages. However, detection of homozygous regions identity by descent (HBD) when family data is not available, or when relationship is hidden, is still a challenge. Making use of population data from high-density SNP genotyping may allow detection of regions HBD from recent common founders in singleton patients without genealogy information. We report a novel algorithm that detects such regions by estimating the population haplotype frequencies (HF) for an entire homozygous region. We also developed a simulation method to evaluate the probability of HBD for a homozygous region by examining the best regions in unaffected controls from the host population. The method can be applied to diseases of Mendelian inheritance and can be further extended to complex diseases to detect rare founder mutations using multiplex families or sporadic cases. Testing of the method on both real cases (singleton affected) and simulated data demonstrated its superb sensitivity and great resistance to genetic heterogeneity.
PMCID: PMC3357498  PMID: 21309031
homozygosity mapping; recessive mutation; founder mutation; runs of homozygosity; hidden relationship
10.  Genome-Wide Copy Number Analysis Uncovers a New HSCR Gene: NRG3 
PLoS Genetics  2012;8(5):e1002687.
Hirschsprung disease (HSCR) is a congenital disorder characterized by aganglionosis of the distal intestine. To assess the contribution of copy number variants (CNVs) to HSCR, we analysed the data generated from our previous genome-wide association study on HSCR patients, whereby we identified NRG1 as a new HSCR susceptibility locus. Analysis of 129 Chinese patients and 331 ethnically matched controls showed that HSCR patients have a greater burden of rare CNVs (p = 1.50×10−5), particularly for those encompassing genes (p = 5.00×10−6). Our study identified 246 rare-genic CNVs exclusive to patients. Among those, we detected a NRG3 deletion (p = 1.64×10−3). Subsequent follow-up (96 additional patients and 220 controls) on NRG3 revealed 9 deletions (combined p = 3.36×10−5) and 2 de novo duplications among patients and two deletions among controls. Importantly, NRG3 is a paralog of NRG1. Stratification of patients by presence/absence of HSCR–associated syndromes showed that while syndromic–HSCR patients carried significantly longer CNVs than the non-syndromic or controls (p = 1.50×10−5), non-syndromic patients were enriched in CNV number when compared to controls (p = 4.00×10−6) or the syndromic counterpart. Our results suggest a role for NRG3 in HSCR etiology and provide insights into the relative contribution of structural variants in both syndromic and non-syndromic HSCR. This would be the first genome-wide catalog of copy number variants identified in HSCR.
Author Summary
Copy number variations (CNVs) are significant genetic risk factors in disease pathogenesis and represent an important portion of missing heritability for some human diseases, making their discovery essential for the identification of genes and risk factors for a wide range of diseases, including Hirschsprung disease (HSCR, congenital colon aganglionosis). Since the discovery of the major HSCR gene, RET, a number of rare mutations have been reported in RET and other genes involved in the development of the enteric nervous system. However, these mutations contribute to only a small proportion of the disease susceptibility. Taking advantage of the recent technical and methodological advances, we have examined the contribution of CNVs to the disease. We have found that HSCR patients are enriched with CNVs encompassing genes. In particular, we found that deletions of NRG3, a paralog of the previously identified HSCR–susceptibility gene NRG1, were associated with the HSCR phenotype.
PMCID: PMC3349728  PMID: 22589734
11.  Identification of IGF1, SLC4A4, WWOX, and SFMBT1 as Hypertension Susceptibility Genes in Han Chinese with a Genome-Wide Gene-Based Association Study 
PLoS ONE  2012;7(3):e32907.
Hypertension is a complex disorder with high prevalence rates all over the world. We conducted the first genome-wide gene-based association scan for hypertension in a Han Chinese population. By analyzing genome-wide single-nucleotide-polymorphism data of 400 matched pairs of young-onset hypertensive patients and normotensive controls genotyped with the Illumina HumanHap550-Duo BeadChip, 100 susceptibility genes for hypertension were identified and also validated with permutation tests. Seventeen of the 100 genes exhibited differential allelic and expression distributions between patient and control groups. These genes provided a good molecular signature for classifying hypertensive patients and normotensive controls. Among the 17 genes, IGF1, SLC4A4, WWOX, and SFMBT1 were not only identified by our gene-based association scan and gene expression analysis but were also replicated by a gene-based association analysis of the Hong Kong Hypertension Study. Moreover, cis-acting expression quantitative trait loci associated with the differentially expressed genes were found and linked to hypertension. IGF1, which encodes insulin-like growth factor 1, is associated with cardiovascular disorders, metabolic syndrome, decreased body weight/size, and changes of insulin levels in mice. SLC4A4, which encodes the electrogenic sodium bicarbonate cotransporter 1, is associated with decreased body weight/size and abnormal ion homeostasis in mice. WWOX, which encodes the WW domain-containing protein, is related to hypoglycemia and hyperphosphatemia. SFMBT1, which encodes the scm-like with four MBT domains protein 1, is a novel hypertension gene. GRB14, TMEM56 and KIAA1797 exhibited highly significant differential allelic and expressed distributions between hypertensive patients and normotensive controls. GRB14 was also found relevant to blood pressure in a previous genetic association study in East Asian populations. TMEM56 and KIAA1797 may be specific to Taiwanese populations, because they were not validated by the two replication studies. Identification of these genes enriches the collection of hypertension susceptibility genes, thereby shedding light on the etiology of hypertension in Han Chinese populations.
PMCID: PMC3315540  PMID: 22479346
13.  A Genome-Wide Linkage and Association Scan Reveals Novel Loci for Hypertension and Blood Pressure Traits 
PLoS ONE  2012;7(2):e31489.
Hypertension is caused by the interaction of environmental and genetic factors. The condition which is very common, with about 18% of the adult Hong Kong Chinese population and over 50% of older individuals affected, is responsible for considerable morbidity and mortality. To identify genes influencing hypertension and blood pressure, we conducted a combined linkage and association study using over 500,000 single nucleotide polymorphisms (SNPs) genotyped in 328 individuals comprising 111 hypertensive probands and their siblings. Using a family-based association test, we found an association with SNPs on chromosome 5q31.1 (rs6596140; P<9×10−8) for hypertension. One candidate gene, PDC, was replicated, with rs3817586 on 1q31.1 attaining P = 2.5×10−4 and 2.9×10−5 in the within-family tests for DBP and MAP, respectively. We also identified regions of significant linkage for systolic and diastolic blood pressure on chromosomes 2q22 and 5p13, respectively. Further family-based association analysis of the linkage peak on chromosome 5 yielded a significant association (rs1605685, P<7×10−5) for DBP. This is the first combined linkage and association study of hypertension and its related quantitative traits with Chinese ancestry. The associations reported here account for the action of common variants whereas the discovery of linkage regions may point to novel targets for rare variant screening.
PMCID: PMC3286457  PMID: 22384028
14.  RET Mutational Spectrum in Hirschsprung Disease: Evaluation of 601 Chinese Patients 
PLoS ONE  2011;6(12):e28986.
Rare (RVs) and common variants of the RET gene contribute to Hirschsprung disease (HSCR; congenital aganglionosis). While RET common variants are strongly associated with the commonest manifestation of the disease (males; short-segment aganglionosis; sporadic), rare coding sequence (CDS) variants are more frequently found in the lesser common and more severe forms of the disease (females; long/total colonic aganglionosis; familial).
Here we present the screening for RVs in the RET CDS and intron/exon boundaries of 601 Chinese HSCR patients, the largest number of patients ever reported. We identified 61 different heterozygous RVs (50 novel) distributed among 100 patients (16.64%). Those include 14 silent, 29 missense, 5 nonsense, 4 frame-shifts, and one in-frame amino-acid deletion in the CDS, two splice-site deletions, 4 nucleotide substitutions and a 22-bp deletion in the intron/exon boundaries and 1 single-nucleotide substitution in the 5′ untranslated region. Exonic variants were mainly clustered in RET the extracellular domain. RET RVs were more frequent among patients with the most severe phenotype (24% vs. 15% in short-HSCR). Phasing RVs with the RET HSCR-associated haplotype suggests that RVs do not underlie the undisputable association of RET common variants with HSCR. None of the variants were found in 250 Chinese controls.
PMCID: PMC3235168  PMID: 22174939
15.  Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets 
Human Genetics  2011;131(5):747-756.
Current genome-wide association studies (GWAS) use commercial genotyping microarrays that can assay over a million single nucleotide polymorphisms (SNPs). The number of SNPs is further boosted by advanced statistical genotype-imputation algorithms and large SNP databases for reference human populations. The testing of a huge number of SNPs needs to be taken into account in the interpretation of statistical significance in such genome-wide studies, but this is complicated by the non-independence of SNPs because of linkage disequilibrium (LD). Several previous groups have proposed the use of the effective number of independent markers (Me) for the adjustment of multiple testing, but current methods of calculation for Me are limited in accuracy or computational speed. Here, we report a more robust and fast method to calculate Me. Applying this efficient method [implemented in a free software tool named Genetic type 1 error calculator (GEC)], we systematically examined the Me, and the corresponding p-value thresholds required to control the genome-wide type 1 error rate at 0.05, for 13 Illumina or Affymetrix genotyping arrays, as well as for HapMap Project and 1000 Genomes Project datasets which are widely used in genotype imputation as reference panels. Our results suggested the use of a p-value threshold of ~10−7 as the criterion for genome-wide significance for early commercial genotyping arrays, but slightly more stringent p-value thresholds ~5 × 10−8 for current or merged commercial genotyping arrays, ~10−8 for all common SNPs in the 1000 Genomes Project dataset and ~5 × 10−8 for the common SNPs only within genes.
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-011-1118-2) contains supplementary material, which is available to authorized users.
PMCID: PMC3325408  PMID: 22143225
16.  Comparisons of seven algorithms for pathway analysis using the WTCCC Crohn's Disease dataset 
BMC Research Notes  2011;4:386.
Though rooted in genomic expression studies, pathway analysis for genome-wide association studies (GWAS) has gained increasing popularity, since it has the potential to discover hidden disease pathogenic mechanisms by combining statistical methods with biological knowledge. Generally, algorithms or programs proposed recently can be categorized by different types of input data, null hypothesis or counts of analysis stages. Due to complexity caused by SNP, gene and pathway relationships, re-sampling strategies like permutation are always utilized to derive an empirical distribution for test statistics for evaluating the significance of candidate pathways. However, evaluation of these algorithms on real GWAS datasets and real biological pathway databases needs to be addressed before we apply them widely with confidence.
Two algorithms which use summary statistics from GWAS as input were implemented in KGG, a novel and user-friendly software tool for GWAS pathway analysis. Comparisons of these two algorithms as well as the other five selected algorithms were conducted by analyzing the WTCCC Crohn's Disease dataset utilizing the MsigDB canonical pathways. As a result of using permutation to obtain empirical p-value, most of these methods could control Type I error rate well, although some are conservative. However, the methods varied greatly in terms of power and running time, with the PLINK truncated set-based test being the most powerful and KGG being the fastest.
Raw data-based algorithms, such as those implemented in PLINK, are preferable for GWAS pathway analysis as long as computational capacity is available. It may be worthwhile to apply two or more pathway analysis algorithms on the same GWAS dataset, since the methods differ greatly in their outputs and might provide complementary findings for the studied complex disease.
PMCID: PMC3199264  PMID: 21981765
17.  Hedgehog/Notch-induced premature gliogenesis represents a new disease mechanism for Hirschsprung disease in mice and humans 
The Journal of Clinical Investigation  2011;121(9):3467-3478.
Hirschsprung (HSCR) disease is a complex genetic disorder attributed to a failure of the enteric neural crest cells (ENCCs) to form ganglia in the hindgut. Hedgehog and Notch are implicated in mediating proliferation and differentiation of ENCCs. Nevertheless, how these signaling molecules may interact to mediate gut colonization by ENCCs and contribute to a primary etiology for HSCR are not known. Here, we report our pathway-based epistasis analysis of data generated by a genome-wide association study on HSCR disease, which indicates that specific genotype constellations of Patched (PTCH1) (which encodes a receptor for Hedgehog) and delta-like 3 (DLL3) (which encodes a receptor for Notch) SNPs confer higher risk to HSCR. Importantly, deletion of Ptch1 in mouse ENCCs induced robust Dll1 expression and activation of the Notch pathway, leading to premature gliogenesis and reduction of ENCC progenitors in mutant bowels. Dll1 integrated Hedgehog and Notch pathways to coordinate neuronal and glial cell differentiation during enteric nervous system development. In addition, Hedgehog-mediated gliogenesis was found to be highly conserved, such that Hedgehog was consistently able to promote gliogenesis of human neural crest–related precursors. Collectively, we defined PTCH1 and DLL3 as HSCR susceptibility genes and suggest that Hedgehog/Notch-induced premature gliogenesis may represent a new disease mechanism for HSCR.
PMCID: PMC3163945  PMID: 21841314
18.  Using Glycosylated Hemoglobin to Define the Metabolic Syndrome in United States Adults 
Diabetes Care  2010;33(8):1856-1858.
To compare the use of GHb and fasting plasma glucose (FPG) to define the metabolic syndrome (MetS).
Data from the U.S. National Health and Nutrition Examination Survey 1999–2006 were used. MetS was defined using the consensus criteria in 2009. Raised blood glucose was defined as either FPG ≥100 mg/dl (5.6 mmol/l) or GHb ≥5.7%.
In 2003–2006, there was 91.3% agreement between GHb and FPG when either was used to define MetS. The agreement was good irrespective of age, sex, race/ethnicity, BMI, and diabetes status (≥87.4%). Similar results were found in 1999–2002. Among subjects without diabetes, only the use of GHb alone, but not FPG, resulted in significant association with cardiovascular diseases (odds ratio 1.45, P = 0.005).
Using GHb instead of FPG to define MetS is feasible. It also identifies individuals with increased cardiovascular risk.
PMCID: PMC2909078  PMID: 20504895
19.  Genome-wide association study identifies a susceptibility locus for biliary atresia on 10q24.2 
Human Molecular Genetics  2010;19(14):2917-2925.
Biliary atresia (BA) is characterized by the progressive fibrosclerosing obliteration of the extrahepatic biliary system during the first few weeks of life. Despite early diagnosis and prompt surgical intervention, the disease progresses to cirrhosis in many patients. The current theory for the pathogenesis of BA proposes that during the perinatal period, a still unknown exogenous factor meets the innate immune system of a genetically predisposed individual and induces an uncontrollable and potentially self-limiting immune response, which becomes manifest in liver fibrosis and atresia of the extrahepatic bile ducts. Genetic factors that could account for the disease, let alone for its high incidence in Chinese, are to be investigated. To identify BA susceptibility loci, we carried out a genome-wide association study (GWAS) using the Affymetrix 5.0 and 500 K marker sets. We genotyped nearly 500 000 single-nucleotide polymorphisms (SNPs) in 200 Chinese BA patients and 481 ethnically matched control subjects. The 10 most BA-associated SNPs from the GWAS were genotyped in an independent set of 124 BA and 90 control subjects. The strongest overall association was found for rs17095355 on 10q24, downstream XPNPEP1, a gene involved in the metabolism of inflammatory mediators. Allelic chi-square test P-value for the meta-analysis of the GWAS and replication results was 6.94 × 10−9. The identification of putative BA susceptibility loci not only opens new fields of investigation into the mechanisms underlying BA but may also provide new clues for the development of preventive and curative strategies.
PMCID: PMC2893814  PMID: 20460270
20.  Fine Mapping of the NRG1 Hirschsprung's Disease Locus 
PLoS ONE  2011;6(1):e16181.
The primary pathology of Hirschsprung's disease (HSCR, colon aganglionosis) is the absence of ganglia in variable lengths of the hindgut, resulting in functional obstruction. HSCR is attributed to a failure of migration of the enteric ganglion precursors along the developing gut. RET is a key regulator of the development of the enteric nervous system (ENS) and the major HSCR-causing gene. Yet the reduced penetrance of RET DNA HSCR-associated variants together with the phenotypic variability suggest the involvement of additional genes in the disease. Through a genome-wide association study, we uncovered a ∼350 kb HSCR-associated region encompassing part of the neuregulin-1 gene (NRG1). To identify the causal NRG1 variants contributing to HSCR, we genotyped 243 SNPs variants on 343 ethnic Chinese HSCR patients and 359 controls. Genotype analysis coupled with imputation narrowed down the HSCR-associated region to 21 kb, with four of the most associated SNPs (rs10088313, rs10094655, rs4624987, and rs3884552) mapping to the NRG1 promoter. We investigated whether there was correlation between the genotype at the rs10088313 locus and the amount of NRG1 expressed in human gut tissues (40 patients and 21 controls) and found differences in expression as a function of genotype. We also found significant differences in NRG1 expression levels between diseased and control individuals bearing the same rs10088313 risk genotype. This indicates that the effects of NRG1 common variants are likely to depend on other alleles or epigenetic factors present in the patients and would account for the variability in the genetic predisposition to HSCR.
PMCID: PMC3024406  PMID: 21283760
21.  A Knowledge-Based Weighting Framework to Boost the Power of Genome-Wide Association Studies 
PLoS ONE  2010;5(12):e14480.
We are moving to second-wave analysis of genome-wide association studies (GWAS), characterized by comprehensive bioinformatical and statistical evaluation of genetic associations. Existing biological knowledge is very valuable for GWAS, which may help improve their detection power particularly for disease susceptibility loci of moderate effect size. However, a challenging question is how to utilize available resources that are very heterogeneous to quantitatively evaluate the statistic significances.
Methodology/Principal Findings
We present a novel knowledge-based weighting framework to boost power of the GWAS and insightfully strengthen their explorative performance for follow-up replication and deep sequencing. Built upon diverse integrated biological knowledge, this framework directly models both the prior functional information and the association significances emerging from GWAS to optimally highlight single nucleotide polymorphisms (SNPs) for subsequent replication. In the theoretical calculation and computer simulation, it shows great potential to achieve extra over 15% power to identify an association signal of moderate strength or to use hundreds of whole-genome subjects fewer to approach similar power. In a case study on late-onset Alzheimer disease (LOAD) for a proof of principle, it highlighted some genes, which showed positive association with LOAD in previous independent studies, and two important LOAD related pathways. These genes and pathways could be originally ignored due to involved SNPs only having moderate association significance.
With user-friendly implementation in an open-source Java package, this powerful framework will provide an important complementary solution to identify more true susceptibility loci with modest or even small effect size in current GWAS for complex diseases.
PMCID: PMC3013112  PMID: 21217833
22.  European Bone Mineral Density Loci Are Also Associated with BMD in East-Asian Populations 
PLoS ONE  2010;5(10):e13217.
Most genome-wide association (GWA) studies have focused on populations of European ancestry with limited assessment of the influence of the sequence variants on populations of other ethnicities. To determine whether markers that we have recently shown to associate with Bone Mineral Density (BMD) in Europeans also associate with BMD in East-Asians we analysed 50 markers from 23 genomic loci in samples from Korea (n = 1,397) and two Chinese Hong Kong sample sets (n = 3,869 and n = 785). Through this effort we identified fourteen loci that associated with BMD in East-Asian samples using a false discovery rate (FDR) of 0.05; 1p36 (ZBTB40, P = 4.3×10−9), 1p31 (GPR177, P = 0.00012), 3p22 (CTNNB1, P = 0.00013), 4q22 (MEPE, P = 0.0026), 5q14 (MEF2C, P = 1.3×10−5), 6q25 (ESR1, P = 0.0011), 7p14 (STARD3NL, P = 0.00025), 7q21 (FLJ42280, P = 0.00017), 8q24 (TNFRSF11B, P = 3.4×10−5), 11p15 (SOX6, P = 0.00033), 11q13 (LRP5, P = 0.0033), 13q14 (TNFSF11, P = 7.5×10−5), 16q24 (FOXL1, P = 0.0010) and 17q21 (SOST, P = 0.015). Our study marks an early effort towards the challenge of cataloguing bone density variants shared by many ethnicities by testing BMD variants that have been established in Europeans, in East-Asians.
PMCID: PMC2951352  PMID: 20949110
23.  Haplotype Analysis Reveals a Possible Founder Effect of RET Mutation R114H for Hirschsprung's Disease in the Chinese Population 
PLoS ONE  2010;5(6):e10918.
Hirschsprung's disease (HSCR) is a congenital disorder associated with the lack of intramural ganglion cells in the myenteric and sub-mucosal plexuses along varying segments of the gastrointestinal tract. The RET gene is the major gene implicated in this gastrointestinal disease. A highly recurrent mutation in RET (RETR114H) has recently been identified in ∼6–7% of the Chinese HSCR patients which, to date, has not been found in Caucasian patients or controls nor in Chinese controls. Due to the high frequency of RETR114H in this population, we sought to investigate whether this mutation may be a founder HSCR mutation in the Chinese population.
Methodology and Principal Findings
To test whether all RETR114 were originated from a single mutational event, we predicted the approximate age of RETR114H by applying a Bayesian method to RET SNPs genotyped in 430 Chinese HSCR patients (of whom 25 individuals had the mutation) to be between 4–23 generations old depending on growth rate. We reasoned that if RETR114H was a founder mutation then those with the mutation would share a haplotype on which the mutation resides. Including SNPs spanning 509.31 kb across RET from a recently obtained 500 K genome-wide dataset for a subset of 181 patients (14 RETR114H patients), we applied haplotype estimation methods to determine whether there were any segments shared between patients with RETR114H that are not present in those without the mutation or controls. Analysis yielded a 250.2 kb (51 SNP) shared segment over the RET gene (and downstream) in only those patients with the mutation with no similar segments found among other patients.
This suggests that RETR114H is a founder mutation for HSCR in the Chinese population.
PMCID: PMC2880000  PMID: 20532249
24.  A Genome-Wide Scan for Loci Influencing Adolescent Cannabis Dependence Symptoms: Evidence for Linkage on Chromosomes 3 and 9 
Drug and alcohol dependence  2006;89(1):34-41.
Cannabis is the most frequently abused illicit substance among adolescents and young adults. Genetic risk factors account for part of the variation in the development of Cannabis Dependence symptoms; however, no linkage studies have been performed for Cannabis Dependence symptoms. This study aimed to identify such loci.
324 sibling pairs from 192 families were assessed for Cannabis Dependence symptoms. Probands (13-19 years of age) were recruited from consecutive admissions to substance abuse treatment facilities. The siblings of the probands ranged in age from 12-25 years. A community-based sample of 4843 adolescents and young adults was utilized to define an age- and sex-corrected index of Cannabis Dependence vulnerability. DSM-IV Cannabis Dependence symptoms were assessed in youth and their family members with the Composite International Diagnostic Instrument -Substance Abuse Module. Siblings and parents were genotyped for 374 microsatellite markers distributed across the 22 autosomes (average inter-marker distance = 9.2 cM). Cannabis Dependence symptoms were analyzed using Merlin-regress, a regression-based method that is robust to sample selection.
Evidence for suggestive linkage was found on chromosome 3q21 near marker D3S1267 (LOD = 2.61), and on chromosome 9q34 near marker D9S1826 (LOD = 2.57).
This is the first reported linkage study of cannabis dependence symptoms. Other reports of linkage regions for illicit substance dependence have been reported near 3q21, suggesting that this region may contain a quantitative trait loci influencing cannabis dependence and other substance use disorders.
PMCID: PMC1892279  PMID: 17169504
genetics; Cannabis; antisocial behavior; adolescence; linkage study
25.  Homozygosity Mapping on a Single Patient—Identification of Homozygous Regions of Recent Common Ancestry by Using Population Data 
Human Mutation  2011;32(3):345-353.
Homozygosity mapping has played an important role in detecting recessive mutations using families of consanguineous marriages. However, detection of regions identical and homozygosity by descent (HBD) when family data are not available, or when relationships are unknown, is still a challenge. Making use of population data from high-density SNP genotyping may allow detection of regions HBD from recent common founders in singleton patients without genealogy information. We report a novel algorithm that detects such regions by estimating the population haplotype frequencies (HF) for an entire homozygous region. We also developed a simulation method to evaluate the probability of HBD and linkage to disease for a homozygous region by examining the best regions in unaffected controls from the host population. The method can be applied to diseases of Mendelian inheritance but can also be extended to complex diseases to detect rare founder mutations that affect a very small number of patients using either multiplex families or sporadic cases. Testing of the method on both real cases (singleton affected) and simulated data demonstrated its superb sensitivity and robustness under genetic heterogeneity. Hum Mutat 32:345–353, 2011. © 2011 Wiley-Liss, Inc.
PMCID: PMC3357498  PMID: 21309031
homozygosity mapping; recessive mutation; founder mutation; rare variants; population-based linkage

Results 1-25 (25)