Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV)-related hepatocellular carcinomas (HCCs) and their matched controls. Comparison of whole genome sequence (WGS) and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3), and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.
Recent genome-wide association studies (GWAS) have identified several novel single nucleotide polymorphisms (SNPs) associated with type 2 diabetes (T2D). Various models using clinical and/or genetic risk factors have been developed for T2D risk prediction. However, analysis considering algorithms for genetic risk factor detection and regression methods for model construction in combination with interactions of risk factors has not been investigated. Here, using genotype data of 7,360 Japanese individuals, we investigated risk prediction models, considering the algorithms, regression methods and interactions. The best model identified was based on a Bayes factor approach and the lasso method. Using nine SNPs and clinical factors, this method achieved an area under a receiver operating characteristic curve (AUC) of 0.8057 on an independent test set. With the addition of a pair of interaction factors, the model was further improved (p-value 0.0011, AUC 0.8085). Application of our model to prospective cohort data showed significantly better outcome in disease-free survival, according to the log-rank trend test comparing Kaplan-Meier survival curves (). While the major contribution was from clinical factors rather than the genetic factors, consideration of genetic risk factors contributed to an observable, though small, increase in predictive ability. This is the first report to apply risk prediction models constructed from GWAS data to a T2D prospective cohort. Our study shows our model to be effective in prospective prediction and has the potential to contribute to practical clinical use in T2D.
Hepatitis C virus (HCV) establishes a chronic infection in 70-80% of infected individuals. Many researchers have examined the effect of human leukocyte antigen (HLA) on viral persistence because of its critical role in the immune response against exposure to HCV, but almost all studies have proven to be inconclusive. To identify genetic risk factors for chronic HCV infection, we analyzed 458,207 single nucleotide polymorphisms (SNPs) in 481 chronic HCV patients and 2,963 controls in a Japanese cohort. Next, we performed a replication study with an independent panel of 4,358 cases and 1,114 controls. We further confirmed the association in 1,379 cases and 25,817 controls. In the GWAS phase, we found 17 SNPs that showed suggestive association (P < 1 × 10-5). After the first replication study, we found one intronic SNP in the HLA-DQ locus associated with chronic HCV infection, and when we combined the two studies, the association reached the level of genome-wide significance. In the second replication study, we again confirmed the association (Pcombined = 3.59 × 10−16, odds ratio [OR] = 0.79). Subsequent analysis revealed another SNP, rs1130380, with a stronger association (OR=0.72). This nucleotide substitution causes an amino acid substitution (R55P) in the HLA-DQB1 protein specific to the DQB1*03 allele, which is common worldwide. In addition, we confirmed an association with the previously reported IFNL3-IFNL4 locus and propose that the effect of DQB1*03 on HCV persistence might be affected by the IFNL4 polymorphism. Our findings suggest that a common amino acid substitution in HLA-DQB1 affects susceptibility to chronic infection with HCV in the Japanese population and may not be independent of the IFNL4 genotype.
Obesity is a disorder with complex genetic etiology, and its epidemic is a worldwide problem. Although multiple genetic loci associated with body mass index (BMI), the most common measure of obesity, have been identified in European populations, few studies have focused on Asian populations. Here, we report a genome-wide association study (GWAS) and replication studies with 62,245 East Asian subjects, which identified two novel BMI-associated loci in the CDKAL1 locus at 6p22 (rs2206734, P = 1.4 × 10−11) and the KLF9 locus at 9q21 (rs11142387, P = 1.3 × 10−9), as well as previously reported loci (the SEC16B, BDNF, FTO, MC4R, and GIPR loci; P < 5.0 × 10−8). We subsequently performed gene–gene interaction analysis and identified an interaction (P = 2.0 × 10−8) between SNPs in the KLF9 locus (rs11142387) and the GDF8 locus at 2q32 (rs13034723). These findings should provide useful insights into the etiology of obesity.
The histone methyltransferase enhancer of zeste 2 (EZH2) is known to be a polycomb protein homologous to Drosophila enhancer of zeste and catalyzes the addition of methyl groups to histone H3 at lysine 27 (H3K27). We previously reported that EZH2 was overexpressed in various types of cancer and plays a crucial role in the cell cycle regulation of cancer cells. In the present study, we demonstrated that EZH2 has the function to monomethylate lysine 120 on histone H2B (H2BK120). EZH2-dependent H2BK120 methylation in cancer cells was confirmed with an H2BK120 methylation-specific antibody. Overexpression of EZH2 significantly attenuated the ubiquitination of H2BK120, a key posttranslational modification of histones for transcriptional regulation. Concordantly, knockdown of EZH2 increased the ubiquitination level of H2BK120, suggesting that the methylation of H2BK120 by EZH2 may competitively inhibit the ubiquitination of H2BK120. Subsequent chromatin immunoprecipitation-Seq and microarray analyses identified downstream candidate genes regulated by EZH2 through the methylation of H2BK120. This is the first report to describe a novel substrate of EZH2, H2BK120, unveiling a new aspect of EZH2 functions in human carcinogenesis.
Lumbar disc degeneration (LDD) is associated with both genetic and environmental factors and affects many people worldwide. A hallmark of LDD is loss of proteoglycan and water content in the nucleus pulposus of intervertebral discs. While some genetic determinants have been reported, the etiology of LDD is largely unknown. Here we report the findings from linkage and association studies on a total of 32,642 subjects consisting of 4,043 LDD cases and 28,599 control subjects. We identified carbohydrate sulfotransferase 3 (CHST3), an enzyme that catalyzes proteoglycan sulfation, as a susceptibility gene for LDD. The strongest genome-wide linkage peak encompassed CHST3 from a Southern Chinese family–based data set, while a genome-wide association was observed at rs4148941 in the gene in a meta-analysis using multiethnic population cohorts. rs4148941 lies within a potential microRNA-513a-5p (miR-513a-5p) binding site. Interaction between miR-513a-5p and mRNA transcribed from the susceptibility allele (A allele) of rs4148941 was enhanced in vitro compared with transcripts from other alleles. Additionally, expression of CHST3 mRNA was significantly reduced in the intervertebral disc cells of human subjects carrying the A allele of rs4148941. Together, our data provide new insights into the etiology of LDD, implicating an interplay between genetic risk factors and miRNA.
Adolescent idiopathic scoliosis (AIS) is the most common spinal deformity, affecting around 2% of adolescents worldwide. Genetic factors play an important role in its etiology. Using a genome-wide association study (GWAS), we recently identified novel AIS susceptibility loci on chromosomes 10q24.31 and 6q24.1. To identify more AIS susceptibility loci relating to its severity and progression, we performed GWAS by limiting the case subjects to those with severe AIS. Through a two-stage association study using a total of ∼12,000 Japanese subjects, we identified a common variant, rs12946942 that showed a significant association with severe AIS in the recessive model (P = 4.00×10−8, odds ratio [OR] = 2.05). Its association was replicated in a Chinese population (combined P = 6.43×10−12, OR = 2.21). rs12946942 is on chromosome 17q24.3 near the genes SOX9 and KCNJ2, which when mutated cause scoliosis phenotypes. Our findings will offer new insight into the etiology and progression of AIS.
The recent development of massively parallel sequencing technology has allowed the creation of comprehensive catalogs of genetic variation. However, due to the relatively high sequencing error rate for short read sequence data, sophisticated analysis methods are required to obtain high-quality variant calls. Here, we developed a probabilistic multinomial method for the detection of single nucleotide variants (SNVs) as well as short insertions and deletions (indels) in whole genome sequencing (WGS) and whole exome sequencing (WES) data for single sample calling. Evaluation with DNA genotyping arrays revealed a concordance rate of 99.98% for WGS calls and 99.99% for WES calls. Sanger sequencing of the discordant calls determined the false positive and false negative rates for the WGS (0.0068% and 0.17%) and WES (0.0036% and 0.0084%) datasets. Furthermore, short indels were identified with high accuracy (WGS: 94.7%, WES: 97.3%). We believe our method can contribute to the greater understanding of human diseases.
Many genome-wide association studies focus on associating single loci with
target phenotypes. However, in the setting of rare variation, accumulating
sufficient samples to assess these associations can be difficult. Moreover,
multiple variations in a gene or a set of genes within a pathway may all
contribute to the phenotype, suggesting that the aggregation of variations
found over the gene or pathway may be useful for improving the power to
Here, we present a method for aggregating single nucleotide polymorphisms
(SNPs) along biologically relevant pathways in order to seek genetic
associations with phenotypes. Our method uses all available genetic variants
and does not remove those in linkage disequilibrium (LD). Instead, it uses a
novel SNP weighting scheme to down-weight the contributions of correlated
SNPs. We apply our method to three cohorts of patients taking warfarin: two
European descent cohorts and an African American cohort. Although the
clinical covariates and key pharmacogenetic loci for warfarin have been
characterized, our association metric identifies a significant association
with mutations distributed throughout the pathway of warfarin metabolism. We
improve dose prediction after using all known clinical covariates and
pharmacogenetic variants in VKORC1 and CYP2C9. In particular, we find that
at least 1% of the missing heritability in warfarin dose may be due to the
aggregated effects of variations in the warfarin metabolic pathway, even
though the SNPs do not individually show a significant association.
Our method allows researchers to study aggregative SNP effects in an unbiased
manner by not preselecting SNPs. It retains all the available information by
accounting for LD-structure through weighting, which eliminates the need for
Prostate specific antigen (PSA) is widely used as a diagnostic biomarker for prostate cancer (PC). However, due to its low predictive performance, many patients without PC suffer from the harms of unnecessary prostate needle biopsies. The present study aims to evaluate the reproducibility and performance of a genetic risk prediction model in Japanese and estimate its utility as a diagnostic biomarker in a clinical scenario. We created a logistic regression model incorporating 16 SNPs that were significantly associated with PC in a genome-wide association study of Japanese population using 689 cases and 749 male controls. The model was validated by two independent sets of Japanese samples comprising 3,294 cases and 6,281 male controls. The areas under curve (AUC) of the model were 0.679, 0.655, and 0.661 for the samples used to create the model and those used for validation. The AUCs were not significantly altered in samples with PSA 1–10 ng/ml. 24.2% and 9.7% of the patients had odds ratio <0.5 (low risk) or >2 (high risk) in the model. Assuming the overall positive rate of prostate needle biopsies to be 20%, the positive biopsy rates were 10.7% and 42.4% for the low and high genetic risk groups respectively. Our genetic risk prediction model for PC was highly reproducible, and its predictive performance was not influenced by PSA. The model could have a potential to affect clinical decision when it is applied to patients with gray-zone PSA, which should be confirmed in future clinical studies.
Rheumatoid arthritis is an autoimmune disease with a complex etiology, leading to inflammation of synovial tissue and joint destruction. Through a genome-wide association study (GWAS) and two replication studies in the Japanese population (7,907 cases and 35,362 controls), we identified two gene loci associated with rheumatoid arthritis susceptibility (NFKBIE at 6p21.1, rs2233434, odds ratio (OR) = 1.20, P = 1.3×10−15; RTKN2 at 10q21.2, rs3125734, OR = 1.20, P = 4.6×10−9). In addition to two functional non-synonymous SNPs in NFKBIE, we identified candidate causal SNPs with regulatory potential in NFKBIE and RTKN2 gene regions by integrating in silico analysis using public genome databases and subsequent in vitro analysis. Both of these genes are known to regulate the NF-κB pathway, and the risk alleles of the genes were implicated in the enhancement of NF-κB activity in our analyses. These results suggest that the NF-κB pathway plays a role in pathogenesis and would be a rational target for treatment of rheumatoid arthritis.
Rheumatoid arthritis (RA) is a chronic autoimmune disease affecting approximately 1% of the general adult population. More than 30 susceptibility loci for RA have been identified through genome-wide association studies (GWAS), but the disease-causal variants at most loci remain unknown. Here, we performed replication studies of the candidate loci of our previous GWAS using Japanese cohorts and identified variants in NFKBIE and RTKN2 gene loci that were associated with RA. To search for causal variants in both gene regions, we first examined non-synonymous (ns)SNPs that alter amino-acid sequences. As NFKBIE and RTKN2 are known to be involved in the NF-κB pathway, we evaluated the effects of nsSNPs on NF-κB activity. Next, we screened in silico variants that may regulate gene transcription using publicly available epigenetic databases and subsequently evaluated their regulatory potential using in vitro assays. As a result, we identified multiple candidate causal variants in NFKBIE (2 nsSNPs and 1 regulatory SNP) and RTKN2 (2 regulatory SNPs), indicating that our integrated in silico and in vitro approach is useful for the identification of causal variants in the post–GWAS era.
Multiple genetic loci associated with obesity or body mass index (BMI) have been identified through genome-wide association studies conducted predominantly in populations of European ancestry. We conducted a meta-analysis of associations between BMI and approximately 2.4 million SNPs in 27,715 East Asians, followed by in silico and de novo replication in 37,691 and 17,642 additional East Asians, respectively. We identified ten BMI-associated loci at the genome-wide significance level (P<5.0×10−8), including seven previously identified loci (FTO, SEC16B, MC4R, GIPR/QPCTL, ADCY3/RBJ, BDNF, and MAP2K5) and three novel loci in or near the CDKAL1,PCSK1, and GP2 genes. Three additional loci nearly reached the genome-wide significance threshold, including two previously identified loci in the GNPDA2 and TFAP2B genes and a new locus near PAX6, which all had P<5.0×10−7. Findings from this study may shed light on new pathways involved in obesity and demonstrate the value of conducting genetic studies in non-European populations.
Atrial fibrillation is a highly prevalent arrhythmia and a major risk factor for stroke, heart failure and death1. We conducted a genome-wide association study (GWAS) in individuals of European ancestry, including 6,707 with and 52,426 without atrial fibrillation. Six new atrial fibrillation susceptibility loci were identified and replicated in an additional sample of individuals of European ancestry, including 5,381 subjects with and 1 0,030 subjects without atrial fibrillation (P < 5 × 10−8). Four of the loci identified in Europeans were further replicated in silico in a GWAS of Japanese individuals, including 843 individuals with and 3,350 individuals without atrial fibrillation. The identified loci implicate candidate genes that encode transcription factors related to cardiopulmonary development, cardiac-expressed ion channels and cell signaling molecules.
Thymic stromal lymphopoietin (TSLP) triggers dendritic cell–mediated T helper (Th) 2 inflammatory responses. A single-nucleotide polymorphism (SNP), rs3806933, in the promoter region of the TSLP gene creates a binding site for the transcription factor activating protein (AP)–1. The variant enhances AP-1 binding to the regulatory element, and increases the promoter–reporter activity of TSLP in response to polyinosinic-polycytidylic acid (poly[I:C]) stimulation in normal human bronchial epithelium (NHBE). We investigated whether polymorphisms including the SNP rs3806933 could affect the susceptibility to and clinical phenotypes of bronchial asthma. We selected three representative (i.e., Tag) SNPs and conducted association studies of the TSLP gene, using two independent populations (639 patients with childhood atopic asthma and 838 control subjects, and 641 patients with adult asthma and 376 control subjects, respectively). We further examined the effects of corticosteroids and a long-acting β2-agonist (salmeterol) on the expression levels of the TSLP gene in response to poly(I:C) in NHBE. We found that the promoter polymorphisms rs3806933 and rs2289276 were significantly associated with disease susceptibility in both childhood atopic and adult asthma. The functional SNP rs3806933 was associated with asthma (meta-analysis, P = 0.000056; odds ratio, 1.29; 95% confidence interval, 1.14–1.47). A genotype of rs2289278 was correlated with pulmonary function. Moreover, the induction of TSLP mRNA and protein expression induced by poly(I:C) in NHBE was synergistically impaired by a corticosteroid and salmeterol. TSLP variants are significantly associated with bronchial asthma and pulmonary function. Thus, TSLP may serve as a therapeutic target molecule for combination therapy.
asthma; TSLP; bronchial epithelial cells; combination therapy; genetic polymorphisms
Nephrolithiasis is a common nephrologic disorder with complex etiology. To identify the genetic factor(s) for nephrolithiasis, we conducted a three-stage genome-wide association study (GWAS) using a total of 5,892 nephrolithiasis cases and 17,809 controls of Japanese origin. Here we found three novel loci for nephrolithiasis: RGS14-SLC34A1-PFN3-F12 on 5q35.3 (rs11746443; P = 8.51×10−12, odds ratio (OR) = 1.19), INMT-FAM188B-AQP1 on 7p14.3 (rs1000597; P = 2.16×10−14, OR = 1.22), and DGKH on 13q14.1 (rs4142110; P = 4.62×10−9, OR = 1.14). Subsequent analyses in 21,842 Japanese subjects revealed the association of SNP rs11746443 with the reduction of estimated glomerular filtration rate (eGFR) (P = 6.54×10−8), suggesting a crucial role for this variation in renal function. Our findings elucidated the significance of genetic variations for the pathogenesis of nephrolithiasis.
Although nephrolithiasis is one of the most common nephro-urological disorders with high prevalence (4%–9%) and extremely high recurrence rate (60% within ten years), little is known about the role of common variations in its pathogenesis. Through a GWAS using a total of 5,892 cases and 17,809 controls, we identified three novel nephrolithiasis loci: rs11746443, rs1000597, and rs4142110 (P<1×10−8). The top two significant SNPs, rs11746443 and rs1000597, are located upstream of the SLC34A1 and the AQP1 genes that play important roles in kidney function and the urine-concentration process, respectively. We also found that SNP rs11746443 is associated with the reduction of estimated glomerular filtration rate (eGFR), indicating the role of this variation in kidney function. Although nephrolithiasis is considered as one of the lifestyle-related diseases, the results of dietary intervention studies to reduce the recurrence incidence have been unsuccessful. Our findings could contribute to a better understanding of the pathogenesis of nephrolithiasis and lead to the development of new therapeutics.
Systemic lupus erythematosus (SLE) is an autoimmune disease that causes multiple organ damage. Although recent genome-wide association studies (GWAS) have contributed to discovery of SLE susceptibility genes, few studies has been performed in Asian populations. Here, we report a GWAS for SLE examining 891 SLE cases and 3,384 controls and multi-stage replication studies examining 1,387 SLE cases and 28,564 controls in Japanese subjects. Considering that expression quantitative trait loci (eQTLs) have been implicated in genetic risks for autoimmune diseases, we integrated an eQTL study into the results of the GWAS. We observed enrichments of cis-eQTL positive loci among the known SLE susceptibility loci (30.8%) compared to the genome-wide SNPs (6.9%). In addition, we identified a novel association of a variant in the AF4/FMR2 family, member 1 (AFF1) gene at 4q21 with SLE susceptibility (rs340630; P = 8.3×10−9, odds ratio = 1.21). The risk A allele of rs340630 demonstrated a cis-eQTL effect on the AFF1 transcript with enhanced expression levels (P<0.05). As AFF1 transcripts were prominently expressed in CD4+ and CD19+ peripheral blood lymphocytes, up-regulation of AFF1 may cause the abnormality in these lymphocytes, leading to disease onset.
Although recent genome-wide association study (GWAS) approaches have successfully contributed to disease gene discovery, many susceptibility loci are known to be still uncaptured due to strict significance threshold for multiple hypothesis testing. Therefore, prioritization of GWAS results by incorporating additional information is recommended. Systemic lupus erythematosus (SLE) is an autoimmune disease that causes multiple organ damage. Considering that abnormalities in B cell activity play essential roles in SLE, prioritization based on an expression quantitative trait loci (eQTLs) study for B cells would be a promising approach. In this study, we report a GWAS and multi-stage replication studies for SLE examining 2,278 SLE cases and 31,948 controls in Japanese subjects. We integrated eQTL study into the results of the GWAS and identified AFF1 as a novel SLE susceptibility loci. We also confirmed cis-regulatory effect of the locus on the AFF1 transcript. Our study would be one of the initial successes for detecting novel genetic locus using the eQTL study, and it should contribute to our understanding of the genetic loci being uncaptured by standard GWAS approaches.
A number of histone methyltransferases have been identified and biochemically characterized, but the pathologic roles of their dysfunction in human diseases like cancer are not well understood. Here, we demonstrate that Wolf-Hirschhorn syndrome candidate 1 (WHSC1) plays important roles in human carcinogenesis. Transcriptional levels of this gene are significantly elevated in various types of cancer including bladder and lung cancers. Immunohistochemical analysis using a number of clinical tissues confirmed significant up-regulation of WHSC1 expression in bladder and lung cancer cells at the protein level. Treatment of cancer cell lines with small interfering RNA targeting WHSC1 significantly knocked down its expression and resulted in the suppression of proliferation. Cell cycle analysis by flow cytometry indicated that knockdown of WHSC1 decreased the cell population of cancer cells at the S phase while increasing that at the G2/M phase. WHSC1 interacts with some proteins related to the WNT pathway including β-catenin and transcriptionally regulates CCND1, the target gene of the β-catenin/Tcf-4 complex, through histone H3 at lysine 36 trimethylation. This is a novel mechanism for WNT pathway dysregulation in human carcinogenesis, mediated by the epigenetic regulation of histone H3. Because expression levels of WHSC1 are significantly low in most normal tissue types, it should be feasible to develop specific and selective inhibitors targeting the enzyme as antitumor agents that have a minimal risk of adverse reaction.
Background and Aims
Recent studies indicate that hepatitis C virus (HCV) can modulate the expression of various genes including those involved in interferon signaling, and up-regulation of interferon-stimulated genes by HCV was reported to be strongly associated with treatment outcome. To expand our understanding of the molecular mechanism underlying treatment resistance, we analyzed the direct effects of interferon and/or HCV infection under immunodeficient conditions using cDNA microarray analysis of human hepatocyte chimeric mice.
Human serum containing HCV genotype 1b was injected into human hepatocyte chimeric mice. IFN-α was administered 8 weeks after inoculation, and 6 hours later human hepatocytes in the mouse livers were collected for microarray analysis.
HCV infection induced a more than 3-fold change in the expression of 181 genes, especially genes related to Organismal Injury and Abnormalities, such as fibrosis or injury of the liver (P = 5.90E-16 ∼ 3.66E-03). IFN administration induced more than 3-fold up-regulation in the expression of 152 genes. Marked induction was observed in the anti-fibrotic chemokines such as CXCL9, suggesting that IFN treatment might lead not only to HCV eradication but also prevention and repair of liver fibrosis. HCV infection appeared to suppress interferon signaling via significant reduction in interferon-induced gene expression in several genes of the IFN signaling pathway, including Mx1, STAT1, and several members of the CXCL and IFI families (P = 6.0E-12). Genes associated with Antimicrobial Response and Inflammatory Response were also significantly repressed (P = 5.22×10−10 ∼ 1.95×10−2).
These results provide molecular insights into possible mechanisms used by HCV to evade innate immune responses, as well as novel therapeutic targets and a potential new indication for interferon therapy.
EHMT2 is a histone lysine methyltransferase localized in euchromatin regions and acting as a corepressor for specific transcription factors. Although the role of EHMT2 in transcriptional regulation has been well documented, the pathologic consequences of its dysfunction in human disease have not been well understood. Here, we describe important roles of EHMT2 in human carcinogenesis. Expression levels of EHMT2 are significantly elevated in human bladder carcinomas compared with nonneoplastic bladder tissues (P < .0001) in real-time polymerase chain reaction analysis. Complementary DNA microarray analysis also revealed its overexpression in various types of cancer. The reduction of EHMT2 expression by small interfering RNAs resulted in the suppression of the growth of cancer cells and possibly caused apoptotic cell death in cancer cells. Importantly, we show that EHMT2 can suppress transcription of the SIAH1 gene by binding to its promoter region (-293 to +51) and by methylating lysine 9 of histone H3. Furthermore, an EHMT2-specific inhibitor, BIX-01294, significantly suppressed the growth of cancer cells. Our results suggest that dysregulation of EHMT2 plays an important role in the growth regulation of cancer cells, and further functional studies may affirm the importance of EHMT2 as a promising therapeutic target for various types of cancer.
Kawasaki disease (KD; OMIM 611775) is an acute vasculitis syndrome which predominantly affects small- and medium-sized arteries of infants and children. Epidemiological data suggest that host genetics underlie the disease pathogenesis. Here we report that multiple variants in the caspase-3 gene (CASP3) that are in linkage disequilibrium confer susceptibility to KD in both Japanese and US subjects of European ancestry. We found that a G to A substitution of one commonly associated SNP located in the 5′ untranslated region of CASP3 (rs72689236; P = 4.2 × 10−8 in the Japanese and P = 3.7 × 10−3 in the European Americans) abolished binding of nuclear factor of activated T cells to the DNA sequence surrounding the SNP. Our findings suggest that altered CASP3 expression in immune effecter cells influences susceptibility to KD.
White blood cells (WBCs) mediate immune systems and consist of various subtypes with distinct roles. Elucidation of the mechanism that regulates the counts of the WBC subtypes would provide useful insights into both the etiology of the immune system and disease pathogenesis. In this study, we report results of genome-wide association studies (GWAS) and a replication study for the counts of the 5 main WBC subtypes (neutrophils, lymphocytes, monocytes, basophils, and eosinophils) using 14,792 Japanese subjects enrolled in the BioBank Japan Project. We identified 12 significantly associated loci that satisfied the genome-wide significance threshold of P<5.0×10−8, of which 9 loci were novel (the CDK6 locus for the neutrophil count; the ITGA4, MLZE, STXBP6 loci, and the MHC region for the monocyte count; the SLC45A3-NUCKS1, GATA2, NAALAD2, ERG loci for the basophil count). We further evaluated associations in the identified loci using 15,600 subjects from Caucasian populations. These WBC subtype-related loci demonstrated a variety of patterns of pleiotropic associations within the WBC subtypes, or with total WBC count, platelet count, or red blood cell-related traits (n = 30,454), which suggests unique and common functional roles of these loci in the processes of hematopoiesis. This study should contribute to the understanding of the genetic backgrounds of the WBC subtypes and hematological traits.
White blood cells (WBCs) are blood cells that mediate immune systems and defend the body against foreign microorganisms. It is well known that WBCs consist of various subtypes of cells with distinct roles, although the genetic background of each of the WBC subtypes has yet to be examined. In this study, we report genome-wide association studies (GWAS) for the 5 main WBC subtypes (neutrophils, lymphocytes, monocytes, basophils, and eosinophils) using 14,792 Japanese subjects. We identified 12 significantly associated genetic loci, and 9 of them were novel. Evaluation of the associations of these identified loci in cohorts of Caucasian populations demonstrated both ethnically common and divergent genetic backgrounds of the WBC subtypes. These loci also indicated a variety of patterns of pleiotropic associations within the hematological traits, including the other WBC subtypes, total WBC count, platelet count, or red blood cell-related traits, which suggests unique and common functional roles of these loci in the processes of hematopoiesis.
White blood cell (WBC) count is a common clinical measure from complete blood count assays, and it varies widely among healthy individuals. Total WBC count and its constituent subtypes have been shown to be moderately heritable, with the heritability estimates varying across cell types. We studied 19,509 subjects from seven cohorts in a discovery analysis, and 11,823 subjects from ten cohorts for replication analyses, to determine genetic factors influencing variability within the normal hematological range for total WBC count and five WBC subtype measures. Cohort specific data was supplied by the CHARGE, HeamGen, and INGI consortia, as well as independent collaborative studies. We identified and replicated ten associations with total WBC count and five WBC subtypes at seven different genomic loci (total WBC count—6p21 in the HLA region, 17q21 near ORMDL3, and CSF3; neutrophil count—17q21; basophil count- 3p21 near RPN1 and C3orf27; lymphocyte count—6p21, 19p13 at EPS15L1; monocyte count—2q31 at ITGA4, 3q21, 8q24 an intergenic region, 9q31 near EDG2), including three previously reported associations and seven novel associations. To investigate functional relationships among variants contributing to variability in the six WBC traits, we utilized gene expression- and pathways-based analyses. We implemented gene-clustering algorithms to evaluate functional connectivity among implicated loci and showed functional relationships across cell types. Gene expression data from whole blood was utilized to show that significant biological consequences can be extracted from our genome-wide analyses, with effect estimates for significant loci from the meta-analyses being highly corellated with the proximal gene expression. In addition, collaborative efforts between the groups contributing to this study and related studies conducted by the COGENT and RIKEN groups allowed for the examination of effect homogeneity for genome-wide significant associations across populations of diverse ancestral backgrounds.
WBC traits are highly variable, moderately heritable, and commonly assayed as part of clinical complete blood count (CBC) examinations. The counts of constituent cell subtypes comprising the WBC count measure are assayed as part of a standard clinical WBC differential test. In this study we employed meta-analytic techniques and identified ten associations with WBC measures at seven genomic loci in a large sample set of over 31,000 participants. Cohort specific data was supplied by the CHARGE, HeamGen, and INGI consortia, as well as independent collaborative studies. We confirm previous associations of WBC traits with three loci and identified seven novel loci. We also utilize a number of additional analytic methods to infer the functional relatedness of independently implicated loci across WBC phenotypes, as well as investigate direct functional consequences of these loci through analyses of genomic variation affecting the expression of proximal genes in samples of whole blood. In addition, subsequent collaborative efforts with studies of WBC traits in African-American and Japanese cohorts allowed for the investigation of the effects of these genomic variants across populations of diverse continental ancestries.
Accurate information on haplotypes and diplotypes (haplotype pairs) is required for population-genetic analyses; however, microarrays do not provide data on a haplotype or diplotype at a copy number variation (CNV) locus; they only provide data on the total number of copies over a diplotype or an unphased sequence genotype (e.g., AAB, unlike AB of single nucleotide polymorphism). Moreover, such copy numbers or genotypes are often incorrectly determined when microarray signal intensities derived from different copy numbers or genotypes are not clearly separated due to noise. Here we report an algorithm to infer CNV haplotypes and individuals’ diplotypes at multiple loci from noisy microarray data, utilizing the probability that a signal intensity may be derived from different underlying copy numbers or genotypes. Performing simulation studies based on known diplotypes and an error model obtained from real microarray data, we demonstrate that this probabilistic approach succeeds in accurate inference (error rate: 1–2%) from noisy data, whereas previous deterministic approaches failed (error rate: 12–18%). Applying this algorithm to real microarray data, we estimated haplotype frequencies and diplotypes in 1486 CNV regions for 100 individuals. Our algorithm will facilitate accurate population-genetic analyses and powerful disease association studies of CNVs.
copy number variation; EM algorithm; haplotype inference; phasing
The analysis of contiguous homozygosity (runs of homozygous loci) in human genotyping datasets is critical in the search for causal disease variants in monogenic disorders, studies of population history and the identification of targets of natural selection. Here, we report methods for extracting homozygous segments from high-density genotyping datasets, quantifying their local genomic structure, identifying outstanding regions within the genome and visualizing results for comparative analysis between population samples.