1.  Expression Quantitative Trait Loci for CARD8 Contributes to Risk of Two Infection-Related Cancers—Hepatocellular Carcinoma and Cervical Cancer 
PLoS ONE  2015;10(7):e0132352.
Caspase recruitment domain family, member 8 (CARD8) can coordinate innate and adaptive immune responses and sensitize cells to apoptosis, which may participate in tumorigenesis of virus-induced hepatocellular carcinoma (HCC) and cervical cancer. By bioinformatics analyses, we identified several single nucleotide polymorphisms (SNPs) within a new identified long non-coding RNA (lncRNA) as expression quantitative trait loci (eQTLs) for CARD8. In this study, we therefore hypothesized that CARD8 eQTLs SNPs within lncRNA may influence the risk of HCC and cervical cancer. We performed two independent case-control studies of 1,300 cases with HBV-positive HCC and 1,344 normal controls, together with 1,486 cervical cancer patients and 1,536 control subjects to test the association between eQTLs SNP (rs7248320) for CARD8 and the risk of HCC and cervical cancer. The variant genotype of rs7248320 was significantly associated with increased risk of HCC and cervical cancer [GG vs. AA/GA: adjusted odds ratio (OR) = 1.28, 95% confidence interval (CI) = 1.03–1.61, P = 0.028 for HCC; adjusted OR = 1.34, 95% CI = 1.09–1.66, P = 0.006 for cervical cancer]. Moreover, the effect of rs7248320 on cervical cancer risk was more prominent in premenopausal women. Further interactive analysis detected a significantly multiplicative interaction between rs7248320 and menopausal status on cervical cancer risk (P = 0.018). These findings suggest that CARD8 eQTLs SNP may serve as a susceptibility marker for virus-related HCC and cervical cancer.
PMCID: PMC4492972  PMID: 26147888
2.  A genome-wide gene–environment interaction analysis for tobacco smoke and lung cancer susceptibility 
Carcinogenesis  2014;35(7):1528-1535.
Tobacco smoke is the major environmental risk factor underlying lung carcinogenesis. However, approximately one-tenth smokers develop lung cancer in their lifetime indicating there is significant individual variation in susceptibility to lung cancer. And, the reasons for this are largely unknown. In particular, the genetic variants discovered in genome-wide association studies (GWAS) account for only a small fraction of the phenotypic variations for lung cancer, and gene–environment interactions are thought to explain the missing fraction of disease heritability. The ability to identify smokers at high risk of developing cancer has substantial preventive implications. Thus, we undertook a gene–smoking interaction analysis in a GWAS of lung cancer in Han Chinese population using a two-phase designed case–control study. In the discovery phase, we evaluated all pair-wise (591 370) gene–smoking interactions in 5408 subjects (2331 cases and 3077 controls) using a logistic regression model with covariate adjustment. In the replication phase, promising interactions were validated in an independent population of 3023 subjects (1534 cases and 1489 controls). We identified interactions between two single nucleotide polymorphisms and smoking. The interaction P values are 6.73 × 10− 6 and 3.84 × 10− 6 for rs1316298 and rs4589502, respectively, in the combined dataset from the two phases. An antagonistic interaction (rs1316298–smoking) and a synergetic interaction (rs4589502–smoking) were observed. The two interactions identified in our study may help explain some of the missing heritability in lung cancer susceptibility and present strong evidence for further study of these gene–smoking interactions, which are benefit to intensive screening and smoking cessation interventions.
PMCID: PMC4076813  PMID: 24658283
3.  Genome-wide Association Study on Platinum-induced Hepatotoxicity in Non-Small Cell Lung Cancer Patients 
Scientific Reports  2015;5:11556.
Platinum-based chemotherapy has been shown to improve the survival of advanced non-small cell lung cancer (NSCLC) patients; the platinum-induced toxicity severely impedes the success of chemotherapy. Genetic variations, such as single nucleotide polymorphisms (SNPs), may contribute to patients’ responses to the platinum-based chemotherapy. To identify SNPs that modify the risk of hepatotoxicity in NSCLC patients receiving platinum-based chemotherapy, we performed a genome-wide association scan in 334 subjects followed by a replication study among 375 subjects. Consistent associations with platinum-induced hepatotoxicity risk was identified for SNP rs2838566 located at 21q22.3, as the minor A allele could significantly increase the risk of liver injury (OR = 3.78, 95%CI = 1.99–7.19, P = 4.90 × 10−5 for GWAS scan, OR = 1.89, 95%CI = 1.03–3.46, P = 0.039 for replication, and OR = 2.56, 95%CI = 1.65–3.95, P = 2.55 × 10−5 for pooled population). These results suggested that genetic variants at 21q22.3 may contribute to the susceptibility of platinum-induced hepatotoxicity in NSCLC patients.
PMCID: PMC4477405  PMID: 26100964
4.  Prediction models and risk assessment for silicosis using a retrospective cohort study among workers exposed to silica in China 
Scientific Reports  2015;5:11059.
This study aims to develop a prognostic risk prediction model for the development of silicosis among workers exposed to silica dust in China. The prediction model was performed by using retrospective cohort of 3,492 workers exposed to silica in an iron ore, with 33 years of follow-up. We developed a risk score system using a linear combination of the predictors weighted by the LASSO penalized Cox regression coefficients. The model’s predictive accuracy was evaluated using time-dependent ROC curves. Six predictors were selected into the final prediction model (age at entry of the cohort, mean concentration of respirable silica, net years of dust exposure, smoking, illiteracy, and no. of jobs). We classified workers into three risk groups according to the quartile (Q1, Q3) of risk score; 203 (23.28%) incident silicosis cases were derived from the high risk group (risk score ≥ 5.91), whilst only 4 (0.46%) cases were from the low risk group (risk score < 3.97). The score system was regarded as accurate given the range of AUCs (83–96%). This study developed a unique score system with a good internal validity, which provides scientific guidance to the clinicians to identify high-risk workers, thus has important cost efficient implications.
PMCID: PMC4473532  PMID: 26090590
5.  A Common Variant Of Ubiquinol-Cytochrome c Reductase Complex Is Associated with DDH 
PLoS ONE  2015;10(4):e0120212.
Genetic basis of Developmental dysplasia of the hip (DDH) remains largely unknown. To find new susceptibility genes for DDH, we carried out a genome-wide association study (GWAS) for DDH.
We enrolled 386 radiology confirmed DDH patients and 558 healthy controls (Set A) to conduct a genome-wide association study (GWAS). Quality-control was conducted at both the sample and single nucleotide polymorphism (SNP) levels. We then conducted a subsequent case-control study to replicate the association between a promising loci, rs6060373 in UQCC gene and DDH in an independent set of 755 cases and 944 controls (set B).
In the DDH GWAS discovering stage, 51 SNPs showed significance of less than 10-4, and another 577 SNPs showed significance of less than 10-3. In UQCC, all the 12 genotyped SNPs showed as promising risk loci. Genotyping of rs6060373 in set A showed the minor allele A as a promising risk allele (p = 4.82*10-7). In set A, the odds ratio of allele A was 1.77. Genotyping of rs6060373 in Set B produced another significant result (p = 0.0338) with an odds ratio of 1.18 for risk allele A. Combining set A and set B, we identified a total p value of 3.63*10-6 with the odds ratio of 1.35 (1.19–1.53) for allele A.
Our study demonstrates common variants of UQCC, specifically rs6060373, are associated with DDH in Han Chinese population.
PMCID: PMC4388640  PMID: 25848760
6.  Clinical outcome and expression of mutant P53, P16, and Smad4 in lung adenocarcinoma: a prospective study 
Whole-exome sequencing has shown that lung adenocarcinoma (LAC) can be driven by mutant genes, including TP53, P16, and Smad4. The aim of this study was to clarify protein alterations of P53, P16, and Smad4 and to explore their correlations between the protein alterations and clinical outcome.
We investigated associations among P53 mutant (P53Mut) expression, and P16 and Smad4 loss-of-expression, with clinical outcome in 120 LAC patients who underwent curative resection, using immunohistochemical (IHC) methods.
Of the 120 patients, 76 (63.3%) expressed P53Mut protein, whereas 54 (45.0%) loss of P16 expressed and 75 (62.5%) loss of Smad4 expressed. P53Mut expression was associated with tumor size (P = 0.041) and pathological stage (P = 0.025). Loss of P16 expression was associated with lymph node metastasis (P = 0.001) and pathological stage (P < 0.001). Loss of Smad4 expression was associated with tumor size (P = 0.033), lymph node metastasis (P = 0.014), pathological stage (P = 0.017), and tumor differentiation (P = 0.022). Kaplan-Meier survival analysis showed that tumor size (P = 0.031), lymph node metastasis (P < 0.001), pathological stage (P < 0.001), P53Mut protein expression (P = 0.038), and loss of p16 or Smad4 expression (P < 0.001) were significantly associated with shorter overall survival(OS), whereas multivariate analysis indicated that lymph node metastasis (P = 0.014) and loss of p16 or Smad4 expression (P < 0.001) were independent prognostic factors. Analysis of protein combinations showed patients with more alterations had poorer survival (P < 0.001). Spearman correlation analysis showed that loss of Smad4 expression inversely correlated with expression of P53Mut (r = −0.196, P = 0.032) and positively with lost P16 expression (r =0.182, P = 0.047).
The findings indicate that IHC status of P53Mut, P16, and Smad4 may predict patient outcomes in LAC.
PMCID: PMC4415338  PMID: 25890228
Lung adenocarcinoma; Mutant P53; P16; Smad4; Immunohistochemistry; Prognosis
7.  A genome-wide gene–gene interaction analysis identifies an epistatic gene pair for lung cancer susceptibility in Han Chinese 
Carcinogenesis  2013;35(3):572-577.
Lung cancer is the leading cause of cancer-related deaths worldwide. By now, genome-wide association studies (GWAS) have identified numerous loci associated with the risk of developing lung cancer. However, these loci account for only a small fraction of the familial lung cancer risk. We hypothesized that epistasis may contribute to the missing heritability. To test this hypothesis, we systematically evaluated the association of epistasis of genetic variants with risk of lung cancer in Han Chinese cohorts. We conducted a pairwise genetic interaction analysis of 591370 variants, using BOolean Operation-based Screening and Testing (BOOST), in an ongoing GWAS of lung cancer that includes 2331 cases and 3077 controls. Pairs of epistatic loci with P BOOST ≤ 1.00×10−6 were further evaluated by a logistic regression model (LRM) with covariate adjustment. Four promising epistatic pairs identified at the screening stage (P LRM ≤ 2.86×10− 13) were validated in two replication cohorts: the first from Beijing (1534 cases and 1489 controls) and the second from Shenyang and Guangzhou (2512 cases and 2449 controls). Using this combined analysis, we identified an interaction between rs2562796 and rs16832404 at 2p32.2 that was significantly associated with the risk of developing lung cancer (P LRM = 1.03×10−13 in total 13 392 subjects). This study is the first investigation of epistasis for lung cancer on a genome-wide scale in Han Chinese. It addresses part of the missing heritability in lung cancer risk and provides novel insight into the multifactorial etiology of lung cancer.
PMCID: PMC3941747  PMID: 24325914
8.  Joint analysis of three genome-wide association studies of esophageal squamous cell carcinoma in Chinese populations 
Wu, Chen | Wang, Zhaoming | Song, Xin | Feng, Xiao-Shan | Abnet, Christian C. | He, Jie | Hu, Nan | Zuo, Xian-Bo | Tan, Wen | Zhan, Qimin | Hu, Zhibin | He, Zhonghu | Jia, Weihua | Zhou, Yifeng | Yu, Kai | Shu, Xiao-Ou | Yuan, Jian-Min | Zheng, Wei | Zhao, Xue-Ke | Gao, She-Gan | Yuan, Zhi-Qing | Zhou, Fu-You | Fan, Zong-Min | Cui, Ji-Li | Lin, Hong-Li | Han, Xue-Na | Li, Bei | Chen, Xi | Dawsey, Sanford M. | Liao, Linda | Lee, Maxwell P. | Ding, Ti | Qiao, You-Lin | Liu, Zhihua | Liu, Yu | Yu, Dianke | Chang, Jiang | Wei, Lixuan | Gao, Yu-Tang | Koh, Woon-Puay | Xiang, Yong-Bing | Tang, Ze-Zhong | Fan, Jin-Hu | Han, Jing-Jing | Zhou, Sheng-Li | Zhang, Peng | Zhang, Dong-Yun | Yuan, Yuan | Huang, Ying | Liu, Chunling | Zhai, Kan | Qiao, Yan | Jin, Guangfu | Guo, Chuanhai | Fu, Jianhua | Miao, Xiaoping | Lu, Changdong | Yang, Haijun | Wang, Chaoyu | Wheeler, William A. | Gail, Mitchell | Yeager, Meredith | Yuenger, Jeff | Guo, Er-Tao | Li, Ai-Li | Zhang, Wei | Li, Xue-Min | Sun, Liang-Dan | Ma, Bao-Gen | Li, Yan | Tang, Sa | Peng, Xiu-Qing | Liu, Jing | Hutchinson, Amy | Jacobs, Kevin | Giffen, Carol | Burdette, Laurie | Fraumeni, Joseph F. | Shen, Hongbing | Ke, Yang | Zeng, Yixin | Wu, Tangchun | Kraft, Peter | Chung, Charles C. | Tucker, Margaret A. | Hou, Zhi-Chao | Liu, Ya-Li | Hu, Yan-Long | Liu, Yu | Wang, Li | Yuan, Guo | Chen, Li-Sha | Liu, Xiao | Ma, Teng | Meng, Hui | Sun, Li | Li, Xin-Min | Li, Xiu-Min | Ku, Jian-Wei | Zhou, Ying-Fa | Yang, Liu-Qin | Wang, Zhou | Li, Yin | Qige, Qirenwang | Yang, Wen-Jun | Lei, Guang-Yan | Chen, Long-Qi | Li, En-Min | Yuan, Ling | Yue, Wen-Bin | Wang, Ran | Wang, Lu-Wen | Fan, Xue-Ping | Zhu, Fang-Heng | Zhao, Wei-Xing | Mao, Yi-Min | Zhang, Mei | Xing, Guo-Lan | Li, Ji-Lin | Han, Min | Ren, Jing-Li | Liu, Bin | Ren, Shu-Wei | Kong, Qing-Peng | Li, Feng | Sheyhidin, Ilyar | Wei, Wu | Zhang, Yan-Rui | Feng, Chang-Wei | Wang, Jin | Yang, Yu-Hua | Hao, Hong-Zhang | Bao, Qi-De | Liu, Bao-Chi | Wu, Ai-Qun | Xie, Dong | Yang, Wan-Cai | Wang, Liang | Zhao, Xiao-Hang | Chen, Shu-Qing | Hong, Jun-Yan | Zhang, Xue-Jun | Freedman, Neal D | Goldstein, Alisa M. | Lin, Dongxin | Taylor, Philip R. | Wang, Li-Dong | Chanock, Stephen J.
Nature genetics  2014;46(9):1001-1006.
We conducted a joint (pooled) analysis of three genome-wide association studies (GWAS) 1-3 of esophageal squamous cell carcinoma (ESCC) in ethnic Chinese (5,337 ESCC cases and 5,787 controls) with 9,654 ESCC cases and 10,058 controls for follow-up. In a logistic regression model adjusted for age, sex, study, and two eigenvectors, two new loci achieved genome-wide significance, marked by rs7447927 at 5q31.2 (per-allele odds ratio (OR) = 0.85, 95% CI 0.82-0.88; P=7.72x10−20) and rs1642764 at 17p13.1 (per-allele OR= 0.88, 95% CI 0.85-0.91; P=3.10x10−13). rs7447927 is a synonymous single nucleotide polymorphism (SNP) in TMEM173 and rs1642764 is an intronic SNP in ATP1B2, near TP53. Furthermore, a locus in the HLA class II region at 6p21.32 (rs35597309) achieved genome-wide significance in the two populations at highest risk for ESSC (OR=1.33, 95% CI 1.22-1.46; P=1.99x10−10). Our joint analysis identified new ESCC susceptibility loci overall as well as a new locus unique to the ESCC high risk Taihang Mountain region.
PMCID: PMC4212832  PMID: 25129146
9.  Cumulative Effect and Predictive Value of Genetic Variants Associated with Type 2 Diabetes in Han Chinese: A Case-Control Study 
PLoS ONE  2015;10(1):e0116537.
Genome-wide association studies (GWAS) have identified dozens of single nucleotide polymorphisms (SNPs) associated with type 2 diabetes risk. We have previously confirmed the associations of genetic variants in HHEX, CDKAL1, VEGFA and FTO with type 2 diabetes in Han Chinese. However, the cumulative effect and predictive value of these GWAS identified SNPs on the risk of type 2 diabetes in Han Chinese are largely unknown.
Methodology/Principal Findings
We conducted a two-stage case-control study consisting of 2,925 cases and 3,281controls to examine the association of 30 SNPs identified by GWAS with type 2 diabetes in Han Chinese. Significant associations were found for proxy SNPs at KCNQ1 [odds ratio (OR) = 1.41, P = 9.91 × 10–16 for rs2237897], CDKN2A/CDKN2B (OR = 1.30, P = 1.34 × 10–10 for rs10811661), CENTD2 (OR = 1.28, P = 9.88 × 10-4 for rs1552224) and SLC30A8 (OR = 1.19, P = 1.43 × 10-5 for rs13266634). We further evaluated the cumulative effect on type 2 diabetes of these 4 SNPs, in combination with 5 SNPs at HHEX, CDKAL1, VEGFA and FTO reported previously. Individuals carrying 12 or more risk alleles had a nearly 4-fold increased risk for developing type 2 diabetes compared with those carrying less than 6 risk alleles [adjusted OR = 3.68, 95% confidence interval (CI): 2.76–4.91]. Adding the genetic factors to clinical factors slightly improved the prediction of type 2 diabetes, with the area under the receiver operating characteristic curve increasing from 0.76 to 0.78. However, the difference was statistically significant (P < 0.0001).
We confirmed associations of SNPs in KCNQ1, CDKN2A/CDKN2B, CENTD2 and SLC30A8 with type 2 diabetes in Han Chinese. The utilization of genetic information may improve the accuracy of risk prediction in combination with clinical characteristics for type 2 diabetes.
PMCID: PMC4294637  PMID: 25587982
10.  Association of GWAS-Identified Lung Cancer Susceptibility Loci with Survival Length in Patients with Small-Cell Lung Cancer Treated with Platinum-Based Chemotherapy 
PLoS ONE  2014;9(11):e113574.
Genetic variants have been shown to affect length of survival in cancer patients. This study explored the association between lung cancer susceptibility loci tagged by single-nucleotide polymorphisms (SNPs) identified in the genome-wide association studies and length of survival in small-cell lung cancer (SCLC). Eighteen SNPs were genotyped among 874 SCLC patients and Cox proportional hazards regression was used to examine the effects of genotype on survival length under an additive model with age, sex, smoking status and clinical stage as covariates. We identified 3 loci, 20q13.2 (rs4809957G >A), 22q12.2 (rs36600C >T) and 5p15.33 (rs401681C >T), significantly associated with the survival time of SCLC patients. The adjusted hazard ratio (HR) for patients with the rs4809957 GA or AA genotype was 0.80 (95% CI, 0.66–0.96; P = 0.0187) and 0.73 (95% CI, 0.55–0.96; P = 0.0263) compared with the GG genotype. Using the dominant model, the adjusted HR for patients carrying at least one T allele at rs36600 or rs401681 was 0.78 (95% CI, 0.63–0.96; P = 0.0199) and 1.29 (95% CI, 1.08–1.55; P = 0.0047), respectively, compared with the CC genotype. Stratification analyses showed that the significant associations of these 3 loci were only seen in smokers and male patients. The rs4809957 SNP was only significantly associated with length of survival of patients with extensive-stage but not limited-stage tumor. These results suggest that some of the lung cancer susceptibility loci might also affect the prognosis of SCLC.
PMCID: PMC4240611  PMID: 25415319
11.  Genome-wide association analysis identifies new lung cancer susceptibility loci in never-smoking women in Asia 
Lan, Qing | Hsiung, Chao A | Matsuo, Keitaro | Hong, Yun-Chul | Seow, Adeline | Wang, Zhaoming | Hosgood, H Dean | Chen, Kexin | Wang, Jiu-Cun | Chatterjee, Nilanjan | Hu, Wei | Wong, Maria Pik | Zheng, Wei | Caporaso, Neil | Park, Jae Yong | Chen, Chien-Jen | Kim, Yeul Hong | Kim, Young Tae | Landi, Maria Teresa | Shen, Hongbing | Lawrence, Charles | Burdett, Laurie | Yeager, Meredith | Yuenger, Jeffrey | Jacobs, Kevin B | Chang, I-Shou | Mitsudomi, Tetsuya | Kim, Hee Nam | Chang, Gee-Chen | Bassig, Bryan A | Tucker, Margaret | Wei, Fusheng | Yin, Zhihua | Wu, Chen | An, She-Juan | Qian, Biyun | Lee, Victor Ho Fun | Lu, Daru | Liu, Jianjun | Jeon, Hyo-Sung | Hsiao, Chin-Fu | Sung, Jae Sook | Kim, Jin Hee | Gao, Yu-Tang | Tsai, Ying-Huang | Jung, Yoo Jin | Guo, Huan | Hu, Zhibin | Hutchinson, Amy | Wang, Wen-Chang | Klein, Robert | Chung, Charles C | Oh, In-Jae | Chen, Kuan-Yu | Berndt, Sonja I | He, Xingzhou | Wu, Wei | Chang, Jiang | Zhang, Xu-Chao | Huang, Ming-Shyan | Zheng, Hong | Wang, Junwen | Zhao, Xueying | Li, Yuqing | Choi, Jin Eun | Su, Wu-Chou | Park, Kyong Hwa | Sung, Sook Whan | Shu, Xiao-Ou | Chen, Yuh-Min | Liu, Li | Kang, Chang Hyun | Hu, Lingmin | Chen, Chung-Hsing | Pao, William | Kim, Young-Chul | Yang, Tsung-Ying | Xu, Jun | Guan, Peng | Tan, Wen | Su, Jian | Wang, Chih-Liang | Li, Haixin | Sihoe, Alan Dart Loon | Zhao, Zhenhong | Chen, Ying | Choi, Yi Young | Hung, Jen-Yu | Kim, Jun Suk | Yoon, Ho-Il | Cai, Qiuyin | Lin, Chien-Chung | Park, In Kyu | Xu, Ping | Dong, Jing | Kim, Christopher | He, Qincheng | Perng, Reury-Perng | Kohno, Takashi | Kweon, Sun-Seog | Chen, Chih-Yi | Vermeulen, Roel | Wu, Junjie | Lim, Wei-Yen | Chen, Kun-Chieh | Chow, Wong-Ho | Ji, Bu-Tian | Chan, John K C | Chu, Minjie | Li1, Yao-Jen | Yokota, Jun | Li, Jihua | Chen, Hongyan | Xiang, Yong-Bing | Yu, Chong-Jen | Kunitoh, Hideo | Wu, Guoping | Jin, Li | Lo, Yen-Li | Shiraishi, Kouya | Chen, Ying-Hsiang | Lin, Hsien-Chih | Wu, Tangchun | Wu, Yi-Long | Yang, Pan-Chyr | Zhou, Baosen | Shin, Min-Ho | Fraumeni, Joseph F | Lin, Dongxin | Chanock, Stephen J | Rothman, Nathaniel
Nature genetics  2012;44(12):1330-1335.
To identify common genetic variants that contribute to lung cancer susceptibility, we conducted a multistage genome-wide association study of lung cancer in Asian women who never smoked. We scanned 5,510 never-smoking female lung cancer cases and 4,544 controls drawn from 14 studies from mainland China, South Korea, Japan, Singapore, Taiwan, and Hong Kong. We genotyped the most promising variants (associated at P < 5 × 10-6) in an additional 1,099 cases and 2,913 controls. We identified three new susceptibility loci at 10q25.2 (rs7086803, P = 3.54 × 10-18), 6q22.2 (rs9387478, P = 4.14 × 10-10) and 6p21.32 (rs2395185, P = 9.51 × 10-9). We also confirmed associations reported for loci at 5p15.33 and 3q28 and a recently reported finding at 17q24.3. We observed no evidence of association for lung cancer at 15q25 in never-smoking women in Asia, providing strong evidence that this locus is not associated with lung cancer independent of smoking.
PMCID: PMC4169232  PMID: 23143601
12.  Replication of the 4p16 Susceptibility Locus in Congenital Heart Disease in Han Chinese Populations 
PLoS ONE  2014;9(9):e107411.
Congenital heart disease (CHD) is the most common form of congenital human birth anomalies and a leading cause of perinatal and infant mortality. Some studies including our published genome-wide association study (GWAS) of CHD have indicated that genetic variants may contribute to the risk of CHD. Recently, Cordell et al. published a GWAS of multiple CHD phenotypes in European Caucasians and identified 3 susceptibility loci (rs870142, rs16835979 and rs6824295) for ostium secundum atrial septal defect (ASD) at chromosome 4p16. However, whether these loci at 4p16 confer the predisposition to CHD in Chinese population is unclear. In the current study, we first analyzed the associations between these 3 single nucleotide polymorphisms (SNPs) at 4p16 and CHD risk by using our existing genome-wide scan data and found all of the 3 SNPs showed significant associations with ASD in the same direction as that observed in Cordell’s study, but not with other subtypes- ventricular septal defect (VSD) and ASD combined VSD. As these 3 SNPs were in high linkage disequilibrium (LD) in Chinese population, we selected one SNP with the lowest P value in our GWAS scan (rs16835979) to perform a replication study with additional 1,709 CHD cases with multiple phenotypes and 1,962 controls. The significant association was also observed only within the ASD subgroup, which was heterogeneous from other disease groups. In combined GWAS and replication samples, the minor allele of rs16835979 remained significant association with the risk of ASD (OR = 1.22, 95% CI = 1.08–1.38, P = 0.001). Our findings suggest that susceptibility loci of ASD identified from Cordell’s European GWAS are generalizable to Chinese population, and such investigation may provide new insights into the roles of genetic variants in the etiology of different CHD phenotypes.
PMCID: PMC4162603  PMID: 25215500
13.  The Spatial Analysis on Hemorrhagic Fever with Renal Syndrome in Jiangsu Province, China Based on Geographic Information System 
PLoS ONE  2014;9(9):e83848.
Hemorrhagic fever with renal syndrome (HFRS) is endemic in mainland China, accounting for 90% of total reported cases worldwide, and Jiangsu is one of the most severely affected provinces. In this study, the authors conducted GIS-based spatial analyses in order to determine the spatial distribution of the HFRS cases, identify key areas and explore risk factors for public health planning and resource allocation.
Interpolation maps by inverse distance weighting were produced to detect the spatial distribution of HFRS cases in Jiangsu from 2001 to 2011. Spatio-temporal clustering was applied to identify clusters at the county level. Spatial correlation analysis was conducted to detect influencing factors of HFRS in Jiangsu.
HFRS cases in Jiangsu from 2001 to 2011 were mapped and the results suggested that cases in Jiangsu were not distributed randomly. Cases were mainly distributed in northeastern and southwestern Jiangsu, especially in Dafeng and Sihong counties. It was notable that prior to this study, Sihong county had rarely been reported as a high-risk area of HFRS. With the maximum spatial size of 50% of the total population and the maximum temporal size of 50% of the total population, spatio-temporal clustering showed that there was one most likely cluster (LLR = 624.52, P<0.0001, RR = 8.19) and one second-most likely cluster (LLR = 553.97, P<0.0001, RR = 8.25), and both of these clusters appeared from 2001 to 2004. Spatial correlation analysis showed that the incidence of HFRS in Jiangsu was influenced by distances to highways, railways, rivers and lakes.
The application of GIS together with spatial interpolation, spatio-temporal clustering and spatial correlation analysis can effectively identify high-risk areas and factors influencing HFRS incidence to lay a foundation for researching its pathogenesis.
PMCID: PMC4160164  PMID: 25207806
14.  Evaluation of functional genetic variants at 6q25.1 and risk of breast cancer in a Chinese population 
Single-nucleotide polymorphisms (SNPs) at 6q25.1 that are associated with breast cancer susceptibility have been identified in several genome-wide association studies (GWASs). However, the exact causal variants in this region have not been clarified.
In the present study, we genotyped six potentially functional single-nucleotide polymorphisms (SNPs) within the CCDC170 and ESR1 gene regions at 6q25.1 and accessed their associations with risk of breast cancer in a study of 1,064 cases and 1,073 cancer-free controls in Chinese women. The biological function of the risk variant was further evaluated by performing laboratory experiments.
Breast cancer risk was significantly associated with three SNPs located at 6q25.1—rs9383935 in CCDC170 and rs2228480 and rs3798758 in ESR1—with variant allele attributed odds ratios (ORs) of 1.38 (95% confidence interval (CI): 1.20 to 1.57, P = 2.21 × 10−6), 0.84 (95% CI: 0.72 to 0.98, P = 0.025) and 1.19 (95% CI: 1.04 to 1.37, P = 0.013), respectively. The functional variant rs9383935 is in high linkage disequilibrium (LD) with GWAS-reported top-hit SNP (rs2046210), but only rs9383935 showed a strong independent effect in conditional regression analysis. The rs9383935 risk allele A showed decreased activity of reporter gene in both the MCF-7 and BT-474 breast cancer cell lines, which might be due to an altered binding capacity of miR-27a to the 3′ untranslated region (3′ UTR) sequence of CCDC170. Real-time quantitative reverse transcription PCR confirmed the correlation between rs9383935 genotypes and CCDC170 expression levels.
The results of this study suggest that the functional variant rs9383935, located at the 3′ UTR of CCDC170, may be one candidate of the causal variants at 6q25.1 that modulate the risk of breast cancer.
Electronic supplementary material
The online version of this article (doi:10.1186/s13058-014-0422-x) contains supplementary material, which is available to authorized users.
PMCID: PMC4303231  PMID: 25116933
15.  Genome-wide association study in Chinese men identifies two new prostate cancer risk loci at 9q31.2 and 19q13.4 
Nature genetics  2012;44(11):1231-1235.
Prostate cancer risk–associated variants have been reported in populations of European descent, African-Americans and Japanese using genome-wide association studies (GWAS). To systematically investigate prostate cancer risk–associated variants in Chinese men, we performed the first GWAS in Han Chinese. In addition to confirming several associations reported in other ancestry groups, this study identified two new risk-associated loci for prostate cancer on chromosomes 9q31.2 (rs817826, P = 5.45 × 10−14) and 19q13.4 (rs103294, P = 5.34 × 10−16) in 4,484 prostate cancer cases and 8,934 controls. The rs103294 marker at 19q13.4 is in strong linkage equilibrium with a 6.7-kb germline deletion that removes the first six of seven exons in LILRA3, a gene regulating inflammatory response, and was significantly associated with the mRNA expression of LILRA3 in T cells (P < 1 × 10−4). These findings may advance the understanding of genetic susceptibility to prostate cancer.
PMCID: PMC4116636  PMID: 23023329
16.  Genetic variants in STAT4 and HLA-DQ genes confer risk of hepatitis B virus–related hepatocellular carcinoma 
Nature genetics  2012;45(1):72-75.
To identify genetic susceptibility loci for hepatitis B virus (HBV)-related hepatocellular carcinoma (HCC) in the Chinese population, we carried out a genome-wide association study (GWAS) in 2,514 chronic HBV carriers (1,161 HCC cases and 1,353 controls) followed by a 2-stage validation among 6 independent populations of chronic HBV carriers (4,319 cases and 4,966 controls). The joint analyses showed that HCC risk was significantly associated with two independent loci: rs7574865 at STAT4, Pmeta = 2.48 × 10−10, odds ratio (OR) = 1.21; and rs9275319 at HLA-DQ, Pmeta = 2.72 × 10−17, OR = 1.49. The risk allele G at rs7574865 was significantly associated with lower mRNA levels of STAT4 in both the HCC tissues and nontumor tissues of 155 individuals with HBV-related HCC (Ptrend = 0.0008 and 0.0002, respectively). We also found significantly lower mRNA expression of STAT4 in HCC tumor tissues compared with paired adjacent nontumor tissues (P = 2.33 × 10−14).
PMCID: PMC4105840  PMID: 23242368
17.  Evaluation of the Impact of Hepatitis B Vaccination in Adults in Jiangsu Province, China 
PLoS ONE  2014;9(6):e101501.
Hepatitis B immunization programs for newborns, children, and adolescents in China have shown remarkable results. To establish whether there would be any benefit in extending the program to cover older individuals, we examined both the epidemiology of hepatitis B virus (HBV) infection and the coverage of hepatitis B vaccinations among adults born before routine vaccinations were implemented. We then evaluated the impact of hepatitis B vaccination in adults aged 20–59 years. A large-scale cross-sectional epidemiological survey of HBV infection was performed in the province of Jiangsu, south-east China, between September 2009 and March 2010. A total of 86,732 adults aged 20–59 years were included, of which 8,615 (9.9%, 95% CI = 9.7–10.1%) were HBsAg sero-positive. Self-reported vaccination status suggested that the coverage was approximately 23.7% (95% CI = 23.4–24.0%). It was shown that higher HBV vaccination coverage was associated with a lower rate of HBsAg seropositivity among adults. There was a negative correlation between hepatitis B vaccination coverage and HBsAg prevalence (correlation coefficient = −0.805, p = 0.016), which might demonstrate the combined effects of vaccination and pre-vaccination HBsAg screening. In the unvaccinated group, the HBsAg-positive rate had an obvious upward trend with age growing among 20–39 year-olds (Trend χ2 = 22.605, P<0.001), while the vaccinated group showed no such trend (Trend χ2 = 3.462, P = 0.063). Overall, hepatitis B vaccination in adults might reduce the rate of HBsAg positivity. Therefore, routine immunization of adults aged 20–39 years should be seriously considered.
PMCID: PMC4076282  PMID: 24979048
18.  Common genetic determinants of breast-cancer risk in East Asian women: a collaborative study of 23 637 breast cancer cases and 25 579 controls 
Human Molecular Genetics  2013;22(12):2539-2550.
In a consortium including 23 637 breast cancer patients and 25 579 controls of East Asian ancestry, we investigated 70 single-nucleotide polymorphisms (SNPs) in 67 independent breast cancer susceptibility loci recently identified by genome-wide association studies (GWASs) conducted primarily in European-ancestry populations. SNPs in 31 loci showed an association with breast cancer risk at P < 0.05 in a direction consistent with that reported previously. Twenty-one of them remained statistically significant after adjusting for multiple comparisons with the Bonferroni-corrected significance level of <0.0015. Eight of the 70 SNPs showed a significantly different association with breast cancer risk by estrogen receptor (ER) status at P < 0.05. With the exception of rs2046210 at 6q25.1, the seven other SNPs showed a stronger association with ER-positive than ER-negative cancer. This study replicated all five genetic risk variants initially identified in Asians and provided evidence for associations of breast cancer risk in the East Asian population with nearly half of the genetic risk variants initially reported in GWASs conducted in European descendants. Taken together, these common genetic risk variants explain ∼10% of excess familial risk of breast cancer in Asian populations.
PMCID: PMC3658167  PMID: 23535825
19.  A genetic variant in pseudogene E2F3P1 contributes to prognosis of hepatocellular carcinoma 
Journal of Biomedical Research  2014;28(3):194-200.
Certain pseudogenes may regulate their protein-coding cousins by competing for miRNAs and play an active biological role in cancer. However, few studies have focused on the association of genetic variations in pseudogenes with cancer prognosis. We selected six potentially functional single nucleotide polymorphisms (SNPs) in cancer-related pseudogenes, and performed a case-only study to assess the association between those SNPs and the prognosis of hepatocellular carcinoma (HCC) in 331 HBV-positive HCC patients without surgical treatment. Log-rank test and Cox proportional hazard models were used for survival analysis. We found that the A allele of rs9909601 in E2F3P1 was significantly associated with a better prognosis compared with the G allele [adjusted hazard ratio (HR)  =  0.69, 95% confidence interval (CI)  =  0.56–0.86, P  =  0.001]. Additionally, this protective effect was more predominant for patients without chemotherapy and transcatheter hepatic arterial chemoembolization (TACE) treatment. Interestingly, we also detected a statistically significant multiplicative interaction between genotypes of rs9909601 and chemotherapy or TACE status on HCC survival (P for multiplicative interaction < 0.001). These findings indicate that rs9909601 in the pseudogene E2F3P1 may be a genetic marker for HCC prognosis in Chinese.
PMCID: PMC4085556  PMID: 25013402
pseudogene; E2F3P1; SNP; hepatocellular carcinoma (HCC); prognosis
20.  A Genetic Variant in Primary miR-378 Is Associated with Risk and Prognosis of Hepatocellular Carcinoma in a Chinese Population 
PLoS ONE  2014;9(4):e93707.
MiR-378 has been reported to be related to cell survival, tumor growth and angiogenesis and may participate in hepatocellular carcinoma (HCC) development and prognosis. Genetic variants in primary miR-378 (pri-miR-378) may impact miR-378 expression and contribute to HCC risk and survival. This study aimed to assess the associations between a genetic variant in primary miR-378 and HCC susceptibility and prognosis.
We conducted a case-control study to analyze the association of rs1076064 in pri-miR-378 with hepatocellular carcinoma risk in 1300 HCC patients with positive hepatitis B virus (HBV) and 1344 HBV carriers. Then, we evaluated the correlation between the polymorphism and hepatocellular carcinoma prognosis in 331 HCC patients at either intermediate or advanced stage without surgical treatment.
The variant genotypes of rs1076064 were associated with a decreased HCC risk in HBV carriers [Adjusted odds ratio (OR) = 0.90, 95% confidence intervals (CI) = 0.81–1.00, P = 0.047]. Moreover, HCC patients with the variant genotypes were associated with a better survival [Adjusted hazard ratio (HR) = 0.70, 95% CIs = 0.59–0.83, P<0.0001 in an additive genetic model]. The reporter gene assay showed that the variant G allele of rs1076064 exerted higher promoter activity than the A allele.
These findings indicate that rs1076064 may be a biomarker for HCC susceptibility and prognosis through altering pri-miR-378 transcription.
PMCID: PMC3994025  PMID: 24751683
21.  Genetic Variants at 10p11 Confer Risk of Tetralogy of Fallot in Chinese of Nanjing 
PLoS ONE  2014;9(3):e89636.
A recent genome-wide association study (GWAS) has identified a new subset of susceptibility loci of Tetralogy of Fallot (TOF), one form of cyanotic congenital heart disease (CHD), on chromosomes 10p11, 10p14, 12q24, 13q31, 15q13 and 16q12 in Europeans. In the current study, we conducted a case-control study in a Chinese population including 1,010 CHD cases [atrial septal defect (ASD), ventricular septal defect (VSD) and TOF] and 1,962 controls to evaluate the associations of these loci with risk of CHD. We found that rs2228638 in NRP1 on 10p11 was significantly increased the risk of TOF (OR = 1.52, 95% CI = 1.13–2.04, P = 0.006), but not in other subgroups including ASD and VSD. In addition, no significant associations were observed between the other loci and the risk of ASD, VSD or TOF. Our results suggested that the genetic variants on 10p11 may serve as candidate markers for TOF susceptibility in Chinese population.
PMCID: PMC3940663  PMID: 24594544
22.  Stability SCAD: a powerful approach to detect interactions in large-scale genomic study 
BMC Bioinformatics  2014;15:62.
Evidence suggests that common complex diseases may be partially due to SNP-SNP interactions, but such detection is yet to be fully established in a high-dimensional small-sample (small-n-large-p) study. A number of penalized regression techniques are gaining popularity within the statistical community, and are now being applied to detect interactions. These techniques tend to be over-fitting, and are prone to false positives. The recently developed stability least absolute shrinkage and selection operator (SLASSO) has been used to control family-wise error rate, but often at the expense of power (and thus false negative results).
Here, we propose an alternative stability selection procedure known as stability smoothly clipped absolute deviation (SSCAD). Briefly, this method applies a smoothly clipped absolute deviation (SCAD) algorithm to multiple sub-samples, and then identifies cluster ensemble of interactions across the sub-samples. The proposed method was compared with SLASSO and two kinds of traditional penalized methods by intensive simulation. The simulation revealed higher power and lower false discovery rate (FDR) with SSCAD. An analysis using the new method on the previously published GWAS of lung cancer confirmed all significant interactions identified with SLASSO, and identified two additional interactions not reported with SLASSO analysis.
Based on the results obtained in this study, SSCAD presents to be a powerful procedure for the detection of SNP-SNP interactions in large-scale genomic data.
PMCID: PMC3984751  PMID: 24580776
Genome-wide association study (GWAS); Interaction; Least absolute shrinkage and selection operator (LASSO); Penalized logistic regression; Smoothly clipped absolute deviation (SCAD); Stability selection
23.  Population Aging and Migrant Workers: Bottlenecks in Tuberculosis Control in Rural China 
PLoS ONE  2014;9(2):e88290.
Tuberculosis is a serious global health problem. Its paradigms are shifting through time, especially in rapidly developing countries such as China. Health providers in China are at the forefront of the battle against tuberculosis; however, there are few empirical studies on health providers' perspectives on the challenges they face in tuberculosis control at the county level in China. This study was conducted among health providers to explore their experiences with tuberculosis control in order to identify bottlenecks and emerging challenges in controlling tuberculosis in rural China.
A qualitative approach was used. Semi-structured, in-depth interviews were conducted with 17 health providers working in various positions within the health system of one rural county (ZJG) of China. Data were analyzed based on thematic content analysis using MAXQDA 10 qualitative data analysis software.
Health providers reported several problems in tuberculosis control in ZJG county. Migrant workers and the elderly were repeatedly documented as the main obstacles in effective tuberculosis control in the county. At a personal level, doctors showed their frustration with the lack of new drugs for treating tuberculosis patients, and their opinions varied regarding incentives for referring patients.
The results suggest that several problems still remain for controlling tuberculosis in rural China. Tuberculosis control efforts need to make reaching the most vulnerable populations a priority and encourage local health providers to adopt innovative practices in the local context based on national guidelines to achieve the best results. Considerable changes in China's National Tuberculosis Control Program are needed to tackle these emerging challenges faced by health workers at the county level.
PMCID: PMC3912209  PMID: 24498440
24.  Genetic Variations in the Flanking Regions of miR-101-2 Are Associated with Increased Risk of Breast Cancer 
PLoS ONE  2014;9(1):e86319.
Genetic variants in human microRNA (miRNA) genes may alter mature miRNA processing and/or target selection, and likely contribute to cancer susceptibility and disease progression. Previous studies have suggested that miR-101 may play important roles in the development of cancer by regulating key tumor-associated genes. However, the role of single nucleotide polymorphisms (SNPs) of miR-101 in breast cancer susceptibility remains unclear. In this study, we genotyped 11 SNPs of the miR-101 genes (including miR-101-1 and miR-101-2) in a case-control study of 1064 breast cancer cases and 1073 cancer-free controls. The results revealed that rs462480 and rs1053872 in the flank regions of pre-miR-101-2 were significantly associated with increased risk of breast cancer (rs462480 AC/CC vs AA: adjusted OR = 1.182, 95% CI: 1.030–1.357, P = 0.017; rs1053872 CG/GG vs CC: adjusted OR = 1.179, 95% CI: 1.040–1.337, P = 0.010). However, the remaining 9 SNPs were not significantly associated with risk of breast cancer. Additionally, combined analysis of the two high-risk SNPs revealed that subjects carrying the variant genotypes of rs462480 and rs1053872 had increased risk of breast cancer in a dose-response manner (Ptrend = 0.002). Compared with individuals with “0–1” risk allele, those carrying “2–4” risk alleles had 1.29-fold risk of breast cancer. In conclusion, these findings suggested that the SNPs rs462480 and rs1053872 residing in miR-101-2 gene may have a solid impact on genetic susceptibility to breast cancer, which may improve our understanding of the potential contribution of miRNA SNPs to cancer pathogenesis.
PMCID: PMC3901682  PMID: 24475105
25.  Spatio-Temporal Trends and Risk Factors for Shigella from 2001 to 2011 in Jiangsu Province, People's Republic of China 
PLoS ONE  2014;9(1):e83487.
This study aimed to describe the spatial and temporal trends of Shigella incidence rates in Jiangsu Province, People's Republic of China. It also intended to explore complex risk modes facilitating Shigella transmission.
County-level incidence rates were obtained for analysis using geographic information system (GIS) tools. Trend surface and incidence maps were established to describe geographic distributions. Spatio-temporal cluster analysis and autocorrelation analysis were used for detecting clusters. Based on the number of monthly Shigella cases, an autoregressive integrated moving average (ARIMA) model successfully established a time series model. A spatial correlation analysis and a case-control study were conducted to identify risk factors contributing to Shigella transmissions.
The far southwestern and northwestern areas of Jiangsu were the most infected. A cluster was detected in southwestern Jiangsu (LLR = 11674.74, P<0.001). The time series model was established as ARIMA (1, 12, 0), which predicted well for cases from August to December, 2011. Highways and water sources potentially caused spatial variation in Shigella development in Jiangsu. The case-control study confirmed not washing hands before dinner (OR = 3.64) and not having access to a safe water source (OR = 2.04) as the main causes of Shigella in Jiangsu Province.
Improvement of sanitation and hygiene should be strengthened in economically developed counties, while access to a safe water supply in impoverished areas should be increased at the same time.
PMCID: PMC3885411  PMID: 24416167

