Gastric cancer and esophageal cancer cause more than 700,000 and 400,000 deaths respectively each year, and represent the 2
nd and 6
th leading causes of cancer death worldwide
1. For GC, infection with
Helicobacter pylori is the primary etiologic factor in all populations, although the majority of infected individuals do not develop cancer. Smoking tobacco and drinking alcoholic beverages explain nearly 90% of ESCC cases in the United States and other Western countries
2, but these exposures represent minor factors in high-risk populations in China
3 and elsewhere
4. Risk factors for ESCC in populations with high incidence rates include family history
5 and dietary deficiencies
6, but a large proportion of the etiology in these populations remains unexplained. GC and ESCC occur in the Taihang Mountains of North-Central China at some of the highest rates reported for any cancer
7; over 20% of all deaths in this area have been attributed to these cancers
8, 9. However, the causes of the high rates and of the geographic correlation of these two anatomically adjacent but histologically distinct tumors have not been determined. The gastric cancers in this area occur primarily in the uppermost portion of the stomach (proximal 3 cm) and are referred to as gastric cardia cancers, while those in the remainder of the stomach are referred to as gastric noncardia cancers. In most other parts of China gastric noncardia cancers are the predominant upper gastrointestinal tract tumors
10.
To investigate the genetic contribution to these highly fatal diseases in ethnic Chinese subjects, we conducted parallel genome-wide association studies (GWAS) for GC and ESCC with shared controls. Using the Illumina 660W Quad chip, we scanned 4,987 samples from the case-control and case-only components of the Shanxi Upper Gastrointestinal Cancer Genetics Project (Shanxi) and 1,389 samples from a prospective cohort, the Linxian Nutrition Intervention Trials (NIT); both studies were conducted in the Taihang Mountains (
Supplementary Table 1). After quality control metrics were applied (
Online Methods), 551,152 SNPs were analyzed in 1,625 cases of GC, 1,898 cases of ESCC and 2,100 controls. 12,000 SNPs with minimal linkage disequilibrium (pair-wise r
2 < 0.004) were used to test for differences in population substructure
11 and did not demonstrate significant evidence for population substructure within study (data not shown). In a second phase, we optimized TaqMan assays to genotype eight SNPs that were significant in the genome-wide phase for GC, ESCC, or both in an independent set of subjects (615 GC, 217 ESCC and 1202 controls) from the Shanxi and NIT studies and three additional prospective cohorts (The Shanghai Men's Health Study, the Shanghai Women's Health Study, and the Singapore Chinese Cohort Study) (
Supplementary Table 1). For these eight SNPs, we conducted a combined analysis of 2,240 GC cases, 2,115 ESCC cases, and 3,302 controls (details in
Supplementary Table 1).
The results of the initial GWAS for GC and ESCC, which were analyzed independently, are presented as Manhattan plots in
Supplementary Figure 1 using
P-values from 1 df trend tests in logistic regression models adjusted for age, sex, and study. We found independent genome-wide significant associations at chromosome 10q23 for both GC and ESCC ( and
2, ). For GC, an association initially observed at chromosome 1q22 was not supported in the combined data (); additional studies are required to determine if this locus contributes to risk for GC in ethnic Chinese.
| Table 1Association between SNPs at 10q23 and 1q22 and risk for gastric cancer in all cases and by anatomic location within the stomach |
At 10q23, we analyzed a set of five correlated SNPs in both GC and ESCC, including two nonsynonymous variants. The strongest association for GC was with rs3781264 (
P = 3.76 × 10
−9; per allele OR= 1.36, 95% c.i. 1.23–1.50). The other four SNPs at 10q23 also showed genome-wide significance (). The associations differed when gastric cancers were divided into the two anatomic subsites. The strongest association for gastric cardia cancer was rs2274223 (
P = 4.19 × 10
−15; OR = 1.57, 95% c.i. 1.40–1.76,), but there was no association for gastric noncardia cancers (
P = 0.44; OR = 1.05, 95% c.i. 0.93 – 1.20). rs2274223 and other SNPs at 10q23 also showed genome-wide significance with ESCC (
P = 3.85 × 10
−9; OR = 1.34, 95% c.i. 1.22–1.48) (). We found consistent results when comparing the two studies from the high incidence areas of the Taihang Mountains (
Supplementary Table 2). The five SNPs at 10q23, which have strong pair-wise LD (r
2 from 0.62 to 0.98 in controls), map to the Phospholipase C ε 1 gene (
PLCE1) that lies adjacent to the nucleolar complex associated 3 homolog gene (
NOC3L) ().
| Table 2Association between SNPs at 22q12and 10q23 and risk for esophageal squamous cell carcinoma |
The SNPs that showed significant associations for GC and ESCC at 10q23 in the PLCE1 gene included two SNPs that result in missense mutations in the coding region, rs2274223 (Arg1927His) and rs3765524 (Ile1777Thr). Further work is required to determine if either of these SNPs is functionally important, but the findings suggest a single locus associated with risk for both cancers. Notably, when gastric cancers were divided into the two distinct anatomic locations, the association was restricted to tumors of the cardia ().
PLCE1 is a member of the phospoholipase C family of proteins and, uniquely within this family, it interacts with the proto-oncogene
ras12 among other proteins. Variants in
PLCE1 are known to cause early-onset nephrotic syndrome in humans
13, but this gene may also be linked to carcinogenic processes.
PLCE1 knockout mice are resistant to the promoting effects of 12-O-tetradecanoylphorbol-13-acetate in 7,12-dimethylbenzanthracene-induced skin carcinogenesis
14 and are resistant to intestinal tumor formation when crossed with APC
min/+ mice
15. In addition, the SNPs reside in an area between two recombination hot spots that also includes
NOC3L, which has been linked to control of DNA replication during mitotic clonal expansion
16.
For ESCC, we initially observed an independent significant association with rs738722 at chromosome 22q12 (
P = 5.67 × 10
−8; OR = 1.32, 95% c.i. 1.19–1.45) () in the first phase, but the association was not statistically significant in the second phase by itself. In the combined data the association remained strong (
P = 1.41 × 10
−8; OR = 1.30, 95% c.i. 1.19–1.43). This SNP maps to a region within the CHK2 checkpoint homolog gene (
CHEK2), but is also in LD with regions of the Hsc B iron-sulfur cluster co-chaperone homolog gene (
HSCB) (
Supplementary Fig. 2). Previous studies of Caucasian populations have suggested an association between uncommon variants in
CHEK2 (rs2267130 and rs17879961) and risk of upper aerodigestive tract cancers
17, 18, but these SNPs were not included in our scan. Rare variants in
CHEK2 have also been associated with susceptibility to breast
19, colorectal, and other cancers
20. This association appears promising, but with the lack of independent confirmation further studies are needed to validate it.
We also examined loci previously reported in a GWAS
21 for GC (
Supplementary Table 3). Specifically, we examined rs2920297 and rs2294008 at 8q24; both SNPs are in proximity to the prostate stem cell antigen gene (
PSCA). We found no associations for GC, but when we restricted our analysis to gastric noncardia tumors, both SNPs showed associations of similar magnitude to those reported in a recent meta-analysis of East Asian studies
22 (e.g. rs2294008 OR = 1.35, 95% c.i. 0.94–1.94). For ESCC, we also examined SNPs marking the alcohol-metabolizing genes
ADH1B (rs1159918 and rs1042026) and
ALDH2 (rs3782886 and rs671) that have been reported in candidate gene studies
23 and in a GWAS
24. Overall and in strata defined by alcohol drinking and tobacco smoking, we found no associations with these SNPs (
Supplementary Table 4), perhaps due to the different environmental risk factors for ESCC in our study populations compared to previous studies with strong alcohol- and tobacco-related risks. In the Shanxi
5 and NIT
3 studies, the only two studies included in this portion of the analysis, alcoholic beverage and tobacco use are not major ESCC risk factors.
In summary, we conducted parallel genome-wide association studies for GC and ESCC in ethnic Chinese subjects. Variants at 10q23 in
PLCE1 showed genome-wide significant associations for gastric cardia cancer and ESCC. These findings suggest that a common genetic mechanism may contribute to the etiology of both cancers. Fine mapping and sequencing in these loci will be required to determine the optimal genetic variants to be studied in laboratory systems to explain these association signals. Additional studies are needed to confirm and discover more loci associated with risk for GC and ESCC in populations in East Asia and elsewhere
25.