Infantile Hypertrophic Pyloric Stenosis (IHPS) is a severe condition characterized by hypertrophy of the pyloric sphincter muscle. We conducted a genome-wide association study (GWAS) on 1,001 surgery-confirmed cases and 2,401 controls from Denmark. The 6 most strongly associated loci were tested in a replication set of 796 cases and 876 controls. Three SNPs reached genome-wide significance. Rs11712066 (odds ratio (OR) = 1.61, P = 1.5 × 10−17) on 3p25.1 is located 150 kb upstream of MBNL1, which regulates splicing transitions occurring shortly after birth. The second SNP, rs573872 (OR = 1.41, P = 4.3 × 10−12), maps to an intergenic region on 3p25.2 some 1.3 Mb downstream of MBNL1. The third SNP, rs29784 (OR = 1.42, P = 1.5 × 10−15) on 5q35.2, is 64 kb downstream of NKX2-5, which is involved in development of cardiac muscle tissue and embryonic gut development.
IHPS is a severe condition of infancy in which hypertrophy of the pyloric sphincter smooth muscle leads to obstruction of the gastric outlet. Symptoms typically appear between 2 and 8 weeks after birth 1 and include non-bilious projectile vomiting that progresses to dehydration, weight loss and hypochloremic, hypokalemic metabolic alkalosis. Treatment may require correction of electrolyte disturbances and curative surgical incision of the pyloric sphincter muscle. The incidence of IHPS among whites is 1.5 to 3 per 1000 live births 2, 3, and it is the most common condition requiring surgery in the first months of life 4. There is a pronounced male excess in the incidence of IHPS, with affected boys outnumbering girls in a 4 to 1 ratio 5, 6.
While environmental risk factors such as exposure to erythromycin 7 and bottle feeding 8 have been reported, IHPS also aggregates strongly in families 6. In a population-based cohort study of almost 2 million children, there was a nearly 200-fold increased risk in monozygotic twins, a 20-fold increased risk among siblings and the heritability was estimated to be 87% 3. IHPS is generally regarded as a complex disease with multiple genetic and environmental factors contributing to disease pathogenesis 2, 9. Several linkage and candidate gene studies have been conducted (review by Panteli10), but there are no genetic variants with replicated association findings.
To identify IHPS susceptibility loci, we analyzed the association between the disease and 523,420 SNPs in 1,001 cases and 2,401 controls of Danish descent. All IHPS cases were surgery-confirmed, singletons, and without major congenital malformations (see Supplementary Note and Supplementary Table 1 for details). We identified six loci associated at P<=1×10−6 (Figure 1 and Supplementary Figure 1). To confirm the observed associations we genotyped and tested the most significant SNP at each of the six loci in a replication sample of 796 cases (also surgery-confirmed) and 876 controls, all of Danish descent (see Supplementary Table 2 for basic characteristics of the discovery and replication sample). Three loci replicated with combined p-values < 1 × 10−11, two loci were close to nominal significance in the replication stage but remained suggestive, and one locus failed to replicate (Table 1). Genotyping of one additional SNP at each locus produced essentially the same results (Supplementary Table 3) and correction for possible population substructure did not influence our findings (Supplementary Table 4). We found no evidence of interaction between the top three loci (data not shown). Assuming a multifactorial liability-threshold model 11, the three confirmed SNPs collectively explain 1.8% of the variance in liability to IHPS. Considering the combined number of risk alleles across the three SNPs, the 8% of children carrying five or six risk alleles were at almost five times higher risk of IHPS compared to the 8% with zero or one alleles (OR = 4.91, 95% confidence interval 3.59 to 6.71).
The first genome-wide significant variant, rs11712066 (OR = 1.61, P = 1.5 × 10−17), is located 150 kb upstream of MBNL1 on 3q25.1 with a recombination hotspot between the SNP and the gene (Figure 2a). The closest genes on the other side, AADAC and SUCNR1 are about 250 kb centromeric. The second identified SNP, rs573872 (OR = 1.41, P = 4.3 × 10−12) on 3q25.2, lies in a gene desert some 1.3 Mb downstream of MBNL1 and the closest genes are C3orf79 about 250 kb centromeric and ARHGEF26 about 370 kb telomeric (Figure 2b). Further centromeric between rs573872 and MBNL1 there are two small genes P2RY1 and RAP2B. The third genome-wide significant variant, rs29784 (OR = 1.42, P = 1.5 × 10−15), is located on chromosome 5q35.2 between BNIP1 and NKX2-5 in a linkage disequilibrium (LD) block that contains both genes (Figure 2c).
To explore the association signals further, we imputed unobserved genotypes in the three confirmed regions based on the Interim Phase I release of the 1000 Genomes Project12 (see Online Methods and Supplementary Note for details of imputation). A large number of imputed SNPs showed strong association; at the 3q25.1 and 3q25.2 loci, the strongest associated imputed SNPs had P values at the same order of magnitude as the top genotyped SNP; at the 5q35.2 locus the best imputed SNP rs40264 (OR = 1.49, P = 1.2 × 10−12, r2 to rs29784 = 0.85) had a P value two orders of magnitude smaller than that of rs29784 (Supplementary Table 5, Supplementary Figures 2–4). Most associated imputed SNPs were in tight LD with the top genotyped SNP of the region, and when accounting for the genotyped SNP, we observed no P < 10−5 (Supplementary Figures 2–4). A long-range haplotype analysis across the 3p25.1 and 3p25.2 loci did not reveal rare haplotypes with high odds ratio (Supplementary Note and Supplementary Table 6).
In order to investigate the possible functional impact of the associations, we considered all 294 genotyped or imputed SNPs with P < 10−6, and searched for functional predictions for these SNPs. 19 of these SNPs were located in genes (C5orf41 and BNIP1), but all were intronic (Supplementary Table 5). A search of the eQTL browser (see URLs) for the 294 SNPs did not reveal any association to gene expression. Next, we explored ENCODE data13 (using the UCSC genome browser NCBI build 36; see URLs) from chromatin immunoprecipitation sequencing (ChIP-seq) experiments, which showed a 23 kb region of histone modification for H3K36me3 that directly overlies rs11712066. At the 5q35.2 locus there are wider regions of histone modification for H3K27me3 and H3K36me3, both covering rs29784. While histone modification can mark distant-acting enhancers14, speculations on specific mechanisms are premature. A search of the GWAS catalog (see URLs) showed that rs251253 on 5q32.2 (OR = 1.34, P = 6.0 × 10−7, r2 to rs29784 = 0.49) is associated with the electrocardiographic PR interval15. The functional basis for the association is, however, unknown.
In light of the pronounced excess risk of IHPS in males, we also carried out sex-specific analysis of the data (Supplementary Table 7). The three genome-wide significant SNPs showed no evidence of heterogeneity of effects between sexes. Although rs2228671 on chromosome 19p13.2 failed to replicate, the strong effect in boys (OR = 1.54, P = 7.6 × 10−8) and absence of effect in girls (OR = 0.91, P = 0.53) warrants further study.
The three confirmed SNPs map to regions where MBNL1 and NKX2-5 are the strongest functional candidates for IHPS. Members of the muscleblind protein family are important regulators of alternative splicing16 and almost all human multiexon genes undergo alternative splicing 17. In the early post-natal period, splicing transitions from fetal to adult protein isoforms are essential for the extensive remodeling of muscle tissue that occurs as a part of normal development18. In studies of mouse heart and skeletal muscle development, Mbnl1 has been shown to control a set of temporally correlated splicing transitions that occur within the first 3 weeks of post-natal life 19, 20. Additionally, expression levels of Mbnl1 show distinct temporal changes in the early post-natal period correlated with the splicing transitions 18, 19. The intriguing observation that IHPS occurs almost exclusively between 2 and 8 weeks after birth points to a possible role for misregulation of MBNL1-controlled splicing transitions in the etiology of IHPS. The importance of MBNL1 for normal development of muscle tissue is highly evident in myotonic dystrophy, where loss of function of MBNL1 (caused by mutant DMPK mRNA), and consequent aberrant alternative splicing for many different pre-mRNAs, has a pivotal role in the pathogenesis of the disease 20.
NKX2-5 at chromosome 5q35.2 encodes the homeobox transcription factor NKX2-5, which is essential for normal heart formation and development 21. In humans, a range of NKX2-5 mutations have been identified that cause different congenital heart defects including atrial septal defects with or without atrioventricular block, isolated ventricular septal defects, and tetralogy of Fallot 22. Although NKX2-5 is not expressed in adult extracardiac tissues 23 studies of embryonic gut development have shown that NKX2-5 is crucial for the formation of pyloric sphincter muscle tissue 24-26. In both chicken and mouse, Nkx2-5 expression occurs in a sharply defined ring of mesenchyme at the junction between the foregut and midgut on specific days of embryonic development 23, 24, 26. Furthermore, repression of Nkx2-5 activity in the pyloric sphincter region results in loss of the pyloric sphincter endodermal phenotype; conversely, formation of pyloric sphincter-like epithelium in other parts of the gizzard (the equivalent of the stomach in the chicken) can be induced by ectopic expression of Nkx2-5 via a retroviral vector 25.
Further experimental studies are needed to uncover the contributions of, e.g., distant-acting enhancers, alternative splicing, or other potential mechanisms to the molecular etiology of IHPS. Although biopsies of pyloric sphincter muscle tissue would be difficult to obtain, comparison of NKX2-5 expression levels in such tissue from IHPS patients with that from age-matched controls would be interesting, as might be analysis of alternative splicing in pyloric sphincter muscle tissue using mRNA sequencing.
In conclusion, we identified three independent, robustly associated loci for IHPS. The findings point to two candidate genes, involved in regulation of alternative splicing, cardiac muscle development, and embryonic gut development. Further functional investigation of the associations identified here will illuminate the biological mechanisms behind this enigmatic condition and may lead to new approaches for screening, prevention or treatment.