Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Nat Genet. Author manuscript; available in PMC 2013 June 26.
Published in final edited form as:
PMCID: PMC3693399

Common Variants near MBNL1 and NKX2-5 are Associated with Infantile Hypertrophic Pyloric Stenosis

Infantile Hypertrophic Pyloric Stenosis (IHPS) is a severe condition characterized by hypertrophy of the pyloric sphincter muscle. We conducted a genome-wide association study (GWAS) on 1,001 surgery-confirmed cases and 2,401 controls from Denmark. The 6 most strongly associated loci were tested in a replication set of 796 cases and 876 controls. Three SNPs reached genome-wide significance. Rs11712066 (odds ratio (OR) = 1.61, P = 1.5 × 10−17) on 3p25.1 is located 150 kb upstream of MBNL1, which regulates splicing transitions occurring shortly after birth. The second SNP, rs573872 (OR = 1.41, P = 4.3 × 10−12), maps to an intergenic region on 3p25.2 some 1.3 Mb downstream of MBNL1. The third SNP, rs29784 (OR = 1.42, P = 1.5 × 10−15) on 5q35.2, is 64 kb downstream of NKX2-5, which is involved in development of cardiac muscle tissue and embryonic gut development.

IHPS is a severe condition of infancy in which hypertrophy of the pyloric sphincter smooth muscle leads to obstruction of the gastric outlet. Symptoms typically appear between 2 and 8 weeks after birth 1 and include non-bilious projectile vomiting that progresses to dehydration, weight loss and hypochloremic, hypokalemic metabolic alkalosis. Treatment may require correction of electrolyte disturbances and curative surgical incision of the pyloric sphincter muscle. The incidence of IHPS among whites is 1.5 to 3 per 1000 live births 2, 3, and it is the most common condition requiring surgery in the first months of life 4. There is a pronounced male excess in the incidence of IHPS, with affected boys outnumbering girls in a 4 to 1 ratio 5, 6.

While environmental risk factors such as exposure to erythromycin 7 and bottle feeding 8 have been reported, IHPS also aggregates strongly in families 6. In a population-based cohort study of almost 2 million children, there was a nearly 200-fold increased risk in monozygotic twins, a 20-fold increased risk among siblings and the heritability was estimated to be 87% 3. IHPS is generally regarded as a complex disease with multiple genetic and environmental factors contributing to disease pathogenesis 2, 9. Several linkage and candidate gene studies have been conducted (review by Panteli10), but there are no genetic variants with replicated association findings.

To identify IHPS susceptibility loci, we analyzed the association between the disease and 523,420 SNPs in 1,001 cases and 2,401 controls of Danish descent. All IHPS cases were surgery-confirmed, singletons, and without major congenital malformations (see Supplementary Note and Supplementary Table 1 for details). We identified six loci associated at P<=1×10−6 (Figure 1 and Supplementary Figure 1). To confirm the observed associations we genotyped and tested the most significant SNP at each of the six loci in a replication sample of 796 cases (also surgery-confirmed) and 876 controls, all of Danish descent (see Supplementary Table 2 for basic characteristics of the discovery and replication sample). Three loci replicated with combined p-values < 1 × 10−11, two loci were close to nominal significance in the replication stage but remained suggestive, and one locus failed to replicate (Table 1). Genotyping of one additional SNP at each locus produced essentially the same results (Supplementary Table 3) and correction for possible population substructure did not influence our findings (Supplementary Table 4). We found no evidence of interaction between the top three loci (data not shown). Assuming a multifactorial liability-threshold model 11, the three confirmed SNPs collectively explain 1.8% of the variance in liability to IHPS. Considering the combined number of risk alleles across the three SNPs, the 8% of children carrying five or six risk alleles were at almost five times higher risk of IHPS compared to the 8% with zero or one alleles (OR = 4.91, 95% confidence interval 3.59 to 6.71).

Figure 1
Manhattan plot of GWAS for IHPS. The genome-wide distribution of −log10 P values after correction by genomic control (λ=1.06) is shown across the chromosomes.
Table 1
Discovery, replication and combined results of the loci associated with IHPS. One SNP per region was followed up in the replication stage

The first genome-wide significant variant, rs11712066 (OR = 1.61, P = 1.5 × 10−17), is located 150 kb upstream of MBNL1 on 3q25.1 with a recombination hotspot between the SNP and the gene (Figure 2a). The closest genes on the other side, AADAC and SUCNR1 are about 250 kb centromeric. The second identified SNP, rs573872 (OR = 1.41, P = 4.3 × 10−12) on 3q25.2, lies in a gene desert some 1.3 Mb downstream of MBNL1 and the closest genes are C3orf79 about 250 kb centromeric and ARHGEF26 about 370 kb telomeric (Figure 2b). Further centromeric between rs573872 and MBNL1 there are two small genes P2RY1 and RAP2B. The third genome-wide significant variant, rs29784 (OR = 1.42, P = 1.5 × 10−15), is located on chromosome 5q35.2 between BNIP1 and NKX2-5 in a linkage disequilibrium (LD) block that contains both genes (Figure 2c).

Figure 2
Regional association plots for three confirmed novel IHPS loci. Association plots for the IHPS loci on a) chromosome 3q25.1, b) chromosome 3q25.2, and c) chromosome 5q35.2. SNPs are plotted by chromosomal position (x-axis) against GWAS association with ...

To explore the association signals further, we imputed unobserved genotypes in the three confirmed regions based on the Interim Phase I release of the 1000 Genomes Project12 (see Online Methods and Supplementary Note for details of imputation). A large number of imputed SNPs showed strong association; at the 3q25.1 and 3q25.2 loci, the strongest associated imputed SNPs had P values at the same order of magnitude as the top genotyped SNP; at the 5q35.2 locus the best imputed SNP rs40264 (OR = 1.49, P = 1.2 × 10−12, r2 to rs29784 = 0.85) had a P value two orders of magnitude smaller than that of rs29784 (Supplementary Table 5, Supplementary Figures 24). Most associated imputed SNPs were in tight LD with the top genotyped SNP of the region, and when accounting for the genotyped SNP, we observed no P < 10−5 (Supplementary Figures 24). A long-range haplotype analysis across the 3p25.1 and 3p25.2 loci did not reveal rare haplotypes with high odds ratio (Supplementary Note and Supplementary Table 6).

In order to investigate the possible functional impact of the associations, we considered all 294 genotyped or imputed SNPs with P < 10−6, and searched for functional predictions for these SNPs. 19 of these SNPs were located in genes (C5orf41 and BNIP1), but all were intronic (Supplementary Table 5). A search of the eQTL browser (see URLs) for the 294 SNPs did not reveal any association to gene expression. Next, we explored ENCODE data13 (using the UCSC genome browser NCBI build 36; see URLs) from chromatin immunoprecipitation sequencing (ChIP-seq) experiments, which showed a 23 kb region of histone modification for H3K36me3 that directly overlies rs11712066. At the 5q35.2 locus there are wider regions of histone modification for H3K27me3 and H3K36me3, both covering rs29784. While histone modification can mark distant-acting enhancers14, speculations on specific mechanisms are premature. A search of the GWAS catalog (see URLs) showed that rs251253 on 5q32.2 (OR = 1.34, P = 6.0 × 10−7, r2 to rs29784 = 0.49) is associated with the electrocardiographic PR interval15. The functional basis for the association is, however, unknown.

In light of the pronounced excess risk of IHPS in males, we also carried out sex-specific analysis of the data (Supplementary Table 7). The three genome-wide significant SNPs showed no evidence of heterogeneity of effects between sexes. Although rs2228671 on chromosome 19p13.2 failed to replicate, the strong effect in boys (OR = 1.54, P = 7.6 × 10−8) and absence of effect in girls (OR = 0.91, P = 0.53) warrants further study.

The three confirmed SNPs map to regions where MBNL1 and NKX2-5 are the strongest functional candidates for IHPS. Members of the muscleblind protein family are important regulators of alternative splicing16 and almost all human multiexon genes undergo alternative splicing 17. In the early post-natal period, splicing transitions from fetal to adult protein isoforms are essential for the extensive remodeling of muscle tissue that occurs as a part of normal development18. In studies of mouse heart and skeletal muscle development, Mbnl1 has been shown to control a set of temporally correlated splicing transitions that occur within the first 3 weeks of post-natal life 19, 20. Additionally, expression levels of Mbnl1 show distinct temporal changes in the early post-natal period correlated with the splicing transitions 18, 19. The intriguing observation that IHPS occurs almost exclusively between 2 and 8 weeks after birth points to a possible role for misregulation of MBNL1-controlled splicing transitions in the etiology of IHPS. The importance of MBNL1 for normal development of muscle tissue is highly evident in myotonic dystrophy, where loss of function of MBNL1 (caused by mutant DMPK mRNA), and consequent aberrant alternative splicing for many different pre-mRNAs, has a pivotal role in the pathogenesis of the disease 20.

NKX2-5 at chromosome 5q35.2 encodes the homeobox transcription factor NKX2-5, which is essential for normal heart formation and development 21. In humans, a range of NKX2-5 mutations have been identified that cause different congenital heart defects including atrial septal defects with or without atrioventricular block, isolated ventricular septal defects, and tetralogy of Fallot 22. Although NKX2-5 is not expressed in adult extracardiac tissues 23 studies of embryonic gut development have shown that NKX2-5 is crucial for the formation of pyloric sphincter muscle tissue 24-26. In both chicken and mouse, Nkx2-5 expression occurs in a sharply defined ring of mesenchyme at the junction between the foregut and midgut on specific days of embryonic development 23, 24, 26. Furthermore, repression of Nkx2-5 activity in the pyloric sphincter region results in loss of the pyloric sphincter endodermal phenotype; conversely, formation of pyloric sphincter-like epithelium in other parts of the gizzard (the equivalent of the stomach in the chicken) can be induced by ectopic expression of Nkx2-5 via a retroviral vector 25.

Further experimental studies are needed to uncover the contributions of, e.g., distant-acting enhancers, alternative splicing, or other potential mechanisms to the molecular etiology of IHPS. Although biopsies of pyloric sphincter muscle tissue would be difficult to obtain, comparison of NKX2-5 expression levels in such tissue from IHPS patients with that from age-matched controls would be interesting, as might be analysis of alternative splicing in pyloric sphincter muscle tissue using mRNA sequencing.

In conclusion, we identified three independent, robustly associated loci for IHPS. The findings point to two candidate genes, involved in regulation of alternative splicing, cardiac muscle development, and embryonic gut development. Further functional investigation of the associations identified here will illuminate the biological mechanisms behind this enigmatic condition and may lead to new approaches for screening, prevention or treatment.



Eligible IHPS cases were defined as children who 1) in their first year of life had a pyloromyotomy 2) were singletons 3) did not have any major malformations, and 4) were of Danish ancestry. In addition, we excluded severe pregnancy complications (see Supplementary Note for details about selection criteria). In the discovery stage, samples from 1,001 cases were selected and successfully genotyped. The control group consisted of 2,401 non-affected Danish children. Apart from IHPS affection status, the selection criteria were the same as for the cases. For the replication stage, we used 796 cases and 876 controls drawn from the same population using the same case and control definitions as in the discovery stage. The sex distribution was similar in the discovery and replication samples, but replication cases were born an average of 6 years earlier than discovery cases (Supplementary Table 2). The study was approved by the Scientific Ethics Committee for the Capital City Region (Copenhagen) and the Danish Data Protection Agency.


All samples were drawn from the Danish Newborn Screening Biobank and the Danish National Birth Cohort biobank, both of which are part of the Danish National Biobank. Sampling and genotyping (using the Illumina Human 660W-Quadv1_A chip) was undertaken in two rounds (see Supplementary Note for details). In total, genotypes for 559,390 SNPs were released in both genotyping rounds. For the association analysis, we used 523,420 SNPs; the other SNPs were excluded based on a missing rate >5%, deviation from Hardy-Weinberg equilibrium in controls (P<10−3), minor allele frequency <1%, or discrepancies (P<10−7) in allele frequencies between the two genotyping rounds. Genotyping of replication samples was performed on two correlated SNPs at each of the 6 most significantly associated loci from the discovery stage at deCODE Genetics using the Centaurus platform (Nanogen). We regenotyped 147 discovery stage samples and observed 100% concordance in a total of 1604 genotypes.


For the SNPs passing quality control, we used PLINK 28 to test for differences in allele frequencies between cases and controls for the discovery samples overall and stratifying by sex. P values were corrected by genomic control 29 using estimated genomic inflation factors of 1.06 in the combined discovery data, 1.04 in the analysis restricted to boys and 1.02 in the analysis for girls. We carried out combined analysis of the discovery and replication data using the inverse variance method as implemented in METAL 30 and tested for heterogeneity of the discovery and replication results using the I2 statistic 31. Using the combined discovery and replication data, we tested for interaction effects between the top six loci by including risk allele count at each locus in a logistic regression model together with pair-wise interaction terms. We also used the combined data to estimate the proportion of variance in the liability to IHPS explained by each of the top SNPs 32.


We imputed unobserved genotypes in the three confirmed regions using phased haplotypes from the Interim Phase I release of the 1000 Genomes Project 12. Imputation was done in a two step procedure. In a first pre-phasing step, we used MaCH 33 to estimate haplotypes for the IHPS study samples. In a second step, we imputed missing alleles for additional SNPs directly onto these phased haplotypes using Minimac 33. All imputed SNPs with imputation quality r2 >0.30 were tested for association with IHPS in a logistic regression of disease status on imputed allele dosage (to account for imputation uncertainty) using mach2dat 33. In addition, we carried out analyses conditioning on the genotype of the confirmed genotyped SNP in each region. (See Supplementary Note for additional information.)

Supplementary Material


This study was supported in part by grants from the Lundbeck Foundation (R34-A3931), the Novo Nordisk Foundation and the Danish Medical Research Council (271-06-0628). The GWAS data for the control samples were generated for our study of preterm birth within the GENEVA consortium with funding provided through the NIH Genes, Environment and Health Initiative (GEI: U01HG004423). Assistance with genotype cleaning and general study coordination for the preterm birth project was provided by the GENEVA Coordinating Center (U01HG004446). Genotyping was performed at Johns Hopkins University Center for Inherited Disease Research, with support from the NIH GEI (U01HG004438).


AUTHOR CONTRIBUTIONS B.F., F.G. and M.M. wrote the first draft of the paper. B.F. and F.G analyzed the data. M.V.H. and D.M.H. performed the experiments. C.K., S.G., J.C.M. and H.A.B. contributed by collecting phenotype data, providing genotype data, and/or giving advice on interpretation of results. B.F., F.G. and M.M. planned and supervised the work. All authors contributed to writing the final manuscript.

COMPETING FINANCIAL INTERESTS Statens Serum Institut has filed a priority patent application at the Danish Patent and Trademark Office on the use of genetic profiling to identify newborns at risk of IHPS, which contains subject matter drawn from the work also published here.

URLs 1000 Genomes Project,; Chicago eQTL Browser,; GWAS Catalog,; HapMap,; LocusZoom,; MACH,; PLINK,; UCSC Genome Browser,


1. Ranells JD, Carver JD, Kirby RS. Infantile hypertrophic pyloric stenosis: epidemiology, genetics, and clinical update. Adv. Pediatr. 2011;58:195–206. [PubMed]
2. Mitchell LE, Risch N. The genetics of infantile hypertrophic pyloric stenosis. A reanalysis. Am J Dis. Child. 1993;147:1203–1211. [PubMed]
3. Krogh C, et al. Familial aggregation and heritability of pyloric stenosis. JAMA. 2010;303:2393–2399. [PubMed]
4. Chung E. Infantile hypertrophic pyloric stenosis: genes and environment. Arch Dis. Child. 2008;93:1003–1004. [PubMed]
5. Schechter R, Torfs CP, Bateson TF. The epidemiology of infantile hypertrophic pyloric stenosis. Paediatr. Perinat. Epidemiol. 1997;11:407–427. [PubMed]
6. MacMahon B. The continuing enigma of pyloric stenosis of infancy: a review. Epidemiology. 2006;17:195–201. [PubMed]
7. Honein MA, et al. Infantile hypertrophic pyloric stenosis after pertussis prophylaxis with erythromcyin: a case review and cohort study. Lancet. 1999;354:2101–2105. [PubMed]
8. Pisacane A, et al. Breast feeding and hypertrophic pyloric stenosis: population based case-control study. BMJ. 1996;312:745–746. [PMC free article] [PubMed]
9. Chakraborty R. The inheritance of pyloric stenosis explained by a multifactorial threshold model with sex dimorphism for liability. Genet. Epidemiol. 1986;3:1–15. [PubMed]
10. Panteli C. New insights into the pathogenesis of infantile pyloric stenosis. Pediatr. Surg. Int. 2009;25:1043–1052. [PubMed]
11. Falconer DS. The inheritance of liability to certain diseases, estimated from the incidence among relatives. Annals of Human Genetics. 1965;29:51–76.
12. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. [PMC free article] [PubMed]
13. Myers RM, et al. A user’s guide to the encyclopedia of DNA elements (ENCODE) PLoS. Biol. 2011;9:e1001046. [PMC free article] [PubMed]
14. Visel A, Rubin EM, Pennacchio LA. Genomic views of distant-acting enhancers. Nature. 2009;461:199–205. [PMC free article] [PubMed]
15. Pfeufer A, et al. Genome-wide association study of PR interval. Nat. Genet. 2010;42:153–159. [PMC free article] [PubMed]
16. Ho TH, et al. Muscleblind proteins regulate alternative splicing. EMBO J. 2004;23:3103–3112. [PubMed]
17. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 2008;40:1413–1415. [PubMed]
18. Bland CS, et al. Global regulation of alternative splicing during myogenic differentiation. Nucleic Acids Res. 2010;38:7651–7664. [PMC free article] [PubMed]
19. Kalsotra A, et al. A postnatal switch of CELF and MBNL proteins reprograms alternative splicing in the developing heart. Proc. Natl. Acad. Sci. U. S. A. 2008;105:20333–20338. [PubMed]
20. Lin X, et al. Failure of MBNL1-dependent post-natal splicing transitions in myotonic dystrophy. Hum. Mol. Genet. 2006;15:2087–2097. [PubMed]
21. Fu Y, Yan W, Mohun TJ, Evans SM. Vertebrate tinman homologues XNkx2-3 and XNkx2-5 are required for heart formation in a functionally redundant manner. Development. 1998;125:4439–4449. [PubMed]
22. Reamon-Buettner SM, Borlak J. NKX2-5: an update on this hypermutable homeodomain protein and its role in human congenital heart disease (CHD) Hum. Mutat. 2010;31:1185–1194. [PubMed]
23. Kasahara H, Bartunkova S, Schinke M, Tanaka M, Izumo S. Cardiac and extracardiac expression of Csx/Nkx2.5 homeodomain protein. Circ. Res. 1998;82:936–946. [PubMed]
24. Smith DM, Tabin CJ. BMP signalling specifies the pyloric sphincter. Nature. 1999;402:748–749. [PubMed]
25. Smith DM, Nielsen C, Tabin CJ, Roberts DJ. Roles of BMP signaling and Nkx2.5 in patterning at the chick midgut-foregut boundary. Development. 2000;127:3671–3681. [PubMed]
26. Self M, Geng X, Oliver G. Six2 activity is required for the formation of the mammalian pyloric sphincter. Dev. Biol. 2009;334:409–417. [PMC free article] [PubMed]
27. Pruim RJ, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics. 2010;26:2336–2337. [PMC free article] [PubMed]
28. Purcell S, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007;81:559–575. [PubMed]
29. Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999;55:997–1004. [PubMed]
30. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. [PMC free article] [PubMed]
31. Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat. Med. 2002;21:1539–1558. [PubMed]
32. So HC, Gui AH, Cherny SS, Sham PC. Evaluating the heritability explained by known susceptibility variants: a survey of ten complex diseases. Genet. Epidemiol. 2011;35:310–317. [PubMed]
33. Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 2010;34:816–834. [PMC free article] [PubMed]