|Home | About | Journals | Submit | Contact Us | Français|
Isolated clubfoot is a relatively common birth defect that affects approximately 4,000 newborns in the US each year. Calf muscles in the affected leg(s) are underdeveloped and remain small even after corrective treatment. This observation suggests that variants in genes that influence muscle development are priority candidate risk factors for clubfoot. This contention is further supported by the discovery that mutations in genes that encode components of the muscle contractile complex (MYH3, TPM2, TNNT3, TNNI2, and MYH8) cause congenital contractures, including clubfoot, in distal arthrogryposis (DA) syndromes. Interrogation of fifteen genes encoding proteins that control myofiber contractility in a cohort of both nonHispanic white (NHW) and Hispanic families, identified positive associations (p<0.05) with SNPs in twelve genes; only one was identified in a family-based validation dataset. Six SNPs in TNNC2 deviated from Hardy Weinberg Equilibrium (HWE) in mothers in our NHW discovery dataset. Relative risk and likelihood ratio tests showed evidence for a maternal genotypic effect with TNNC2/rs383112 and an inherited/child genotypic effect with two SNPs, TNNC2/rs4629 and rs383112. Associations with multiple SNPs in TPM1 were identified in the NHW discovery (rs4075583, p=0.01), family-based validation (rs1972041, p=0.000074) and case-control validation (rs12148828, p=0.04) datasets. Gene interactions were identified between multiple muscle contraction genes with many of the interactions involving at least one potential regulatory SNP. Collectively, our results suggest that variation in genes that encode contractile proteins of skeletal myofibers may play a role in the etiology of clubfoot.
Isolated clubfoot is a relatively common orthopedic birth defect characterized by forefoot adductus, hindfoot varus and ankle equinus [Bohm, 1929]. Serial casting is initiated shortly after birth and surgical intervention is still necessary in some cases that relapse [Hulme, 2005]. The calf muscles in the affected leg(s) are underdeveloped at birth and remain small even after corrective treatment [Irani and Sherman, 1972; Isaacs et al., 1977]. In 50 percent of cases, both feet are affected; in unilateral cases, the right side is more commonly affected [Lochmiller et al., 1998]. Males are affected twice as often as females [Lochmiller et al., 1998]. More than 4,000 newborns in the US and 135,000 worldwide are born with a clubfoot each year [Ponseti, 2003]. While the average birth prevalence of clubfoot worldwide is 1/1,000, prevalence varies greatly across ethnicities with the highest rate in Polynesians (1/150) and the lowest in African Americans (1/2,500) [Beals, 1978; Chung et al., 1969; Lochmiller et al., 1998; Moorthi et al., 2005].
The etiology of clubfoot is multifactorial involving both environmental and genetic factors. The genetic effects of individual variants are likely to be small to moderate in size and vary among families/populations. Additionally, we hypothesize that these variations occur in multiple genes within one or more pathways in a given individual and that there are multiple susceptibility variants within a single gene in the population. The higher concordance in monozygotic twins (32%) compared to dizygotic twins (2.9%) and recurrence in 10–20% of families support a role for genes in clubfoot [Barker et al., 2003; Engell et al., 2006; Idelberger K, 1939; Kruse et al., 2008; Wang et al., 1988]. To date, the vast majority of the genetic liability is unknown [de Andrade et al., 1998; Morton and MacLean, 1974; Wang et al., 1988; Yang et al., 1987].
One approach for identifying candidate genes/pathways that influence risk for complex traits such as birth defects is to capitalize on what is known about the molecular causes of rare multiple malformation syndromes with an overlapping phenotype. For example, Van der Woude syndrome (VWS) (OMIM: #119300), an autosomal dominant syndrome with cleft lip or cleft palate and/or lip pits, is caused by mutations in interferon regulatory factor 6 (IRF6) [Kondo et al., 2002]. An association between variation in IRF6 and nonsyndromic cleft lip and palate has been found in numerous populations [Blanton et al., 2005; Jugessur et al., 2008; Kondo et al., 2002; Rahimov et al., 2008; Zucchero et al., 2004]. Approximately 13–20% of the genetic variation in nonsyndromic cleft lip and palate may be attributable to genetic variation in IRF6 [Zucchero et al., 2004].
This approach can be applied to clubfoot. For example, the Distal Arthrogryposis (DA) syndromes are a group of rare autosomal dominant disorders characterized by multiple congenital joint contractures, including clubfoot, and muscle hypoplasia. The feet are generally more severely affected than the upper extremities. Nine different types of DA have been delineated and clubfoot is a common characteristic of several of these, including DA1, DA2A, and DA2B [Bamshad et al., 1996]. To date, mutations that cause DA have been reported in MYH3, TNNT3, TNNI2 and TPM2 [Bamshad et al., 1996; Stevenson et al., 2006; Sung et al., 2003a; Sung et al., 2003b; Toydemir and Bamshad, 2009; Veugelers et al., 2004]. Additionally, mutations in MYH8 cause DA7 or trismus-pseudocamptodactyly, which is characterized by contractures of the feet and occasionally clubfoot [Carlos et al., 2005; Gasparini et al., 2008; Pelo et al., 2003; Vaghadia and Blackstock, 1988]. These five genes encode components of the contractile apparatus of skeletal myofibers. The calf muscles of individuals with DA and clubfoot have inconsistently been reported to show a variety of abnormalities including disorganization of muscle fibers, increased number of Type I fibers (slow-twitch) and a decrease in Type II fibers (fast-twitch) [Fukuhara et al., 1994; Handelsman and Isaacs, 1975; Isaacs et al., 1977]. Collectively, these observations suggest that genes encoding sarcomeric proteins that influence myofiber contractility are plausible candidates for clubfoot. Therefore, we undertook this study to test whether variants in fifteen of the genes that encode muscle contractile proteins influence the risk of clubfoot.
This study was approved by the Committee for the Protection of Human Subjects at the University of Texas Health Sciences Center at Houston (HSC-MS-5R01HD043342).
Multiple datasets were used in the analyses: a family-based discovery dataset, a family-based validation dataset and a case-control validation dataset. The discovery dataset was comprised of 224 multiplex families, which include 137 nonHispanic white (NHW) and 87 Hispanic families, and 357 simplex families, which includes 139 NHW and 218 Hispanic families. Families were recruited as previously described from clubfoot clinics in Shriners Hospitals for Children in Houston, Los Angeles and Shreveport, Texas Scottish Rite Hospital for Children of Dallas and University of British Columbia [Ester et al., 2007; Ester et al., 2009; Heck et al., 2005]. The family-based validation dataset consisted of 142 NHW simplex families ascertained and characterized in the Orthopedic Clinic at the Department of Orthopedics at Washington University in St. Louis. In all centers, probands and family members underwent clinical and radiographic examinations to exclude syndromic causes of clubfoot. Ethnicity was based on self-report. Hispanic participants were of Mexican descent. Blood and/or saliva samples were collected from affected individuals and family members after obtaining informed consent. DNA was extracted using either the Roche DNA Isolation Kit for Mammalian Blood (Roche, Switzerland) or Oragene Purifier for saliva (DNA Genotek, Inc. Ottawa, Ontario, Canada) following the manufacturer’s protocol.
The case-control validation dataset was composed of de-identified isolated clubfoot cases and matched control newborn bloodspots ascertained from the Texas Birth Registry. The controls were matched to the cases by sex, maternal ethnicity, county of maternal residence and birth +/− 8 weeks of the case’s date of birth. These variables were chosen for the following reasons: a known risk factor, maternal ethnicity affects allele frequencies and environmental exposures may vary geographically and temporally. The majority of the matched controls (78.5%) were born within one month of their matched cases. This validation dataset included 616 NHW (308 cases and 308 controls) and 752 Hispanic (376 cases and 376 controls) DNA samples. DNA was extracted from the dried blood spots using the Qiagen DNeasy blood and tissue kit (Qiagen, Valencia, CA) and amplified using the Qiagen REPLI-g kit (Qiagen, Valencia, CA) following the manufacturer’s protocol.
Fifteen genes were selected for evaluation based upon their expression and role in the muscle contractile apparatus. The NCBI and HapMap databases were used to identify SNPs that flank and span ACTA1, MYBPC2, MYBPH, MYH1, MYH2, MYH3, MYH4, MYH8, MYH13, MYL1, TNNC2, TNNI2, TNNT3, TPM1 and TPM2 (Table 1). Seventy-four SNPs were selected based on heterozygosity in the nonHispanic white population (>0.3) (HapMap CEU dataset -www.ncbi.nlm.nih.gov/SNP/snp_viewTable.cgi?pop=1409), position in or around the gene and extent of linkage disequilibrium (LD) (Table 1). Genotyping was performed using either TaqMan® Genotyping Assays (Applied Biosystems, Foster City, CA) and detected on a 7900HT Sequence Detection System (Applied Biosystems, Foster City, CA) or SNPlex™ Genotyping System (Applied Biosystems, Foster City, CA) and analyzed on a 3730 DNA Analyzer using Genemapper® 4.0 (Applied Biosystems, Foster City, CA) following the manufacturer’s protocol. One SNP, rs373018, had poor clustering and was removed from further analysis. A subset of twenty-three SNPs, in seven genes, were genotyped in the validation datasets.
Tests for Hardy-Weinberg Equilibrium (HWE) were calculated using SAS (v9.1). SNPs for which the genotype distributions were significantly different from HWE (p<0.001) were excluded from the analyses. This p-value was chosen to identify only those SNPs which showed marked deviation from HWE. Chi-square analysis was performed using SAS to evaluate ethnic differences in allele frequencies. Pairwise linkage disequilibrium values (D′ and r2) were calculated using GOLD [Abecasis and Cookson, 2000].
For statistical analyses, the data were stratified by ethnicity alone or by family history of clubfoot and ethnicity. Linkage and/or association were evaluated using multiple analytic methods to extract the greatest amount of information from the data. Parametric and nonparametric linkage analyses were performed using Merlin [Abecasis et al., 2002]. Linkage parameters were used as described previously [Ester et al., 2009]. Association was tested using Pedigree Disequilibrium Test (PDT), genotype-Pedigree Disequilibrium Test (GENO-PDT) and Association in the Presence of Linkage (APL)[Chung et al., 2006; Martin et al., 2003; Martin et al., 2000]. Two-SNP intragenic haplotypes were evaluated using APL. Generalized estimating equations (GEE) as implemented in SAS was used to evaluate gene interactions at a statistical level [Hancock et al., 2007]. Gene-environment interactions were assessed using FBATI [Hoffmann et al., 2009]. Genes with a SNP association p<0.05 in the single SNP or p<0.01 in the 2-SNP haplotype analyses were evaluated with APL in the family-based validation dataset and Chi-square in the case-control validation dataset.
Log-linear regression models were used to evaluate the independent effects of maternal and inherited (child) genotypes for the TNNC2 SNPs that were out of HWE in the NHW families [van Den Oord and Vermunt, 2000; Weinberg et al., 1998; Wilcox et al., 1998]. Specifically, only one triad was selected per family consisting of the affected proband and their parents. For each SNP, the likelihood ratio test was used to compare the full model, which included parameters for both maternal and inherited genotypes, with reduced models, which included parameters for only the maternal or the inherited genotype. In addition, estimates of genotype relative risks and their associated 95% confidence intervals were estimated. All log-linear models assumed a log-additive model of inheritance.
In silico analyses were performed on associated SNPs located in potential regulatory regions. Three online binding site prediction programs (Alibaba2, Patch and TESS) were used to assess if the presence of the ancestral or alternate allele could alter the DNA binding site (www.ncbi.nlm.nih.gov/)[Grabe, 2002; Matys et al., 2006; Schug, 2008].
None of the SNPs in TNNC2 were in HWE in the NHW discovery dataset and were removed from the association analyses; all remaining SNPs in the NHW were in HWE. All TNNC2 SNPs were in HWE in the Hispanic dataset and were therefore included in the association analyses. Only rs2074877 in MYH13 was out of HWE in the Hispanic discovery dataset and was removed from analyses. Allele frequencies differed significantly between the NHW and Hispanic groups for SNPs in fourteen of the fifteen examined genes (Table I). Therefore, the data were stratified by ethnicity. Parametric and nonparametric linkage analysis found no evidence for linkage (data not shown).
Overall, nominal evidence for association was found for SNPs in twelve of fifteen genes in the discovery datasets (p<0.05) (Table II). For the NHW dataset, evidence for association was seen for SNPs in six genes: MYBPH, TPM2, TNNT3, TPM1, MYH13 and MYH3. Three SNPs in MYH3 had altered transmission primarily in the NHW multiplex subset. All other associations involved a single SNP in each of the five other genes. In the Hispanic dataset, there was evidence for altered transmission in eleven genes (Table IIB). Five of these genes, MYBPH, TPM2, TNNT3, TPM1 and MYH13, also had SNPs with altered transmission in the NHW dataset; only one SNP was common to both datasets (MYH13/rs17690195). In addition, several genes had multiple SNPs with altered transmission (MYL1 (3), TNNT3 (3), MYH8 (4), MYH4 (3), MYH1 (2) and MYH2 (2)).
When 2-SNP haplotypes were considered, altered transmission was found for five genes in the NHW group (p<0.01) (Table IIIA). Two of these genes, ACTA1 and MYH8, did not have individually altered transmitted SNPs. Three different MYH13 haplotypes had altered transmission; none of the haplotypes included the individual SNPs with altered transmission (Table IIIA). The two TPM2 haplotypes both contained rs1998303, which had altered transmission in the single SNP analyses. In the Hispanic discovery dataset, three MYH13 haplotypes had altered transmission (Table 3B); only one contained rs17690195, which had altered transmission in the single SNP analysis (Table 2B). There was no overlap between the NHW MYH13 haplotypes and the Hispanic MYH13 haplotypes, and only one SNP (MYH13/rs2240579) was common to both ethnicities.
Numerous potential gene interactions were identified in both the NHW and Hispanic discovery datasets (p<0.01) (Table IV). The only gene interaction present in both datasets was TPM1 and MYH13, although the same SNPs were not involved in the two datasets. SNPs in ACTA1, MYH1, MYH13, MYH2, MYH4, MYH3, MYH8, MYL1, TNNT3, TPM1 and TPM2 were involved in interactions in both ethnic groups.
Three genes (TNNI2, MYBPC2 and TNNC2) did not have any SNPs meeting our criteria for follow-up in the validation datasets. In the family-based validation dataset, only two SNPs in the single SNP analyses demonstrated any evidence for altered transmission, TNNT3/rs2734495 (p=0.04) and TPM1/rs1972041 (p=0.000074)(data not shown). The TPM1 result is supported by the 2-SNP analyses in the validation dataset where only TPM1 haplotypes had altered transmission (Table V). All four of the significant haplotypes contained rs1972041. In the case-control dataset, only nominal evidence for association was seen with rs1248828 in TPM1 (p=0.04) in the Hispanic subset; there were no associations in the NHW subset (data not shown).
Further examination of the NHW maternal, paternal and proband TNNC2 genotype frequencies revealed that only the maternal genotypes deviated from HWE, suggesting the presence of a maternal genetic effect. Table VI summarizes the results of log-linear models assessing maternal and inherited genotypic effects. For rs383112, significant associations were observed with both the maternal and inherited genotypes (p=0.02 and 0.03, respectively). The maternal genotype for rs383112 was associated with a 1.38-fold increased risk (CT versus CC; 95% CI: 1.13–1.72) of clubfoot in offspring, while a protective inherited genotypic effect was conferred with a relative risk of 0.77 (CT versus CC; 95% CI: 0.50–0.99). In addition, a significant protective inherited genotypic effect (p=0.02), with a relative risk of 0.74 (TG versus TT; 95% CI: 0.48–0.97), was found for rs4629.
We specifically targeted genes encoding components of the muscle contractile apparatus because of their role in muscle development and because mutations in several of these genes cause DA syndromes, which frequently include clubfoot as part of the phenotype. We report on the first evidence for maternal and inherited genotypic effects involving two SNPs in TNNC2 (rs4629 and rs383112) in the NHW group (Table VI). A deleterious maternal effect was found for rs383112, while a protective inherited effect was found for rs4629 and rs383112. TNNC2 encodes tropinin C and plays a key role in initiating muscle contraction in fast-twitch muscle fibers by binding Ca2+. This causes a conformational change in troponin I, which releases inhibition of troponin T causing tropomyosin to allow actin-myosin interactions [Gordon et al., 2000; Schiaffino and Reggiani, 1996]. The alternate allele for rs4629, located in exon 5, is a synonymous change. Synonymous changes can alter the amino acid translation rate resulting in changes in protein structure and function [Kimchi-Sarfaty et al., 2007; Komar, 2007]. TNNC2/rs383112 is located in a potential regulatory region approximately 1.5 kb upstream of the start site of TNNC2. The presence of the alternate allele is predicted to create a new DNA binding site (Table VII). Therefore, either variant could affect protein function and/or expression. Testing in other datasets is warranted because this finding was not confirmed in our simplex family-based validation dataset, which does not closely mimic the family-based discovery dataset, as the discovery dataset contains both simplex and multiplex families.
In the NHW group, evidence of association was found for SNPs located in TPM1 and TPM2, which encode members of the tropomyosin family; only TPM1 had altered transmission in the validation datasets [Gordon et al., 2000; Schiaffino and Reggiani, 1996]. TPM1 is expressed in fast-twitch muscle fibers, while TPM2 is mainly expressed in slow-twitch muscle fibers. Tropomyosin functions with the troponin complex to regulate muscle contraction by restricting myosin from binding to actin [Gordon et al., 2000; Schiaffino and Reggiani, 1996]. TPM2/rs1998308, an intronic SNP (p<0.003) had modest evidence for association in the discovery dataset but was not identified in the validation datasets. While no coding mutations were identified in twenty familial clubfoot patients in a separate study evaluating three skeletal muscle contractile genes (TNNT3, TPM2 and MYH3), regulatory regions of the TPM2 gene were not evaluated [Gurnett et al., 2009].
The association with TPM1 SNPs detected in the discovery dataset was validated in the family-based validation dataset, with suggestive evidence in the case-control validation datasets, albeit with different SNPs. rs4075583 is in a potential regulatory region and is predicted to alter a DNA binding site (Table VII), while rs1972041 and rs12148828 are either in an intron or downstream depending on the TPM1 isoform. Multiple TPM1 isoforms are produced through alternative splicing and expression is cell type specific [Perry, 2001]. Three TPM1 regulatory SNPs associated with Metabolic Syndrome were evaluated for their effect on the expression of the short TPM1 isoform [Savill et al., 2010]. The presence of the rs4075583 G allele (the risk allele in our NHW group) decreased gene expression in HEK293 cells. A haplotype incorporating the G allele of rs4075583 and the C allele of rs4075584 caused decreased expression in THP-1 cells [Savill et al., 2010]. Altered gene expression could affect muscle contraction and needs to be further assessed in a biologically relevant cell line, such as a muscle cell line. The association of a regulatory SNP in TPM1 in our clubfoot discovery dataset leads us to hypothesize that correct expression of tropomyosin is important for normal foot development and that alteration of the muscle contractile apparatus may be a risk factor for clubfoot [Fukuhara et al., 1994; Handelsman and Isaacs, 1975; Isaacs et al., 1977].
Muscle contraction is a well-orchestrated process involving multiple proteins [Gordon et al., 2000]. Numerous potential interactions were found among SNPs in both the NHW and Hispanic discovery datasets (Table IVA and B); these interactions could not be validated because of small sample size. Many of these interactions involve at least one SNP located in a potential regulatory region. The combination of risk variants in several genes that encode muscle contractile proteins may perturb both muscle development and function and consequently play a key role in determining susceptibility to clubfoot. Nevertheless, each of these associated variants still needs to be evaluated through functional assays to assess their effect on gene function and expression to begin to understand their potential role in clubfoot. Finally, this study suggests that focusing on genes that encode proteins for the contractile complex in fast- and slow-twitch myofibers may provide key insight into the genetic etiology of clubfoot.
This study was approved by the Committee for the Protection of Human Subjects of the University of Texas Health Science Center at Houston (HSC-MS-03-090). We thank all of the families that kindly participated in this study and made it possible. Thanks to Marie Elena Serna and Rosa Martinez for screening, enrolling and collecting patient samples and to Dr. S. Shahrukh Hashmi for database management. This work was approved by the Committee for the Protection of Human Subjects at the University of Texas Health Science Center at Houston. Shriners Hospital for Children and NICHD R01-HD043342-05 supported this work with grants to JTH.
Conflict of interest statement: None.