Our study was strongly motivated by the unequal sex distribution observed in the two main types of clefts (CL/P and CPO) as well as previous findings of a strong link between X-linked genes and orofacial clefts. X-linked genes have been identified primarily in syndromic forms of clefting and include midline 1 (MID1
) on chr Xp22, T-box 22 (TBX22
) on chr Xq21.1, PHD finger protein 8 (PHF8
) on chr Xp11.22, and RNA binding motif protein 10 (RBM10
) on chr Xp11.23. Mutations in MID1
cause the X-linked Opitz GBBB syndrome (OSX, MIM 300000), a congenital midline malformation syndrome characterized by clefting of the lip/palate and a variety of other pathologies 
. An association between specific haplotypes in MID1
and isolated CL/P was later reported in an Italian population 
. Mutations in TBX22
cause the rare X-linked syndrome ‘cleft palate with ankyloglossia’ (CPX; MIM 303400) 
belongs to the T-box family of genes that are evolutionarily highly conserved and recognized for playing key roles in early vertebrate development. Consistent with the CPX phenotype in humans 
, the expression of Tbx22
in mice is localized to the developing palatal shelves and the base of the tongue. Further, a genome-wide linkage analysis of families with iCL/P identified a susceptibility locus near TBX22
, suggesting that the linkage signal may emanate from this gene 
. Mutations in TBX22
have also been identified in patients with isolated CPO 
. As to PHF8
, mutations in this gene cause the X-linked mental retardation syndrome Siderius that includes cleft palate as a common phenotypic feature 
. PHF8 is a histone lysine transcription activator expected to have a wide range of functions. Finally, deep sequencing of exons on the X chromosome identified RBM10
as the gene causing TARP (MIM 311900), a syndromic form of cleft palate 
Given these strong links between X-linked genes and syndromic clefts, we examined whether variants in X-linked genes might also be relevant for isolated forms of clefting. To enable X-linked gene analysis, we first developed a method that can i) perform both single-marker and haplotype analyses, ii) generate relevant relative risk estimates with confidence intervals, and iii) assess several etiological models relevant to an X-linked disease locus. The higher prevalence/penetrance for CL/P in males compared with females may be due to hemizygosity for an X-linked disease locus 
. Therefore, we first analyzed males and females together to account for the possibility that an X-linked disease locus might contribute to clefting risk in both sexes, followed by sex-stratified analyses to investigate whether the X-linked disease locus affects one sex in particular. X-chromosome inactivation in females was also taken into account in the models by treating a heterozygous female (X1
) as the average of the two homozygotes (X1
Overall we found only weak associations with OFD1
in the Danish iCL/P sample, with no replication in the Norwegian iCL/P sample. As noted in our previous analyses of fetal gene-effects in the same study samples 
, the genotype call rates for the Norwegian sample (DNA extracted from blood) and Danish sample (DNA extracted from buccal swabs) were 99.6% and 99.1% respectively. Hence, the lack of replication of the OFD1
association in the Norwegian iCL/P samples cannot be ascribed to differences in DNA source. Moreover, different genotype frequencies do not imply differences in gene effects on the phenotype.
In sex-stratified analyses, the association of OFD1
in the Danish iCL/P sample was confined to males only, suggesting a possible sex-specific effect as previously reported for several loci when only males were analyzed 
. Separate analyses for males and females can be potentially more powerful than a pooled analysis if the X-linked disease locus affects only one sex 
. An alternative explanation for the apparent sex-specific effect in our data is the potentially higher statistical power to detect an effect of OFD1
in males due to the larger number of male iCL/P cases available for analysis.
Mutations in OFD1
underlie the X-linked dominant oral-facial-digital syndrome type 1 (OFD1, MIM 311200), which is characterized by malformations of the face, oral cavity and digits, as well as lethality in the vast majority of affected males 
. Featuring prominently among the orofacial abnormalities are median cleft lip, clefts of alveolar ridge at the area of lateral incisors, and cleft palate 
. To our knowledge, however, no genetic association with this gene has previously been reported in isolated clefts.
For haplotype analysis of X-chromosome markers, the standard log-linear approach needs some modification. First, many diseases show markedly different birth prevalences in males versus females, as is the case for orofacial clefts, with higher prevalence of CL/P in males and higher prevalence of CPO in females 
. This difference may be due to causes other than the effect of the particular locus under study, such as loci differentially expressed between males and females. To avoid confounding of the genetic risk estimation by the sex effect, separate baseline risks should be assumed for males and females; i.e. in males the effect of an allele A2
should be measured relative to the reference allele A1
in males, whereas in females the effects of A1
should be measured relative to A1
in females (). Second, it is not clear a priori
whether a single dose of A2
in males has an effect comparable to a single dose in females (A1
) or to a double dose (A2
), or is entirely different from the effect in females. This is aptly illustrated by craniofrontonasal syndrome (CFNS; MIM 304110), an X-linked developmental disorder that paradoxically affects heterozygous females more severely than hemizygous males 
. In addition, there is the usual question of the relationship between A1
in females; i.e. whether there is a dose-response relationship, a dominant relationship etc. Third, the basic log-linear model in HAPLIN assumes the same allele frequencies for males and females in the background population. While this is a relatively robust assumption for autosomal markers, it is less obvious for X-linked markers. For populations that are genetically relatively homogeneous (like the Danes and Norwegians), however, this assumption seems to be reasonable.
The most extreme solution to the problems raised above is to run separate analyses on males and females. HAPLIN has a special option for doing this, which allows several different response patterns in females to be explored, whereas males are obviously restricted to single-dose effects. To increase statistical power, HAPLIN allows joint analyses of males and females, which reduce the number of parameters to be estimated. The analyses all assume the same allele/haplotype frequencies but different baseline risks for males and females. In addition, various response patterns can be specified.
Another important consideration is X-inactivation in females which may produce a special relationship between male and female allele effects. In females, one X allele in each cell is inactivated (except for a very few second X chromosomes that escape inactivation). A deleterious X-linked allele would be expected to be more detrimental to males than females because males have no chance of compensation by a corresponding normal allele 
. Because X-inactivation in women occurs in early embryogenesis, women will tend to have a mixture of cells expressing either their mother’s or father’s X-linked genes (mosaicism). This heterogeneity can have different consequences on a female’s disease response depending on how the two X chromosomes are distributed among tissues 
. The normal expectation would be an equal distribution of the two cell types 
. However, there may be “founder=" effects due to the relatively small number of cells in the embryo at the time of X-inactivation, or differential cellular reproduction rates, leading to an imbalance between the two cell types.
If the risk associated with allele A2
in females is RRF
, genotypes A1
will produce risks BF
, respectively. Assuming a 50
50 cell type distribution, the risk associated with genotype A1
will be an average of the two homozygotes, i.e. (BF
)/2 (Model 4
in ). Technically speaking this is not a log-linear model, so HAPLIN replaces the heterozygous risk with BF
–the geometric mean of the two homozygous risks. This results in a log-linear model, and as long as RRF
is neither very small nor very large, the approximation is reasonable. For males, the single-dose effect is then assumed equal to female homozygotes, i.e. RRM
(denoted simply as RR in Model 4
). HAPLIN also provides an extension of this model to accommodate an unbalanced cell type distribution.
The basic likelihood models in X-LRT 
and HAPLIN are similar; for a single SNP, X-LRT uses zero-dose males as reference and estimates relative risks for single-dose males, and zero-, single-, and double-dose females independently. This corresponds to Model 5
in HAPLIN, and in this special case the results are identical, except that HAPLIN chooses reference levels differently. In addition, HAPLIN provides a number of other modeling options on the X-chromosome, and the software provides a full framework for autosomal and X-linked haplotype association analyses in a candidate-gene or GWAS setting.
To summarize, this is the first candidate-gene based study to investigate the role of X-linked genes in orofacial clefting. Although OFD1 is a highly plausible gene for clefts, the lack of replication in the Norwegian iCL/P sample highlights the need to confirm these preliminary findings in other datasets. The novel methods presented here address several scenarios relevant to an X-disease locus and can easily be adapted to explore the role of X-linked genes in other complex disorders.