Our aim was to discover and characterize the regulatory sequences responsible for the extreme allele-specific expression differences of KRT1 in human white blood cells. In all individuals expressing KRT1 and heterozygous for exonic SNP2, the same allele is always expressed at a significantly higher level then the other allele. These data suggest that the KRT1 allelic-expression differences likely result primarily from cis-regulatory polymorphisms in strong linkage disequilibrium with exonic SNP2. We determined that all nine KRT1 exons as well as ~22 kb of sequences upstream of the gene are contained within a single haplotype block. The high-expressing KRT1 exonic SNP2 allele maps to haplotype pattern 2, while the low-expressing SNP2 allele maps to haplotype pattern 1, suggesting that cis-regulatory variants differing between these two haplotypes are likely responsible for the majority of the allele-specific expression differences.
Examining SNPs whose alleles differ between the low- and high-expressing KRT1 haplotypes using a variety of experimental and computational methods, we identified five cis-regulatory polymorphisms. SNP5 and SNP23 cis-regulatory intervals act as positive regulators of the KRT1 promoter in luciferase reporter assays, while SNP11, SNP17, and SNP28 cis-regulatory intervals act as negative regulators. Consistent with these data is the fact that SNP11 and SNP17 are present in predicted binding sites for ZNF143 and ZEB, respectively; both known to act as negative transcriptional regulators. And SNP23 is present in a predicted binding site for AML-1, a known positive transcriptional regulator. EMSA and chromatin immunoprecipitation assays suggest that ZEB and AML-1 respectively bind the SNP17 and SNP23 intervals in vivo.
Our study shows that the extreme allele-specific expression differences of KRT1 result from the haplotypic combinations of the five cis-regulatory polymorphisms that differ between the low- and high-expressing patterns. The high-expressing alleles of SNP5 and SNP23 bind more protein in the EMSA than the low-expressing allele. In addition, the high-expressing SNP23 allele exhibited an almost 2-fold increase in the KRT1 promoter activity compared with the low-expressing allele in the luciferase reporter assay. Thus, for both SNP5 and SNP23 the high-expressing alleles appear to have higher affinities for transcriptional activators than the low-expressing alleles. On the other hand, the low-expressing alleles of SNP11 and SNP17 bind more protein in EMSA than the high-expressing allele. In the luciferase reporter assays the low-expressing SNP17 and SNP28 constructs have approximately 2-fold less KRT1 promoter activity then the high-expressing allele constructs. Thus, our data indicate that for SNP11, SNP17, and SNP28 the low-expressing alleles have higher affinities for transcriptional repressors than the high-expressing alleles. Additionally, when combined, the SNP17 and SNP28 intervals result in a greater reduction of KRT1 promoter activity then that observed for either interval alone. And interactions between the individual SNP17 and SNP28 intervals with the KRT1 promoter(s) suggest that there are functional polymorphisms in the promoter region resulting in less activity for the low-expressing version. It is important to note that in addition to these functional promoter variants other cis-regulatory polymorphisms may exist in the KRT1 haplotype block and/or adjacent haplotype blocks that are also involved in the extreme differential KRT1 allelic expression. However, the haplotypic combinations of the five cis-regulatory polymorphisms that we identified and characterized in this study can readily explain a large fraction of the observed allele-specific KRT1 expression differences.
Previous studies examining allele-specific expression differences have focused on analyzing single SNPs or SNPs grouped in a short interval, such as a promoter [29
]. Thus, our study provides important new insights into the complexities of the molecular mechanisms underlying allele-expression differences. The fact that each of the five cis
-regulatory SNPs we characterized contributes to just a fraction of the observed variation indicates that allele-specific expression is itself a complex trait. Interestingly, the finding that allelic-expression differences can result from the interaction of multiple cis
-regulatory SNPs may explain the difficulties researchers frequently encounter when trying to discover the “causative SNP” underlying a linkage peak or in an interval identified as associated with a trait in a genetic study.
It is generally well proven that non-coding sequences conserved between humans and mice can represent functional regulatory elements [35
]. However, in a previous study we demonstrated that functional cis
-regulatory sequences in humans can be missing in other mammals, even closely related primate species [38
]. Based on functional data in this study we proposed that this class of cis
-regulatory sequences represent rapidly evolving elements that are responsible for gene expression differences between species. Since the cis
-regulatory SNPs identified here are involved in intra-species gene regulatory differences, the fact that four of the intervals, SNP5, SNP11, SNP17, and SNP23, are not evolutionarily conserved between humans and mice is consistent with our previous observations and hypothesis. The fact that the SNP28 interval appears to have dual functions as a cis
-regulatory sequence and an exonic sequence in the KRT1B
gene raises the possibility that transcription of the KRT1
genes is linked by a novel mechanism.
has not previously been shown to have a functional role in white blood cells, and hence we are unable to state whether or not the observed expression differences between the low-expressing and high-expressing haplotype patterns have physiological relevance. Interestingly, a recent study indicates that allele-specific expression differences observed in white blood cells can be associated with physiological relevance in other tissues [39
]. The investigators of this study identified two genes with allelic-expression differences in white blood cells isolated from osteoarthritis patients and those isolated from control individuals, and they also showed that these same two genes contain 5′ SNPs with statistically significant association with osteoarthritis. KRT1
is expressed in the basal layer of the epidermis and plays a major role in the differentiation and function of keratinocytes [11
expression is down-regulated in keratinocytes in response to wounding [40
]. This down-regulation of KRT1
expression is thought to be necessary for keratinocytes to make the morphological changes required for migration [40
] into the wound site. Based on the functional role of KRT1
it is interesting to hypothesize that the allele-specific expression differences observed in human white blood cells may be associated with keratinocyte migration rates in response to wounding. In theory, if keratinocytes homozygous for the low-expressing haplotype pattern down-regulate KRT1
expression more quickly than those homozygous for the high-expressing pattern, they should migrate sooner in response to wounding.