Genetic and transcriptional variations are important key factors in the evolution of biology and the dispensation of diseases. Single nucleotide polymorphisms (SNPs) are one type of DNA sequence alteration that is commonly used as a marker for tracking genetic variation. The allelic frequency of a SNP at a given locus can vary between populations and the genotype may code for a SNP that results in a particular phenotype, trait or disease 
. Within populations and under certain biological conditions genes are coordinately regulated by transcript-regulators (TRs) such as transcription factors (TFs), cofactors, complexes of TFs and miRNAs (). These co-expressed genes often times share biological functions and work in concert to mediate cellular events such as biological processes and molecular pathways. Although it has been shown that TFs do not harbor trans
-acting variants 
, coupling coordinately regulated genes as a quantitative trait for a loci (eQTL) with the genotype of SNPs as a genome-wide association study (GWAS) can presumably help to elucidate variation in gene expression (TReQTLs) on a genomic and systems biology scale that code for particular phenotypes and complex diseases 
Tailoring the GWAS eQTL analysis by considering genes with coordinated expression is of added value to reveal master regulators of transcriptional genetic variation (). We used a multivariate linear regression with the gene expression of known downstream targets (DSTs) of TRs () as the response variable and individual SNPs as predictor variables to identify TReQTLs in European (CEU) and African (YRI) HapMap populations. At a nominal p
-value threshold of <1×10−6
we discovered 234 SNPs in CEU and 154 in YRI as putative TReQTLs (). These represent 36 and 39 independent (tag) SNPs in CEU and YRI affecting the DSTs of 25 and 36 TRs respectively. Two SNPs (one in each population) are cis
-acting TReQTLs (within 1 kb of a gene) at a false discovery rate (FDR) of 45%. One of them, a SNP in the pecanex-like 2 (Pcnxl2
) gene was found in CEU to be highly associated with the DSTs of the cAMP responsive element modulator (CREM) transfactor whereas in the YRI dataset, a SNP was linked to the DSTs of miRNA hsa-miR-125a. Although the FDR may seem abnormally high and one would expect at least one if not both of the TReQTLs to be false positives, it can be misleading as others have demonstrated that adjusting for biases which arise from correlations in eQTL analysis is a major challenge and a substantial overestimation of the number of false positives 
Interestingly enough, the gene expression of the DSTs of 24 TRs was associated with SNPs (albeit different ones) in both populations () but the majority differed ( and ). The overlap in the TReQTLs probably reflects the ubiquity of certain basic biological processes such as transcription regulation, cell communication, transport, kinase activity, growth and development. On the otherhand, one TReQTL tag SNP (rs3790904) in the CEU population is associated (p
) with the DSTs of the X-linked breast cancer suppressor gene Foxp3
() but is not significant in YRI (p
0.89). The interaction network of the Foxp3 TReQTL in CEU revealed that tumor necrosis factor (TNF), NF-kappaB and variants in G-protein coupled receptors (GPCR) signaling may play a central role as communicators in Foxp3 functional regulation (). Although the Foxp3 tumor suppressor is biologically relevant in the pathogenesis of breast cancer, some have shown that SNPs in the germline of the gene are not associated with the risk of the disease 
. Our TReQTL analysis reveals other potentially interesting loci which might be causative in the etiology of complex diseases.
Another difference between the two populations based on the TReQTLs was the connectivity of the underlining Gene Ontology (GO) biological processes that the genes of the TReQTL represent (). In CEU, several SNPs associated with the variation of expression for the DSTs of two miRNAs (hsa-mir-181b-1 (MI0000270) and hsa-mir-181b-2 (MI0000683)) are mapped to the peptidyl-prolyl cis-trans isomerase pseudo-gene and yields a subtree with synaptic transmission as the more cohesive descriptive GO term (). The activity of this enzyme has been suggested to be necessary for memory formation and may be involved in complex neurodegenerations such as Alzheimer's disease 
. In YRI, a SNP (rs12258754) controlling the variation of expression for the DSTs of activating transcription factor 3 (Atf3) yielded a subtree with vascular smooth muscle cell (VSMC) contraction as the more descriptive GO term (). Although much is not currently known about the function of Atf3 in VSMCs 
, mutations in the actin, alpha 2 (Acta2
) smooth muscle gene have been shown to result in a variety of vascular diseases 
. Transcriptional networks such as these have been recently shown to be hubs with high connectivity and association with controlling higher-ordered biological function such as lipogenesis, lipid trafficking and surfactant homeostasis 
. Our approach embraces this strategy by using the SNPs within the TReQTLs as an adjudicator for the identification of master regulators of these genetic networks. Although it is expected that a TR and its DSTs will share a common signaling pathway, what is not certain is that the SNP associated with the eQTL from the TR and DSTs will reside near or in a gene with biological functionality that forms a cohesive GO biological process subtree. Bear in mind that it is not known where the true regulating TR associated with a candidate TReQTL actually exerts its biological functionality and to date, there is no independent data set with gene expression and genotype calls from another sample of the YRI and CEU populations to replicate our results. However, once the genotype data from Idaghdour et al. 
are made publicly available, we will be able to use it to determine if our TReQTLs can discern between Moroccan populations according to geographical locations, regional differences and ancestry. Furthermore, in depth functional analyses on TR targets will presumably shed light on these TReQTL regulatory networks and perhaps biologically confirm our results.
McCauley et al. 
reported that SNPs in multi-species conserved sequences (MCS) are useful as markers linking to complex diseases. Recent evidence suggests that SNPs that influence alternative splicing are enriched within splice junctions (SJs) or disrupt splicing enhancers 
. Our analysis of Foxp3 TReQTLs revealed SNPs overrepresented within 5-way (human, mouse, chimp, rhesus monkey and dog) evolutionary conserved regions (ECRs) in CEU and in SJs of YRI defined by RNA-Seq mapping (). These results support the notion that genomics, genetics and transcriptomics play an intricate role in sustaining population diversity and structure 
. It would be interesting to determine how environmental factors, population structure and geographical differences affect transcript abundance as a quantitative trait when co-regulation of gene expression is considered.
Although the identification of TReQTLs is useful for determining genetic variants regulating gene expression, there are limitations to the approach and guidelines with interpretation of the results. First, there is a paucity of information about the genes which TRs control. We restricted our analysis to only 333 TRs with two or more DSTs known at a given time to be regulated by TRs. This does not capture the full array of genetic variants which might contribute to the gene expression differences between the two populations. However, as advances in functional genomics leads to improved knowledge about gene regulation and biological function on a genome-wide scale, the discovery of TReQTLs should advance and be more informative. In addition, the study of the transcript-regulation of genes by miRNA is in its infancy and there is a small number of miRNAs known to regulate genes. Furthermore, our analysis only tested the association of a single SNP with sets of coordinately expressed genes. It is very likely that the variation in expression is due to the synergistic effect of two or more SNPs. In fact, there may be other mediators of complex diseases other than SNPs acting alone or symbiotically. Finally, our work relied on samples from immortalized lymphoblastoid cell lines (LCLs) and not from a disease state. Therefore, it is debatable whether or not the genetic associations of SNPs with gene expression in LCLs will carry over to tissue samples from organs 
. However, there is some indication, albeit a paucity of evidence, that the DNA repair capacity of LCLs from breast cancer samples is significantly lower than control subjects 
, that tumor-infiltrating Foxp3+ regulatory T cells can distinguish between high-risk breast cancer patients and those at risk of a late relapse 
and that a fraction of eQTLs derived from the analysis of UK Adult Twin registry LCLs gene expression and genotype data overlap with those identified in a HapMap population 
. Despite the caveats noted above, the advantages of associating genetic markers such as SNPs to quantitative traits such as co-regulated genes is promising and of value as an additional strategy when investigating the role of a genetic variant and master regulators in the etiology of a complex diseases.