In the STAGE study, we profiled five CAD-relevant tissues to identify functionally associated genes with potential importance in coronary atherosclerosis. This analysis revealed 128 genes that were strongly associated with atherosclerosis severity (A-module). The A-module was found to be enriched with genetic risk for CAD and involve the TEML pathway. Parts of the A-module were active in both atherosclerotic arterial wall and visceral fat. The latter may be a local source of inflammation contributing to coronary atherosclerosis. We also identified a putative high-hierarchy regulator of the A-module, LDB2, which was robustly expressed in all major lesion cell types both in lesion-free and in late atherosclerosis lesions. Interestingly, key genes in the TEML pathway were differentially regulated in the arterial wall of Ldb2-deficient mice. Our findings suggest that the A-module, including LDB2, is important in the regulation of TEML and in atherosclerosis development.
TEML is an established pathway in atherosclerosis and other inflammatory diseases 
. Transendothelial migration of monocytes is essential for foam-cell formation and for early phases of atherogenesis, and transendothelial migration of T-cells may be central in later phases 
. Indeed, leukocyte migration has been suggested as a therapeutic target 
. The identified module was enriched in genes involved in TEML and thus may be causally involved in the development of clinically significant atherosclerotic lesions (as indicated by the extent of coronary stenosis and IMT). However, most of the identified A-module genes lack pathway annotations but may in future studies be proven important to leukocyte migration or its regulation.
The STAGE study was designed as a “top-down” systems biological approach to identify gene networks or groups of otherwise functionally associated genes (modules) of importance for disease severity 
. The term “top-down” refers to our belief that these modules must first be identified in clinical studies as the most disease relevant and then be consecutively detailed by studies in animal and cellular models to reveal high-resolution networks 
. In contrast, “bottom-up” systems biology approaches first identify full biological networks in prokaryotic or yeast cells and then examine their roles in more disease-relevant systems. Systems biological approaches have advantages over traditional gene-expression profiling studies, which usually focus on identifying individual genes differentially expressed as a result of disease. Such gene-by-gene analyses generate many false positives due to a vast “multiple testing” problem. In contrast, the two-way clustering approach first focuses on identifying functionally associated genes (which in the current study reduced the number of genes from 12,621 to 3958 represented in 60 tissue clusters) and then investigate whether the generated clusters (not individual genes) are related to a given disease phenotype.
Using a multi-organ approach 
, we hypothesized the liver, skeletal muscle, or fat deposits would harbour functionally related genes (e.g., clusters, modules, networks) reflecting molecular processes in those organs affecting the levels of inflammatory mediators, blood lipids, glucose or unknown blood constituents that contribute to coronary atherosclerosis development. There were no clusters relating to the extent of coronary atherosclerosis in the liver and skeletal muscle. This was surprising given the importance of these organs for CAD risk factors, such as plasma cholesterol and diabetes. However, therapies to reduce plasma lipid and glucose levels () might have normalized disease-promoting activities in CAD-modules in these organs. In contrast, we identified one part of the A-module in visceral fat that segregated patients according to the degree of coronary stenosis. Although the relation of visceral fat to CAD risk factors in blood is less clear, a high waist-hip ratio—an indicator of increased visceral fat mass in the abdomen—is a strong predictor of CAD 
. An interesting aspect of the visceral fat in the mediastinum is its anatomic location and the possibility that it is a source of local macrophages releasing inflammatory mediators 
. Another possible cellular source for the presence of the TEML-enriched atherosclerosis module in visceral fat may be endothelial cells, which are relatively enriched in this tissue. Although our study does not directly address the subcellular origin of the A-module in visceral fat or how it contributes to atherosclerosis, it might be a local source of inflammatory mediators that increase the rate of atherosclerosis progression 
In all, 60 tissue clusters were identified, two of which—one in atherosclerotic lesion and one in visceral fat—related to the extent of coronary atherosclerosis. This might appear to be a small fraction (2/60, ~3%). However, since the first clustering step takes no phenotypic data into consideration but is entirely based on the mRNA signals in each tissue, these 60 clusters may relate to tissue physiology or subtraits of CABG patients (). Examining the latter possibility, we found that as many as 41 of the tissue clusters (besides the two related to extent of coronary atherosclerosis) segregated the patients into groups with significant difference in the levels of subtraits (not shown).
The gene expression clustering was done with the absolute value of Spearman rank correlation as distance measure. Thus, we also included inverse correlated genes which could be implicated in the same pathway and functionally related. Moreover, Spearman rank correlation is a non-parametric measure stable against outliers and in this sense a better distance measure than commonly used Euclidean and Manhattan distances, where the magnitude in expression levels are important. Of note, a clustering algorithm could produce different clusters depending on the distance measure used and the A-module could therefore have been different or even lost by other metric clustering choices.
We used atherosclerotic aortic wall/internal mammary artery (IMA) ratios to highlight atherosclerosis gene expression in the aortic wall because both aortic wall and IMA samples contain normal wall gene expression. Unlike the aortic wall, however, the IMA has no atherosclerosis 
. This notion was supported by macro- and microscopic examinations of randomly chosen sets of aortic wall and IMA samples. Moreover, two-way clustering of mRNA signals from the aortic wall samples alone did not generate any cluster that segregated patients by stenosis scores (not shown), which may be due to a relative large portion of normal vascular wall gene expression in this tissue. However, we cannot entirely exclude the possibility that using the aortic wall/IMA ratios resulted in some false-positive genes (nonatherosclerosis genes related to normal vascular wall gene expression) that should have been excluded from the A-module or false-negative genes that otherwise should have been included.
We decided to use two different atherosclerosis cohorts—coronary for the exploration and carotid for the confirmatory step. In doing so, we added more credibility to the confirmatory step that would have been lost if we instead had used identical cohort for exploration and confirmation. The validation in the carotid cohort indicates a general importance of the A-module in atherosclerosis and at the same time rules out the possible risk that any of the tissue clusters identified in the STAGE cohort was a result of the exploratory study design (e.g. choice of sample locations and/or using ratios instead of straight expression) rather than related to atherosclerosis. The extents of coronary and carotid atherosclerosis (as judge from the surrogate measurements of stenosis score and IMT 
) have repeatedly been shown to be highly correlated 
. This observation is not entirely surprising since atherosclerosis development and the principal molecular processes underlying this development have been found to be very similar in all major arteries, regardless of location 
Currently, GWAS are given much attention in leading scientific journals. However, such studies have some limitations, since they are primarily designed to identify the relatively few DNA variants that influence the risk of developing complex diseases, like CAD, independently of other risk factors 
. In the current study, we used a recently published GWAS 
to further validate the A-module genes by calculating the relative enrichment of genetic CAD risk in the module. Unlike today's GWAS, which link DNA variation directly to clinical phenotypes, future studies that also include intermediate expression phenotypes have the potential to extract much more disease-relevant information on DNA variation that contributes to the development of complex diseases. For now, this information remains hidden in the data generated by GWAS.
Genes encoding LIM domain-binding factors such as LDB2
were initially isolated in a screen for proteins that physically interact with the LIM domains of nuclear proteins. These proteins bind to a variety of TFs and are likely to function as enhancers, bringing together diverse TFs to form higher-order activation complexes 
. Our screen of LDB2-associated TFs identified ISL-1alpha, Lmo2, Lhx3a, Lhx3b, LHX2, LHX4, and BRCA1. ISL-1alpha enhances HNF4 activity and thus insulin signaling 
. Lmo2 is involved in angiogenesis 
. Lhx3 and Lhx4 regulate proliferation and differentiation of pituitary-specific cell lineages 
and are expressed in subsets of lymphocytes 
and thymocyte tumor cell lines 
. BRCA1 is associated with a selective deficiency in spontaneous and LPS-induced production of tumor necrosis factor (TNF)-α and of TNF-alpha-induced expression of intercellular adhesion molecule-1 (ICAM1
) on peripheral blood monocytes 
and in controlling the life cycle of T-lymphocytes 
. LDB2 has not previously been related to CAD or atherosclerosis. Because of its high-hierarchy regulatory role and involvement in diverse biological processes, LDB2 is an interesting target for further evaluation in complex diseases.
Being the only transcriptional regulator among the six genes relating to severity of atherosclerosis present in all three tissue clusters (), LDB2 was chosen for functional validation in atherosclerosis. However, despite the fact that none of the other five genes were transcriptional regulators, they might still be of functional importance for atherosclerosis development, which remains to be determined. In nonatherosclerotic arterial wall and in early lesions, LDB2 was mainly expressed by the endothelium. In late lesions, LDB2 expression was more intense and mainly seen in macrophages/foam cells but also in SMCs. The TEML pathway has been implicated in both early and late atherosclerosis 
. This pathway is also active in lesion SMCs accompanying endothelial cells in recruiting monocytes from the blood to the atherosclerotic plaque 
. The pattern of LDB2
expression seen in early and late lesions has been observed for other key TEML genes (Vcam1
, and -16
, and Cdc
. The notion that LDB2 is an important regulator of TEML is further supported by the fact that 13 key genes in TEML were differentially expressed in the arterial wall of Ldb2−/−
mice already at 6 weeks of age. Five of those genes have previously been shown to affect atherosclerosis in mouse model studies 
. In addition, a very recent study demonstrated that LDB2 regulates cell migration both in vitro
and in vivo 
. However, the final verdict on LDB2 as an important regulator of atherosclerosis development remains to be determined.
Although it cannot be excluded that the A-module also will be of importance for early stage of atherosclerosis (e.g., by promoting early lesion development through activating TEML in the atherosclerosis-free endothelium), the current study mainly supports a role of the A-module in late stages of coronary atherosclerosis. If the activity of this cassette of genes is mirrored, at least in part, by gene expression in blood (i.e., in leukocytes) or by plasma protein levels, the A-module may be helpful as a complement to semi-invasive investigations (e.g., angiography) as markers of degree of coronary and carotid stenosis.
In conclusion, by adopting a new strategy for functional analysis of expression profiles isolated from multiple CAD-relevant organs, we identified a module that is genetically enriched with CAD risk and important for TEML and atherosclerosis development. The clinical usefulness, and exact role in CAD of this module and its high-hierarchy regulator 
LDB2, merit further investigation.