The methods illustrated here provide a means for translating large data sets that capture global gene and protein expression changes during hMSC differentiation into simplified models. The performance of the Tucker1, 3, and PARAFAC models sometimes differed considerably. For example, in our proteomics data set (case study I), Tucker3 appeared to perform best in locus link mode. K-means algorithm identified two clusters (Figure ), one of which (the outliers) included genes that participate in signal transduction, especially calcium/calmodulin-associated proteins (e.g., calmodulin-dependent protein kinase α, β, and γ); and control of transcription and translation (e.g., STAT1, SYNCRIP). We feel these genes may be of special interest because they fall outside the majority of the "common" genes that can be reduced by the model, and thus may contribute uniquely to the distinct protein profiles in this data set. Tucker1 and PARAFAC, by comparison, were comparatively poor at identifying meaningful clusters. Nevertheless, the outliers identified by each method shared one common feature: they all included a set of six genes, five of which participate in signal transduction pathways (see Figure ). One of these, calmodulin-dependent protein kinase II delta, was previously identified by SVD analysis as a candidate "osteogenic" gene, in that its expression is found in hMSC populations most closely resembling osteoblasts. This bolsters our belief that calcium/calmodulin signaling in especially important during hMSC osteogenic differentiation.
When one considers the set of proteins shared by at least two of the three methods, this pattern becomes even clearer: additional isoforms of calmodulin-dependent protein kinase II and other signaling proteins (caldesmon, PDLIM7, RhoA, Rho C, and protein phosphatase 2) emphasize the importance of integrin-associated signaling pathways during ECM-induced differentiation. These interpretations are consistent with those of others who have applied similar techniques to other stem cell data sets (refs: UID# 15257023, 17541472, 17625253). These models also included a number of muscle-associated proteins (tropomyosin, two myosin isoforms, CAPZB) suggesting that bone and muscle differentiation may be closely related. This also agrees with our previous analysis of osteogenic gene focusing in response to tensile strain, wherein we observed a drop in expression of marker genes for many different lineages (nerve, fat, cartilage), but observed no drop in smooth muscle cell markers [28
In category mode, PARAFAC yielded the most interesting clustering results for the proteomics data set, in that it identified clusters of functionally related genes that contribute the most to the model. Furthermore, many of these genes afford a plausible biological explanation for how hMSC undergo differentiation. It selected the greatest number of categories, yet organized them into three clearly distinct clusters. P cluster 1 contained categories primarily concerned with nucleotide binding and metabolism, and resembled the gene expression cluster (T1 cluster 2) in the Tucker1 analysis. P cluster 2 closely resembled the signal transduction cluster (T1 cluster 3) in the Tucker1 model, and added an additional category, phosphorus metabolism. The third cluster contained categories not found in the other two models, that centered on the theme of extracellular matrix protein synthesis and modification. We previously identified these categories as significant during hMSC differentiation .
Tucker1 and Tucker3 identified smaller sets of outliers. The first cluster in the Tucker1 model (T1 cluster 1) contained two categories primarily associated with cell survival, and therefore sheds little light on the potential mechanisms underlying hMSC differentiation. However, T1 cluster 2 and T1 cluster 3 contained categories concerned with control of gene expression and signal transduction, respectively. Given the tight association between these activities and their clear association with cellular differentiation, selection of these categories may help identify the potential mechanisms used by hMSC during osteogenic differentiation. In particular, the signal transduction cluster (T1 cluster 3) contained categories concerned with traditional signaling pathways known to control differentiation. For example, calmodulin and calmodulin-dependent protein kinase II stimulate osteogenic differentiation of hMSC while promoting cell migration and suppressing cell growth [52
]; all of these activities are contained in the signal transduction cluster. G proteins, which correspond to the purine nucleotide binding category, are well-known to play an important role in osteogenic differentiation [reviewed in [53
]]. Again, these results agree with our previous analysis, which identified calcium-dependent signaling as an important factor in osteogenesis [32
The plot of sample mode data from Tucker3 (Figure ) is quite informative. The wide separation of the NoStimulant and Osteoblast samples allows us to interpret the space between them as a form of "differentiation axis," and illustrates two important themes. First, we observe that populations of hMSC grown On_Vitronectin or On_Collagen lie midway between the unstimulated hMSC and osteoblasts, demonstrating the partial differentiation induced by these stimulants. Second, the observation that hMSC cultured IN_OSMedium lie beyond the intended target (Osteoblasts) suggests that OS may "over-stimulate" these cells. OS medium contains dexamtheazone, a synthetic form corticosteroid, and this population of cells expressed a distinct set of genes/proteins devoted to steroid metabolism. Both ECM and OS stimulants yield cells that resemble osteoblasts, yet they induce the expression of quite different genes. It is quite possible that the typical OS exposure regimen drives steroid metabolism genes beyond the level necessary for osteogenesis. It is also possible that a combination of the genes expressed in ECM stimulated cells and genes expressed in OS stimulated cells would yield a phenotype closer to true osteoblasts than either set of genes alone. Curiously, the same type of plot for the PARAFAC data (Figure ) offered no clear biologically meaningful relationship between the samples.
The locus link analysis of our second (microarray) data set identified a set of genes, ("outliers") that our model suggests contribute heavily to the variance between each experimental group (Table ). In other words, expression of these genes may discriminate between different states of hMSC differentiation. Consistent with our previous analysis, the majority of these genes can be organized into four subsets based on their functions. One class encodes proteins known to contribute to osteogenic differentiation and/or inhibit hMSC growth (FHL2, POSTN, LOX, LOXL1, SPARC, TMSB4X, CTHR1, FST, TGFB1) while a second contains markers for a closely related differentiation fate of hMSC, chondrogenesis (CHI3LI, COL1A1, COL3A1, COL6A3, COL8A1, COL12A1, CTGF, LUM). The fact that many of these are extracellular matrix (ECM) molecules or modifiers of the ECM underscores the importance of ECM in controlling differentiation of hMSC. Expression of at least two of these genes, CTGF and COL12A1, is controlled by mechanical strain. Consistent with our hypothesis that application of strain promotes osteogenic differentiation of hMSC by triggering ECM-associated signaling pathways, our outliers contain a third class of genes that participate in signal transduction and/or regulation of gene expression (IFITM1, CCPG1, FST, MAPKBP1, FBXL2, TRERF1, FHL2). Finally, our gene set contains markers for a range of different cell differentiation fates, including embryonic development (AMD1, SERPINH1), vasculogenesis (S100A4), hematopoesis (B2M, IGFBP6, IKZF5), neurogenesis (BASP1, SERPINH1), and even osteoclastogenesis (CTSB, CTSD, CTSK); it is possible that these genes are downregulated in response to our osteogenic stimulus.
We intersected the 98% outliers obtained by both Tucker3 and PARAFAC analysis to compose a list of interesting genes
Finally, the graph of the samples in Figure , when viewed as an axis of similarity between undifferentiated hMSC grown on tissue culture plastic (TCP) and hOST is entirely consistent with our hypothesis, and strongly suggests that hMSC transdifferentiate towards the osteoblast phenotype under these conditions. Furthermore, at each time point tested (days 2, 4, and 5), the strained population always lies closer to the osteoblasts than the unstrained population. This is consistent with our previous finding that application of tensile strain accelerates the osteogenic differentiation of hMSC [31