|Home | About | Journals | Submit | Contact Us | Français|
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Cartilage plays a fundamental role in the development of the human skeleton. Early in embryogenesis, mesenchymal cells condense and differentiate into chondrocytes to shape the early skeleton. Subsequently, the cartilage anlagen differentiate to form the growth plates, which are responsible for linear bone growth, and the articular chondrocytes, which facilitate joint function. However, despite the multiplicity of roles of cartilage during human fetal life, surprisingly little is known about its transcriptome. To address this, a whole genome microarray expression profile was generated using RNA isolated from 18–22 week human distal femur fetal cartilage and compared with a database of control normal human tissues aggregated at UCLA, termed Celsius.
161 cartilage-selective genes were identified, defined as genes significantly expressed in cartilage with low expression and little variation across a panel of 34 non-cartilage tissues. Among these 161 genes were cartilage-specific genes such as cartilage collagen genes and 25 genes which have been associated with skeletal phenotypes in humans and/or mice. Many of the other cartilage-selective genes do not have established roles in cartilage or are novel, unannotated genes. Quantitative RT-PCR confirmed the unique pattern of gene expression observed by microarray analysis.
Defining the gene expression pattern for cartilage has identified new genes that may contribute to human skeletogenesis as well as provided further candidate genes for skeletal dysplasias. The data suggest that fetal cartilage is a complex and transcriptionally active tissue and demonstrate that the set of genes selectively expressed in the tissue has been greatly underestimated.
Skeletogenesis begins with condensation of mesenchymal chondroprogenitor cells to form the cartilage anlagen that pattern the early skeleton. Subsequently, for bones that grow by endochondral ossification, the chondrocytes differentiate further to establish the growth plates. At the joint surfaces, development of articular cartilage facilitates and maintains joint movement during fetal life. These multi-step processes require the coordinated expression of many genes, including genes encoding extracellular matrix proteins and morphogens, as well as proliferative, angiogenic, and apoptotic signals . Most of our knowledge of the function of the genes involved has been derived from developmental studies in model systems and cell lines  as well as from the identification of disease genes in skeletal disorders.
Whole genome analysis of chondrocyte gene expression has the potential to reveal novel genes and gene expression programs which define the tissue. Although the complete set of genes expressed in human cartilage has not yet been described, analysis of human cartilage cDNA libraries has provided an initial in vivo picture of the cartilage transcriptome [3-6]. These investigations have also identified expression of both known and novel genes. Comparative microarray studies in rat cartilage  and several chondrocyte cell lines [8,9] have provided a larger set of genes of potential importance in chondrocytes, including genes specific to the stages of chondrocyte differentiation. Wang et al. (2004) identified 92 genes with two-fold variation in expression between hypertrophic and proliferative growth plate chondrocytes. In this in vivo study, significant gene expression changes were principally associated with cell cycle, transcription, extracellular matrix structure, receptor and transporter functions. In microarray studies of mouse micromass cultures , 212 genes exhibited at least a ten-fold difference in gene expression as the cultures differentiated. Thus global characterization of gene expression is beginning to describe the identities of key regulatory molecules and their targets in chondrocytes.
Disrupting genes involved in the organization and maturation of the growth plate and/or the stability of articular cartilage results in inherited skeletal disorders that range from perinatal lethal phenotypes to mild disorders with early-onset osteoarthropathy as their major feature [10,11]. Of the approximately 370 clinically distinguishable skeletal dysplasias , mutations in 115 genes have been associated with about 150 disorders. Many of these disease genes are expressed in a cartilage-selective pattern, and therefore identifying additional genes uniquely expressed in cartilage should yield new skeletal dysplasia candidate genes.
To identify a larger set of genes uniquely expressed in chondrocytes, this study describes a genome-scale gene expression profile for 18–22 week human fetal cartilage. There were 161 genes which appeared to be selectively expressed in fetal cartilage, comprising a variety of novel genes that may contribute to skeletal development. The data suggest a complex pattern of cartilage gene expression and indicate that the number of genes selectively expressed in cartilage has been greatly underestimated.
To define a set of genes preferentially or uniquely expressed in normal human fetal cartilage, cartilage probeset intensities were compared with probeset intensities across a variety of normal tissues. A two-step process was employed for gene identification, consisting of a training step and a validation step (see the additional data file 1, for a flow chart of an overview of the analysis). The tissue-selectivity of a representative sampling of the identified genes was confirmed by quantitative RT-PCR.
The training dataset consisted of five independent cartilage samples and 41 non-cartilage samples, all analyzed using Affymetrix U133 2.0 Plus arrays. The average correlation coefficient among the cartilage samples (R2) was 0.96. To identify unbiased relationships within the data, and to test the robustness of the normalization and tissue-specificity, an unsupervised approach , in which the genes and tissues were grouped based only on expression patterns, was employed. Probesets with the greatest variation across all tissues and whose expression in any two arrays differed by at least two standard deviations from their mean expression across the entire set of samples were selected. This selection yielded 9483 probesets.
Two-way hierarchical clustering based on similarity of expression of these 9483 probesets within the samples was performed (Figure (Figure2).2). Samples from the same tissues clustered together, indicating that the normalization was sufficiently robust to allow tissue-selective expression patterns to be identified. Even with these relatively non-stringent selection criteria, the results showed a surprisingly large number of genes with a fetal cartilage-selective expression pattern. At least 89 probesets representing 64 genes with coordinately higher expression in cartilage relative to non-cartilage tissues appeared to drive the clustering of the two groups (Figure (Figure2B).2B). These probesets formed a gene expression node in the dendrogram which shared an overall expression correlation of 0.99. The genes represented by these probesets included some well established cartilage-selective genes, including aggrecan (AGC1), type × collagen (COL10A1), and matrilin 3 (MATN3), among others. Thus, a comparative approach with microarrays can identify genes whose expression is cartilage-selective.
To define a ranked list of genes significantly expressed in cartilage, a supervised analysis , comparing cartilage versus non-cartilage gene expression, was employed. This consisted of a two-class analysis with a modified t-test (SAM) (See additional data file 2, for the complete results of this analysis). There were 2634 probesets representing 1720 genes with at least three-fold differential expression when comparing cartilage and non-cartilage tissues, with a false discovery rate of zero. Of these, 2446 of the 2634 probesets demonstrated higher expression in cartilage with respect to non-cartilage tissues, while the remaining 188 probesets were expressed at significantly higher levels in the other tissues. As observed for the hierarchical clustering, probesets representing well-known cartilage markers, including COL2A1, AGC1, COMP, COL9A3, and MMP3 were among the top genes listed. In addition, lubricin (PRG4), an articular cartilage-specific marker, was also identified, confirming the ability to identify genes specific to fetal articular cartilage. Indeed, among the top 35 probesets more highly expressed in cartilage, only four probesets, representing unannotated genes, were derived from genes not previously known to be expressed in cartilage.
Three array platforms were used to validate the 2446 probesets identified in the supervised analysis and generate a robust list of cartilage-selective genes (Table (Table1).1). A majority of these probesets (2245 probesets (> 92%)) were identified in 124 U133A and 74 U133B arrays using the Celsius database (see Materials and Methods), and represented expression from 34 normal tissues. A small proportion of the probesets (201/2446) are not found on the Affymetrix™ Human Genome U133A/B Arrays, so these probesets were identified in the analysis of 26 U133 Plus 2.0 arrays, representing eight non-cartilage tissues. A summary of the validation and the tissue distribution are available as additional files.
Of the three platforms, the U133A dataset was the most robust with regard to the number of arrays, biological replicates, diversity of tissues, probes identified, and gene annotation. From this platform, 1363 of the 2446 probesets identified in the SAM analysis as expressed at a higher level in cartilage were obtained. Two hundred seventy-four of the 1363 probesets (274/1363), representing 237 genes, exhibited at least five-fold higher expression when compared to non-cartilage tissues and were ranked by cartilage-specificity using an analog of coefficient of variation (CV) (see Methods). Of these, 56 probesets, representing 49 genes, were identified with a CV < 50% in non-cartilage samples, constituting the cartilage-selective gene set from this platform (Table (Table1,1, left). Twenty of these genes have mutations that have been associated with skeletal phenotypes in humans and/or mice, representing 44% of the probes selected from this platform.
Eight hundred eighty-two of the 2446 probesets were identified from the U133B validation set. Two hundred fourteen of these probesets, representing 158 genes, were well measured in cartilage with at least five-fold higher expression in cartilage relative to non-cartilage tissues. Of these, 77 probesets had a CV less than 50% in non-cartilage samples, representing 71 cartilage-selective genes (Table (Table1,1, center), including 3 genes also identified using the U133A platform (COL11A1, EDIL3, and PDPN).
A subset of the cartilage-selective genes was represented only on the Human Genome U133 Plus 2.0 arrays and were selected from the analysis of 28 non-cartilage samples. In total 201/2446 probesets were not represented in the U133A/B array set. Of these 201 probesets, 96 probesets, representing 85 genes, had a five-fold higher expression in cartilage than non-cartilage samples. By including the CV selection criterion, 52 probesets, representing 50 cartilage-selective genes were identified and added to the complete tally (Table (Table1,1, right), including 6 genes also identified using the U133A and U133B arrays (IRAK2, NRP2, WTAP, PITPNC1, AKR1C2, and PTK2).
In summary, 480 genes demonstrating enriched or specific expression in cartilage were selected from the comparison of cartilage and non-cartilage tissues with data derived from the U133A (n = 237), U133B (n = 158) and U133 Plus 2.0 (n = 85) platforms. Of these, a non-redundant set of 161 genes (Table (Table2),2), including 11 uncharacterized genes and 16 genes represented by unannotated probesets, were classified as cartilage-selective. These data greatly expand the number of genes known to be selectively expressed in cartilage and emphasize the unique pattern of gene expression that determines its properties.
Quantitative RT-PCR was used to independently assess the tissue-selectivity of the genes identified in the microarray analysis. For each of the three microarray platforms, the probesets with a CV less than 50% were divided into 10% intervals (0–10% CV, 10–20% etc.) (Table (Table1),1), and one gene from the middle of each interval was selected for analysis by qRT-PCR.
All of the thirteen of genes analyzed demonstrated higher expression in cartilage than in the seven non-cartilage tissues studied (Table (Table3).3). Also, with one exception (OSMR), the selection threshold of at least five-fold higher expression in cartilage tissues as compared with the average expression among all non-cartilage tissues imposed for the microarray analysis, was observed. For most of the genes studied by qRT-PCR, there was little expression in the seven non-cartilage tissues (median Ct = 33.2), indicating that including the coefficient of variation in the ranking algorithm preferentially identifies genes selectively expressed in cartilage. Also, there was an inverse correlation between the gene rank and the standard deviation in expression level among non-cartilage tissues, indicating that genes with a higher rank were more selectively expressed in cartilage. Finally, there was a trend of decreasing cartilage selectivity moving from the U133A to U133B to U133 2.0 qRT-PCR validations, likely reflecting the decreasing robustness of comparison datasets in the respective platforms. Overall the qRT-PCR experiments replicated and validated findings derived from the comparative microarray data.
Using genome-scale microarrays, gene expression in human fetal cartilage was compared with a robust set of other normal tissues. Hierarchical clustering showed remarkable similarity among the 18–22 week fetal cartilage expression profiles and demonstrated that a subset of the cartilage transcriptome is composed of a unique gene set not generally expressed in the other tissues studied. Using SAM, 2446 probesets measured preferential expression of 1712 genes with at least three-fold higher expression in cartilage as compared with other tissues. 1028 (42%) of these probesets matched genes identified in a cartilage growth plate cDNA library  validating their expression in cartilage via an independent dataset. The identification of genes known to have restricted patterns of expression in cartilage confirmed the presence of RNA derived from the reserve (GREM1), hypertrophic (BMP6, COL10A1), and terminally differentiated (MMP13) chondrocytes, in addition to genes expressed throughout all zones of the growth plate. This analysis suggested that there is differential transcriptional regulation of many genes in fetal cartilage and that the data could be used to identify genes selectively expressed in the tissue.
Tissue-selective genes have been previously defined as genes with enriched expression in a particular tissue  and characterized with algorithms dependent on the degree of differential expression relative to other tissues, including t-test , SAM , fold change [14,17,18], and enrichment scores . While these approaches successfully identify tissue-selective genes, the reliance on fold change reduces the significance of many selectively expressed genes with low fold change. To compensate for this and identify cartilage-selective genes expressed at lower levels, the approach presented here placed increased significance on the preferentially expressed genes that showed the least variation of expression in non-cartilage tissues. This was made possible by the use of publicly-released reference gene expression data performed on the same platform and led to the reliable identification of genes with lower fold changes, but high cartilage selectivity. The impact of the use of coefficient of variation on the ranked gene list is apparent in Tables Tables11 and and2.2. In the U133A dataset, nine of the top 25 genes were ranked higher than 100 in significance in the SAM ranking. The average fold change of the probes for these nine genes was 10.7, while the average fold change of the probes for the other 14 of the top 25 genes was 42.2. One of these probes, COL10A1 was among the top four cartilage-selective genes using the CV algorithm but ranked at 284 by SAM (Table (Table1).1). In the U133B dataset, which contains a higher percentage of unannotated genes, 4 of the top 50 probes had a SAM ranking below 100, and the average SAM ranking was 576. Overall, to identify only the most cartilage-selective genes, a threshold of 50% coefficient of variation was used across all three platforms, yielding 161 cartilage-selective genes. A subset of 13 of the 161 cartilage-selective genes was studied by quantitative RT-PCR in cartilage and eight non-cartilage tissues to independently assess tissue selectivity. The data confirmed the cartilage-selectivity of genes with less than 50% CV, validating the selection procedure and suggesting that the gene expression patterns determined by microarray analysis are representative.
The coefficient of variation selection approach could, in theory, equally select for three different patterns of expression: cartilage-specific genes; genes with a consistent level of baseline expression in non-cartilage tissues; and genes with significant but equal expression in all tissues (e.g. housekeeping genes). In this data analysis, however, the most highly ranked genes consistently demonstrated little or no expression in non-cartilage tissues. The data thus demonstrate that incorporating coefficient of variation preferentially selected for genes not significantly expressed in non-cartilage tissues, yielding genes likely to have important and perhaps unique roles in cartilage.
Regardless of expression level, a cartilage-selective expression pattern suggests that the product of each identified gene may have a functional role in the development of the skeleton. Concordant with this hypothesis, mutations in 25 of the 161 selected genes have been associated with skeletal phenotypes in humans and/or mice. Included among them were the products of the well characterized genes encoding aggrecan and the cartilage-specific collagens, gene products known to have a prominent role in skeletal development and endochondral ossification. By this measure, the remaining genes may be candidate genes for skeletal dysplasias in which the disease gene has yet to be identified. As new skeletal dysplasia loci are defined, coincidence between a locus and a cartilage-selective gene may promote rapid identification of the disease gene. Knockout of the orthologous genes in mice would also facilitate exploring the role of each gene in skeletal development.
Classification of the biological roles of the products of the cartilage-selective gene set reveals genes with diverse functions including structural proteins of the cartilage extracellular matrix, enzymes that modify them, and 41 gene products with unannotated function. There were 65 genes that are components of signaling pathways, and only 43% of these were identified by sequence analysis of a comparable fetal cartilage cDNA library . Among the genes were elements of the nitric oxide, VEGF, TNF/RANK, and gp130 pathways, all of which have known roles in the growth plate [19-23]. Mutations in the genes encoding some of the molecules in these pathways, including RPS6SKA3, LIFR, TNFRSF11A and IKBKG, have been associated with human skeletal dysplasias , again suggesting that the remaining genes may also serve critical roles in endochondral ossification.
Multiple genes encoding members of the LIF/gp130 signaling pathway met the definition of cartilage-selective genes. LIF is a cytokine that is expressed in terminally-differentiated growth plate chondrocytes  and signals through the gp130/LIFR complex. Homozygosity for loss of function mutations in the LIF receptor produces the recessively inherited skeletal dysplasia, Stuve-Wiedemann syndrome . In addition to their skeletal features, these patients have cardiovascular, pulmonary, gastrointestinal, neurologic and metabolic abnormalities, likely attributable to the role that LIFR plays in embryonic or fetal development. Genes on the cartilage-selective gene list upstream of the receptor include RELA and RELB, NF-KB survival transcription factors that increase transcription of LIF , as well as the LIF gene itself. Through the LIFR/gp130 complex, LIF can regulate both the JAK/STAT and ERK MAP kinase pathways. Pathway components downstream of the receptor include ATF1, part of the ATF1/CREB transcription factor complex that participates in ERK MAP kinase signaling [27,28]. The ATF1/CREB complex is also regulated by phosphorylation by the product of the RPS6SKA3 gene [29,30], another gene in the MAPK/ERK pathway that is associated with a skeletal phenotype. The gene encoding RPS, a phosphorylation target of RPS6SKA3 , was also cartilage-selective, but the role of this protein in growth plate differentiation has yet to be determined. Finally, the gene encoding FOSL1, a FOS-like transcription factor activated by the ERK/MAPK pathway which binds cJUN to form a transcription complex [31,32], was among the cartilage-selective genes identified. Thus comparative microarray analysis has identified multiple components of a regulatory pathway that can be explored to further evaluate their importance in growth plate differentiation and endochondral ossification.
While this study has provided a deep set of genes that exhibit a cartilage-selective expression pattern, there are some limitations to the analysis. First, the study focused on total cartilage RNA, including all types of growth plate chondrocytes, as well as articular cartilage. As a result, it cannot be determined if the selected genes are expressed in all types of chondrocytes or only a subset of cells. In this context, nine of the cartilage selective genes have been shown to be more highly expressed in hypertrophic cells relative to proliferating chondrocytes in the rat and/or mouse [7,8]. Second, the cartilage samples were derived from a single anatomic site and a narrow window of fetal development, so it is unclear to what extent the observed gene expression pattern can be generalized. Third, neither all possible non-cartilage tissues nor each type of cell within each tissue were studied, so cartilage-selectivity could be affected if additional fetal and/or adult tissues that express the identified genes were found. This may be particularly important for other connective tissues such as bone, tendon and ligament which contain cells known to express some of the cartilage-selective genes identified here.
Not all genes selectively expressed in developing cartilage will necessarily be identified using this approach. For instance, the COL2A1 gene fell just below the rigorous 50% CV standard set to define cartilage-selectivity. The underlying reasons for this are complex. Probe performance as well as the known expression of COL2A1 in fetal liver and heart, are likely to have had an effect, as both factors could have contributed to the variation in expression in non-cartilage tissues. In addition, the approach presented here treated the three expression platforms, U133A, U133B and U133 2.0 equally from the viewpoint of the threshold for cartilage-selectivity. Because the comparative dataset of normal tissues was both broader and deeper for the 133A platform, additional genes from this platform, albeit with greater than 50% CV, could be considered to be tissue-selective (e.g. COL2A1). Thus, a platform independent threshold would likely yield additional genes of interest within the U133A dataset. Finally, tissue-specific genes were identified using only microarrays and a single generalized algorithm. Additional genes selectively expressed in cartilage could be identified by less stringent criteria or other methods.
Genome-scale comparative expression analysis using human fetal cartilage and a broad set of normal human tissues has identified 161 cartilage-selective genes, including 27 uncharacterized genes. The data identify novel gene products that may provide essential roles in normal skeletogenesis and suggest new candidates for the over 100 inherited skeletal disorders in which the disease gene has not been identified. The results demonstrate that fetal cartilage is a complex and transcriptionally active tissue, and that the set of genes selectively expressed in cartilage has been greatly underestimated.
A flow chart outlining methods and results as well as other supplemental information is provided in additional data file 1.
Seven independent 18–22 week normal human fetal cartilage samples were studied under an Institutional Review Board approved protocol. Cartilage from the distal femur was dissected to remove bone and any adherent non-cartilage tissue (Figure (Figure1).1). RNA was isolated and purified as previously described  and the quality and quantity of RNA were confirmed using an Agilent 2100 bioanalyzer and a Nanodrop ND-1000 spectrophotometer, respectively. Probe labeling, microarray hybridization, washing and scanning were carried out as detailed in Affymetrix protocols . Five samples were used to probe Affymetrix™ U133 Plus 2.0 microarrays; and two samples were used to probe the Affymetrix™ Human Genome U133A/B set. Annotations were from version 11/15/06. The data are publicly available in the GEO database series [GEO:GSE6565]. An additional sample was fixed in formalin, sectioned and stained with toluidine blue.
This project made use of the Celsius database [34,35], which is a database of publicly available microarray datasets from Gene Expression Omnibus, Array Express, and individual databases. Only CEL files are entered into the database, permitting reprocessing using identical algorithms to enable experimental comparisons. Only data from Affymetrix™ Human Genome U133A/B and Plus 2.0 platforms that contained clear annotation that they were derived from normal human tissues were selected for this analysis.
Raw data were normalized using the RMA algorithm with default parameters, available as part of the Bioconductor R library [36,37]. In brief, each CEL file was processed separately with an invariant pool of 50 arrays from a matching platform. Higher signal intensities observed in a subset of U133A non-cartilage samples from one provider  were additionally normalized by subtracting the median from other non-cartilage samples. After normalization, the training dataset was log2 transformed prior to analysis.
In highly expressed cartilage genes, the degree of cartilage selectivity was defined as a median derived analog of CV (average deviation/median) applied to expression of these genes in non-cartilage tissues. The analog of CV was used to allow for greater tolerance of gene expression in some cartilage containing tissues without affecting the cartilage-selective assessment. A CV of 50% was empirically determined as a mathematically acceptable threshold for cartilage selectivity for probes across all three validation datasets (U133A, U133B, and U133 Plus 2.0).
For the unsupervised analysis, probe intensities in all tissues were subtracted by the median probe intensity in cartilage, so all expression was defined relative to cartilage (i.e. the median of cartilage expression was set at zero) for each selected probe. Probesets with the greatest variation across all tissues and whose expression in at least two samples differed by two standard deviations from their mean expression across the entire set of samples were selected. Two-way hierarchical clustering was performed using Pearson's correlation to group genes and arrays based on the similarity of their expression patterns . For the supervised analysis, the significance analysis of microarrays (SAM) two class method  was applied. 100 hundred permutations were used and at least three-fold variation between cartilage and non-cartilage expression was required.
The probeset expression profiles for the 2,446 probesets identified in the training set (see Results) were acquired from 224 arrays representing 34 different tissues on three different platforms: Affymetrix U133A, U133B and U133 Plus 2.0. The data from the first platform, U133A, consisted of 1363 probeset profiles from 124 arrays, representing two normal fetal cartilage and 122 normal non-cartilage (32 tissues) samples (see additional data file 3, for the tissue distribution of samples used in this analysis). Arrays represented in this dataset were mostly from two large normal tissue expression profiling projects . The second dataset, U133B, consisted of 882 expression profiles identified on 72 U133B microarrays from two normal fetal cartilage and 74 normal non-cartilage samples. These samples were primarily from the UCLA normal tissue microarray project (Chen, Day and Nelson, unpublished). The U133 2.0 dataset consisted of 201 expression profiles obtained from 26 Human Genome U133 Plus 2.0 arrays representing expression from five normal fetal cartilage and 21 non-cartilage (eight different tissue types) samples. The five cartilage arrays for the U133 2.0 platform were technical replicates of the arrays used in the training dataset. The non-cartilage arrays were a subset of the training dataset set aside for this validation only. From the 2446 probesets selected, probesets that exhibited at least a five-fold difference between the average cartilage intensity and the median signal intensity of all non-cartilage tissues, were selected. Cartilage-specificity was then determined using a median-derived analog of coefficient of variation (CV) as described above. Probesets with less than 50% CV were defined as reflecting cartilage-selective expression.
One microgram of RNA from seven tissues (brain, prostate, kidney, liver, heart, thyroid, and testis) in the FirstChoice® Human Total RNA survey panel (Ambion) was reverse transcribed using a high-capacity cDNA archive kit (ABI) and random primers. For cartilage, RNA from three independent cartilage samples was pooled and reverse transcribed. Amplification reactions were performed in triplicate using 100 ng of each cDNA. Thirty-five cycles of amplification were carried out in an ABI 7300 using the validated QuantiTect Gene Expression Assays and SYBR Green PCR kit (Qiagen). To assess specificity, amplification products were subjected to melting curve analysis and gel electrophoresis. The 2- [delta] [delta]Ct method was employed to calculate relative amplification. This was performed using an average of endogenous references (18S, GAPDH, and HPRT1) to improve normalization across the panel of tissues used . For genes where no amplification was detected in a tissue, a Ct value of 35 was assigned, reflecting the maximum number of cycles carried out.
CV: coefficient of variation; SAM: significance analysis of microarrays
VF designed and carried out the bioinformatics analysis and qRT-PCR, and drafted the manuscript. AD participated in data acquisition and normalization. DK obtained, dissected and isolated the cartilage RNA and critically revised the manuscript, ZAC participated in the bioinformatics analysis. ZC participated in generation of microarray data. SN participated in the design and coordination of the study and critically revised the manuscript. DC conceived the study, participated in its design and coordination, performed microarray analysis and critically revised the manuscript. All authors read and approved the final manuscript.
A flow chart illustrating a summary of the analysis. An unsupervised (black arrows) and supervised analysis (blue arrows) were performed with gene expression from 46 U133 2.0 Affymetrix arrays. An independent validation set comprised of 224 Affymetrix arrays (dashed arrows) was also used to test the 1713 genes for the most robust fetal cartilage selective genes.
Supervised analysis and summary of in silico validation. Ranked list of cartilage genes more highly expressed by at least three fold in cartilage than non-cartilage tissues in the training dataset (five fetal cartilage samples compared to 41 normal non-cartilage samples). Ranked order is based on expression profiles obtained from 46 U133 Plus 2.0 arrays analyzed with SAM 2 class analysis with 100 permutations and with a False Discovery Rate (FDR) of 0. (B) Each probeset was independently evaluated in the validation datasets five fold higher expression using independent samples and three independent platforms as outlined in methods. Present indicates probe is identified in validation platform; Enriched indicates gene is expressed five fold higher in cartilage than non-cartilage tissues; Cartilage selectivity indicates at least five fold higher expression in cartilage than non-cartilage with a CV score of < 50% in non-cartilage samples. (C) "X" denotes gene was identified in a fetal cartilage cDNA library.
Tissue distribution training and validation sets. (A) 31 Non-cartilage tissues and 124 arrays were used for the validation of cartilage selective genes identified on the U133A chip. Two fetal cartilage samples were compared against 122 non-cartilage samples. (B) 27 non-cartilage Tissues and 74 arrays were used for the validation of cartilage selective genes identified on the U133B chip. Two fetal cartilage samples were compared against 72 non-cartilage arrays (C) Eight non-cartilage tissues and 26 arrays used for the validation of cartilage selective genes identified on the U133B chip. Five fetal cartilage samples were compared against 72 non-cartilage arrays.
The authors thank Louis Fridkis and Brian O'Connor for their contributions to the CELSIUS database. The authors also thank the UCLA microarray core for assistance with the generation and analysis of the microarray data. This work was supported in part by grants from the NIH (HD22657 and RR00425 to DHC and DK) and (HL072367 and U24NS052108 to SFN) and DK was supported by the Joseph Drown Foundation. AD was supported by a grant from the NSF UCLA-IGERT (DGE-9987641). DHC and DK are recipients of Winnick Family Clinical Scholars Awards.