|Home | About | Journals | Submit | Contact Us | Français|
The deciphering of the sequence of the human genome has raised the expectation of unravelling the specific role of each gene in physiology and pathology. High-throughput technologies for gene expression profiling provide the first practical basis for applying this information. In rheumatology, with its many diseases of unknown pathogenesis and puzzling inflammatory aspects, these advances appear to promise a significant advance towards the identification of leading mechanisms of pathology. Expression patterns reflect the complexity of the molecular processes and are expected to provide the molecular basis for specific diagnosis, therapeutic stratification, long-term monitoring and prognostic evaluation. Identification of the molecular networks will help in the discovery of appropriate drug targets, and permit focusing on the most effective and least toxic compounds. Current limitations in screening technologies, experimental strategies and bioinformatic interpretation will shortly be overcome by the rapid development in this field. However, gene expression profiling, by its nature, will not provide biochemical information on functional activities of proteins and might only in part reflect underlying genetic dysfunction. Genomic and proteomic technologies will therefore be complementary in their scientific and clinical application.
Inflammatory rheumatic diseases are among the greatest diagnostic challenges in modern medicine. Especially in early cases there are usually no pathognomonic markers such as distinct clinical features, specific morphological changes by imaging or typical serological markers. Similarly to malignant situations, however, early diagnosis is essential to avoid destructive processes that will lead to a severely reduced quality of life, early invalidity and premature death.
In view of the limitations in clinical rheumatology, expectations of genomics are high. Gene expression profiling has opened new avenues. Instead of single or a handful of candidates, tens of thousands of different genes can be investigated at a given time. This technology is currently the most advanced and comprehensive approach to screening gene activity as well as molecular networks and has already been used in several clinical studies in rheumatic diseases. Although moving at a slower pace, proteome analyses are also rapidly improving and might provide further insight beyond the capabilities of transcriptome information. Furthermore, genome mutations predisposing for rheumatic diseases might help in both diagnosis and prognosis of the disease .
Clinical questions and expectations focus on molecular markers or profiles for initial diagnosis . Early diagnosis, as mentioned, is critical; gene expression profiles at this initial phase of the disease might provide valuable information on triggering mechanisms. Assessment of disease activity including organ involvement or destruction is currently limited to general markers of inflammation or organ function and needs profound improvement. On the basis of gene expression profiles from an initial molecular assessment of a patient, we expect to identify subclasses or different stages of the diseases with relevance to the therapeutic decision. As in only few other diseases, our therapeutic anti-rheumatic armamentarium has been greatly enlarged by modern approaches of combination therapies, which include the usage of biologics (namely, cytokine antagonists). Nevertheless, these modern strategies are effective only in a proportion of patients, potentially make the patients more prone to infections and represent an enormous economic burden to the health care system. Careful diagnostic stratification will therefore be crucial. Once therapy has been initiated, monitoring of effectiveness and responsiveness is essential and is currently dominated by scores derived from physical examination . Molecular measures are needed that define the quantity and quality of responsiveness to adjust the dosage or change the drug. Profiles might also give a clue to identifying toxic side effects and adverse events such as infectious complications. Prognostic molecular markers might arise from long-term studies by correlating initial expression profiles with the individual outcome.
From a pharmaceutical point of view, unravelling the molecular puzzle of rheumatic diseases might lead to the discovery of the dominant pathways in this network and provide novel targets for drug development. Current therapies in rheumatic diseases focus predominantly on the suppression of inflammation. However, destructive processes and loss of function, as in lupus nephritis or arthritic cartilage invasion and bone resorption, also demand the identification of targets to directly inhibit destruction and/or to induce regeneration and repair. A deeper knowledge of pathophysiological networks and gene expression profiling during drug development will facilitate the selection of the most effective and the least toxic compounds, thereby reducing costs and bringing new drugs to clinical application at an earlier stage.
To fulfil all these expectations, systematic analyses, collating of information and development of molecular network models will be essential and will provide the basis for functional interpretation.
An initial work by Heller and colleagues  introduced a customised array of 96 genes, demonstrating the usefulness of arrays in the analysis of inflammatory diseases such as rheumatoid arthritis (RA). Basing their work on a specific selection of genes, they identified in synovial tissue samples from RA the expression of the matrix metalloproteinases stromelysin 1, collagenase 1, gelatinase A and human matrix metallo-elastase, TIMP (tissue inhibitor of metalloproteinases) 1 and 3, interleukin (IL)-6, vascular cell adhesion molecule and discernible levels of monocyte chemotactic protein (MCP)-1, migration inhibitory factor and RANTES.
More advanced platform technologies with many thousands of genes up to genome-wide arrays have been applied in recent studies, aiming for new candidates, functional mechanisms and diagnostic patterns. Comparing autoimmune diseases with the response to influenza vaccination in healthy donors, Maas and colleagues investigated peripheral blood mononuclear cells (PMBCs) from patients with RA, systemic lupus erythematosus (SLE), type I diabetes and multiple sclerosis . Genes differentially expressed after vaccination were compared with the profiles of the four autoimmune groups. A panel of genes was extracted that discriminated between normal immune and autoimmune responses. However, the investigators could not identify genes that distinguished between different autoimmune diseases. Their candidates were predominantly genes involved in apoptosis, cell cycle progression, cell differentiation and cell migration, but not necessarily in the immune response. They further developed an algorithm to identify patients with these autoimmune diseases. Because this algorithm also sorted relatives of patients with autoimmune diseases to the disease group, the authors speculated that their gene selection might reflect a genetic trait rather than the disease process.
Gene expression profiling in lupus was reviewed recently in detail by Crow and Wohlgemuth . Four different groups [6-9] have independently identified an interferon signature by analysing PBMCs. One group  confirmed these findings by comparing the patients' profiles with in vitro-induced interferon (IFN)-α, IFN-β or IFN-γ signatures in PBMCs from healthy donors. This attributed 23 of 161 genes to induction by IFN. In addition to the IFN signature, Bennett and colleagues  found the differential expression of granulopoietic genes. As Ficoll separation usually excludes granulocytes, they became aware of a subpopulation of granular cells, which was co-separated only in SLE. These were identified as cells of the myeloid lineage, ranging from promyelocytes to segmented neutrophils.
Gu and colleagues  investigated PBMCs from spondyloarthropathies, RA and psoriatic arthritis on a 588-gene commercial platform. Their dominant candidates included MNDA, a myeloid nuclear differentiation antigen, two members of the S100 family of proteins, calgranulin A and B (involved in cellular processes such as cycle progression and differentiation), JAK3 and mitogen-activated protein kinase p38, tumour necrosis factor (TNF) receptors, the chemokine receptors CCR1 and CXCR4 and also IL-1β and IL-8. Because stromal cell-derived factor-1 (SDF-1), the ligand of CXCR4, was found increased in the synovial fluids of arthritides, the authors suggested an important role of this chemotactic axis in spondyloarthropathies and RA. In our studies on highly purified separated cells, these genes revealed the highest expression level in neutrophil granulocytes in comparison with cells positive for CD14, CD4 and CD8. In view of the findings by Bennett and colleagues  that granulocytes might be co-separated with PBMCs in inflammatory diseases such as SLE, these data need further confirmation.
Van der Pouw Kraan and colleagues investigated synovial tissue samples from RA and osteoarthritis (OA) [11,12]. Basing their decision on molecular profiles, they divided their RA samples into three subgroups: first, immune-related processes; second, complement-related activities with fibroblast dedifferentiation; and third, processes of tissue remodelling. Their analyses also reflect the established histological classification of RA into different subgroups, which is in part based on cellular composition . Furthermore, the STAT1 pathway was identified as being associated with immune-related processes. Our own data on synovial tissues, which were established on a different technology platform, confirm many of these findings . We also identified that some of the processes, especially those associated with tissue remodelling, are also active in OA compared with normal tissues .
A similar tissue-based approach showed various inflammatory genes to be upregulated in chronic inflammation of periprosthetic membranes of RA and OA patients in the process of prosthetic loosening .
To overcome the problem of unspecific dilution and to allow the histological association of complete profiles, Judex and colleagues  have presented an initial study on gene expression analysis of laser-microdissected areas from synovial tissues. They have been able to extract sufficient RNA from as few as 600 cells to perform subsequent array analysis.
In contrast, in vitro studies on isolated synovial fibroblasts from RA patients are well established. Pierer and colleagues  have investigated profiles of synoviocytes on a functional basis by stimulation through Toll-like receptor 2 with Staphylococcus aureus peptidoglycan. Their focus on chemokines revealed a preferential activation of granulocyte chemotactic protein (GCP)-2, RANTES, MCP-2, IL-8 and GRO2. Functional dependence on NF-κB for the induction of MCP-2, RANTES and GCP-2 was confirmed by inhibition experiments. Chemotactic importance for monocyte migration was demonstrated for RANTES and MCP-2, and for T-cell migration only for RANTES. The expression of GCP-2 and MCP-2, which have not yet been investigated in RA, was identified in both synovial tissue and synovial fluid.
Besides the application in human studies, gene expression profiling was also performed in arthritis models. Wester and colleagues  investigated the effect of pristan-induced arthritis in DA rats in comparison with resistant E3 rats. The authors compared two different array platforms for a selected number of genes and also used pooled samples. They demonstrated variable cellular composition of the lymph nodes by fluorescence-activated cell sorting and identified only a relatively small number of genes that were differentially expressed, including mRNA for major histocompatibility complex class II antigen, immunoglobulins, CD28, mast cell protease 1, gelatinase B, carboxylesterase precursor, K-cadherin, cyclin G1, DNA polymerase and the tumour-associated glycoprotein E4.
By expression profiling in experimental SLE of NZB/W mice, Alexander and colleagues  identified endogenous retroviral transcripts in kidney tissue as the highest differentially expressed genes. Results were confirmed by in situ hybridisation, demonstrating retroviral transcripts in renal tubules and also in brain and lung tissue.
Azuma and colleagues used microarrays for the detection of new candidates in salivary gland tissue from the MLR/MpJ-lpr/lpr (MRL/lpr) mouse as a model of human secondary Sjögren's syndrome . From nine genes, which were confirmed by reverse transcriptase polymerase chain reaction (PCR), five had been already identified in patients with Sjögren's syndrome.
Firneisz and colleagues  used gene expression profiling in two genetically different arthritis mouse models [23,24] to identify genes involved in both models. Subsequently, they computed the spatial autocorrelation function, a statistical technique used in astrophysics, and identified critical clustering of selected genes in the two different genetic backgrounds of these mice.
Aidinis and colleagues  investigated immortalised synovial fibroblasts from human (h)TNF transgenic mice by microarray and differential display technology. Microarrays revealed 372 differentially regulated genes, whereas differential display provided many unknown sequences and a total of 49 different genes and sequences. Only 20% (n = 11) of these were represented on the mouse array. The significance of regulation was only partly confirmed, and one gene (SPARC) was identified as being regulated in both but in opposite directions. Functional clustering of all differentially regulated genes in either of the two methods revealed genes involved in stress response, energy production, transcription, RNA processing, protein synthesis and degradation, growth control, adhesion, cytoskeletal organisation, Ca2+ binding and antigen presentation.
As summarised in this short overview, gene expression profiling with microarrays has been applied in recent work to the identification of either diagnostic algorithms or new candidates and pathomechanisms, to functional studies in mouse and in vitro models, and to the calculation of potential genomic clusters associated with the disease. In a few studies different technologies or platforms were compared. In all studies, confirmation analysis was possible only for a limited number of genes. Concordance or divergence of results can therefore be estimated only from the selection of genes published in more detail.
Up to now, gene expression profiling has given only a first suggestion of candidates. It is still impossible to interpret comprehensively this overwhelming flood of data and the puzzling complexity of as yet insufficiently characterised molecular networks. Different platform technologies further complicate comparability. Nevertheless, the publication of results achieved with the current state of methodology is essential in the exchange and development of different approaches to gene expression profiling and in comparing selected candidates. This will improve our concepts to overcome the problems and limitations arising from this technology.
Array technologies and statistical algorithms, as they are established today, provide measures for signal intensities and differences on the basis of the abundance of mRNA in a given sample. In RA, the current results of array analyses  would not necessarily direct drug development towards the most favourable therapeutic targets such as TNF and IL-1. In SLE, an interferon signature was identified; however, indirect signs were detectable but not the cytokine itself . In contrast, genes of highly abundant proteins such as immunoglobulins, collagens and matrix metalloproteinases were readily identified by array analysis. Furthermore, the mRNA species of many cell surface receptors were also identified. These observations suggest that RNA abundance and detection by array techniques might be related to the functional category to which a gene belongs. This would be of special relevance to diagnostic and pathophysiological interpretation and therefore important to current limitations and perspectives.
Concerning the lack of detection of TNF, IL-1 or interferon as candidates for important regulators of pathomechanisms, the following possibilities might explain such limitations: first, the array hybridisation techniques might not be sensitive enough; second, signals derived from a defined cell population might be diluted below the threshold of significance in the complex tissues; or third, the stimulation might have occurred at a different location or time, leaving only its signature as an indirect sign of the activated pathway.
The application of purification or microdissection techniques might therefore increase sensitivity and improve our insight into the regulatory networks of important immune regulators. However, purification techniques might introduce artefacts. As an alternative approach, similar to Baechler's confirmation of the interferon signature, comparison with cytokine-induced gene expression signatures might provide an indirect measure for the activation of the TNF or IL-1 pathway.
Besides the cytokines, genes of the intracellular signalling cascade are also important for the understanding of pathophysiology and might be relevant to drug targeting. Dependent on cell type and function, such proteins might be expressed from very low to relatively high basal levels. Upregulation of these genes might not exceed a certain limit of expression because protein concentration will quickly increase in the small intracellular compartment, where they act. Furthermore, the function of these factors is mostly regulated at the protein level. Therefore, in this category of molecules, detection of the quantitatively limited differences is also very difficult. Signals might be diluted and become undetectable if activation occurred in a localised manner. Differential expression between infiltrating and tissue cells might also confuse interpretation and falsely indicate regulation, especially when cellular composition is variable. This might also be crucial for separation procedures, when variable quantities of cells with different profiles remain as contaminants.
On the basis of these findings and general considerations, it is currently almost impossible for many signalling processes to become readily obvious as being truly regulated. A different cellular composition resulting from infiltration is inherent in the inflammatory processes analysed in rheumatology. Parameters that reflect this cellular composition and functional components might need to be introduced into the analysis to improve interpretation. The fact that molecular profiles enabled the identification of an unexpected subpopulation in PBMCs by Bennett and colleagues encourages one to believe in the possibility of identifying parameters for a molecular differential blood count or tissue composition. Thus, many of the currently published data will merit reevaluation when improved technologies of interpretation become available.
An extensive review of microarrays by Grant and colleagues  describes the general features of spotting and photolithography array technology as well as the general tools for bioinformatic analysis of these arrays. Rapid advances in this field have brought new technologies to the market. These include PCR arrays , bead arrays  and bioelectronic sensors [29-31].
Concerning the different slide or wafer-based array technologies, reproducibility and quality have undergone constant improvement for all platforms. Although photolithographic technology is currently highly efficient for genome-wide array analysis, new surfaces provided in the context of spotting technology might improve sensitivity . Gene expression profiles of only up to a few hundred genes might be determined more rapidly, with increased sensitivity and less expense, with the use of real-time PCR technology prefabricated on a card system with up to 384 different reactions.
In addition, with a relatively low investment, with less working time and with applicability to DNA  as well as protein or antibody screening, bead array systems can currently detect many hundreds of different products even from very small sample volumes. For example, Cook and colleagues  have applied this system to the detection of six different cytokines at the protein level in tears from allergic patients.
The new evolving detection methods based on bioelectronic sensors are forming an electronic circuit mediated only by nucleic acid hybridisation. This very intriguing approach, which is currently applicable to DNA detection and mutation analysis, might soon become applicable to the quantification of cDNA. This system is currently established for only a few DNA species. With low investment and convenient application, this system inherits the potential to be developed for a cost-effective bedside test.
Molecular profiles of previously published experiments are extremely complex. Bioinformatics has long been focusing on the technical challenges and the enormous amount of data from image analysis (millions of pixels per image) and comparisons of genes (several hundreds of thousands). Many efforts to distinguish signals from background and to identify and eliminate artefacts have now created high-quality platforms. Many algorithms to identify differential gene expression and to group similarities together have been established, using different types of distance measures, statistics and cluster methods . Supervised clustering, neuronal networks and classification algorithms might provide astonishing results [36-38].
However, these technologies are also regarded as black boxes by many clinical investigators, as leading away from understanding the principles of gene selection and disregarding established clinical experience or previous molecular knowledge. It is now becoming more than obvious that bioinformatics depends essentially on a basic knowledge of biology. 'Systems biology', 'molecular networks', 'biochemical systems theory'  and other meaningful terms have been used to express this basic need for a functional understanding of molecular mechanisms in biology. Our molecular knowledge – especially of a rheumatological background – has to be systematically collected and organised to make this information retrievable. Gene ontologies (GO) and functional networks in the KEGG or GenMAPP databases are still in their infancy. Interpretation in rheumatology is restricted to the personal knowledge and investigative capacity of the scientists and is susceptible to misinterpretation.
In the face of our limited knowledge of the role and function of most of the genes that we discover in our experiments, strategies for systematic investigation are essential. Gene expression profiling will essentially depend on valid statistical methods for estimating the reliability of gene selection. A combined analysis of molecular and clinical data will be necessary. Functional data need to be integrated into our interpretation to identify the key molecules that connect the network and define the boundaries of different phenotypes of the system . These will allow models to develop and will reduce our screening and analysis efforts to the principal components and actors.
Suggestions by Firestein and Pisetsky  underline the importance of an understandable and reproducible bioinformatics approach. Most software packages have now reached a level that provides enough statistical power for basic comparative analyses. Platform technologies are becoming increasingly available from professional suppliers and are achieving high reproducibility. Currently evolving high-throughput technologies that confirm gene expression profiling on a functional basis, such as protein, tissue  or cell arrays, are still limited to a few representative candidates. Analysis of defined cell populations will provide cornerstones to our view of systems biology but will not provide sufficient insight into the networks of functional units consisting of different interacting cells and organ systems.
Intelligent strategies will therefore be necessary, making use of the currently most advanced capabilities in gene expression profiling. Besides the principal limitations of mRNA quantification in comparison with proteomics and of functional interpretation, there are currently two general hurdles: a mixture of profiles from different cell types, and a mixture of profiles derived from different stimuli or functional processes. As in routine laboratory analysis, standards and ranges need to be defined to distinguish between different molecular phenotypes on an individual basis.
Defining signatures as specific patterns derived from singular functional or cellular entities, signatures of highly purified leucocyte cell types  and precisely defined cellular stimulation (for example, stimulation by Toll-like receptor 2 in synovial fibroblasts ) contribute to the establishment of such a systematic data collection for referencing. On the basis of such referenced information, algorithms have to be established that are able to identify the contribution of each cellular and functional component to the complex profile of an individual sample (Fig. (Fig.11).
To achieve such a general approach, it is indispensable to adhere to the standardisation of techniques, to intensify collaborations between different expert groups, to share array data as raw data and to define guidelines of good scientific practice to respect and honour individual contributions to such databases, which must be made publicly accessible .
Gene expression profiling provides a completely new approach to rheumatology research. As an interdisciplinary technology it has stimulated fruitful collaboration between experts in array technology, bioinformatics, immunology and rheumatology. The molecular overview by genome-wide profiles has revealed that many questions arise that demand careful standardisation and validation. The complexity of clinical samples has also initiated new experimental strategies to dissect cellular and functional signatures and to enable the interpretation of profiles from each patient individually. To accomplish this approach, data sharing and collectively developed knowledge bases in rheumatology for data mining will markedly accelerate this time-consuming process and will also open new avenues for many established models and many as yet unanalysed disease entities in rheumatology.
GCP = granulocyte chemotactic protein; IFN = interferon; IL = interleukin; MCP = monocyte chemotactic protein; OA = osteoarthritis; PBMC = peripheral blood mononuclear cell; PCR = polymerase chain reaction; RA = rheumatoid arthritis; SLE = systemic lupus erythematosus; TNF = tumour necrosis factor.
We thank Christian Kaps PhD, Oligene GmbH, Berlin, for fruitful discussion. Perspectives and strategies result from our current work, which is supported by the BMBF grants 01GS0110 and 01GS0160.