Thirteen patients, all fulfilling the American College of Rheumatology classification criteria for RA [1
], were included in this study. Synovial tissues were taken from seven of these patients with erosive, end-stage disease during knee joint replacement surgery at the Department of Orthopedic Surgery, Karolinska University Hospital, Sweden. No further data on the characteristics of this subgroup of patients were available. Synovial tissue was obtained from the other six patients by rheumatic arthroscopy solely for research purposes. The clinical characteristics of these patients (five women and one man) are shown in Table . All six arthroscopic patients were recruited from the outpatient clinic of the Karolinska University Hospital Rheumatology Unit, and all except one (patient 13) had clinical arthritis with effusion in at least one knee joint at the time of the investigation. All patients except one (patient 11) were using the disease-modifying anti-rheumatic drug methotrexate, four in conjunction with low-dose corticosteroids, and all except one were using nonsteroidal anti-inflammatory drugs. Patient 13 had been taking methotrexate for two months; the others had been doing so for more than six months. Patients 10, 11 and 13 had erosive disease. The Ethical Committee at the Karolinska Institute approved the study protocol and all patients gave informed consent.
Clinical data for patients 8 to 13
Synovial tissue, sampling and handling
Orthopedic samples (patients 1 to 7)
Knee joint replacement surgery was performed in accordance with standard procedures, during which three synovial tissue specimens were obtained from random sites and immediately handled by research personnel. Each biopsy was visually inspected to minimize non-inflammatory synovial tissue contamination. Each orthopedic biopsy was then split into two parts, one of which was used for the MA experiment; the other was saved for histochemical analysis (except for biopsy 3 of patient 2 which was used only for MA). After dividing the biospies they were snap frozen (within two minutes) in precooled isopentane and stored at -80°C until further use, to ensure high RNA quality. For patients 1 to 3 each half of a biopsy that was to be used in the MA experiment was further divided into three parts, hereafter referred to as sub-biopsies. In total this resulted in 39 specimens from the orthopedic patients (nine sub-biopsies from patients 1 to 3 and three biopsies from each of patients 4 to 7). The average weight of the biopsies from patients 1 to 3 before division into three parts was 99 mg, yielding an average sub-biopsy weight of 33 mg. The average weight of the biopsies from patients 4 to 7 was 29 mg.
Arthroscopic samples (patients 8 to 13)
Rheumatic arthroscopy was performed by a technique previously described [31
], including biopsy site-scoring for signs of inflammation (vascularity and proliferation), photography and mapping according to local standards (not shown). Biopsies were taken at the site of inflammation close to cartilage or not close to cartilage, defined as less than or more than 1.5 cm away from cartilage, respectively. Multiple biopsies were taken from two sites in patients 8, 11, 12 and 13 and from four sites in patients 9 and 10. The samples were frozen, as above, within two minutes. The average weight of the arthroscopic samples was 19 mg.
Frozen biopsies were embedded in Optimal Cutting Temperature (OCT; Tissue-Tek, SAKURA Finetek, Zoeterwoude, Netherlands) and cut with a cryostat into 7 μm thick. Sections were placed on SuperFrost®Plus slides (Menzel-Gläser, Braunschweig, Germany) and air-dried for 30 minutes, then stained with Mayer's hematoxylin and eosin to confirm the histopathology of each biopsy.
RNA was successfully extracted from all biopsies except biopsy 3 of patient 7, in which the RNA was degraded. For both types of biopsy (arthroscopic and orthopedic) one biopsy yielded enough RNA to perform MA experiments. To extract the RNA the biopsies were placed in steel-bead matrix tubes (Lysing Matrix D; Qbiogene, Irvine, CA, USA) containing buffer (600 μl of phenol, 600 μl RLT of buffer from an RNeasy kit (Qiagen, Hilden, Germany) and 0.6 μl of 2-mercaptoethanol) and homogenized with a tabletop FastPrep homogenizer (Qbiogene). The tubes were shaken for 30 seconds at speed setting 6 and then put on ice for 30 seconds. This procedure was repeated four times to ensure thorough homogenization. The tubes were then centrifuged for 5 minutes at 12,000 r.p.m. All steps up to this point were performed at 4°C. The water phase was collected and transferred to Qiashredder (Qiagen) columns and centrifuged (13,000 r.p.m. at room temperature) for two minutes to ensure complete homogenization. To the flow-through was added 600 μl of 70% ethanol.
The mixture was loaded onto RNeasy spin columns (Qiagen) and centrifuged for 15 seconds at 13,000 r.p.m. An RNeasy kit from Qiagen was used to wash and elute the extracted RNA (in 30 μl of RNase-free water). Before eluting the RNA the columns were treated with 2 units of DNAse H (Omega Biotech, Victoria, Canada) for 15 minutes at room temperature to remove residual DNA contamination. For further details see 'Preparation of RNA from tissues' at the KTH microarray core facility web site [32
], under 'Protocols'. The average biopsy weight used for RNA extraction was 28.7 mg and the average RNA yield was 411.4 ng/μl in 30 μl. All concentration measurements were made with the Nanodrop (Nanodrop Technologies, Wilmington, DE USA). RNA quality was ensured with the RNA 6000 Nano LabChip kit of the Bioanalyzer system (Agilent Technologies, Palo Alto, CA, USA) where pass or fail judgments were based on an evaluation of Bioanalyzer electropherograms [33
]. Two samples showed signs of partial degradation (patient 3, biopsy 3, and patient 6, biopsy 1). The average ratio of 28S to 18S rRNA among the remaining samples was 1.6.
Because of the small amounts of RNA extracted, the RNA was amplified with a RiboAmp RNA amplification kit (Arcturus, Mountain View, CA, USA). RiboAmp uses T7-based in vitro transcription to generate amplified RNA (aRNA), the bulk of which consists of sequences 250 to 1,800 base pairs long. Total RNA (300 ng to 1 μg) was used in each RNA amplification, and the average yield was 503 ng/μl in 11 μl of water.
Universal Human Reference RNA from Stratagene (La Jolla, CA, USA) was used as reference RNA and was amplified in the same manner as the sample RNA. The reference RNA was pooled before use for hybridization.
Labeling and cDNA synthesis
To prime the reaction, 1 μl of random hexamer primer (5 μg/μl; Operon, Alameda CA, USA) was added to 1 μg of amplified aRNA. The volume was adjusted to 18.4 μl with RNase-free water. The sample was mixed and incubated for 10 minutes at 70°C to denature the aRNA, then incubated for a further 5 minutes on ice and centrifuged briefly. A cDNA synthesis mixture (11.6 μl) consisting of 6 μl of 5× first-strand buffer, 3 μl of 0.1 M dithiothreitol, 2 μl of Superscript III (Invitrogen, San Diego, CA, USA) and 0.6 μl of 50× aa-dUTP+dNTP mix (Sigma-Aldrich, St. Louis, MO, USA) was added to each sample. The whole mixture was gently mixed by pipetting and incubated at 25°C. After 10 minutes at this temperature the mixture was incubated at 46°C for a further 2 hours. To terminate the reaction and to hydrolyze the RNA strand, 3 μl of 0.2 M EDTA pH 8.0 and 4.5 μl of 1 M NaOH were added. The sample was vortex-mixed briefly, incubated for 15 minutes at 70°C, cooled to room temperature and centrifuged briefly. Then 4.5 μl of 1 M HCl was added to restore the pH to neutrality. The sample was vortex-mixed and centrifuged briefly.
The cDNA was purified and eluted by the following procedure. First, 60 μl of water and 500 μl of PB buffer (MinElute Reaction Cleanup Kit; Qiagen) were added. The mixture was thoroughly mixed and transferred to a MinElute Reaction Cleanup Kit spin column and centrifuged for 30 seconds at 13,000 r.p.m. The flow-through was reapplied to the column and the centrifugation step was repeated. The flow-through was discarded and 650 μl of 80% ethanol was added to the column. The column was centrifuged for 30 seconds at 13,000 r.p.m. and the flow-through was again discarded. The ethanol wash step was repeated and then the membrane was dried by centrifugation for 1 minute at 13,000 r.p.m. The column was transferred to a new tube, and 10 μl of 100 mM NaHCO3 pH 9.0 was added. The column was then incubated for 1 minute at room temperature; the sample was eluted by centrifugation for 30 seconds at 13,000 r.p.m. The elution step was repeated after a further addition of 10 μl of 100 mM NaHCO3 pH 9.0 to ensure high yield.
To couple fluorophores, the eluate was mixed with a dried aliquot of either Cy3 or Cy5 mono-reactive esters (Amersham-Biosciences, Little Chalfont, Bucks., UK) and incubated for 30 minutes at room temperature in a dark container, after which 70 μl of water and 500 μl of PB buffer were added. The mixture was thoroughly mixed and transferred to a MinElute Reaction Cleanup Kit spin column, which was centrifuged for 30 seconds at 13,000 r.p.m. The flow-through was reapplied to the column and the centrifugation step was repeated. The flow-through was discarded and 650 μl of PE buffer (MinElute Reaction Cleanup Kit) was added. The column was centrifuged for 30 seconds at 13,000 r.p.m. and the flow-through was discarded. The wash step was repeated and then the membrane was washed by centrifugation for 1 minute at 13,000 r.p.m. The column was transferred to a new tube, 10 μl of EB buffer (MinElute Reaction Cleanup Kit) was added, and the column was incubated for 1 minute at room temperature. The sample was eluted by centrifugation for 30 seconds at 13,000 r.p.m. The elution step was repeated after a further addition of 10 μl of EB buffer, to ensure high yield. The concentrations of the incorporated fluorophore and cDNA were measured with the Nanodrop to confirm success in the labeling reaction. The sample was then ready for hybridization. For further information about the preparation of N
-hydroxysuccinimide-ester fluorophores and indirect labeling of cDNA see SOP 001 and SOP 002 at the KTH microarray core facility web site [32
] under 'Protocols'.
The cDNA arrays used in this study were produced at the KTH microarray core facility. The clones on the array originate from the first 310 96-well plates of a commercial clone collection containing 46,000 sequence-verified human cDNA clones (Research Genetics; now Invitrogen). The clones have been prepared by cell culture, plasmid preparation, PCR amplification and purification with PCR filter plates (Millipore, Bedford MA, USA). The cDNA was spotted in 30% dimethylsulphoxide onto UltraGAPS slides (Corning, NY, USA) with a QArray spotter (Genetix, Hampshire, UK) with 24 SMP2.5 pins (Telechem, Sunnyvale, CA, USA) in 48 blocks, each of which contained 25 × 25 clones, spotted with a center-to-center distance of 170 μm, non-specifically attached to the surface by UV crosslinking. According to a UniGene mapping performed in September 2004 based on GenBank accession numbers (29,717 on the whole chip), 25,087 of the 30,000 spots on the chip have a UniGene ID and 16,164 of those were unique. For more information about the chip see the KTH microarray core facility web site [32
] under 'HUM 30k cDNA array'.
After prehybridizing of the slides for 30 minutes at 42°C in prehybridization buffer consisting of 1% BSA (Sigma-Aldrich), 5 × SSC (where SSC consists of 0.15 M NaCl and 0.015 M sodium citrate) and 0.1% SDS, the arrays were washed, first in a trough containing water and then in a trough containing propan-2-ol, and centrifuged dry. After that the samples (one labeled with Cy5 and the other labeled with Cy3) were pooled and dried to a volume of 13.6 μl. Hybridization buffer consisting of 5 × SSC, 50% formamide (Sigma-Aldrich), 0.1% SDS and 0.2 μg/μl Cot-1 DNA (Invitrogen) was added to the pooled samples to a final volume of 64.5 μl. The hybridization mixture was then denatured for 3 minutes at 95°C and cooled for 2 minutes on ice before being applied to the array. Lifter-slips (Erie Scientific Company, Shelton, CT, USA) were used to contain the hybridization mixture on the array during hybridization. The arrays were then placed, in hybridization chambers (Corning), in a water bath at 42°C for 14 to 18 hours. After hybridization the slides were washed once for 5 minutes at 42°C in wash buffer 1 (2 × SSC, 0.1% SDS), once for 5 minutes in wash buffer 2 (0.1 × SSC, 0.1% SDS) and five times, for 1 minute each, in wash buffer 3 (0.1 × SSC). The slides were then centrifuged dry and scanned. For further information about the hybridization see SOP 003 at the KTH microarray core facility web site [32
] under 'Protocols'.
Scanning and image processing
An Agilent G2565BA scanner was used to scan the slides and acquire 50-megabyte TIFF images. The scanner resolution was set at 10 μm. GenePix 18.104.22.168 (Axon Instruments, Foster City, CA, USA) was used to extract the raw signals from the TIFF images and to assign each spot an ID. Spots defined as 'not found' by GenePix were flagged with a negative flag (- 50) and removed downstream in the analysis. Spots with clearly abnormal morphology due to dust particles or other factors were manually flagged as bad (- 100) and were also removed in downstream analysis. No further processing of the slides was performed in GenePix. The data are available at ArrayExpress, a public repository for MA data (accession number E-MEXP-367) [34
The data were analyzed mainly with the help of packages in R [35
] except for the Expression Analysis Systematic Explorer (EASE) analysis [36
], which was performed in MEV [37
]. EASE uses the hierarchical structure of gene ontology (GO) [38
] to find biological themes among sets of differentially expressed (DE) genes. Each GO category is given an EASE score, which is a conservative adjustment of Fisher's exact probability [39
] in which Fisher's exact probability is jackknifed to weigh significance in favor of GO categories supported by many genes. R is a language and environment for statistical computing and graphics. The packages that were used in R were LIMMA [40
], aroma [41
], the KTH package [32
] and bioconductor [42
]. All operations performed on the data in R during analysis can be accomplished with these packages. After the result files (gpr) produced by GenePix had been imported into R, unreliable spots with abnormal physical properties were removed with four filters:
1. filterFlags, which removes spots flagged as not found or absent in GenePix.
2. filterSaturated, which removes spots saturated in both cy5 and cy3.
3. filterB2SD, which removes spots in which 70% of the pixels have below background intensity + 2 standard deviations.
4. filterSize, which removes spots that are enlarged due to spotting artefacts.
On average 4,030 spots were removed by the filters, leaving about 25,970 spots for downstream analysis. More information about the filters can be found at the KTH package web site [32
]. After filtering, the slides were normalized with print-tip loess (local regression) normalization [43
]. To identify DE genes a parametric empirical Bayes approach implemented in LIMMA [40
] was used. This test statistic will assign a score (B
-score) to each gene. The B
-score was used to rank the genes so that the gene with the highest score has the highest probability of being DE. When differences were being investigated, two criteria had to be fulfilled for a gene to be regarded as DE: the genes had to have a B
-score of more than 0 and an |M
-value| of more than 1 (an M
-value is the second logarithm of the fold change [43
]). When the DE genes were used to cluster the biopsies a third criterion was added: a gene had to occur in at least two or more comparisons between biopsies (regardless of patient) to be used for clustering. This was done to remove noise, because no biological replication was possible. When we were identifying DE genes between tissues with an overrepresentation of adipose cells versus all the other biopsies, we defined a gene as DE if the gene had a B
-score of more than 20. Here samples from all patients were used, allowing the approximation of the different parameters used for the test statistics, for example the standard error, to be improved, and thus the B
-scores to be high. A cutoff was therefore set at a B
-score of more than 20 to investigate a reasonable number of genes with the highest ranking in this comparison.
A moderated t
] was performed in parallel, with the use of a false discovery rate [45
] correction for multiple testing. Technical replicates were dealt with in different ways depending on the comparison in question. When three different levels of replicates were available (for example, for patients 1 to 3 the levels were technical replicates (i), multiple samples from each biopsy (sub-biopsies) (ii) and the biopsy (iii) itself), instead of just taking the average of technical replicates, we used the duplicateCorrelation function [46
] available in LIMMA [40
] to acquire an approximation of gene-by-gene variance. This retains valuable information about the variance when fitting a linear model to the data so as to identify DE genes. When four levels were available, as when testing for DE genes between patients 1 and 3 (technical replicates, sub-biopsies, biopsies and patients), the first level was averaged and duplicateCorrelation was used for the sub-biopsies and biopsies. When only two levels of replicates were available (for example when testing for DE genes between biopsies in patient 4 there were technical replicates of each biopsy that was tested) the replicates were all treated in the same way.
Several hierarchical clusterings were performed [47
], in which 1 minus the Pearson correlation was used as the distance measure. When creating the agglomerative dendrogram the average distance between each cluster was used. To evaluate the structure the clustering algorithm imposes on the data the cophenetic (coph) correlation coefficient [48
] was determined. This measures how well the hierarchical structure from the dendrogram represents the actual distances; coph = 1 indicates perfect representation, whereas coph = 0 indicates no representation. To facilitate color representation in the hierarchical clustering the (log2
) expression value for each gene in each of the biopsies was adjusted by subtracting the respective mean log2
expression value across all biopsies.
Two series of hybridizations were performed in this study. The design chosen for both series was a common reference design that allows both the identification of DE genes in different contexts and unsupervised classification, such as hierarchical clustering. Each hybridization was performed with a technical replicate (the same amount of RNA taken from the same amplified RNA aliquot labeled in two separate reactions and hybridized onto two separate arrays). The average correlation between the M-values of technical replicates was 0.97. In the first series (Figure ), aRNA from the orthopedic sub-biopsies/biopsies (depending on the patient they were obtained from; see above) was labeled and hybridized in duplicate versus the reference (also amplified). In all, 76 hybridizations were performed in this series. In the second series (Figure ), aRNA from the arthroscopic biopsies was labeled and hybridized in duplicate versus the reference. A total of 32 hybridizations were performed in this series. The gene expression data were filtered and normalized as described above.
Figure 1 Experimental design. (a) Orthopedic samples. Three orthopedic biopsies (b1, b2, and b3) were taken from each of seven patients at random sites of the synovium. For patients 1 to 3 each biopsy was split into three parts. RNA from each sub-biopsy was hybridized (more ...)