Cells and cell treatment with SeM
Several LNCaP cell clones with different phenotypic characteristics were established using a limiting dilution approach, as described elsewhere (
1). Parental LNCaP cells and three LNCaP cell clones (Clones 6, 17 and 21) were chosen for the present study based on the differences of these clones in anchorage-independent growth capability.
Cells of each clone and the parental line were plated in T-75 flasks. On day-1 of the experiments, the medium was changed and fresh plain medium (Control) or medium containing 5μM SeM was added to the cell cultures. The cells were cultured for an additional three days, and then harvested on day-4 for RNA isolation.
RNA extraction and cDNA synthesis
RNA was prepared using the Totally RNA™ kit (Ambion, Austin, TX). The precipitated RNA pellets were resuspended in DEPC (0.1% diethylpyrocarbonate)-treated water and stored frozen at −70 °C before use. The RNA samples were thawed on ice, vortexed and an aliquot was diluted in 10 mM Tris, pH 7.5 for quantitation on a GeneQuant™ spectrophotometer (Amersham BioSciences, Piscataway, NJ). The quality of RNA preparation was assessed by the ratio of absorbance readings at wavelengths of 260 and 280 nm followed by visualization of ribosomal 28S and 18S bands either on a 1% agarose gel with ethidium bromide staining or by using an Agilent 2100 BioAnalyzer™ (Agilent, Palo Alto, CA). RNA samples having a 260/280 ratio less than 1.8 were processed with RNAeasy™ columns (Qiagen, Valencia, CA) and an aliquot was re-assessed for quality prior to being used for cDNA synthesis reactions.
The target cDNA was prepared using 10 μg of total RNA. First strand cDNA was synthesized using Superscript II™ Reverse Transcriptase (Life Technologies, Bethesda, MD) and a T7-(dT)24 primer (IDT, Coralville, IA) to incorporate the T7 priming site into the cDNA. Following RNA degradation with RNase H and the second strand cDNA synthesis with DNA polymerase I, the resulting double-stranded cDNA was extracted with phenol, chloroform and isolamyl alcohol (at a volume ratio of 25:24:1). For some samples, cDNA cleanup was performed using the Affymetrix Sample Cleanup Module.
cRNA labeling and hybridization
Approximately 1 μg of cDNA was used as a template for each in vitro transcription assay reaction (Enzo Biochem, New York, NY), which incorporated biotin into the resulting cRNA. cRNA was purified using either Qiagen RNAeasy™ columns or the Affymetrix Sample Cleanup Module. The cRNA was fragmented to a size range of 35–200 bases by incubation at 94 °C for 35 minutes in fragmentation buffer (40 mM Tris, acetate, pH 8.1, 125 mM KOAc, 30 mM MgOAc) prior to use in hybridization. Fifteen μg of fragmented probe was mixed with the Gene Microarray Eukaryotic Hybridization Controls, herring sperm DNA, and acetylated bovine serum albumin (BSA) in hybridization buffer (100 mM MES; 1 M [Na+]; 20 mM EDTA; 0.01% Tween 20). The hybridization mixture was heated at 99 °C for five minutes, incubated at 45 °C for five minutes and then centrifuged at 13,000 x g for 5 minutes. Test microarrays were prehybridized with 200 μl of 1X hybridization buffer for 10 minutes at 45 °C and 60 RPM in the hybridization oven. Following the removal of prehybridization buffer, the microarrays were filled with 200 μl of the hybridization mixture and incubated at 45 °C and 60 RPM for 16 hours.
For each cell clone–treatment combination, a total of three Affymetrix HG U133A microarrays were used. Two of these were technical replicates, in which the same target preparation was split between the two microarrays for hybridization. The third microarray was an independent biological experimental replicate, in which the RNA sample from a separate cell culture experiment was extracted, labeled and hybridized onto the microarray.
Washing, staining and scanning
After the hybridization reaction, the hybridization mixture was removed and saved, and the microarray was filled with 250 μl of non-stringent wash buffer [6X SSPE (1X SSPE contains 0.18 M NaCl, 10 mM NaH2P04, 1 mM EDTA at pH 7.7) and 0.01% Tween 20]. Further washing and staining of the microarrays was conducted on the fluidics station using sequentially the non-stringent washing buffer, stringent washing buffer (100 mM MES, 0.1 M [Na+] and 0.01% Tween 20), and stain buffer (100 mM MES, 1 M [Na+] and 0.05% Tween 20) containing 2 mg/ml acetylated BSA and 10 μg /ml of streptavidin phycoerythrin (SAPE). The signal was amplified by additional treatment with stain buffer (100 mM MES; 1 M [Na+] and 0.05% Tween 20) containing acetylated BSA (2 mg/ml), goat IgG (0.1 mg/ml), biotinylated antibody (3 μg/ml) and a second staining with SAPE.
Each HG U133A microarray was scanned twice at 570 nm using an Agilent confocal scanner. The output fluorescence was obtained using the Affymetrix Microarray Analysis Suite (MAS) 5.0 software and the average of the two scans was used to produce an image file for further data analysis.
Signal processing and Quality Control
The Affymetrix human HG U133A image (DAT) files were processed using the VMAxS® microarray analysis service from ViaLogy, Altadena, CA. VMAxS processing entails determination of presence as well as an absolute expression value for each probe’s feature cell on the HG U133A microarray using QRI active signal processing algorithm. The detected and quantitated individual feature-level expression values are then combined to quantitate a gene-level expression for each probe set on the HG U133A microarray. If a critical mass of features is not detected, the gene is called absent and an expression value of zero is generated. Quality Control was performed initially using global and quantile normalizations across all microarrays. Global normalizations were tested for scaling factors to ensure microarray-to-microarray comparability of overall signal strength. Quantile normalized data were used to generate M (log intensity ratios) versus A (average log intensities) plots to test for the presence of any significant technical problems or anomalies on any of the microarrays. In the case of the individual comparisons, quantile normalizations were performed only across the data being compared in individual small groups. The M versus A plot showed that the difference between the results of the third microarray, which represents an independent biological experimental replicate, and the first and second microarrays, which are technical hybridization replicates using the same target preparation for hybridization, are larger than the difference between the results of the first and second microarrays (data not shown). This is typical of microarray experiments on high density expression arrays due to variation in labeling and hybridization efficacy. Some of this variability can be attributed to variations in starting RNA pooled from independent runs of cell culture experiments.
Pre-processing
For comparison of SeM treatment with the Control treatment, quantile normalization was performed across the six microarrays (three for the SeM treatment and three for the Control treatment), and M versus A plots were generated to ensure even distribution of gene intensities. If a gene is determined by VMAxS® to be absent, it is assigned an expression value of ‘0’. As many post-analysis algorithms using fold change do not work well with values of zero, nor with the zero values modified to a finite low number, genes with one or more zero values across the compared arrays were analyzed using an alternative method, as described below.
Determination of differentially expressed genes
The data were analyzed using three algorithms: 1) SAM as described in Tusher et al (
4), 2) Cyber-T as described in Long et al (
5), and 3) J-score, an algorithm which ranks genes based on the following statistic:
This formula is loosely based on the SAM d-score. For each gene, the natural logarithm of the ratio of gene expression was computed for each of the three untreated samples versus each of the three treated samples. In the above expression, ni is the number of log2 ratios computed for genei, where ri is the absolute value of the mean of the log2 ratios, and si denotes the standard deviation of the ni log2-ratios.
Further processing was conducted to minimize significance bias due to any false positives and false negatives. Results from the three above methods were compiled separately in three lists. Subsequently, the three independent scores were combined to enforce a more stringent significance control during the downstream analysis. The presumptive differentially expressed genes were compared across the three algorithms using a simple ranking scheme. The three individual rankings generated by each algorithm were averaged to generate a single rank, with one exception. When a gene was ranked above a thousand in any of the individual rankings (less significant), the ranking number used to calculate the overall ranking average was kept constant at a thousand. This was implemented to avoid unfairly punishing a gene that for some reason had a bad ranking in one system.
For genes that had one or more absent calls (i.e., they had the intensity value of zero), different rules were applied relating to the number of zero values in each set of replicates, in connection with the fold change observed for the non-zero values. This procedure generated a second list of potential differentially expressed genes.
The differentially expressed genes after SeM treatment, as determined by their final ranking values, were compared across the four cell lines to identify genes affected by the same treatment, and also to identify genes that were affected in a particular cell line, but not in the others.
Mapping to biological processes
Lists of differentially expressed genes were generated and analyzed for the preponderance of genes associated with certain biological processes or molecular functions. The lists, which included both up- and down-regulated genes, were examined using EASE, the Expression Analysis Systematic Explorer from the National Institute of Allergy and Infectious Diseases at the National Institute of Health (
6). EASE Online is available at
http://apps1.niaid.nih.gov/david/upload.asp. Ranked lists according to EASE score were generated, and the resulting biological processes were compared across treatments and cell lines, as above, to determine both shared and differentially affected processes.
Analysis of impact of SeM treatment on gene expression
The impact of SeM treatment on gene expression was determined by regression analysis using the level of gene expression in the control cells as the independent variable and the level of gene expression in the SeM-treated cells as the dependent variable. A separate regression analysis was performed for each of the three LNCaP cell clones and the parental line. Absent genes (genes with absent calls) were not included in the regression analysis since the inclusion of a large number of data pairs with a numeric value of zero would have forced the regression line through zero, thereby skewing the regression line for genes that were expressed.
Comparison of gene expression and SeM response among three LNCaP cells clones and the parental line
The gene expression profile was compared between the three LNCaP cell clones and the parental lines by regression analysis using the level of gene expression in one LNCaP cell clone or the parental line as the independent variable and the level of gene expression in another LNCaP cell clone or the parental line as the dependent variable. A separate regression analysis was performed for each pair of LNCaP cell clones and the parental line. Absent genes were again excluded in the regression analysis to avoid skewing the regression line for genes that were expressed.
The SeM response for each gene was defined as the difference in the gene expression levels between the control cells and the SeM-treated cells. The SeM responses were also compared between the three LNCaP cell clones and the parental lines by regression analysis as described above, and a separate regression analysis was performed for each pair of LNCaP cell clones and the parental line.