4.1 Chemicals
Chemicals used in assay development/testing were obtained from the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) via G. Bittner (Plastipure, Austin, TX). were received in solid form and 1μM stock solutions prepared in 100% ethanol. Other chemicals were obtained from Sigma-Aldrich (St. Louis, MO) unless stated otherwise.
4.2 Cell line generation and maintenance
The ER:PRL-HeLa cell line was generated by stable introduction of GFP-tagged ERα into the previously described PRL-HeLa cell line (
Sharp et al., 2006). Briefly, HeLa cells were cotransfected with p52X-PRL-DS-Red2-SKL and pTKHygro (Clontech), and selection was carried out in 400μg/ml hygromycin. Transient cotransfection with GFP-ER and GFP-Pit-1 resulted in the presence of GFP signal in the nucleus with either 1 or 2 bright intranuclear foci of fluorescence that indicated the integration of the p52X-PRL-dsRED2-SKL plasmid (confirmed by DNA FISH (
Sharp et al., 2006). The clone HeLa/52X-DM66-Red2-PTS-23(#19) was used to generate the ER:PRL-HeLa cells. In contrast to the previously described HeLa/52X-DM66-Red2-PTS-23 (
Sharp et al., 2006), clone #19 exhibits a single foci in the presence of GFP-ER and has lower basal activity in the absence of hormone (
Amazit et al., 2007). GFP-hERα was amplified by PCR from the hERα-C1-EGFP plasmid (
Berno et al., 2008) by PCR using the following forward and reverse primers: 3′-C ACC ATG GTG AGC AAG GGC-5′ and 3′-CTA GAC TGT GGC AGG GAA ACC CTC-5′. The resulting PCR product was cloned into pLenti6/V5 using directional TOPO cloning system (Invitrogen). The full insert was sequenced in pLenti6/V5 and shown to encode EGFP in frame with full-length hERα. Virus production was performed according to manufacturer’s guidelines (Invitrogen) using 293FT cells. Transduced clones were selected using 0.8μg/ml blasticidin and single cell cloned using flow-assisted cell sorting based on GFP-fluorescence. As initial GFP-ER-positive clones were growth-limited, single cell clones were expanded in phenol red-free DMEM, with 10%FBS, 0.8ug/ml blasticidin and 1nM 4HT. ER:PRL-HeLa cells were subsequently maintained in phenol red-free DMEM+ L-glutamine and Na
+ pyruvate supplemented with 10% FBS (Gemini Bio-Products) 200ug/ml hygromycin, 0.8μg/ml blasticidin, and 1nM 4-hydroxytamoxifen (Sigma).
When ER:PRL-HeLa cells are co-cultured with MCF-7 breast cancer cells, MCF-7 cells were routinely maintained in IMEM, supplemented with 10%FBS and penicillin/streptomycin. GFP ER:PRL-HeLa cells were co-cultured with MCF7 cells on poly-D-lysine coated glass coverslips in phenol red-free DMEM supplemented with 5% charcoal-stripped and dialyzed FBS for 24 hours before fixation and immunolabeling with antibody against ERα (rabbit monoclonal ER60C, Millipore 04–820). Cells were plated directly onto poly D-lysine coated glass coverslips for high-resolution deconvolution imaging. Multiwell plate preparation is described below.
4.3 Multi-well plate preparation
ER:PRL-HeLa cells (passage 6–12) grown in T75 cell culture flasks were trypsinized, spun down and resuspended in phenol red-free DMEM with L-glutamine and Na+ pyruvate supplemented with 5% charcoal stripped, dialyzed (SD)-FBS (Experiment Media) to a density of 1.2×10^5 cells per ml. A TiterTek Multidrop 384 fluid dispensing unit was used to dispense ~3600 cells per well into 384-well plates (384 IQ-EB black/clear, Aurora Biotechnologies). Cells were grown in the absence of 4HT for 48 hours prior to compound treatments. Serial dilution of compounds and addition to the multi-well plates was performed using a Beckman Biomek NX robotic platform. Antibody labeling was performed as described previously (
Stenoien et al., 2000) using 4% formaldehyde fixation (20 minutes, room temperature) and indirect labeling with anti-mouse/rabbit Alexa-546 or anti-mouse/rabbit Alexa-647 conjugated secondary antibodies (Molecular Probes). Standard protocol details can be found in section 4.4. All liquid handling steps were carried out using a Beckman Biomek NX robotic platform.
4.4 Immunofluorescence Protocol
Cells were washed once in PBS (with Ca2+, Mg2+) and fixed for 20 min at room temperature in 4% formaldehyde prepared in PEM buffer (80mM potassium PIPES, pH 6.8, 5mM EGTA, 2mM MgCl2). This was followed by a 10 min quench (0.1M ammonium chloride for 10min) and a 30 min permeabilization using 0.5% Triton-X. Blocking was performed in Blotto (5% milk prepared in 1XTBS-Tween-20) for 15 min and the cells were incubated overnight at 4°C in primary antibody (mouse anti-RNA polymerase II, AbCam ab5408) diluted in Blotto. The following day the cells were washed three times in Blotto for 10 min each and incubated with secondary antibody for 1 hour at room temperature. The cells were then washed an additional 3 times in PEM and incubated for 10 min in a solution of 1μg/ml DAPI and 1μg/ml of CellMask-FarRED (Molecular Probes) in PEM. In multiwell plates this solution was replaced with PBS + 0.02% Sodium azide for imaging. Coverslips were mounted in Slow Fade gold (Molecular Probes).
4.5 Image Collection
Automated imaging was carried out using either the Cell Lab IC-100 Image Cytometer (IC100; Beckman Coulter) or the DeltaVision core system (Applied Precision) with automated stage (DV Live). The IC-100 system consisted of a Nikon Eclipse TE2000-U Inverted Microscope (Nikon; Melville, NY) with Chroma 82000 triple band filter set (Chroma; Brattleboro, VT), a Hamamatsu ORCA-ER Digital CCD camera (Hamamatsu; Bridgewater, NJ) and a Photonics COHU Progressive scan focusing camera (Photonics; Oxford, MA). This was equipped with a Nikon S Fluor 40×/0.90NA objective and the imaging camera was set to capture 8 bit images at 1×1 binning (1344×1024 pixels; 6.5 μm pixel size). The DV Live system consisted of an Olympus IX71 microscope with a 250W xenon light, photometrics Coolsnap HQ2 camera, standard filter set with multiband dichroic beamsplitter and individual excitation and emission filters (DAPI ex 360/40nm em457/40nm FITC ex 490/20nm em 526/38nm TRITC ex 555/28nm em 617/63nm. This was equipped with 40X Plan Apo/0.95 air gap objective with correction collar. In either case we collected 8–12 fields per well in 4 channels: blue, green, red and far red. For GFP-ER and Pol II a stack of 6 focal planes were collected at 1μM intervals. High-resolution fluorescence deconvolution microscopy was performed with a DeltaVison Restoration Microscopy System (Applied Precision Inc.). Cells were imaged using a 60X objective lens (1.42 NA). A Z-series of focal planes (~30 at 0.2μM) were digitally imaged and deconvolved with the DeltaVision constrained iterative algorithm to generate high-resolution images.
4.6 Automated image analysis and feature extraction
4.6.1 Pipeline Pilot Unless stated otherwise images were analyzed using Pipeline Pilot Version 7.5 (Accelrys) analysis software. Maximum intensity projections were created for Ch01 (GFP) and Ch02 (antibody) and all images were corrected to remove background. Nuclei were identified using Ch00 (DAPI) images to create masks by applying adaptive thresholding followed by marker-based watershed. Total cell area was determined using the nucleus mask regions as markers to apply a watershed on Ch03 (CellMask). Cell cytoplasm was determined by subtracting the nuclear masked region from the whole cell mask.
In order to accurately segment the PRL-array a linear filter and a Top Hat operator were applied to the Ch01 image to enhance only the dim arrays, and all arrays respectively. Images exhibiting only dim arrays or only bright arrays were used to train 3 k-means classifiers. The first classifier (DimFilt) was trained on the linear filtered Ch01 images of only dim arrays (E2-treated). The second and third classifiers were trained on the Top Hat processed images of dim and bright arrays (E2 and 4HT-treated), respectively (DimTH and BrightTH). Once the training is completed, all three classifiers are always applied. Small regions of less than 8 pixels in area are then removed. The DimFilt classifier is sensitive and accurately estimates the area of an array but is prone to false positives. The DimTH classifier has a low false positive rate, but can underestimate the area of the array. Therefore, these two classifiers are combined using Morphological Reconstruction, whereby arrays detected by DimTH are used as markers for reconstruction of arrays detected by DimFilt. This way, the area of detected arrays is accurately determined, while the false positive arrays are removed. BrightTH accurately detects bright arrays, missing most of the dim arrays; however, the halo of out of focus light often found around bright arrays, is often picked up by the Dim Filt classifier. Therefore, a final step is applied to remove this halo using Morphological Reconstruction of dim arrays using bright arrays as markers and removing those reconstructed dim arrays from the final array segmentation mask.
Cell populations were filtered to achieve a uniform population of cells without cell aggregates, mitotic cells, apoptotic cells, and cellular debris. Applied gates were based upon nuclear area, nuclear circularity and the ratio of nuclear to cell area (cell size ratio). Outlier filtering (99% acceptance based on a Gaussian distribution) was also performed based on the mean nuclear Ch01 (GFP) and Ch02 (antibody staining) signal per cell. Typically, 20–40 cells per field were kept for analysis after filtering. Cell-level features were then averaged across wells, producing well-level features that were used in subsequent high content analysis.
4.6.2 Cyteseer Where indicated, the mean fluorescence intensity per the nuclei was obtained using Cyteseer automated cell image analysis software (Vala Sciences, San Diego) exploiting algorithms for automated analysis of protein expression.
4.7 Statistics
Assay quality was established using the Z′, a dimensionless measurement determined using the following equation:
where σ represents the standard deviation of both positive and negative control and μ represents the mean of the populations (
Zhang et al., 1999). One way ANOVA followed by post-hoc Dunnets comparison to the positive control (E2) was used to ascertain significant differences between compound responses. EC
50 calculations were carried out in GraphPad prism using variable slope model, where Y=Bottom + (Top-Bottom)/(1+10^((LogEC50-X)*HillSlope)). Constraints were set to >0 and <1 for bottom and top values respectively.
4.8 HCA classification platform
A glossary of mathematical terms used in this section is provided in section 4.8.4
4.8.1 Feature selection We used
N-fold cross-validation
1 on the control data to identify a set of features useful for distinguishing between low, medium and high dose E2 and 4HT treatments. Each plate in the control set was assigned to a fold. In cross-validation, a classifier is trained on (
N – 1) plates (training set) and evaluated on the remaining group (testing set). This is repeated until each of the
N plates has been tested once. For each round of cross-validation we scaled the features by subtracting the mean of the training set features then dividing by the standard deviation of the training set features and normalized by dividing each sample by its L2-norm. We then performed stepwise discriminant analysis
2 (SDA) on the training set to remove less informative features (
Jennrich, 1977). Using the
N sets of SDA-selected features (obtained from the cross-validation runs), we selected features that appear in a majority of the runs. This was implemented in Python 2.6 using a port of the SLIC toolbox (
http://pslid.cbi.cmu.edu/release/).
4.8.2 Classification A radius basis function-kernel
3 support vector machine (SVM) classifier
4 was trained on control data using selected features (
Cortes and Vapnik, 1995). Its parameters,
C (slack penalty) and
g (kernel parameter), were tuned with a grid search. Once a classifier was trained, it was applied to the testing data, yielding probabilities of a sample belonging to one of the classes the classifier was trained to recognize (these probabilities are determined by the distance between the sample and the classifier’s decision boundaries) (
Wu et al., 2004). After all samples were tested, these probabilities were used to assess classification accuracy. This was implemented in Python 2.6 using the LIBSVM 2.9 toolbox (
http://www.csie.ntu.edu.tw/cjlin/libsvm/).
4.8.3 Clustering Features were scaled and samples normalized using the method described above. Principal component analysis5 (PCA) was applied to these features, and the number of resulting components was determined by the percent of dataset variance they captured. Data were clustered using the Euclidean distance6 with various linkage algorithms (centroid, complete, median single, Ward’s methods). A resampling approach was employed to find the variance and linking algorithm that produced the best clustering. Since each treatment was run in quadruplicate, we randomly sampled (without replacement) three wells per treatment. From this subset we defined treatments by the median of these triplicates, and then performed feature standardization and normalization, PCA, and clustering. This was done 2000 times to produce an ensemble of trees. From this ensemble we computed conditional probability tables that describe the probability of two groups of compounds (including single compounds) clustering together given all other existing groups.
4.8.4 Glossary of terms 1. Cross-validation A technique used to evaluate classifier performance and tune classifier parameters. Validation data is split into N-folds. (N-1) of these folds are used to define a training set, while the Nth fold is used for testing. The process is run a total of N times such that each of the folds has been used in testing.
2. Stepwise discriminant analysis (SDA) An iterative approach using feature removal and replacement to select features that are most informative in discriminating between given classes. For more, see (
Jenrich et al., 1977;
Huang et al., 2003)
3. Radial basis function (RBF) A transformation to project features from a non-linear to a linear space so that these features are suitable for support vector machine classification.
4. Support vector machine (SVM) classification A supervised learning approach used to define a decision boundary between classes of data. For two classes of data, with samples represented by
N features, an (
N-1)-dimensional hyperplane is defined such that it maximizes the margin (distance) between the plane and the nearest samples from both classes. For
M classes of data, an ensemble of
M*
(M−
1)/2 pair-wise classifiers can be produced, and some voting method can be applied across this ensemble to produce a classification label for a test sample. One property of SVM classification is that it allows the classifier to handle noisy data by allowing misclassifications during training. The parameter controlling this is the
slack penalty. Another property of SVMs is that they define linear hyperplanes. To deal with non-linear data, features can be projected into different spaces (to linearize them) using various transformations, one of which is a radial basis function (
Cortes and Vapnikm, 1995).
5. Principal component analysis (PCA) A feature reduction method in which features are projected into orthogonal components. The first component contains the highest variance, and subsequent components capture less. Many of the lower ranked components can be discarded under the presumption that they contain little useful information.
6. Euclidean distance The geometric distance between two samples. For two samples, S1 and S2, with N features, this is
4.9 Western Blotting
ER:PRL-HeLa cells were lysed in Cell Extraction Buffer (Biosource, Invitrogen) + complete protease inhibitor cocktail (Roche) for 10 min and the debris was cleared by centrifugation at 13,400 × g for 15 min at 4°C. The samples were resolved by SDS PAGE and transferred to nitrocellulose membranes (Bio-Rad). Primary antibodies (ER, Millipore 04–820 and actin, Affinity MA1-744) were diluted in TBS-T buffer (5% non-fat dry milk, 50mM Tris-HCl, 150 8mM NaCl [pH 7.5], 0.1%Tween 20) and added to the membranes overnight at 4°C followed by incubation with the appropriate horseradish peroxidase-conjugated secondary antibody for 1 hour at room temperature. All proteins were detected with ECL Plus Detection Reagents (Amersham) and visualized by chemiluminescence.
4.10 Time-Lapse Video Microscopy
ER:PRL-HeLa were plated onto 35-ml Delta T dishes (Bioptechs) pre-coated with Poly-D-lysine for live cell imaging. Imaging was performed with a Zeiss LSM 510 confocal microscope using a 63x objective (NA=1.4). HEPES-buffered media previously gassed in a 5% CO2 incubator was used to replace the existing growth media. Delta T dishes (Bioptechs) were secured to a stage adapter for temperature control at 37°C (±0.1 degree). A bioptechs objective-heating collar was also used (also 37°C). Hormone was applied to the cells and the DeltaT dish covered with a black plastic lid to minimize evaporation.