Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Gene. Author manuscript; available in PMC 2012 May 15.
Published in final edited form as:
PMCID: PMC3086628

High content imaging-based assay to classify estrogen receptor-α ligands based on defined mechanistic outcomes


Estrogen receptor-α (ER) is an important target both for therapeutic compounds and endocrine disrupting chemicals (EDCs); however, the mechanisms involved in chemical modulation of regulating ER transcriptional activity are inadequately understood. Here, we report the development of a high content analysis-based assay to describe ER activity that uniquely exploits a microscopically visible multicopy integration of an ER-regulated promoter. Through automated single-cell analyses we simultaneously quantified promoter occupancy, recruitment of transcriptional cofactors and large-scale chromatin changes in response to a panel of ER ligands and EDCs. Image-derived multi-parametric data was used to classify a panel of ligand responses at high resolution. We propose this system as a novel technology providing new mechanistic insights into EDC activities in a manner useful for both basic mechanistic studies and drug testing.


Estrogen receptor-α (ER) is a member of the superfamily of nuclear receptor (NR) transcription factors (Nilsson and Gustafsson, 2000). ER elicits its effects via direct binding to specific sequences within DNA, termed estrogen response elements (EREs) (Klein-Hitpass et al., 1988) and also indirectly via interaction with certain transcription factors (e.g., AP1) (Kushner et al., 2000). Found in both reproductive and non-reproductive tissues, ER participates in the gene regulator programs of several important physiological processes including sexual maturation, maintenance of bone density, CNS functions, and is known to be a critical regulatory factor in certain diseases, namely breast cancer and osteoporosis (reviewed(Woolley, 1999; Nilsson and Gustafsson, 2000; Sommer and Fuqua, 2001). For these reasons, ER has become a target for the development of many therapeutic compounds, including estrogen mimetics (e.g., diethystilbestrol (DES)) and selective ER modulators (SERMS; e.g. 4-hydroxy-tamoxifen (4HT) and raloxifene (RAL)). Similar to its cognate ligand 17β-estradiol (E2), these compounds bind ER within the binding pocket of the ligand binding domain (LBD) (Brzozowski et al., 1997; Shiau et al., 1998). X-ray crystallography has revealed that either an E2- or DES-bound LBD results in presentation of a coregulator binding surface on helix-12 of the LBD, whereas binding to either 4HT or RAL does not. These configurations are proposed to regulate the agonist or antagonistic properties of the receptor (Brzozowski et al., 1997; Shiau et al., 1998). In addition to therapeutic compounds, it is becoming increasingly apparent that there are numerous chemicals, both natural and synthetic that can impinge upon ER (and other NR) signaling through similar mechanisms. These endocrine-disrupting chemical (EDCs), including the widely used plasticizer bisphenol-A (BPA), can modulate ER activity at varying exposure levels with potentially wide-reaching medical and environmental consequences (Bromer et al.; Richter et al., 2007). A key challenge for high-throughput drug screening and EDC testing has been to access the complex mechanisms associated with NR functions that are usually studied with a wide range of biochemical assays on large populations of cells. Historically this has involved the use of numerous and diverse assays each providing single point readouts representing population-based response averages. Although informative, these assays have proved limiting in the quest for time- and cost-efficient assessment of drug or EDC effects. To address these issues, we have developed a high throughout, single cell-based assay specifically designed to simultaneously quantify multiple biological features of ER activity using a novel, high content analysis (HCA) approach. This automated systems-level inquiry was used to profile dose-dependent effects of a panel of ER-ligands, including SERMs and EDCs. Compounds were assigned activities based on simultaneously collected quantitative measurements of their ability to affect transcriptional regulation of a microscopically-visible, integrated reporter gene locus (Sharp et al., 2006; Berno et al., 2008). The primary measurements include: 1) ER protein levels and localization, 2) ER targeting to the ERE-rich reporter gene locus, 3) large-scale chromatin modeling, and 4) ER-based recruitment of RNA polymerase II. Quantitative automated image analyses provided the means to classify the effects of compounds based upon their collective effects upon ER functions.


2.1 Image-based analysis of ERα genomic activity

Several distinct events occur during ligand-activation of ER-dependent gene transcription. These include nuclear translocation, binding to DNA response elements within chromatin, recruitment or loss of various cofactors associated with transcriptional activation or repression, and receptor degradation. Each of these events can be quantified by individual biochemical assays based on cell population averages but the ability to simultaneously perform analyses within a cellular context is currently only available through the use of multi-copy promoter array systems combined with high resolution microscopy. We previously described the generation of a mammalian promoter chromosomal array system based on the stable, multi-copy integration of the ER-responsive prolactin promoter-enhancer (PRL-HeLa (Sharp et al., 2006)). Low-throughput semi-automated image analysis of transiently-expressed ER demonstrated the physiological responsiveness of the array to ER ligands E2 and 4HT (Berno et al., 2008). However, transient expression systems are not ideally suited to high-throughput imaging technologies due to inconsistent transfection efficiencies and extremely variable protein expression levels that require extensive post-acquisition filtering to focus analysis on physiologically-relevant cells. It was therefore critical to generate a clonal derivative of the PRL-HeLa cell line with constitutive expression of relatively low levels of GFP-tagged ER. To achieve this goal we used a lentiviral expression system and clonal expansion in the presence of antibiotic selection and 4HT [1nM], which was necessary due to negative effects of ER expression on HeLa cell growth. The clone used in this study (ER:PRL-HeLa) was typically greater than 95% positive for GFP-ER, which was predominantly localized to the nucleus and expressed at 4.3 fold higher levels when compared to endogenous ER in MCF-7 breast cancer cells, as determined by immunofluorescence (Fig. 1a). Similar to our previous studies with transiently transfected ER in PRL-HeLa (Sharp et al., 2006; Berno et al., 2008), we tested the response of the ER:PRL-HeLa cells to compounds known to regulate ER levels and activity. In agreement with published studies (Wijayaratne et al., 1999; Preisler-Mashek et al., 2002) treatment with either E2 or ICI 182780 (ICI) substantially decreased ER expression while 4HT increased ER levels (Fig. 1a). Treatment with 10nM E2 produced rapid recruitment of the constitutively expressed GFP-ER to the PRL-array followed by large-scale chromatin decondensation that is known to be associated with transcriptional activity (Muller et al., 2001; Janicki et al., 2004; Berno et al., 2008) (Supplementary Video 1). During vehicle treatment, the PRL-array did not accumulate ER above nucleoplasmic levels (not shown). Immunolabeling with antibody against RNA polymerase II (Pol II) demonstrated marked recruitment of Pol II to the PRL-array after treatment with E2 [10nM; 30min]; conversely treatment with 4HT [10nM; 30 min] produced condensed arrays associated with transcriptional repression (Berno et al., 2008) that did not recruit Pol II (Fig. 1b).

Figure 1
GFP-ER expressed in PRL-HeLas exhibits normal localization and responds to treatment with known ER agonist and antagonist compounds

Automated multiwell plate handling combined with automated image acquisition tools were used to perform cell plating, compound dilution and transfer, and immunofluorescence protocols. These procedures are described in detail within sections 4.3 and 4.4. Figure 2a shows images collected in 4 fluorescent channels and their associated image masks that were created using customized routines in Pipeline Pilot (Basic and Advanced Imaging Collection, Accelrys, San Diego, CA). Image features pertaining to the mean, sum and distribution of the pixel intensities under each of the defined masks (nucleus, cytoplasm and PRL-array) were selected to describe the cellular response to potential estrogenic/anti-estrogenic compounds. A full list of the extracted features is shown in Supplementary table I. Intra-plate Z′ values were calculated for some of the features envisaged to be important in determining mechanisms of compound activity, and in discerning agonist from antagonist-like responses (Fig. 2b). Z′ values >0.5 are considered preferable for cell based assays (Zhang et al., 1999). Indicative of the data quality from the PRL-array model system, intra-plate Z′ values for 4 of these selected features were >0.6. Array Occupancy (proportion of cells with a detectable GFP-ER loaded array) and GFP-ER CV (coefficient of variance for GFP-ER pixel intensities in the nucleus) were both considered effective measures of ER-DNA binding, giving Z scores of 0.91 and 0.71 when comparing vehicle control against E2 or 4HT respectively. Using customized array segmentation routines, described in detail in section 4.6.1, we were able to measure the area of the ER-occupied arrays with sufficient accuracy and consistency to reliably distinguish the larger decondensed arrays resulting from agonist (E2) treatment from the smaller, brighter, condensed array structures resulting from antagonist (4HT) treatment (Z value= 0.61). Pol II loading at the array was similarly effective at discriminating between agonist (E2) and antagonist (4HT) controls with a Z′ value of 0.78. The feature ‘percent nuclear GFP-ER’ failed to give an acceptable Z′ value due to the small magnitude of change in ER nuclear localization following a 30 min exposure to either E2 or 4HT.

Figure 2
Development of the ER:PRL-HeLa HCA assay

2.2 Kinetic analysis of ligand-induced PRL array occupancy by ER

Previous single cell-based studies using both the PRL and MMTV array systems have reported time-dependent changes in array recruitment and chromatin condensation state in response to ligand treatment (Muller et al., 2001; Berno et al., 2008; Stavreva et al., 2009). Also, in previous biochemical studies using bulk cell populations and chromatin immunoprecipitation (ChIP), averaged cellular results from E2-treated cells indicated cyclical recruitment and loss of both ER and Pol II at the endogenous pS2 promoter (Metivier et al., 2003). Taken together, these data suggest complex and highly dynamic ER-promoter interactions (Sharp et al., 2006). Thus, we hypothesized that ligand-dependent effects on ER interaction with the PRL-array would be time-sensitive. To test this hypothesis, we employed the high throughput automated assay described above to study the kinetics of ER and Pol II recruitment to the PRL array in response to treatment with low (0.1nM), intermediate (1nM) and high (10nM) doses of E2 and 4HT (Fig. 3). Cells were treated for 5 – 30 min at 5 min intervals and for 60 – 360 min at 60 min intervals. Maximal array occupancy was not reached at low dose for either ligand (Fig 3a). Treatment with an intermediate dose of E2 resulted in maximal array occupancy within 30 min of treatment; however, the same concentration of 4HT produced only ~10% occupancy at 30 min. High dose E2 gave maximal occupancy within ~10 min whereas the 4HT response was again delayed reaching maximal occupancy at 30–60 min post-treatment. Once a maximal level of array occupancy was established for either ligand, this was maintained for up to 6 hours.

Figure 3
Kinetics and dose-dependency of ligand-induced promoter occupancy and recruitment of Pol II

We next investigated the effects of either ligand on the kinetics of chromatin decondensation (Fig. 3b), and Pol II loading at the PRL-array (Fig. 3c). Due to the low array occupancy in cells treated with low doses of ligand we analyzed only intermediate and high doses for these features. As predicted from the live imaging studies (Supplementary Video), quantitative measurement of fixed cells showed that E2-treatment produced a gradual decondensation of arrays reaching a peak at ~30 min. From 30 to 60 min, the arrays underwent a partial recondensation resulting in a plateau of ~60% of the maximum response 2 hours post-treatment (Fig. 3b). In cells treated with 4HT, arrays remained small and condensed up to 6 hours after treatment, with no peak at 30 min. Consistent with early studies using the glucocorticoid receptor (GR)-activated MMTV promoter array system, which indicated maintenance of decondensed chromatin required the presence of an elongating transcription factor (Muller et al., 2001), we observed that recruitment of RNA polymerase II to the PRL array followed a similar kinetic and ligand-dependence to chromatin condensation (Fig. 3c). With intermediate and high doses of E2, peak recruitment of Pol II occurred at ~30 min post-treatment. As expected, neither intermediate nor high dose 4HT could induce loading of Pol II to the PRL array, consistent with suppressed reporter gene mRNA accumulation (Berno et al., 2008) and the inability to induce chromatin decondensation. We were able to conclude from these experiments that the magnitude of a response as oppose to the kinetic profile was most affected by compound dose and, most importantly for this study, that a critical time window exists at ~30 min post-treatment for retrieval of the most informative ligand-specific activities. This time window coincides with the previous assignment of maximal transcriptional activity observed in response to E2 treatment (Berno et al., 2008).

2.3 Determining compound activities using an EC50 for array occupancy by ER and selected image based features

Dose response data was generated for a panel of compounds considered received from the Interagency Coordinating Committee on the Validation of Alternative Methods that were considered potential ER ligands. The identities of some compounds were known and others (potential EDCs) remained blinded during testing. From 15 compounds tested, 8 induced array occupancy (Table I). From this group of 8, 4 were known: E2, BPA, 4HT, and raloxifene (RAL); and 4 were blinded, indicated by a * in all figures and later revealed to be 17α-estradiol (17αE2), BPA, Bisphenol-B (BPB) and diethylstilbestrol (DES). We next calculated EC50 values for the 8 compounds based on non-linear curve-fitting of a 10 point dose-response series of array occupancy measurements, n=3 independent experiments (Fig. 4a&b, and Table I). The average calculated EC50 for E2-dependent PRL-array occupancy was 7.7×10−10M. BPA and BPB had the lowest affinity to induce ER occupancy of the PRL-array, DES had an intermediate affinity and the affinities of 17α-E2, RAL and 4HT were not statistically different from E2 (Fig. 4b).

Figure 4
Determination of ER ligand activities and mechanism of action from image-derived measurements
Table I
Compound panel, EC50 for array occupancy

We next investigated the effects of the active compounds on selected image-derived features. The means ± standard error (SE) were calculated from quadruplicate wells treated with the maximum dose used of each compound for the following parameters: i) array occupancy, ii) percent nuclear GFP-ER, iii) GFP-ER array loading, iv) array area and v) Pol II array loading (Fig. 4c–g). This collection of mechanism-oriented results enabled us to compare the effects of blinded EDC compounds to known ER ligands. All compounds induced a similar level of maximal array occupancy (Fig. 4c) and a similar increase in nuclear localization of ER versus vehicle (Fig. 4d). RAL, BPA and BPB, like 4HT treatment produced small, bright arrays with less Pol II recruitment compared to E2, indicated by significantly higher GFP-ER loading (Fig 4e), smaller array area (Fig 4f) and significantly lower Pol II recruitment (Fig 4g) when compared to E2. Responses to 17α-E2 and DES were not significantly different to E2.

2.4 A compound response classification framework for high content screening

We applied statistical learning methods to develop an automated analysis platform for high content screens. We sought to classify compounds as having varying levels of agonist or antagonist responses by training a classifier to recognize responses in a control set. The control set consisted of E2 and 4HT responses to non-saturating (low), saturating (medium), and supersaturating (high) doses across 14 plates. We extracted cell-level features from the control data and averaged these across wells. We then applied stepwise discriminate analysis (SDA) (Jennrich, 1977) to identify features useful in distinguishing between different control groups. To make the feature selection process more robust across plates, we performed 14-fold cross-validation (Kohavi, 1995) around the SDA, splitting data by plate. We then selected the SDA features that appeared in a majority of the folds, indicating their cross-plate robustness. These four features were: 1) Array_GFP-ER_PI-CV, 2) Array_GFP-ER_PI-Variance, 3) Nucleus_GFP-ER_PI-Maximum, and 4) Array_area. Using these four features, we trained a classifier on the control set and applied this classifier to categorize the 8 active compounds at their maximum dose. For a given plate (each with a set of controls), a support vector machine classifier (Cortes and Vapnik, 1995) was trained. RAL, BPA and BPB were classified into either the 4HT-high or -med class, while DES and 17α-E2 were classified into the E2-high or -med class (Table II). The classifier additionally outputs a probability that its decision is correct (Wu et al., 2004), providing a confidence level to each compound classification. Using 14-fold cross-validation on the 14-plate control dataset, we assessed the performance of this framework (Table III). While there was excellent discrimination between E2 and 4HT compound classes, there was some confusion between medium and high doses indicating that saturating and supersaturating doses produced the greatest similarity in cellular responses. Importantly, the four significant features were selected in the each round of cross-validation, indicating that our feature selection method is robust to plate variability.

Table II
SVM classification of active compounds
Table III
Cross-validation of classifier performance.

2.5 Hierarchical clustering to define a response fingerprint

In order to show that the features listed in Supplementary Table I can define meaningful and unique fingerprints of ligand responses we performed principal component analysis and hierarchical clustering on maximum-dose compound responses from a single plate (Fig. 5). After defining various treatment response datasets using sampling without replacement, we produced an ensemble of cluster trees that were used in the optimization and evaluation of our clustering approach. We defined a simple measure of stability, m, for a reference tree (generated using the complete set of wells to define treatments) as the ratio of the number of ensemble trees that have the same linkage as the reference to the number of ensemble trees. Trees with higher m are more stable than trees with lower m. We found that selecting PCA features that capture the top 80% of variance coupled with Ward’s method (compared to centroid, complete, median, and single linking approaches) produced the reference tree (Fig. 5, green lines) having the highest robustness with a m=0.46). We also used the ensemble of trees to determine the conditional probability of pairings of different compound groups (Fig. 5, blue text). Hierarchical clustering robustly grouped the treatments into two major response families: one containing the known antagonist (4HT) and the other containing the known agonist (E2). BPA fell into the grouping with 4HT while exhibiting a signature distinct from RAL and/or 4HT. BPB typically clustered with RAL and 4HT, but also grouped notably with BPA (not shown). DES and 17α-E2 clustered together within the agonist response group.

Figure 5
Hierarchical clustering of cellular response fingerprints

3. Discussion

The overall goal of creating a single cell-based model to investigate ER function at a systems level has led to the development of the current in vitro screening system that generates mechanism-rich data using a platform compatible with large-scale screening applications. In this study, we describe the generation and characterization of the ER:PRL-HeLa cell line and the development of associated high content imaging and analysis capabilities. Optimal conditions for comparison of ligand responses were determined to be 30 min post-treatment based on agonist (E2) and antagonist (4HT) control responses). This time point is consistent with maximal transcriptional activity in response to E2 as determined in a previous less automated study (Berno et al., 2008). From a semi-blinded ER- and EDC-specific compound panel, we identified 8 compounds that were able to induce rapid ER binding to the PRL-HeLa array (within 30 min): E2, 17αE2, 4HT, RAL, DES, BPA, BPA and BPB. We determined their EC50 for inducing binding of ER to the PRL-array and characterized their effects on ER nuclear localization, ER induced chromatin remodeling, ER promoter loading and recruitment of Pol II to the promoter array. Further, we used approximately 100 image-based features in both a robust SDA-based classification schema and a PCA-based hierarchical clustering approach to group similar responses, creating a classification framework for high content screening. Using compound classification based on HCA we were able to clearly distinguish agonist (E2) from antagonist (4HT) responses. The classifier assigned 4HT and RAL to the 4HT-like antagonist group with high confidence and these compounds also robustly clustered together. E2, 17α-E2 and DES were classified as having E2-like responses and these compounds consistently clustered together and separate from RAL and 4HT. The classifier assigned both BPA and BPB to the antagonist class with high confidence; however, hierarchical clustering carried out on the PCA-derived data was able to clearly distinguish the BPA and BPB responses from 4HT and RAL. Data presented in this study is consistent with the previously proposed hypothesis that xenoestrogens can affect ER function via multiple mechanisms (Safe et al., 2001) and that BPA can behave antagonistically in certain cell systems (Gould et al., 1998; Yoon et al., 2000). Lack of induction of ER promoter occupancy in response to either vinclozolin or flavone is consistent with previous studies indicating vinclozolin does not activate ERα-dependent gene transcription in some systems (Sonneveld et al., 2005) (Kojima et al., 2004) and that the estrogenic/anti-estrogenic effect of flavone can be attributed to indirect mechanisms (Collins-Burow et al., 2000) (Frigo et al., 2002).

A screening system for ER ligands and EDCs that offers important and early insights into potential mechanisms of action has obvious potential for scientific and logistical/economic benefits. This is particularly critical in the case of ER because it is a common EDC target and also an important drugable target for the treatment of breast cancer and conditions prevalent in post-menopausal women, including osteoporosis (Dix and Jordan, 1979) (Delmas et al., 1997) (Tice, 1978). Significant advances to ER ligand-screening technologies would therefore have potentially far-reaching health benefits. While we (Szafran et al., 2008; Szafran et al., 2009; Hartig et al., 2010) and others (Perlman et al., 2004; Young et al., 2008) (reviewed (Feng et al., 2009)) have applied HCA to compound testing of NR and other biologies, our unique exploitation of the ER-dependent integrated promoter array contributes considerable new functional data to this field. Utilization of a large panel of antibodies to nuclear receptor coregulators and other transcription-associated factors is currently in progress to improve the mechanistic profiling of ER functions at the single cell level.


4.1 Chemicals

Chemicals used in assay development/testing were obtained from the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) via G. Bittner (Plastipure, Austin, TX). were received in solid form and 1μM stock solutions prepared in 100% ethanol. Other chemicals were obtained from Sigma-Aldrich (St. Louis, MO) unless stated otherwise.

4.2 Cell line generation and maintenance

The ER:PRL-HeLa cell line was generated by stable introduction of GFP-tagged ERα into the previously described PRL-HeLa cell line (Sharp et al., 2006). Briefly, HeLa cells were cotransfected with p52X-PRL-DS-Red2-SKL and pTKHygro (Clontech), and selection was carried out in 400μg/ml hygromycin. Transient cotransfection with GFP-ER and GFP-Pit-1 resulted in the presence of GFP signal in the nucleus with either 1 or 2 bright intranuclear foci of fluorescence that indicated the integration of the p52X-PRL-dsRED2-SKL plasmid (confirmed by DNA FISH (Sharp et al., 2006). The clone HeLa/52X-DM66-Red2-PTS-23(#19) was used to generate the ER:PRL-HeLa cells. In contrast to the previously described HeLa/52X-DM66-Red2-PTS-23 (Sharp et al., 2006), clone #19 exhibits a single foci in the presence of GFP-ER and has lower basal activity in the absence of hormone (Amazit et al., 2007). GFP-hERα was amplified by PCR from the hERα-C1-EGFP plasmid (Berno et al., 2008) by PCR using the following forward and reverse primers: 3′-C ACC ATG GTG AGC AAG GGC-5′ and 3′-CTA GAC TGT GGC AGG GAA ACC CTC-5′. The resulting PCR product was cloned into pLenti6/V5 using directional TOPO cloning system (Invitrogen). The full insert was sequenced in pLenti6/V5 and shown to encode EGFP in frame with full-length hERα. Virus production was performed according to manufacturer’s guidelines (Invitrogen) using 293FT cells. Transduced clones were selected using 0.8μg/ml blasticidin and single cell cloned using flow-assisted cell sorting based on GFP-fluorescence. As initial GFP-ER-positive clones were growth-limited, single cell clones were expanded in phenol red-free DMEM, with 10%FBS, 0.8ug/ml blasticidin and 1nM 4HT. ER:PRL-HeLa cells were subsequently maintained in phenol red-free DMEM+ L-glutamine and Na+ pyruvate supplemented with 10% FBS (Gemini Bio-Products) 200ug/ml hygromycin, 0.8μg/ml blasticidin, and 1nM 4-hydroxytamoxifen (Sigma).

When ER:PRL-HeLa cells are co-cultured with MCF-7 breast cancer cells, MCF-7 cells were routinely maintained in IMEM, supplemented with 10%FBS and penicillin/streptomycin. GFP ER:PRL-HeLa cells were co-cultured with MCF7 cells on poly-D-lysine coated glass coverslips in phenol red-free DMEM supplemented with 5% charcoal-stripped and dialyzed FBS for 24 hours before fixation and immunolabeling with antibody against ERα (rabbit monoclonal ER60C, Millipore 04–820). Cells were plated directly onto poly D-lysine coated glass coverslips for high-resolution deconvolution imaging. Multiwell plate preparation is described below.

4.3 Multi-well plate preparation

ER:PRL-HeLa cells (passage 6–12) grown in T75 cell culture flasks were trypsinized, spun down and resuspended in phenol red-free DMEM with L-glutamine and Na+ pyruvate supplemented with 5% charcoal stripped, dialyzed (SD)-FBS (Experiment Media) to a density of 1.2×10^5 cells per ml. A TiterTek Multidrop 384 fluid dispensing unit was used to dispense ~3600 cells per well into 384-well plates (384 IQ-EB black/clear, Aurora Biotechnologies). Cells were grown in the absence of 4HT for 48 hours prior to compound treatments. Serial dilution of compounds and addition to the multi-well plates was performed using a Beckman Biomek NX robotic platform. Antibody labeling was performed as described previously (Stenoien et al., 2000) using 4% formaldehyde fixation (20 minutes, room temperature) and indirect labeling with anti-mouse/rabbit Alexa-546 or anti-mouse/rabbit Alexa-647 conjugated secondary antibodies (Molecular Probes). Standard protocol details can be found in section 4.4. All liquid handling steps were carried out using a Beckman Biomek NX robotic platform.

4.4 Immunofluorescence Protocol

Cells were washed once in PBS (with Ca2+, Mg2+) and fixed for 20 min at room temperature in 4% formaldehyde prepared in PEM buffer (80mM potassium PIPES, pH 6.8, 5mM EGTA, 2mM MgCl2). This was followed by a 10 min quench (0.1M ammonium chloride for 10min) and a 30 min permeabilization using 0.5% Triton-X. Blocking was performed in Blotto (5% milk prepared in 1XTBS-Tween-20) for 15 min and the cells were incubated overnight at 4°C in primary antibody (mouse anti-RNA polymerase II, AbCam ab5408) diluted in Blotto. The following day the cells were washed three times in Blotto for 10 min each and incubated with secondary antibody for 1 hour at room temperature. The cells were then washed an additional 3 times in PEM and incubated for 10 min in a solution of 1μg/ml DAPI and 1μg/ml of CellMask-FarRED (Molecular Probes) in PEM. In multiwell plates this solution was replaced with PBS + 0.02% Sodium azide for imaging. Coverslips were mounted in Slow Fade gold (Molecular Probes).

4.5 Image Collection

Automated imaging was carried out using either the Cell Lab IC-100 Image Cytometer (IC100; Beckman Coulter) or the DeltaVision core system (Applied Precision) with automated stage (DV Live). The IC-100 system consisted of a Nikon Eclipse TE2000-U Inverted Microscope (Nikon; Melville, NY) with Chroma 82000 triple band filter set (Chroma; Brattleboro, VT), a Hamamatsu ORCA-ER Digital CCD camera (Hamamatsu; Bridgewater, NJ) and a Photonics COHU Progressive scan focusing camera (Photonics; Oxford, MA). This was equipped with a Nikon S Fluor 40×/0.90NA objective and the imaging camera was set to capture 8 bit images at 1×1 binning (1344×1024 pixels; 6.5 μm pixel size). The DV Live system consisted of an Olympus IX71 microscope with a 250W xenon light, photometrics Coolsnap HQ2 camera, standard filter set with multiband dichroic beamsplitter and individual excitation and emission filters (DAPI ex 360/40nm em457/40nm FITC ex 490/20nm em 526/38nm TRITC ex 555/28nm em 617/63nm. This was equipped with 40X Plan Apo/0.95 air gap objective with correction collar. In either case we collected 8–12 fields per well in 4 channels: blue, green, red and far red. For GFP-ER and Pol II a stack of 6 focal planes were collected at 1μM intervals. High-resolution fluorescence deconvolution microscopy was performed with a DeltaVison Restoration Microscopy System (Applied Precision Inc.). Cells were imaged using a 60X objective lens (1.42 NA). A Z-series of focal planes (~30 at 0.2μM) were digitally imaged and deconvolved with the DeltaVision constrained iterative algorithm to generate high-resolution images.

4.6 Automated image analysis and feature extraction

4.6.1 Pipeline Pilot

Unless stated otherwise images were analyzed using Pipeline Pilot Version 7.5 (Accelrys) analysis software. Maximum intensity projections were created for Ch01 (GFP) and Ch02 (antibody) and all images were corrected to remove background. Nuclei were identified using Ch00 (DAPI) images to create masks by applying adaptive thresholding followed by marker-based watershed. Total cell area was determined using the nucleus mask regions as markers to apply a watershed on Ch03 (CellMask). Cell cytoplasm was determined by subtracting the nuclear masked region from the whole cell mask.

In order to accurately segment the PRL-array a linear filter and a Top Hat operator were applied to the Ch01 image to enhance only the dim arrays, and all arrays respectively. Images exhibiting only dim arrays or only bright arrays were used to train 3 k-means classifiers. The first classifier (DimFilt) was trained on the linear filtered Ch01 images of only dim arrays (E2-treated). The second and third classifiers were trained on the Top Hat processed images of dim and bright arrays (E2 and 4HT-treated), respectively (DimTH and BrightTH). Once the training is completed, all three classifiers are always applied. Small regions of less than 8 pixels in area are then removed. The DimFilt classifier is sensitive and accurately estimates the area of an array but is prone to false positives. The DimTH classifier has a low false positive rate, but can underestimate the area of the array. Therefore, these two classifiers are combined using Morphological Reconstruction, whereby arrays detected by DimTH are used as markers for reconstruction of arrays detected by DimFilt. This way, the area of detected arrays is accurately determined, while the false positive arrays are removed. BrightTH accurately detects bright arrays, missing most of the dim arrays; however, the halo of out of focus light often found around bright arrays, is often picked up by the Dim Filt classifier. Therefore, a final step is applied to remove this halo using Morphological Reconstruction of dim arrays using bright arrays as markers and removing those reconstructed dim arrays from the final array segmentation mask.

Cell populations were filtered to achieve a uniform population of cells without cell aggregates, mitotic cells, apoptotic cells, and cellular debris. Applied gates were based upon nuclear area, nuclear circularity and the ratio of nuclear to cell area (cell size ratio). Outlier filtering (99% acceptance based on a Gaussian distribution) was also performed based on the mean nuclear Ch01 (GFP) and Ch02 (antibody staining) signal per cell. Typically, 20–40 cells per field were kept for analysis after filtering. Cell-level features were then averaged across wells, producing well-level features that were used in subsequent high content analysis.

4.6.2 Cyteseer

Where indicated, the mean fluorescence intensity per the nuclei was obtained using Cyteseer automated cell image analysis software (Vala Sciences, San Diego) exploiting algorithms for automated analysis of protein expression.

4.7 Statistics

Assay quality was established using the Z′, a dimensionless measurement determined using the following equation:


where σ represents the standard deviation of both positive and negative control and μ represents the mean of the populations (Zhang et al., 1999). One way ANOVA followed by post-hoc Dunnets comparison to the positive control (E2) was used to ascertain significant differences between compound responses. EC50 calculations were carried out in GraphPad prism using variable slope model, where Y=Bottom + (Top-Bottom)/(1+10^((LogEC50-X)*HillSlope)). Constraints were set to >0 and <1 for bottom and top values respectively.

4.8 HCA classification platform

A glossary of mathematical terms used in this section is provided in section 4.8.4

4.8.1 Feature selection

We used N-fold cross-validation1 on the control data to identify a set of features useful for distinguishing between low, medium and high dose E2 and 4HT treatments. Each plate in the control set was assigned to a fold. In cross-validation, a classifier is trained on (N – 1) plates (training set) and evaluated on the remaining group (testing set). This is repeated until each of the N plates has been tested once. For each round of cross-validation we scaled the features by subtracting the mean of the training set features then dividing by the standard deviation of the training set features and normalized by dividing each sample by its L2-norm. We then performed stepwise discriminant analysis2 (SDA) on the training set to remove less informative features (Jennrich, 1977). Using the N sets of SDA-selected features (obtained from the cross-validation runs), we selected features that appear in a majority of the runs. This was implemented in Python 2.6 using a port of the SLIC toolbox (

4.8.2 Classification

A radius basis function-kernel3 support vector machine (SVM) classifier4 was trained on control data using selected features (Cortes and Vapnik, 1995). Its parameters, C (slack penalty) and g (kernel parameter), were tuned with a grid search. Once a classifier was trained, it was applied to the testing data, yielding probabilities of a sample belonging to one of the classes the classifier was trained to recognize (these probabilities are determined by the distance between the sample and the classifier’s decision boundaries) (Wu et al., 2004). After all samples were tested, these probabilities were used to assess classification accuracy. This was implemented in Python 2.6 using the LIBSVM 2.9 toolbox (

4.8.3 Clustering

Features were scaled and samples normalized using the method described above. Principal component analysis5 (PCA) was applied to these features, and the number of resulting components was determined by the percent of dataset variance they captured. Data were clustered using the Euclidean distance6 with various linkage algorithms (centroid, complete, median single, Ward’s methods). A resampling approach was employed to find the variance and linking algorithm that produced the best clustering. Since each treatment was run in quadruplicate, we randomly sampled (without replacement) three wells per treatment. From this subset we defined treatments by the median of these triplicates, and then performed feature standardization and normalization, PCA, and clustering. This was done 2000 times to produce an ensemble of trees. From this ensemble we computed conditional probability tables that describe the probability of two groups of compounds (including single compounds) clustering together given all other existing groups.

PCA and clustering were done in Python 2.6 using the MDP 2.6 ( and SciPy 0.8 ( toolboxes.

4.8.4 Glossary of terms

1. Cross-validation

A technique used to evaluate classifier performance and tune classifier parameters. Validation data is split into N-folds. (N-1) of these folds are used to define a training set, while the Nth fold is used for testing. The process is run a total of N times such that each of the folds has been used in testing.

2. Stepwise discriminant analysis (SDA)

An iterative approach using feature removal and replacement to select features that are most informative in discriminating between given classes. For more, see (Jenrich et al., 1977; Huang et al., 2003)

3. Radial basis function (RBF)

A transformation to project features from a non-linear to a linear space so that these features are suitable for support vector machine classification.

4. Support vector machine (SVM) classification

A supervised learning approach used to define a decision boundary between classes of data. For two classes of data, with samples represented by N features, an (N-1)-dimensional hyperplane is defined such that it maximizes the margin (distance) between the plane and the nearest samples from both classes. For M classes of data, an ensemble of M*(M1)/2 pair-wise classifiers can be produced, and some voting method can be applied across this ensemble to produce a classification label for a test sample. One property of SVM classification is that it allows the classifier to handle noisy data by allowing misclassifications during training. The parameter controlling this is the slack penalty. Another property of SVMs is that they define linear hyperplanes. To deal with non-linear data, features can be projected into different spaces (to linearize them) using various transformations, one of which is a radial basis function (Cortes and Vapnikm, 1995).

5. Principal component analysis (PCA)

A feature reduction method in which features are projected into orthogonal components. The first component contains the highest variance, and subsequent components capture less. Many of the lower ranked components can be discarded under the presumption that they contain little useful information.

6. Euclidean distance

The geometric distance between two samples. For two samples, S1 and S2, with N features, this is


4.9 Western Blotting

ER:PRL-HeLa cells were lysed in Cell Extraction Buffer (Biosource, Invitrogen) + complete protease inhibitor cocktail (Roche) for 10 min and the debris was cleared by centrifugation at 13,400 × g for 15 min at 4°C. The samples were resolved by SDS PAGE and transferred to nitrocellulose membranes (Bio-Rad). Primary antibodies (ER, Millipore 04–820 and actin, Affinity MA1-744) were diluted in TBS-T buffer (5% non-fat dry milk, 50mM Tris-HCl, 150 8mM NaCl [pH 7.5], 0.1%Tween 20) and added to the membranes overnight at 4°C followed by incubation with the appropriate horseradish peroxidase-conjugated secondary antibody for 1 hour at room temperature. All proteins were detected with ECL Plus Detection Reagents (Amersham) and visualized by chemiluminescence.

4.10 Time-Lapse Video Microscopy

ER:PRL-HeLa were plated onto 35-ml Delta T dishes (Bioptechs) pre-coated with Poly-D-lysine for live cell imaging. Imaging was performed with a Zeiss LSM 510 confocal microscope using a 63x objective (NA=1.4). HEPES-buffered media previously gassed in a 5% CO2 incubator was used to replace the existing growth media. Delta T dishes (Bioptechs) were secured to a stage adapter for temperature control at 37°C (±0.1 degree). A bioptechs objective-heating collar was also used (also 37°C). Hormone was applied to the cells and the DeltaT dish covered with a black plastic lid to minimize evaporation.

Supplementary Material


Excellent technical support was provided by MG Mancini, and generous image analysis and data workflow support provided by TJ Moran (Accelrys, Inc). This work was funded by NIH 5R01DK055622 (MAM), The Susan Komen Foundation KG091198 (FJA), Department of Defense (MAM), NIH K12-DK0830140-02, DJL P.I. (JYN), Keck Foundation (EDJ) and pilot grant and equipment support from the John S. Dunn Gulf Coast Consortium for Chemical Genomics (MAM). The authors imaging resources were supported by SCCPR U54 HD-007495 (BW O’Malley), P30 DK-56338 (MK Estes), P30 CA-125123 (CK Osborne), and the Dan L. Duncan Cancer Center of Baylor College of Medicine.


Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


  • Amazit L, Pasini L, Szafran AT, Berno V, Wu RC, Mielke M, Jones ED, Mancini MG, Hinojos CA, O’Malley BW, Mancini MA. Regulation of SRC-3 intercompartmental dynamics by estrogen receptor and phosphorylation. Mol Cell Biol. 2007;27:6913–32. [PMC free article] [PubMed]
  • Berno V, Amazit L, Hinojos C, Zhong J, Mancini MG, Sharp ZD, Mancini MA. Activation of estrogen receptor-alpha by E2 or EGF induces temporally distinct patterns of large-scale chromatin modification and mRNA transcription. PLoS One. 2008;3:e2286. [PMC free article] [PubMed]
  • Bromer JG, Zhou Y, Taylor MB, Doherty L, Taylor HS. Bisphenol-A exposure in utero leads to epigenetic alterations in the developmental programming of uterine estrogen response. FASEB J [PubMed]
  • Brzozowski AM, Pike AC, Dauter Z, Hubbard RE, Bonn T, Engstrom O, Ohman L, Greene GL, Gustafsson JA, Carlquist M. Molecular basis of agonism and antagonism in the oestrogen receptor. Nature. 1997;389:753–8. [PubMed]
  • Collins-Burow BM, Burow ME, Duong BN, McLachlan JA. Estrogenic and antiestrogenic activities of flavonoid phytochemicals through estrogen receptor binding-dependent and -independent mechanisms. Nutr Cancer. 2000;38:229–44. [PubMed]
  • Cortes C, Vapnik V. Support-Vector Networks. Mach Learn. 1995;20:273–297.
  • Delmas PD, Bjarnason NH, Mitlak BH, Ravoux AC, Shah AS, Huster WJ, Draper M, Christiansen C. Effects of raloxifene on bone mineral density, serum cholesterol concentrations, and uterine endometrium in postmenopausal women. N Engl J Med. 1997;337:1641–7. [PubMed]
  • Dix CJ, Jordan VC. Control of experimental breast cancer by antioestrogenic therapies [proceedings] Br J Clin Pharmacol. 1979;7:431P. [PubMed]
  • Feng Y, Mitchison TJ, Bender A, Young DW, Tallarico JA. Multi-parameter phenotypic profiling: using cellular effects to characterize small-molecule compounds. Nat Rev Drug Discov. 2009;8:567–78. [PubMed]
  • Frigo DE, Duong BN, Melnik LI, Schief LS, Collins-Burow BM, Pace DK, McLachlan JA, Burow ME. Flavonoid phytochemicals regulate activator protein-1 signal transduction pathways in endometrial and kidney stable cell lines. J Nutr. 2002;132:1848–53. [PubMed]
  • Gould JC, Leonard LS, Maness SC, Wagner BL, Conner K, Zacharewski T, Safe S, McDonnell DP, Gaido KW. Bisphenol A interacts with the estrogen receptor alpha in a distinct manner from estradiol. Mol Cell Endocrinol. 1998;142:203–14. [PubMed]
  • Hartig SM, He B, Bruerer B, Loose L, Mancini MA. hSRC-3 promotes early human adipogenesis. J Cell Biology. 2010 In Press.
  • Huang K, Velliste M, Murphy RF. Feature reduction for improved recognition of subcellular location patterns in fluorescence microscope images. Proc SPIE. 2003;4962:307–318.
  • Janicki SM, Tsukamoto T, Salghetti SE, Tansey WP, Sachidanandam R, Prasanth KV, Ried T, Shav-Tal Y, Bertrand E, Singer RH, Spector DL. From silencing to gene expression: real-time analysis in single cells. Cell. 2004;116:683–98. [PMC free article] [PubMed]
  • Jennrich RI. Stepwise Discriminant Analysis. Statistical Methods for Digital Computers. 1977;3:76–96.
  • Klein-Hitpass L, Ryffel GU, Heitlinger E, Cato AC. A 13 bp palindrome is a functional estrogen responsive element and interacts specifically with estrogen receptor. Nucleic Acids Res. 1988;16:647–63. [PMC free article] [PubMed]
  • Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI’95: Proceedings of the 14th international joint conference on Artificial intelligence; Morgan Kaufmann Publishers Inc; San Francisco, CA, USA. 1995. pp. 1137–1143.
  • Kojima H, Katsura E, Takeuchi S, Niiyama K, Kobayashi K. Screening for estrogen and androgen receptor activities in 200 pesticides by in vitro reporter gene assays using Chinese hamster ovary cells. Environ Health Perspect. 2004;112:524–31. [PMC free article] [PubMed]
  • Kushner PJ, Agard DA, Greene GL, Scanlan TS, Shiau AK, Uht RM, Webb P. Estrogen receptor pathways to AP-1. J Steroid Biochem Mol Biol. 2000;74:311–7. [PubMed]
  • Metivier R, Penot G, Hubner MR, Reid G, Brand H, Kos M, Gannon F. Estrogen receptor-alpha directs ordered, cyclical, and combinatorial recruitment of cofactors on a natural target promoter. Cell. 2003;115:751–63. [PubMed]
  • Muller WG, Walker D, Hager GL, McNally JG. Large-scale chromatin decondensation and recondensation regulated by transcription from a natural promoter. J Cell Biol. 2001;154:33–48. [PMC free article] [PubMed]
  • Nilsson S, Gustafsson JA. Estrogen receptor transcription and transactivation: Basic aspects of estrogen action. Breast Cancer Res. 2000;2:360–6. [PMC free article] [PubMed]
  • Perlman ZE, Slack MD, Feng Y, Mitchison TJ, Wu LF, Altschuler SJ. Multidimensional drug profiling by automated microscopy. Science. 2004;306:1194–8. [PubMed]
  • Preisler-Mashek MT, Solodin N, Stark BL, Tyriver MK, Alarid ET. Ligand-specific regulation of proteasome-mediated proteolysis of estrogen receptor-alpha. Am J Physiol Endocrinol Metab. 2002;282:E891–8. [PubMed]
  • Richter CA, Birnbaum LS, Farabollini F, Newbold RR, Rubin BS, Talsness CE, Vandenbergh JG, Walser-Kuntz DR, vom Saal FS. In vivo effects of bisphenol A in laboratory rodent studies. Reprod Toxicol. 2007;24:199–224. [PMC free article] [PubMed]
  • Safe SH, Pallaroni L, Yoon K, Gaido K, Ross S, Saville B, McDonnellc D. Toxicology of environmental estrogens. Reprod Fertil Dev. 2001;13:307–15. [PubMed]
  • Sharp ZD, Mancini MG, Hinojos CA, Dai F, Berno V, Szafran AT, Smith KP, Lele TP, Ingber DE, Mancini MA. Estrogen-receptor-alpha exchange and chromatin dynamics are ligand- and domain-dependent. J Cell Sci. 2006;119:4101–16. [PubMed]
  • Shiau AK, Barstad D, Loria PM, Cheng L, Kushner PJ, Agard DA, Greene GL. The structural basis of estrogen receptor/coactivator recognition and the antagonism of this interaction by tamoxifen. Cell. 1998;95:927–37. [PubMed]
  • Sommer S, Fuqua SA. Estrogen receptor and breast cancer. Semin Cancer Biol. 2001;11:339–52. [PubMed]
  • Sonneveld E, Jansen HJ, Riteco JA, Brouwer A, van der Burg B. Development of androgen- and estrogen-responsive bioassays, members of a panel of human cell line-based highly selective steroid-responsive bioassays. Toxicol Sci. 2005;83:136–48. [PubMed]
  • Stavreva DA, Wiench M, John S, Conway-Campbell BL, McKenna MA, Pooley JR, Johnson TA, Voss TC, Lightman SL, Hager GL. Ultradian hormone stimulation induces glucocorticoid receptor-mediated pulses of gene transcription. Nat Cell Biol. 2009;11:1093–102. [PubMed]
  • Stenoien DL, Mancini MG, Patel K, Allegretto EA, Smith CL, Mancini MA. Subnuclear trafficking of estrogen receptor-alpha and steroid receptor coactivator-1. Mol Endocrinol. 2000;14:518–34. [PubMed]
  • Szafran AT, Hartig S, Sun H, Uray IP, Szwarc M, Shen Y, Mediwala SN, Bell J, McPhaul MJ, Mancini MA, Marcelli M. Androgen receptor mutations associated with androgen insensitivity syndrome: a high content analysis approach leading to personalized medicine. PLoS One. 2009;4:e8179. [PMC free article] [PubMed]
  • Szafran AT, Szwarc M, Marcelli M, Mancini MA. Androgen receptor functional analyses by high throughput imaging: determination of ligand, cell cycle, and mutation-specific effects. PLoS One. 2008;3:e3605. [PMC free article] [PubMed]
  • Tice LF. Estrogens: their function, uses and hazards. Part 1. Am Pharm. 1978;18:26–31. [PubMed]
  • Wijayaratne AL, Nagel SC, Paige LA, Christensen DJ, Norris JD, Fowlkes DM, McDonnell DP. Comparative analyses of mechanistic differences among antiestrogens. Endocrinology. 1999;140:5828–40. [PubMed]
  • Woolley CS. Effects of estrogen in the CNS. Curr Opin Neurobiol. 1999;9:349–54. [PubMed]
  • Wu T-F, Lin C-J, Weng RC. Probability Estimates for Multi-class Classification by Pairwise Coupling. J Mach Learn Res. 2004;5:975–1005.
  • Yoon K, Pellaroni L, Ramamoorthy K, Gaido K, Safe S. Ligand structure-dependent differences in activation of estrogen receptor alpha in human HepG2 liver and U2 osteogenic cancer cell lines. Mol Cell Endocrinol. 2000;162:211–20. [PubMed]
  • Young DW, Bender A, Hoyt J, McWhinnie E, Chirn GW, Tao CY, Tallarico JA, Labow M, Jenkins JL, Mitchison TJ, Feng Y. Integrating high-content screening and ligand-target prediction to identify mechanism of action. Nat Chem Biol. 2008;4:59–68. [PubMed]
  • Zhang JH, Chung TD, Oldenburg KR. A Simple Statistical Parameter for Use in Evaluation and Validation of High Throughput Screening Assays. J Biomol Screen. 1999;4:67–73. [PubMed]