|Home | About | Journals | Submit | Contact Us | Français|
Classically, the estrogen signaling system has two core components: cytochrome P450 aromatase (CYP19), the enzyme complex that catalyzes the rate limiting step in estrogen biosynthesis; and estrogen receptors (ERs), ligand activated transcription factors that interact with the regulatory region of target genes to mediate the biological effects of estrogen. While the importance of estrogens for regulation of reproduction, development and physiology has been well-documented in gnathostome vertebrates, the evolutionary origins of estrogen as a hormone are still unclear. As invertebrates within the phylum Chordata, cephalochordates (e.g. the amphioxus of the genus Branchiostoma) are among the closest invertebrate relatives of the vertebrates and can provide critical insight into the evolution of vertebrate-specific molecules and pathways. To address this question, this paper briefly reviews relevant earlier studies that help to illuminate the history of the aromatase and ER genes, with a particular emphasis on insights from amphioxus and other invertebrates. We then present new analyses of amphioxus aromatase and ER sequence and function, including an in silico model of the amphioxus aromatase protein, and CYP19 gene analysis. CYP19 shares a conserved gene structure with vertebrates (9 coding exons) and moderate sequence conservation (40% amino acid identity with human CYP19). Modeling of the amphioxus aromatase substrate binding site and simulated docking of androstenedione in comparison to the human aromatase shows that the substrate binding site is conserved and predicts that androstenedione could be a substrate for amphioxus CYP19. The amphioxus ER is structurally similar to vertebrate ERs, but differs in sequence and key residues of the ligand binding domain. Consistent with results from other laboratories, amphioxus ER did not bind radiolabeled estradiol, nor did it modulate gene expression on anestrogen-responsive element (ERE) in the presence of estradiol, 4-hydroxytamoxifen, diethylstilbestrol, bisphenol A or genistein. Interestingly, it has been shown that a related gene, the amphioxus “steroid receptor” (SR), can be activated by estrogens and that amphioxus ER can repress this activation. CYP19, ER and SR are all primarily expressed in gonadal tissue, suggesting an ancient paracrine/autocrinesignaling role, but it is not yet known how their expression is regulated and, if estrogen is actually synthesized in amphioxus, whether it has a role in mediating any biological effects. Functional studies are clearly needed to link emerging bioinformatics and in vitro molecular biology results with organismal physiology to develop an understanding of the evolution of estrogen signaling.
Based primarily on evidence from humans and laboratory mammals, it is well established that estrogens play a critical regulatory role in many different life processes beginning in early stages of embryogenesis. The term “estrogen” derives from its first perceived function as a female reproductive hormone, specifically associated with the period of sexual receptivity in female mammals (estrus = Latin oestrus meaning frenzy or gadfly). Although early investigators used the urine of pregnant women to isolate estrone, the first steroid found to have hormonal activity, subsequent studies soon reported the presence of estrogens and the biosynthesis of estradiol, estrone and estriol from small acyclic precursors in both males and females of a wide range of vertebrates from fish to mammals . It is now generally accepted that estrogen not only is required for the normal growth, development and functioning of the reproductive system but also has a critical role in diverse other tissue types and organ systems, including brain, bone, skin, fat, cardiovascular and metabolic.. Excesses or deficiencies of estrogen are associated with various pathological states, such as breast and prostate cancer and osteoporosis. Environmental chemicals that are estrogen-like in their bioactivity have been implicated in developmental abnormalities and endocrine-disrupting effects in humans and animals. Not surprisingly, factors and mechanisms regulating estrogen production and signal transduction continue to be a matter of intense research interest (reviewed by [2,3]).
Classically, the estrogen signaling system has two core components: cytochrome P450 aromatase, the enzyme complex that catalyzes the rate limiting step in estrogen biosynthesis; and estrogen receptors (ERs), ligand activated transcription factors that interact with the regulatory region of target genes to mediate the biological effects of estrogen. While this viewpoint continues to serve as a valuable template for basic and clinical studies, advances in molecular endocrinology reveal that the complexity and diversity of estrogen physiology is accomplished by multiple signaling modes (endocrine, paracrine, autocrine/intracrine), as defined by the nature, proximity and topographical relationship of aromatase and ER expressing cells; two or more genetically distinct ER subtypes and multiple ER splice variants; diverse other classes of membrane- and nuclear-localized receptors; and an array of different cellular signal transduction pathways (genomic, nuclear-mediated; non-genomic/membrane-mediated)(see section 1.2.1, below).
Fundamental questions remain regarding the evolution of the estrogen mediated signaling system. What are the evolutionary origins and molecular nature of the core components (aromatase and ER)? Which receptor signal transduction pathway is most ancient? Is the original messenger molecule the endogenously synthesized estrogen we know in vertebrates (estradiol, estrone)? Or did estrogen-like environmental molecules have the earliest signaling role? The basic anatomy, physiology and biochemistry of estrogen signaling have been extensively studied in representatives of all major groups of jawed vertebrates, signifying an ancient and evolutionarily conserved regulatory role. More recently, the structures and phylogenetic distribution of genes encoding aromatase (Figure 1a, [4,5]) and ER (Figure 1b, [6–10]) have been documented, reinforcing the earlier work, but mechanistic details of estrogen-mediated signaling in organisms that predate the gnathostomes is not entirely clear. One approach to addressing the question is to study the closest invertebrate relatives of vertebrates and to determine precursors of vertebrate-specific molecules and pathways in these organisms. In addition to vertebrates, the phylum Chordata includes two invertebrate groups: urochordates (e.g. the ascidian Ciona intestinalis) and cephalochordates (e.g. the amphioxus of the genus Branchiostoma). In this paper, we briefly review the evolutionary history of the aromatase and ER genes, with a particular emphasis on insights from amphioxus and other invertebrates, and then present new analyses of aromatase and ER in amphioxus.
The critical enzyme for estrogen synthesis is aromatase, a member of the cytochrome P450 (CYP) superfamily of monooxygenase enzymes . The membrane-associated aromatase complex catalyzes the transformation of androgens (androstenedione and testosterone) to estrogens (estradiol and estrone) and is the product of a single CYP19A1 gene in humans. Although most highly expressed in estrogen secreting glandular tissues, such as placenta and gonads, aromatase is expressed in a wide array of other tissue types: brain, fat, bone, pituitary in humans; brain, pituitary, retina in teleost fish. Of these, certain cell/tissue types are competent to transform acyclic precursors stepwise through cholesterol all the way to estrogen (ovary), whereas others are competent in the final aromatization step but are lacking one or more of the earlier enzymes in the steroidogenic pathway. Human placenta, for example, lacks C17,20 lyase (CYP17) and relies on androgen precursors supplied by the fetal adrenal for estrogen production.
The aromatase protein is monomeric and is anchored within the endoplasmic reticulum by a membrane-spanning region of the amino terminus [12,13]. The crystal structure of the human aromatase protein has recently been determined . The 503-residue polypeptide chain folds into 12 major α-helices and 10 β-strands and forms a heme group and adjacent steroid binding site near the geometric center of the protein . This overall folding pattern is similar to other membrane-bound P450s, and several regions show strong sequence conservation including helices H-K, the aromatic region and especially the heme-binding region. Of the conserved helices, the “I-helix” is particularly important because it contains several hydrophobic residues that help to form the catalytic cleft and incorporates a key bend at Pro308 that provides additional space to accommodate a steroid substrate [15,16].
Aromatase activity (for review ) and the CYP19 gene(s) have been well-documented in all major classes of gnathostome (jawed) vertebrates. The CYP19 gene has undergone independent duplications in several lineages, most notably the teleost fish [18,19] and suiform mammals [20,21]. Whereas the teleostean gene duplicates are thought to reflect a whole genome duplication event , the three CYP19 genes of pigs are the result of much more recent tandem duplication events. Duplicate aromatases retain the ability to synthesize estrogens but also exhibit functional differences. Within the teleost fish, duplicated CYP19 genes differ dramatically in their tissue expression patterns [19,23] as well as in their relative affinity for different androgen and inhibitor substrates [24,25] and inducibility by estrogens and xenoestrogens [18,23,26,27]. Similarly, in suiform mammals, duplicated aromatase genes differ in expression patterns, substrate affinity and product formation [20,21]. While humans possess only a single CYP19 gene, expression is regulated by 11 promoters and alternative first exons, which are used in a tissue specific manner [28,29]. Along with the diverse roles played by estrogens, this complexity of aromatase regulation indicates the importance and richness of the estrogen signaling pathway.
Phylogenetic analyses of the CYP superfamily have not revealed close relationships of CYP19 with any other family members [4,30]; thus, it is not currently possible to trace the origin of aromatase activity from ancestral CYPs that served other metabolic functions. CYP19 orthologs have recently been identified within a mphioxus [4,5]. However, CYP19 has not been identified within the sequenced genomes of urochordates, echinoderms, or protostomes, nor have they been identified outside of the bilaterian animals [31,32]. Although we cannot rule out the possibility that a recognizable ancestral CYP19-like gene or CYP19 itself was secondarily lost in these groups, the cephalochordate lineage represents the earliest known occurrence of CYP19 to date. In addition to CYP19, amphioxus contains orthologs of other enzymes in the steroidogenic sequence leading to estrogen biosynthesis: CYP17, and 17β-hydroxysteroid dehydrogenase [5,33]. In addition, Amphioxus contains CYP11-like genes that, along with some uncharacterized cnidarian and placozoan CYPs, are positioned as an outgroup to the vertebrate CYP11 clade [5,31]. CYP11A catalyzes cleavage of the side chain from the sterol D-ring; side chain cleavage by CYP11A (or a functional equivalent) is necessary for de novo synthesis of steroids. Because the catalytic activities of the amphioxus CYP11-like genes have not been determined and side-chain cleavage has not been documented, it remains unclear whether amphioxus can synthesize steroids from sterol precursors.
Measurements of steroidogenic activity using radiolabeled precursors and steroid-like immunoreactivity in amphioxus are consistent with the molecular studies described above. Aromatase activity in amphioxus was first demonstrated through the conversion of tritiated 19-hydroxyandrostenedione to estrone and estradiol by homogenates of body segments containing gonads . Interestingly, activity was not detected in homogenates of brain or tail segments. Mizuta and colleagues  similarly measured estrogen synthesis by amphioxus ovarian homogenates and documented a suite of steroidogenic conversions. Estrogen synthesis primarily occurred in mature ovarian tissues prior to spawning. Estradiol-like, as well as progesterone- and testosterone-like molecules, have been quantified in amphioxus gonads using radioimmunoassay . Similar to the patterns in aromatase activity, immunoactive estrogen was present in both ovaries and testes, but not in non-gonadal extracts, and concentrations in the ovary were greatest prior to spawning .
In vertebrates, the classical mechanism of estrogen signaling occurs through specific binding of estradiol to ERs, which are are encoded by Esr genes. Within the nuclear receptor superfamily, the ERs form a family with two other receptor groups: the estrogen-related receptors (ERRs), and other vertebrate-type steroid receptors (SRs, which include androgen receptors, progesterone receptors, and corticoid receptors). The human genome contains two ERs, ERα (NR3A1, Esr1 ) and ERβ (NR3A2, Esr2 ), due to a duplication of the Esr gene early in the vertebrate lineage . Unique among the vertebrates, however, teleost fish have one ERα but two ERβs (ERαa and ERβb).
Like other nuclear receptors, ERs have a modular structure divided into key functional domains (A-F) . At the amino terminus, the A/B domains contain the ligand-independent AF-1 activation function . The DNA-binding domain (DBD, C domain) is the most highly conserved region and contains two zinc fingers that enable binding of the ER to specific estrogen responsive elements (EREs) on the DNA. The hinge region (D-domain) has a more variable sequence, contains a nuclear localization signal, and enables synergism between the activation functions (AF-1 and AF-2) for full transcriptional activity . At the amino terminus, the ligand binding domain (E/F) LBD is highly conserved, and serves to bind ligands, enable dimerization, recruit co-factors and stimulate transcription through the ligand-dependent AF-2 region.
In the absence of ligand, ERs generally occur in complexes with chaperones, such as Hsp90 . Upon binding of estradiol or another agonist, ERs dissociate from the chaperones, form homo- or heterodimers ), recruit cofactors, bind to DNA and modulate transcription of target genes. Utilization of multiple promoters and alternative splicing creates additional complexity in ER signaling. Eight promoters have been identified for human ERα and two for ERβ, which function in tissue-specific expression [44–47]. Alternate splicing generates an exceptional number of ER isoforms lacking one or more functionally important domains; these variants differ in their expression patterns and functional properties . For example, a human ERβ isoform (ERβcx) truncated at the C-terminus has been reported heterodimerize with wild-type ERβ and function as a dominant negative [47–49].
In addition to modulating the activity of nuclear receptors, steroids can also stimulate rapid cellular responses which are mediated through membrane-bound receptors [50,51]. With respect to estrogen signaling, rapid effects have been attributed to interactions with classical nuclear ERs that are localized within the cell membrane [52–54] as well as with GPR30, a G-protein coupled receptor . To date, membrane-bound ERs have only been rigorously characterized in mammals and fish [56,57]. Estrogens have been shown to exert similar rapid effects on cell signaling in molluscs ; however, the genes encoding membrane-bound ERs have not yet been identified in invertebrates, and it has not yet been demonstrated that estradiol is the endogenous activator of this receptor.
ERs have been identified and shown to be activated by steroidal estrogens in all classes of vertebrates, including the agnathan sea lamprey . Among invertebrates, homologs to the ERs have been identified in amphioxus [7,33] as well as in molluscs [9,59] and annelids . Previous phylogenetic analyses conducted using a variety of methods (parsimony, likelihood, Bayesian) have shown that chordate ERs (vertebrate and amphioxus) form a clade [7,10] and that the protostome ERs (mollusc and annelid) comprise a sister group [9,10]. In addition, Keay and Thornton  found that this bilaterian ER clade was supported as a sister group to the SRs. In their study, the position of the protostomes ERs was only moderately supported, but much of the observed uncertainty could be attributed to the effects of a long branch associated with the amphioxus SR.
As demonstrated by reporter assays in mammalian cell lines, ERs from amphioxus [6,8,60] and from molluscs [9,59] are not activated by steroidal estrogens. In contrast, ERs from two annelid species bind estrogens with high affinity and activate transcription in response to low doses (EC50 < 10 nM estradiol) of estrogens , although it remains to be determined whether steroidal estrogens are physiological ligands for these annelid receptors. Based on phylogenetic patterns and reconstructions of predicted ancestral receptors, it has been hypothesized that the ancestral ER originated early in the bilaterian lineage and was activated by estrogens ([10,61], but see also [6,31]). One interpretation is that ER activation by estrogens was property that was lost within the lineage leading to the cephalochordates and that the ER gene per se was lost from echinoderms, urochordates and several protostome lineages.
Within the large nuclear receptor superfamily (48 genes in human, 33 in amphioxus ), the ERs form a family (NR3A) with two other receptor groups: the estrogen-related receptors (ERRs, NR3B), and other steroid receptors (SRs, NR3C, which include androgen receptors, progesterone receptors, and corticoid receptors). Amphioxus has one representative gene in each of these three groups [7,33]. As mentioned above, cell-based reporter assays indicate the amphioxus ER ortholog does not stimulate transcription of ERE-driven reporters or interact with the coactivator SRC-1 in response to estradiol. Somewhat surprisingly (but as hypothesized by Paris and colleagues ), reporter assays indicate that the amphioxus SR stimulates transcription through EREs and AREs (androgen-responsive elements) in response to estradiol and estrone [8,60]. Amphioxus ER and SR share overlapping affinities for DNA binding sites, and reporter assays indicate that ER can competitively repress estradiol-induced signaling by SR  as well as by human ERα and ERβ . Binding of ligands to amphioxus ER was not directly measured in these studies, but limited proteolysis assays suggested that the amphioxus ER is unlikely to bind estradiol or several other ligands for vertebrate ERs . Cell-based reporter assays have been used to screen a variety of ligands (e.g., 3β-androstenediol, resveratrol, enterolactone, diethylstilbestrol ) for their ability to modulate signaling by amphioxus ER, but no functional ligands have been identified. Interestingly, although limited proteolysis assays suggested that the plasticizer bisphenol A can bind amphioxus ER, this ligand did not affect transactivation .
Bridgham and colleagues  noted that 11 of the 18 residues that line the ligand-binding pocket of human ERα are altered in amphioxus ER, but only 4 of 18 in amphioxus SR. Through comparison with the human ERα crystal structure, they identified two key substitutions likely to disrupt hydrogen bonding and packing interactions that would normally stabilize the ligand within the binding pocket in a trancriptionally active conformation. They then conducted site-directed mutagenesis, and experimentally demonstrated that the two substitutions (corresponding to amino acids 394 and 404 in the LBD of human ERα) are indeed sufficient to confer repressive activity on the SR.
As part of a long term program of research in this laboratory that focuses on the origin and evolution of estrogen signaling in vertebrates, we sought to obtain insights by studying aromatase and ER in amphioxus. Here we confirm and extend studies cited above, and present new information on CYP19 gene organization, including an in silico model of the aromatase protein.
Amphioxus (Branchiostoma floridae) were purchased from Gulf Specimen Marine Lab (Panacea, FL). Animals were obtained in May, when adults were reproductively active and readily sexed by visualizing the gonads through the transparent body wall. Immediately upon receipt, animals were chilled to 4° C on ice, sexed, and divided into cephalic (anterior to the gonads), caudal (posterior to the gonads), and central (gonad-containing) regions under a dissecting microscope as previously described .
Tissues were used to prepare RNA (as in [18,62]) for cloning and semi-quantitative PCR analysis. For analysis of genomic sequence, DNA was extracted from tail segments of individual amphioxus. Briefly tissue (250 mg) was incubated overnight at 56° C in 500 µl of lysis buffer (50 mM Tris-HCl [pH 8.0], 5 mM EDTA [pH 8.0], 200 mM NaCl, 1% [w/v] sodium dodecyl sulfate containing proteinase K to a final concentration of 0.1 mg/mL). After addition of 500µl isopropanol, the sample was centrifuged for 5 min at 3500 rpm at 4 °C. The resulting DNA pellet was washed once with 100% ethanol (1 ml) and once with 75% ethanol (1 ml), air dried for 10 min, and resuspended in 30 µl TE buffer (10 mM Tris- HCl/1 mM EDTA).
Using total RNA from ovarian segments and methods previously described in detail for teleostean cDNAs [63,64], amphioxus aromatase and ER cDNAs were amplified stepwise by RT-PCR and 5’- and 3’-RACE. Oligonucleotide primers are shown in Supplementary Table 1. In the case of aromatase, initial primers were designed to target sequences in an in silico P450 aromatase predicted by Nelson . For cloning of ER, initial primer sequences were designed to amplify a portion of the ER detected bioinformatics queries of the amphioxus whole genome database using the discontinuous megaBLAST algorithm with human ERα (NM_000125) and ERβ (X99101), Aplysia ER (AY327135) and lamprey ER (AY028456) The sequence identified as a putative amphioxus DBD was extended in the 3’ and 5’ directions using an in silico DNA-walking approach in combination with 5’ and 3’-RACE.
For both aromatase and ER, full coding sequences were then amplified as single products, confirming assembly of the cDNA fragments. Deduced aromatase and ER sequences were aligned using Clustal W with sequences previously reported from representative vertebrate taxa (Accession numbers shown in Fig 1 caption). To confirm the phylogenetic relationship of the cloned amphioxus sequences, trees were constructed using Neighbor-Joining and/or maximum likelihood criteria. For Cyp19, the tree was rooted using the human Cyp17 and Cyp21 sequences, which are both members of the Cyp2 clan [30,65]. A maximum likelihood tree was constructed using RAXML  with a WAG matrix (selected by AIC using ProtTest version 2.4 ) and 100 bootstrap replicates. For ER, a Neighbor-Joining tree was constructed in Phylip 3.6  with 1000 bootstrap replicates and a PAM Dayhoff matrix.
Intronic sequence was obtained for the CYP19 gene by PCR amplification of genomic DNA using primers which were specific for sequences in adjacent exons or spanning exon-intron junctions (Supplementary Table 1). 5’-flanking sequence was amplified from genomic DNA using a forward primer targeting genomic sequence and a reverse primer targeting a sequence downstream of putative translational start site in the second exon. Putative cis regulatory elements were identified within the 5’-flanking sequence by comparison with the TRANSFAC database using MATCH with default parameters .
The crystal structure of the human aromatase protein has recently been determined  and is available in the Protein Data Bank , PDB code 3EQM. We used the homologous extension program MODELLER [71,72] to generate a model of amphioxus aromatase. After specifying the target sequence (GenBank ID DQ165086.1), the template sequence and structure PDB code 3EQM), and an alignment of the two sequences, MODELLER was used to automatically build a 3-dimensional protein model containing all non-hydrogen atoms. The model was refined using energy minimization within MODELLER.
The main goal of constructing a model of the amphioxus aromatase was to compare the binding sites of the human and amphioxus proteins. The comparison uses a very sensitive tool called computational solvent mapping [73,74], originally developed for the identification of “hot spots”, i.e., pockets of a protein that bind a variety of small organic molecules. An established experimental approach to finding such hot spots is screening for the binding of fragment-sized organic compounds [75,76]. Since the binding is very weak, it is usually detected by nuclear magnetic resonance (SAR by NMR ) or by X-ray crystallography  methods. The FTMAP solvent mapping algorithm used here is a computational analog of the screening experiments, and has been described previously . FTMAP places molecular probes, small organic molecules containing various functional groups, around the protein surface on a dense grid, finds favorable positions by further search using empirical free energy functions, clusters the low energy conformations, and ranks the clusters on the basis of the average free energy. We used 16 small molecules as probes (ethanol, isopropanol, tert-butanol, acetone, acetaldehyde, dimethyl ether, cyclohexane, ethane, acetonitrile, urea, methylamine, phenol, benzaldehyde, benzene, acetamide, and N,N dimethylformamide). The low energy clusters of different probes are further clustered to identify consensus sites, and the importance of such sites is measured in terms of the probe clusters contained. The sites with the largest number of probe clusters are considered as predictions of binding hot spots. Applications to a variety of proteins show that the probes always cluster in important subsites of the binding site and the amino acid residues that interact with many probes also bind the specific ligands of the protein. Since the differences in the number of probe clusters that bind to a particular site highlight even very small conformational changes if those affect the size or surface properties of the pocket, mapping is very useful for comparing homologous proteins or different structures of a protein [77–80]. The comparison is based on residue contact fingerprints. To obtain such fingerprints, the non-bonded interactions and hydrogen bonds between all atoms of the computational probes and the individual protein residues are counted using the HBPLUS program .
After the identification of the important residues in the binding site, we docked androstenedione to both the human aromatase structure and the homology model of the amphioxus aromatase using version 4.0 of the AutoDock program . AutoDock is a suite of automated docking tools. It is designed to predict how small molecules, such as substrates or drug candidates, bind to a receptor of known 3D structure. The docking is restricted to a 40 Å × 40 Å × 40 Å box, centered at the center of the protein. The box is large enough to enclose the entire ligand binding site. Other parameters are assigned the default values given by the AutoDock program. The protein structure is kept fixed during docking. AutoDock employs a genetic algorithm (GA) for conformational sampling, each GA run resulting in a single docked conformation. We performed 100 individual GA runs, thus generating 100 docked conformations for each complex.
The full length Amphioxus ER was subcloned into a v5-tagged expression vector (pcDNA3.1/nV5-DEST, Invitrogen). A similar expression vector was obtained for the human ERα (pcDNA3.1nv5-hERalpha, ). To assess the ability of amphioxus ER to bind estradiol, amphioxus ER and human ERα proteins were synthesized using the TnT Quick Coupled Reticulocyte Lysate System (Promega). The specific binding of tritiated estradiol ([6,7-3H] estradiol, 45.0 Ci/mmol, Amersham Biosciences) to in vitro expressed ERs was measured using charcoal-based binding assays [84,85]. Briefly, in vitro synthesized proteins were diluted in MEEDGM buffer (25 mM MOPS, 1 mM EDTA, 5 mM EGTA, 0.02% NaN3, 20 mM Na2MoO4, 10% (v:v) glycerol, 1 mM DTT, pH 7.5) containing a mixture of protease inhibitors . To correct for variation in expression efficiency, amphioxus ER was diluted 1:10 and human ERα was diluted 1:20. Aliquots (100 µl) of the diluted proteins were incubated overnight at 4°C with tritiated estradiol in 2.5 µl DMSO. The activity of tritiated estradiol was directly measured in 10 µl from each tube. At the end of the incubation, 30 µl was transferred from each tube in duplicate aliquots to 1.5 ml polypropylene microcentrifuge tubes containing 30 µl of 4 mg/ml dextran-coated charcoal in MEEDGM. Tubes were incubated on ice for 10 min with periodic vortex mixing. The tubes were centrifuged for 2 min at 2000 × g, and activity was quantified in 40 µl of the supernatant by liquid scintillation counting. Nonspecific binding was directly measured using TnT lysate incubated with an empty expression vector . Specific binding of tritiated estradiol to the ERs was calculated by subtracting non-specific binding from total binding. Binding curves were fitted using a one-site binding equation with PRISM software (GraphPad).
Transactivation by amphioxus ER was assessed using a cell-based reporter assay with methods similar to those described by Karchner et al. . COS-7 cells (ATCC) were plated (3 × 104 cells/well) in triplicate wells of 48-well plates in phenol red-free MEM (Invitrogen), supplemented with non-essential amino acids, 1 mM sodium pyruvate, 2 mM L-glutamine and 10% charcoal-stripped fetal bovine serum. After 24 hours, cells were transiently transfected using 1 µl Lipofectamine 2000 (Invitrogen) in fresh media along with expression plasmids for an ER (human or amphioxus, 100 ng), a luciferase reporter (3xERE-TATA-LUC, Addgene plasmid 11354 , 100 ng) and transfection control (pRL-TK, Promega, 3 ng). The total amount of DNA per well was adjusted to 300 ng through addition of an empty expression vector (pcDNA3.1). Five hours after transfection, cells were treated with vehicle control (0.5% DMSO final concentration), estradiol (1–100 nM), or other potential ligands. Twenty-four hours after transfection, the cells were lysed with passive lysis buffer (Promega), and luminescence was measured using the Dual Luciferase Assay kit (Promega) in a TD 20/20 luminometer (Turner Designs, Sunnyvale, CA). Transactivation in t he presence of DMSO and estradiol was measured in three independent experiments. The other compounds were tested in two independent experiments.
Semi-quantitative RT-PCR was performed using cDNAs from head, gonadal and tail segments from individual amphioxus. Primer sequences are given in Supplementary Table 1. The PCR reactions utilized Platinum Taq polymerase (Invitrogen) according to the manufacturer’s instructions. PCR conditions were set to approximate the linear range by optimizing the quantity of input template and cycle number. PCR conditions for aromatase were 94° C/ 5 min, 30 cycles of (94° C/ 30 s, 50°C/ 45 s, and 72°C/ 2 min), followed by 72°C / 10 min. PCR conditions for ER were 94°C for 5 min, 5 cycles of (94°C/ 30 s, 43°C/ 45 s, and 72°C/ 90), then 20 cycles of (94°C/ 30 s, 50°C/ 45 s, and 72°C/ 2 min), followed by 72°C/ 10 min.
The assembled amphioxus CYP19 cDNA consensus sequence (GenBank Accession number DQ165086) consisted of a single translation initiation site, a 1581 bp open reading frame (ORF) that encoded a predicted protein sequence of 527 aa, and 5′ and 3’ UTR of 5 and 1194 bp, respectively. The 3’-UTR terminated in a polyA tail. Compared with the in silico sequence initially reported by Nelson , our cloned sequence showed 13 overall residue substitutions and a 5 amino acid insertion at the boundary of exons 4 and 5 (amino acid 173, not shown). Two of the differences were within the conserved I-helix domain. Compared with the partial cDNA sequence reported by Castro and colleagues , our sequence contained 3 residue substitutions and a single amino acid insertion (amino acid 373). Our sequence was 88% identical to the B. belcheri sequence (433/492 residues). The amino terminus of the B. floridae CYP19 aromatase is elongated relative to the human and killifish aromatase B sequences and is similar in length to the dogfish and killifish aromatase A sequences. While the B. belcheri sequence is not elongated, the predicted start codon aligns with the second methionine in our B. floridae sequence. Because no 5’-UTR sequence has been reported for B. belcheri, we consider it likely that a portion of the amino terminus has been truncated.
Phylogenetic analysis confirmed that the amphioxus sequence identified in this study is orthologous to the vertebrate aromatases (Fig 1A), consistent with previously published analyses of amphioxus aromatase conducted using neighbor-joining [4,5] and maximum likelihood methods . The tree topology corresponded with the evolutionary relationship between amphioxus and vertebrates .
The assembled cloned amphioxus ER cDNA (GenBank accession number EF554313.1) contained an ORF of 1383 bp, a 5’-UTR of 684 bp, and two 3’-UTR sequences (988 bp and 633 bp). The long and short UTRs overlapped and were essentially identical in sequence at their 5’ ends. Both had polyA tails suggesting they are products of a single mRNA with alternate polyA addition sites. The ORF of the assembled mRNA encoded a polypeptide of 460 aa, and was amplified, cloned and sequenced. The cloned cDNA had >99% identity when compared to the in silico derived ER cDNA; however, the protein predicted from the genomic sequence (JGI_210589), is missing the entire A/B domain 5’ of residue 83 of our cloned sequence and contains several indels due to incorrectly predicted exon boundaries. Our cloned sequenced differed by two amino acids from the sequence reported and characterized by Paris et al. : one in the A/B domain (histidine at residue 33 in our sequences replaced by arginine) and one in the hinge domain (arginine at residue 164 replaced by lysine); both of these differences result in conservative substitutions. A phylogenetic tree constructed using our ER sequence was consistent with previously published trees and the evolutionary relationships among taxa (Fig 1B, [6,8,88])..
Through interrogation of the amphioxus genome assembly and cloning of all the B. floridae CYP19 exons and introns, we determined the complete sequence of the gene (GenBank Accession Number HQ115077). Like all other CYP19 genes, the amphioxus CYP19 has nine coding exons, and these are well conserved in size (Figure 2, Table 1). As previously reported for CYP19A1a the predominant ovarian aromatase in goldfish  and zebrafish , the amphioxus CYP19 gene most closely resembles the situation of the human gene in which the PII (ovarian) promoter and untranslated first exon are contiguous with and immediately upstream of the ATG site in exon II [90,91]. In contrast, CYP19Alb, the predominant brain aromatase of teleostean fish, has an untranslated first exon farther upstream and, in this respect, resembles the human ortholog, in which multiple promoters/untranslated first exons located as far as −93 kb from the translation initiation site are alternatively spliced in a tissue-specific manner to a common site in exon II such that the aromatase protein synthesized is identical in all tissues [90,91], suggesting that tissue-specific promoters were acquired sequentially during the course of evolution (ovary>brain>placenta). From the ATG in exon II, the amphioxus CYP19 is approximately 7 kb, much smaller than the human CYP19 (30 kb) or either zebrafish CYP19A1a (15 kb) or CYP19A1b (12 kb), due primarily to shorter introns (Figure 2; Table 1). Worth noting here, our experimentally determined intronic sequences, when aligned with the amphioxus whole genome database, had a number of indels and mismatches, most notably a 1300 bp insert in intron III and a 216 bp insert in intron IV at the junction with exon V.
Regulation of CYP19 expression and promoter structure varies considerably among taxa. In contrast to teleost fish in which aromatase expression in brain and ovary is controlled by two distinct genes and promoters, the CYP19 of humans, other mammals and birds is a single gene with multiple promoters (also see section 3.3.1 above, and legend to Fig. 2). From genomic DNA, we amplified, cloned and sequenced 5’-flanking sequence 1184 bp upstream of the ATG in exon II (Genbank Accession Number HQ010363), which includes a TATA box at −187. Although overall sequence identity in the 5'-flanking region of the different CYP19 genes was low, statistically over-represented motifs corresponding to known cis elements were identifiable. TRANSFAC analysis of the B. floridae 5’-flanking sequence revealed at least six potential transcription factor binding sites, each of which have been identified within the aromatase promoter from other taxa (Table 2). Notably, some forms of aromatase from other taxa (e.g., CYP19a1b expressed predominantly in teleostean brain [19,92] and several human tissue-specific CYP19 promoters [91,93]) can be induced by estradiol exposure through direct ER interactions with estrogen-responsive elements (ERE) or indirect ER interactions with other transcription factors and binding sites. A typical ERE consists of two hexameric half-sites (AGGTCA) in opposite orientation (inverted repeats), separated by three nucleotides . In addition, several nuclear receptors, including ERα, ERRα and SF-1 can bind to ERE half-sites or extended half-sites (TCAAGGTCA, also called ERREs or SFREs) . While we did not identify EREs upstream of the B. floridae CYP19, three largely conserved putative ERE half-sites were found within the amphioxus CYP19 promoter (designated by MATCH as ERR and SF-1 binding sites, Table 2). Availability of a putative promoter of the amphioxus CYP19 provides an entry point for studying transcriptional regulation at this key phyletic level.
Key functional domains of our deduced amino acid sequence were aligned with reported CYP19 sequences from the congener B. belcheri and representative vertebrates (Fig. 3). Boundaries of conserved functional domains are as described by Simpson et al.  and correspond to the following residues: human (I-helix 294–324, aromatic region 376–398, heme-binding 424–443), B. floridae (I-helix 327–357, aromatic region 399–430, heme-binding 460–475). Comparison among taxa revealed moderate conservation between amphioxus and vertebrate sequences. Relative to B. belcheri, our sequence contained one difference in the aromatase-specific conserved region, and one in the heme-binding region. Compared with the human sequence, the B. floridae sequence exhibited 54% identity (17/31 residues) in the I-helical domain, 56% (18/32 residues) in the aromatic region, and 56% (10/18 residues) in the heme-binding domain. Within these regions, the four residues shown to contact the substrate by the human aromatase (A306, D309, T310, F427; Fig. 4b, 4d) are also predicted to contact the substrate by the amphioxus aromatase (A339, D342, T343, F463; Fig. 4a, 4c, Section 3.5). These four residues are perfectly conserved among all taxa shown.
Using MODELLER, the human and amphioxus aromatase sequences are 40% identical overall and, considering conservative mutations, show similarity for 60% of the amino acid residues. In addition, the identical and similar residues are distributed evenly along the sequence, and there are only 14 residues in gap regions for the sequence of 452 amino acids. Based on this high level of sequence conservation, it is expected that a useful model of the amphioxus aromatase can be constructed based on the structure of the human protein.
Figure 4A shows the amino acid residues in the binding site of the resulting amphioxus aromatase model and the position and orientation of androstenedione (shown in grey) obtained by docking. We note that the docking is predicted to be fairly accurate. Indeed, the 100 independent docking runs yielded docked androstenedione poses that can all be confined to a cluster with a mean root mean square deviation (RMSD) of less than 0.8 Å, and all docked structures have very similar interactions with the surrounding residues. In order to further test the docking algorithm, we also docked androstenedione to the known structure of the human aromatase (Figure 4B). The docked poses from the 100 docking runs formed a cluster with the RMSD of less than 1.2 Å, and the lowest energy docked pose (grey) had an RMSD of less than 1 Å from the androstenedione pose in the X-ray structure (shown in violet). In addition to the similar binding modes, the binding energies obtained in the two docking experiments (−10.7 and −11.3 kcal/mol for human and amphioxus aromatase, respectively) suggest that androstenedione is likely to bind to the human and amphioxus proteins with similar affinity.
Figure 4C and 4D show the percentage of nonbonded interactions between the small molecular probes from the computational solvent mapping and the amino residues in the human and amphioxus aromatase, respectively. We consider only the binding site residues within 6 Å from any androstenedione atom. The two fingerprints confirm the conservative character of the mutations in the binding site, and explain why the binding modes of androstenedione are so similar in the two proteins. The site includes 21 amino acid residues that have more than 1% of the nonbonded interaction contacts in one or both structures, but only one of these residues is mutated (from L372 to F404). In addition, as shown for F404 in Figure 4A and for L372 in Figure 4B, these residues interact with the bound androstenedione using backbone atoms rather than their side chains, and hence do not affect the binding features. Thus, all residues that are critical for the binding of small molecules are also highly conserved during the course of evolution. The conservation is not as strong for the less important residues: among the five positions in the binding site that have less than 1% of the nonbonded interaction contacts, two are mutated during the course of evolution (I305 to V338 and A307 to G340).
Based on the results described above, there is a remarkable degree of conservation in the predicted structure of the amphioxus and human aromatase proteins despite the approximately 500 million years of divergence between the cephalochordate and vertebrate lineages. While the overall amino acid identity is moderate (40%), binding site residues are highly conserved, and docking results indicate that androstenedione is likely to react within the catalytic site of the amphioxus protein as it does with human aromatase. In this regard, it would be of interest to compare the substrate affinity and catalytic activity of the two aromatase enzymes in the same membrane context.
Paris et al.  inferred from limited proteolysis assays that bisphenol A binds the amphioxus ER but other classic ER ligands (estradiol, 3b-androstane-diol, 4-hydroxytamoxifen, diethylstilbestrol, enterolactone, ICI-182780) do not. The limited proteolysis assay indicates the ability of a compound to induce a conformational change in a protein that protects it from typsin digestion, as is generally observed upon binding of estrogens to the vertebrate ER LBD [6,96]. In this report, we quantified specific binding of radiolabeled estradiol to the human and amphioxus estrogen receptors as a more direct measurement of binding. When expressed in vitro, human ERα specifically bound tritiated estradiol in a saturable manner with high affinity (Fig. 5, Kd = 0.23 ± .046 nM). In contrast, no specific binding of estradiol to the amphioxus ER was detected in this assay (Fig. 5A).
When human ERα and amphioxus ER were transiently transfected into COS-7 cells, they produced proteins of the expected size (59 kD amphioxus, 66 kD human) with a similar efficiency. As expected, estradiol, bisphenol A, diethylstilbestrol and genistein activated human ERα, and 4-hydroxytamoxifen (an ER antagonist) did not activate human ERα. As shown in Fig. 5B, activation of human ERα by the weak estrogens bisphenol A and genistein was more variable (larger error bars), although this variability was not consistently observed. The amphioxus ER showed no constitutive activity beyond that of an empty expression vector. Transactivation by the amphioxus ER was not increased in the presence of estradiol or the other estrogenic compounds tested (Fig. 5B). These results are consistent with previous studies showing that amphioxus ER is not activated by ligands for the vertebrate ER [6,8,60]. Indeed a ligand for amphioxus ER has not been identified, although it has been demonstrated that amphioxus ER can serve as a competitive repressor for the hormone-activated SR [8,60].
Semi-quantitative RT-PCR was conducted to examine the expression of aromatase and ER transcripts in different amphioxus body segments (Fig 6A). As previously reported for aromatase enzyme activity , aromatase mRNA expression was limited to central (gonad-containing) segments, and expression was somewhat higher in females. Although ER mRNA was detectable in all three regions, the relative band intensity was tissue-related: expression was highest in gonad-containing segments (ovary > testis) and lower but approximately equal in cephalic and caudal segments (Fig 6B). Overall, these expression patterns are consistent with results from Bridgham et al , who used in situ hybridization to demonstrate that ER and SR are primarily expressed in gonads: ER and SR were co-expressed in oocytes, but in testes SR was broadly expressed and ER expression was more restricted.
The basic requirements of a functional chemical signaling system are (a) a messenger molecule; (b) a cellular receptor for recognition and signal transduction; and (c) a biological response. Results presented here reinforce the view that the cephalochordate amphioxus has the ability to synthesize estrogen, and also has the core molecular elements of a classical vertebrate ER-mediated signal transduction pathway. While modeling and docking studies predict that amphioxus aromatase will bind androgen, the substrate affinity, catalytic activity and other reaction properties of this enzyme remain to be evaluated. In addition, functional differences between vertebrate and amphioxus ERs and SRs indicate that mechanistic differences in estrogen signaling must exist between the two groups. Indeed, evidence that aromatizable substrate is available and that estrogen is actually recognized as a chemical messenger that activates a cellular response in a biologically relevant context remains to be established.
What is clear from our new analyses of the amphioxus CYP19 gene and aromatase protein is the remarkable degree of structural and functional conservation from amphioxus to humans. To place this in an evolutionary timeframe, the ancestral chordate represented by the common ancestor to contemporary vertebrates, amphioxus, and tunicates is estimated to have emerged 500 million years ago (Cambrian era). In view of this ancient history, it is surprising that a recognizable ancestral CYP19 has not yet been found among the CYP genes in invertebrates. Although the possibility that CYP19 was secondarily lost in invertebrates cannot be ruled out, a renewed search using the larval forms of invertebrates and a wider range of species could be productive in illuminating the evolution of this important member of the CYP family of genes.
In itself, conservation of a character, such as the ability to synthesize estrogen, signifies an important adaptive value. Moreover, coexpression of aromatase and ER in the gonads suggests a functional interaction, perhaps a paracrine/autocrine signaling role in regulating seasonal or cyclical gonadal growth as occurs in vertebrates. How can this be accomplished if, as we show here, amphioxus ER does not bind estradiol? One explanation is that estradiol is not a surrogate for the actual amphioxus estrogen. Certainly, many natural steroidal chemicals (estrone, estriol, catechol estrogens) have estrogenic or antiestrogenic bioactivity but differ substantially in their binding properties and spectrum of bioactivities when compared to estradiol, even when tested with mammalian ER. It is worth noting here that aromatization of androgen to estrogen occurs in three hydroxylation steps and accumulation of intermediates such as 19-nortestoserone is substantial with some aromatases (e.g., porcine blastocyst isoform; ). To our knowledge, these steroids have not been tested with amphioxus ER although 19-nortestosterone is reported to bind to the mammalian ERβ . Additionally, estrone and estradiol can be further metabolized to a variety of hydroxylated forms (e.g., at C2, C4). Although these estrogens generally do not interact to any extent with mammalian ER, they cannot be ruled out as ligands of the amphioxus ER.
Another way to explain discordance between estrogen synthesis and estrogen action is that the early estrogen signaling system involved ER indirectly, for example, through heterodimerization with another estrogen-activated nuclear receptor (ERR, SR), or through binding with a different class of membrane-associated receptors (GPR30). These, in turn, could activate ER through phosphorylation or other post-translational modification. Additionally, ERs partner in protein-protein interactions with other nuclear factors by which they are tethered to DNA binding motifs including Sp-1 and AP-1 recognition elements . Without testing a variety of reporter constructs, it would be premature to conclude that the amphioxus estrogen/ER complex lacks transcactivational activity.
If it can be proven that the role of ER in estrogen signaling in amphioxus is indirect, then it is reasonable to postulate that direct estrogen binding/transactivation of ER is a feature that was acquired secondarily during the course of evolution, concomitant with the ever-increasing complexity of vertebrate organisms. This theory could explain the remarkable diversity and complexity of estrogen signaling pathways in contemporary mammals: genomic/transcriptional; rapid non-genomic/membrane-mediated; ligand- and ERE-dependent and independent (see Introduction).The value of an evolutionary perspective is that it provides a conceptual framework for organizing and analyzing information, thereby revealing common themes, unanswered questions and new hypotheses for testing. At this point we cannot rule out the possibility that endogenously synthesized estrogen is just a metabolic byproduct, or that the ER of adult amphioxus is preadaptive or degenerate. The information provided here provides an entry point for new molecular analysis. A key remaining challenge, however, is to demonstrate that estrogen has biologically relevant effects at this phyletic level.
Supported by grants from the NIEHS P42 ES07381 (GVC, SV) and EPA (STAR-RD831301) (GVC), a Ruth L Kirschstein National Research Service Award (AT, F32 ES013092-01), an NIH traineeship (SS, SG), a NATO Fellowship (AN) and the Boston University Undergraduate Research Program (LC). The human ERαand 3xERE-TATA-luc reporter plasmids were generously provided by Dr. Donald McDonnell.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.