|Home | About | Journals | Submit | Contact Us | Français|
Chromosome position effects combined with transgene silencing of multi-copy plasmid insertions lead to highly variable and usually quite low expression levels of mini-genes integrated into mammalian chromosomes. Together, these effects greatly complicate obtaining high-level expression of therapeutic proteins in mammalian cells or reproducible expression of individual or multiple transgenes. Here, we report a simple, one-step procedure for obtaining high-level, reproducible mini-gene expression in mammalian cells. By inserting mini-genes at different locations within a BAC containing the DHFR housekeeping gene locus, we obtain copy-number-dependent, position-independent expression with chromosomal insertions of one to several hundred BAC copies. These multi-copy DHFR BAC insertions adopt similar large-scale chromatin conformations independent of their chromosome integration site, including insertions within centromeric heterochromatin. Prevention of chromosome position effects, therefore, may be the result of embedding the mini-gene within the BAC-specific large-scale chromatin structure. The expression of reporter mini-genes can be stably maintained during continuous, long-term culture in the presence of drug selection. Finally, we show that this method is extendable to reproducible, high-level expression of multiple mini-genes, providing improved expression of both single and multiple transgenes.
Plasmid-based expression cassettes using cDNA mini-genes driven by viral or cloned eukaryotic promoters represent the most common method for expression of transgenes. Stable expression is usually achieved by integration of these cassettes into the host eukaryotic genome. However, expression levels of these mini-genes are typically greatly influenced by the chromatin structure surrounding the integration site, producing chromosome position effects, sometimes accompanied by variegation of expression (1). Transgenes integrated into repressive chromatin regions are expressed at low levels and tend to be silenced over time. This effect is particularly pronounced in mammalian cells. A second phenomenon contributing to low levels of transgene expression is multi-copy transgene silencing, observed for most plasmid mini-genes (2,3). In transgene silencing, expression per gene copy tends to decrease with increasing transgene copy number such that transgene expression levels do not increase proportionally with copy number and very high copy number insertions may express at levels comparable or even lower than single copy insertions.
The combined impact of chromosome position effects and transgene silencing makes typical transgene expression in mammalian cells both unpredictable and unstable, hindering both industrial and clinical applications as well as biomedical research applications. These problems are compounded when cell lines expressing multiple transgenes are required.
As just one example, a number of recombinant proteins are important therapeutic reagents with enormous market value. Mammalian cell culture has been the dominant expression system for therapeutic protein production as it facilitates both proper protein folding and posttranslational modifications (4,5). In the absence, however, of a robust, single-step method for reliable, high-level, multi-copy transgene expression, gene amplification remains the method of choice for obtaining high expressing cell clones (6). This process of gene amplification, in which cell mutants carrying hundred of copies of an inserted mini-gene are gradually selected, requires repeated rounds of cell selection, subcloning and clone characterization over a period of many months. Even then, selection of amplified cell clones with high-level, stable expression can be difficult and unpredictable, in many cases requiring a year or more for clone development and stabilization.
To improve the efficiency and minimize the unpredictability of transgene expression, various cis-regulatory elements have been used to flank transgenes to maintain accessible chromatin structure and counteract chromosome position effects. Locus control regions (LCRs) (7), insulators (8,9), ubiquitous chromatin opening elements (UCOEs) (10,11), Scaffold/Matrix associated regions (SAR/MARs) (12,13) and antirepressor elements (STAR) (14) have all been shown to improve transgene expression to some degree (15).
Locus control regions were identified as DNA sequences conferring copy-number-dependent, position-independent expression close to endogenous gene expressions in transgenic mice (16). Specifically, this property of copy-number-dependent expression is used as an operational definition distinguishing LCR elements from enhancers. Known LCRs generally function in a tissue-specific manner, restricting their use to certain cell types. In addition, as demonstrated for the β-globin LCR, they are thought to act by interacting with only one promoter at any one moment in time, limiting their use to promote the expression of multiple transgenes simultaneously (17). A ‘mini-locus’ cosmid construct piecing together the LCR, 3′ β-globin DNase I hypersensitivity site, and other β-globin regulatory sequences showed highly linear, copy-number-dependent β-globin expression over a 1–94 range in cosmid copy number (7). Copy-number-dependent expression has also been reproduced after random insertion of 244 kb or 155 kb yeast artificial chromosomes containing the intact β-globin locus (18). Instead, a 100-kb DNA genomic DNA region containing the entire β-globin locus and LCR on a bacterial artificial chromosome (BAC) produced variable position effects in transgenic mice, with BAC transgenes expressing anywhere from 0% to 105% the level of the endogenous genes, with most lines expressing at ~1/3 the level of the endogenous genes (19). To reduce the original cosmid mini-LCR to a size allowing the use of more traditional cloning vectors, including viruses, ‘micro-LCRs’ have been constructed using various combinations of the small DNA regions corresponding to the DNase I hypersensitivity sites of the LCR. These micro-LCRs are necessary but not sufficient for conferring chromosome position-independent expression, with additional and different regulatory regions present in the introns and intergenic regions of specific globin genes, possibly arranged in a specific spatial pattern, required to confer copy-number-dependent expression (20–24). For all of these reasons, use of LCRs in cloning vectors as a general tool for conferring high-level protein expression or multi-gene transgenesis may be limited.
Insulators are another important class of cis-regulatory elements used to shield transgenes from chromosome position effects created by spreading of activating or repressive chromatin marks from regions flanking the transgene insertion site (25). Most published studies using insulator sequences have not included careful analysis of the degree to which these sequences confer copy-number-dependent expression (25,26). In at least one case where such analysis was done, the chicken HS4 insulator was not sufficient to confer copy-number-dependent transgene expression in B cells (27). Copy-number-dependent expression in this system required additional, LCR-like cis-regulatory sequences in the transgene together with the insulator sequence. Most of the other cis-regulatory elements described above also are not sufficient to confer both position-independent and copy-number-dependent transgene expression.
Here we describe a new transgene expression system based on embedding mini-gene constructs within large, cloned mammalian genomic DNA regions. When these DNA constructs are integrated into the mammalian genome they appear to create a reproducible, favorable chromatin environment for transcription independent of their chromosome integration site. We have implemented this system using bacterial artificial chromosomes (BACs) as the cloning vector. Using GFP and mRFP reporter mini-gene constructs, we show that by inserting these reporter genes into a BAC containing the mouse DHFR locus we obtain position independent, copy number dependent expression of reporters in cell clones carrying one to several hundred copies of the BAC. This contrasts with the typical position-dependent, copy-number-independent, and lower expression values observed after directly transfecting cells with the very same mini-genes, with or without flanking chicken HS4 core insulator sequences.
Therefore, in a single stable transformation we can isolate cell clones expressing mini-genes amplified up to several hundred fold. These cell clones show stable chromosome karyotype, transgene copy number, and expression over many passages in the presence of selection. Moreover, we also demonstrate position-independent, copy-number-dependent, simultaneous expression of both RFP and GFP reporter genes inserted within a single BAC construct, showing the ability to use this method for multi-gene transgenesis. This BAC TG-EMBED (BAC TransGene EMBEDded) method provides a new methodology for single and multi-transgene expression capable of facilitating a wide range of applications.
Plasmid p[CMV-mRFP-Zeo] (pCRZ) was created from the EZ-Tn5™ pMOD-2<MCS> transposon construction vector (Epicentre Technologies). The SV40-zeocin fragment was cut from pSV40/Zeo2 (Invitrogen) and cloned into the SacI and HindIII sites of pMOD-2 vector to generate an intermediate construct p[Zeo]. The CMV-mRFP DNA fragment was obtained by polymerase chain reaction (PCR) amplification (forward primer, 5′-CGA GCT CTG AGC TAT GAG AAG CGC CA-3′; reverse primer, 5′-CCC TCG AGT GCC GAT TTC GGC CTA TTG GTT-3′) from vector pmRFP (28), digested with SacI and XhoI, and cloned into p[Zeo] to generate pCRZ.
Plasmid pBSKS-cHS4 (29), a gift from Dr Peter Jones (University of Illinois, Urbana-Champaign), contains two copies of a 2 copy cHS4 core element (25) cloned into the BamH1–Not1 digested pBluescript KS (+) (Stratagene). The CMV-mRFP-Zeo expression cassette was cut from pCRZ by PshAI digestion and cloned into the AflII site of pBSKS-cHS4 to generate pcHS4-CRZ or into the AflIII site of vector pEGFP-N1 (Clontech) to generate plasmid pGNRZ.
A 256-mer lac operator direct repeat and kanamycin selectable marker was inserted into mouse DHFR BAC 057L22 (CITB mouse library) using a Tn5 transposon as described previously (30). BAC Clone 057-k8.32-C29 containing this lac operator transposon inserted 75 kb downstream of the Msh3 transcription initiation site was selected for subsequent engineering. The [CMV-mRFP-Zeo] transposon generated by PshAI digestion of pCRZ was inserted into 057-k8.32-C29 using Tn5 transposase (Epicentre Technologies) following the manufacturer’s directions and bacterial clones were selected using 25 μg/ml zeocin. Primer pMOD-Seq-For (5′-GCC AAC GAC TAC GCA CTA GCC AAC-3′) was used to sequence the [CMV-mRFP-Zeo] transposon insertion site; BAC clones C4 and C27 with the [CMV-mRFP-Zeo] transposon inserted into nucleotides 117 695 and 23 426 of Msh3 gene, respectively, were selected for use in these experiments (Supplementary Figure S1).
DHFR BAC 057-GN-RZ carrying both EGFP and mRFP reporters was constructed by λ Red-mediated homologous recombination (31). An EGFP-Kan/Neo DNA fragment flanked by homology arms was created by PCR from vector pEGFP-C1 (Clontech Labratories, Inc.) using 60-bp primers (forward, 5′-ATG AAT GCA CAT CTG TAC ATG CAT TAT TCA TTG TTC TAT GTT TTT GTG ATG CTC GTC AGG-3′; reverse, 5′-ATT ATA CCA AGA GCA ACT TCA GAA TAA GTT TCC TAG AAT TGG TGG GGA AAA GGA AGA AAC-3′). Each primer contained a 17-bp template homology sequence and a 43-bp BAC target site homology sequence. Recombination procedures followed standard protocols (31) and recombinants were selected with 20 μg/ml kanamycin. BAC clone 057-GN carrying the EGFP-Kan/Neo DNA fragment was chosen for further recombineering. To insert RFP, a 1.3-kb homology region flanking the BAC target site was PCR amplified (forward primer, 5′-GGA ATT CGT TTA AAC CAT GGG TAC TTG GGA GCA CT-3′; reverse primer, 5′-GGA ATT CGT TTA AAC AAA ACA CAT CTG CCC AGG TC-3′) and cloned into EcoRI digested pMOD-2<MCS>. The mRFP expression cassette was cut from pCRZ using PshAI and inserted into the homology region, within pMOD-2<MCS>, at the EcoRV site. The mRFP-Zeo plus homology region fragment generated by PmeI digestion was used for a second recombination reaction followed by 20 μg/ml kanamycin and 25 μg/ml zeocin selection.
To construct BACs containing reporter genes EGFP at a fixed position and mRFP at different locations, the [CMV-mRFP-Zeo] transposon was used to randomly transpose mRFP into BAC 057-GN as described above, generating BAC 057-GN-RZ clones 1–9. Transposon insertion sites and directions for clones were mapped by DNA sequencing. BAC clones 057-GN-RZ-1, 2, 3, 4, 6 and 9 with the transposon insertions into nucleotides 42769, 30688, 27130, 73522, 104954 and 28045 of the Msh3 gene, respectively, were selected for transfection into NIH 3T3 cells.
After each BAC modification, the integrity of the BAC constructs and the length of the lac operator repeat were checked by a HindIII and XhoI digest. Observed gel band sizes were compared with the predicted digestion patterns generated by the Gene Construction Kit software (Textco BioSoftware). Only BAC clones showing the correct digestion pattern and a full-length lac operator repeat were chosen for future modifications. BAC DNA for transfection into mammalian cells was prepared with the Qiagen Large Construct Kit.
NIH 3T3 cells (CRL-1658, ATCC) were grown in Dulbecco’s modified Eagle medium (Invitrogen) plus 10% Bovine Growth Serum (HyClone). pCRZ was digested with PshAI and pcHS4-CRZ digested with EcoRV and NotI. DNA fragments containing the CMV-RFP-Zeo expression cassette were gel purified prior to transfection. EcoRV linearized pGNRZ was used for transfection. BAC clones C4 and C27 were linearized with SrfI and BAC 057-GN-RZ was linearized with BsiWI prior to transfection. The linear integrity of all BACs after restriction digestion was confirmed by PFGE. All DNA fragments were ethanol precipitated and transfected into NIH 3T3 cells using Lipofectamine 2000 (Invitrogen) according to the manufacturer’s directions. Mixed clonal populations of stable transformants were obtained after 3 weeks of zeocin (75 μg/ml) selection; individual subclones were obtained by serial dilution. A lentivirus expression construct (Paul Sinclair, Andrew Belmont) was used to transiently express EGFP-LacI. EGFP-LacI binding to the lac operator repeats localized on each BAC copy provided an outline of the large-scale chromatin conformation of the BAC transgene array in well-preserved nuclei. Three days after lentivirus transduction, cells were plated on coverslips, fixed with 1.6% paraformaldehyde (Polysciences) in calcium, magnesium free PBS buffer and stained with 0.2 μg/ml DAPI (Sigma-Aldrich). BAC transgene array large-scale chromatin conformation was also visualized by two-color DNA fluorescence in situ hybridization (FISH; described below). Deconvolution wide field light microscopy was carried out as described previously (28).
Reporter gene expression levels were measured on a MoFlo flow cytometer (Cytomation) using 584 and 488 nm laser excitation for mRFP and EGFP respectively. Emission filters centered at 607 and 507 nm were used for mRFP and EGFP, respectively. Rainbow fluorescent beads RFP-30-5A (Spherotech, Inc.) were used for calibration of both mRFP and EGFP fluorescence. Untransfected NIH 3T3 cells were used to establish background fluorescence levels. The linear fitting of mean RFP expression level versus transgene copy number for each group of clones was performed using Microsoft Excel fixing the y-intercept, a, to the fluorescence background level of non-transfected cells. The correlation coefficient R2 when the y-intercept is fixed is defined as: R2 = bb ′, where and .
Genomic DNA from individual clones was isolated using the Purelink Genomic DNA mini-kit (Invitrogen) and DNA concentration was measured with the Fluorescent DNA Quantitation kit (Bio-Rad). The BAC or plasmid transgene copy number was determined by real-time quantitative PCR using a SYBR-Green PCR master mix (Applied BioSystem) and the iCycler machine (Bio-Rad). Primers used to measure BAC copy number (5′-GAA CTG CCT CCG ACT ATC CA-3′ and 5′-CGA GGA GCT CAT TTT CTT GC-3′) amplify a 106-bp region within DHFR exon 4. Primers used to measure plasmid copy number (5′-ATG AGG CTG AAG CTG AAG GA-3′ and 5′-GTC CAG CTT GAT GTC GGT CT-3′) amplify a 114-bp region within the mRFP coding region. Standard curves were measured using serial dilutions of a plasmid containing both mRFP and DHFR (pSV2-DHFR-CRZ).
Metaphase spreads were prepared and FISH was performed according to standard protocols (28). For mapping BAC transgenes, a biotin labeled probe was prepared from BAC 057L22 DNA using the BioNick Labeling Kit (Invitrogen). For simultaneous detection of the BAC transgene and centromeric sequences in NIH 3T3 clone C4-10, the digoxigenin BAC probe was hybridized to metaphase chromosomes together with a biotin labeled pan-centromeric chromosome paint (Cambio). For two-color 3D FISH, a plasmid pSV2-DHFR-8.32 containing the 256-mer lac operator repeat (32) was used to generate a biotin labeled lacO probe and the 057L22 DNA was used to generate a digoxigenin labeled BAC probe using the BioNick Labeling Kit and digoxigenin conjugated dUTP (Roche Applied Science). Biotin and digoxigenin labeled probes were detected with Alexa-594 tagged streptavidin (Invitrogen) and fluorescein conjugated anti-digoxigenin antibody (Roche Applied Science), respectively.
BACs can carry 100–300 kb of eukaryotic genomic DNA insertions (33) and are therefore large enough to contain not only the coding regions of genes but also many of their distal regulatory elements. In transgenic animals, genes contained within large genomic BAC inserts usually express within several fold of the endogenous gene expression level and show similar tissue expression patterns (19,34–36). We hypothesized that a BAC containing a housekeeping gene locus would assemble into a large-scale chromatin structure permissive for transcription when integrated into the chromosome of any mammalian cell line. Furthermore, we predicted that insertion of typical mini-gene constructs anywhere within this open large-scale chromatin environment would allow high level, reproducible mini-gene expression, independent of the BAC chromosomal insertion site.
To test this idea we used Tn5-mediated transposition to insert a reporter gene cassette into a BAC with a 175-kb mouse genomic DNA insert containing the full-length DHFR gene and most of the Msh3 gene. The DHFR gene is expressed in proliferating cells with a significant upregulation seen at the late G1/ early S-phase boundary (37). Expression decreases in quiescent fibroblasts and after differentiation of myoblasts into myotubes (38,39). The DHFR and Msh3 genes are transcribed from a divergent promoter. In NIH 3T3 cells both the DHFR and Msh3 genes were shown to show similar, growth-regulated expression patterns (40), whereas in rat embryonic fibroblasts and human fibroblasts the expression of Msh3 was reported to be constitutive (41).
To help visualize the large-scale chromatin structure of the BAC transgene, we inserted a Tn5 transposon containing a 256-mer lac operator direct repeat plus a kanamycin selectable marker (30). We then used Tn5 transposition again to insert a reporter gene cassette containing an mRFP mini-gene driven by a CMV promoter plus a prokaryotic/eukaryotic zeocin selectable marker (Figure 1a, Supplementary Figure S1).
A BAC with the lac operator transposon inserted within the Msh3 gene was chosen as the template for transposition with the second reporter gene transposon. Two BAC clones with insertions of the reporter gene transposon within different Msh3 gene introns were selected for further analysis (Figure 1b, Supplementary Figure S1). In clone C27, the reporter gene is transcribed in the same direction as the Msh3 gene, while in clone C4 the reporter gene is transcribed in the opposite direction. Reporter gene insertion sites of both clones were more than 50 kb away from the lac operator repeat to minimize possible silencing effects from repetitive DNA. Linearized BAC DNA was transfected into mouse NIH 3T3 cells and stable transformants were selected using zeocin. As controls, we transfected just the reporter gene cassette alone, with and without flanking insulator sequences [two copies of the chicken β-globin core HS4 insulator (25); Figure 1a]. Flanking mini-genes with this core insulator sequence previously was shown to reduce chromosome position effects (42,43).
After 3 weeks of selection, reporter gene expression in mixed clonal populations of stable transformants was measured for each DNA construct using flow cytometry (Figure 1c). Flow cytometry of nontransfected cells established the fluorescence background levels. A large fraction of cells transfected with the reporter cassette fragment alone showed no detectable fluorescence above background levels. A broad fluorescent intensity distribution biased toward low expression levels with a long tail extending to higher fluorescence values was observed. Cells transfected with the reporter gene cassette flanked by insulator sequences showed a somewhat decreased fraction of non-fluorescent cells. The distribution was more symmetric with an increased fraction of fluorescent cells.
Strikingly, the reporter gene cassette embedded in either location within the DHFR BAC resulted in essentially 100% of cells with fluorescence values above background levels. Moreover, the intensity modes for the fluorescence distributions corresponding to the reporter cassette embedded within the BACs were 1–2 orders of magnitude higher than that observed for the reporter cassette alone or with flanking insulators.
We next examined variability of reporter gene expression within individual cell clones isolated from these mixed clonal populations (Figure 1d–f). Flow cytometry was performed on two clones with comparable mean reporter gene expression levels from each cell population. Both cell clones with the reporter gene embedded within the BAC transgene showed noticeably more homogeneous expression (Figure 1f), as compared to cell clones containing just the reporter gene cassettes with (Figure 1e) or without (Figure 1d) flanking insulators. In particular, the plasmid-based reporter genes produced clones with very broad intensity distributions, with long tails skewed towards lower fluorescence values. Notably, this tail of low-expressing cells, indicative of variegated expression within individual cell clones, was absent in cell clones carrying the BAC transgenes.
A key test of transgene protection from chromosome position effects is to verify copy-number-dependent, position-independent expression. We used qPCR to determine transgene copy number in 7–12 individual cell clones for each transfected DNA construct. We then plotted mean expression levels, measured by flow cytometry, versus mini-gene copy number for each clone (Figure 2a–c). As expected, the reporter gene cassette alone shows poor linearity of reporter gene expression with copy number, indicative of copy-number-independent expression (Figure 2a, correlation coefficient R2 = 0.153, fixing the y-intercept to the fluorescence background level of non-transfected cells). Improved linearity is observed for the reporter gene cassette construct flanked by insulators (Figure 2b, R2 = 0.657).
However, there is a striking increased linearity for the reporter gene cassette embedded at either location within the BAC. In fact, cell clones derived from BACs containing the reporter gene inserted at either of two locations within the BAC showed nearly identical expression levels per BAC copy (Figure 2c). The net correlation coefficient calculated using data from all 15 clones (Figure 2c, R2 = 0.930) demonstrates a markedly increased copy number dependence produced by embedding the reporter gene within the BAC.
Position-independent expression of the transgenes embedded in the BAC is implied by the observed copy number dependence, assuming each clone has an independent chromosome insertion site. We verified different chromosome integration sites for a handful of clones using mitotic chromosome spreads and FISH, revealing single insertion sites for multi-copy BAC integrations, including clones with hundreds of BAC copies (Figure 2d). In contrast to the high degree of chromosome instability observed with gene amplification, all clones showed a single chromosome integration site and uniform size of the multi-copy BAC insertions. Insertions were mapped cytologically both to the middle and ends of chromosomes, including insertions cytologically close to the telomere (Figure 2d, top left) and even one which inserted within the centromeric heterochromatin as confirmed by two-color FISH using a DHFR BAC probe together with a pan-centromeric probe (Figure 2d, top right, and Figure 2e), yet all showed position-independent expression proportional to BAC copy number.
The design of the new transgene expression system was motivated by our hypothesis that large mammalian genomic inserts cloned within BACs would create a BAC specific, large-scale chromatin conformation independent of chromosome insertion site. Previous work from our laboratory characterizing two different CHO (Chinese Hamster Ovary) cell lines containing multiple integrated copies of the same DHFR BAC used in this study revealed a large-scale chromatin fiber-like conformation similar to surrounding euchromatic chromosome regions (28).
To examine the large-scale chromatin structure of the multi-copy insertions of the DHFR BAC containing the mRFP reporter gene in a number of independently derived NIH 3T3 cell clones, we performed two-color 3D FISH with both lac operator and DHFR BAC probes (Supplementary Figure S2). Lac operator FISH signals appeared as arrays of dot-like structures while the BAC FISH signals produced more continuous staining patterns. Particularly for cell clones with smaller copy number BAC insertions, these lac operator FISH signals frequently showed linear configurations, with the BAC FISH signal forming a more continuous, fiber-like conformation. All NIH 3T3 cell clones examined showed similar FISH patterns.
To avoid possible changes in structure induced by DNA denaturation during the FISH procedure, in the cell clones characterized in Figure 2c and d we transiently expressed an EGFP- lac repressor (dimer)- NLS fusion protein to visualize the lac operator repeats in these cells (32). Interphase nuclei showed string-like chains of GFP stained dots similar in conformation for each clone (Figure 3), including in cells from a clone with the BAC copies inserted within centromeric heterochromatin (Figure 2). Further characterization of these cell clones has revealed a similar spacing between GFP spots (0.33–0.4 μm) and a similar ratio of number of BAC copies to GFP spots (Paul Sinclair, Qian Bian, Andrew Belmont, unpublished data), suggesting similar large-scale chromatin compaction levels in all cell lines independent of the chromosome integration site.
To test the long-term stability of transgene arrays formed by chromosomal integration of multiple BAC copies, we continuously passaged three NIH 3T3 cell clones (C4-2, C27-6 and C27-13) over a 63-day period, replating cells every three days in the presence of drug selection (75 μg/ml Zeocin). Expression levels of mRFP were monitored by flow cytometry after 18, 43 and 63 days of passaging. Expression levels remained at ~80–100% of the starting values for all three clones over this time period, demonstrating stable reporter gene expression with continuous, long-term culture (Figure 4a). After 63 days, we also examined mitotic chromosome spreads by FISH to assay the stability of the chromosomal BAC transgene arrays. For all three clones examined, the BAC transgene arrays showed no detectable chromosomal rearrangements, with the chromosomal location and size of the BAC array unchanged, as assayed by the distinctive appearance of the chromosome carrying the BAC insertion and the ratio of the chromosomal length of the BAC transgene array to the total chromosome arm length (Figure 4b).
In parallel, the same cell clones were passaged for 60 days in the absence of drug selection. In this case, mRFP expression levels dropped by 30–80% from the starting levels for the three clones examined (Supplementary Figure S3a). Silencing was not due to loss of BAC transgene copies as mitotic spreads demonstrated that the chromosome location and size of the BAC transgene array remained constant over this same time period (Supplementary Figure S3b). Silencing also was not accompanied by global changes in large-scale chromatin compaction of the BAC transgene array. Staining of the BAC transgene array by transient expression of EGFP-lac repressor revealed similar numbers of GFP spots and separation between GFP spots in nuclei from silenced clones to that observed in nuclei from parental cells, prior to long-term passaging without drug selection (data not shown). The observed time course for silencing of the reporter gene in the absence of selection is similar to previously observed reporter gene silencing in plasmids without insulator sequences (43,44), which has been attributed, at least in part, to promoter DNA methylation (9,45).
As described previously, we obtained strikingly similar values of reporter gene expression per transgene copy for NIH 3T3 clones carrying BACs with the reporter cassette inserted at two different locations within the BAC. qRT-PCR data revealed copy number dependent expression both for the kan/neo selectable marker linked to the reporter gene as well as the DHFR gene itself (data not shown). This is consistent with our hypothesis that the BAC DNA creates a global large-scale chromatin conformation permissive for reporter gene expression, while also suggesting that the same BAC embedding approach could be extended to reliable, copy-number-dependent expression of multiple transgenes.
To test this idea explicitly, we used BAC recombineering (31) to insert two different reporter gene cassettes into the parent DHFR BAC. We inserted a CMV promoter-driven EGFP reporter gene cassette containing the kanamycin/neomycin selectable marker, into Msh3 intron 8, the same intron into which the original reporter gene transposon inserted in BAC clone C27. We then inserted the original CMV promoter-driven mRFP reporter gene/zeocin selectable marker cassette into Msh3 intron 19, the same intron into which the original reporter gene transposon inserted in BAC clone C4. We transfected NIH 3T3 cells with the linearized two-reporter BAC and selected for stable transformants with zeocin. As a control, we also transfected cells with a plasmid construct containing both the mRFP and EGFP expression cassettes.
As observed for the single reporter gene constructs, transfection with the dual reporter gene fragments alone resulted in most stable transformants showing only background fluorescence levels (Figure 5a). Examining this mixed population of cell clones, cells with higher than background fluorescence levels showed poor correlation of mRFP and EGFP reporter gene expression, with most cells showing higher EGFP versus mRFP expression. Only a small fraction of cells showed a linear relationship between EGFP and mRFP expression.
In contrast, in the mixed population of stable colonies derived from transfection of the two-reporter BAC nearly all cells expressed both mRFP and EGFP at higher than background levels (Figure 5b). The average BAC expression levels were roughly 2 orders of magnitude higher than the average plasmid expression levels. Moreover, flow cytometry revealed a striking linear correlation (R2 = 0.8262) of mRFP versus EGFP fluorescence intensities; 5/5 individual clones containing the BAC transgene showed a similar ratio of mRFP/ EGFP expression to the mixed clonal population (Figure 5c, 4 clones shown). These results demonstrate the potential of this BAC transgene embedding method for simultaneous expression of multiple transgenes at reproducible relative expression levels independent of the chromosome integation site, with expression levels proportional to BAC copy number.
Previously, we showed very similar expression levels for two different locations of the mRFP reporter gene within the DHFR BAC (Figure 2c). The two-reporter approach described in the previous section provided a rapid way to evaluate the potential of multiple locations within the DHFR BAC to support reporter gene expression and to explore the variability of reporter gene expression when placed at different locations within the BAC.
We randomly inserted the CMV-mRFP-Zeocin transposon into the DHFR BAC already containing the CMV-EGFP reporter, thereby generating a number of BAC clones containing EGFP at the same fixed position but mRFP at different locations (Figure 5d). Six randomly chosen BAC clones (057-GN-RZ-1, 2, 3, 4, 6 and 9) showed CMV-mRFP-Zeocin transposon insertions within different introns of the Msh3 gene, mapping from ~3–80 kb away from the EGFP reporter gene (Figure 5e). Each of these BACs was then transfected into independent cultures of NIH 3T3 cells and mixed populations of stable colonies (dozens to hundreds per flask) from each transfection were selected and propagated together.
After selection, the expression levels of both reporters in these mixed clonal cell populations of cells were measured by flow cytometry (Figure 5e) and compared to stably transfected cells carrying the original two color reporter BAC construct (057-GN-RZ) (Figure 5b). Nearly 100% cells of all mixed clonal populations showed high expression levels of both EGFP and mRFP reporters and excellent linear correlation of EGFP and mRFP expression (Figure 5e). The ratio of mRFP to EGFP fluorescence from each mixed population showed no more than a 2.4-fold variation with no correlation observed between the mRFP/EGFP ratio and the DNA distance between reporters. This small variation in reporter fluorescence ratio not only confirms the previously described chromosome position independent expression of the BAC-TG-EMBED system, but also suggests that mini-gene expression is relatively independent of location within the BAC, at least throughout the Msh3 gene region.
Here we have described a novel transgene expression scheme that essentially eliminates chromosome position effects in mouse NIH 3T3 cells by embedding mini-transgene constructs within large genomic fragments cloned within BACs. Both the BAC array and expression of the mini-transgene constructs embedded within the BAC remain stable during continuous growth in the presence of drug selection. We suggest this effect is related to the ability of BACs containing housekeeping gene loci to maintain a euchromatin-like large-scale chromatin conformation conducive to transgene expression. Similar levels of expression were observed for multiple locations of the mini-transgene within the BAC. Moreover we showed that by embedding two mini-reporter genes at different locations within the same BAC that our method conferred reproducible, chromosome position-independent-expression for both reporter genes simultaneously, suggesting the usefulness of this method for multi-gene transgenesis. Although so far only the DHFR BAC was tested, we speculate that BACs with genomic inserts containing other active gene loci may behave similarly. Experiments are now in progress to better define the cis elements within these genomic loci that confer this behavior.
Our BAC TG-EMBED method is both simple and fast, with a transposon reaction typically requiring just 2 days to insert an expression cassette into a BAC and a single transfection and selection yielding mammalian cell clones stably expressing transgenes at levels up to hundreds of fold higher than a single transgene copy. Therefore, by selecting individual cell clones with different BAC copy numbers, a wide range of expression levels should be obtainable from a single stable transfection experiment.
Using our method, we have demonstrated chromosome position-independent, copy-number-dependent expression, with linearly proportional increases in expression over more than a 100-fold variation in transgene copy number. While insulator sequences have generally been used successfully to protect transgenes against chromosome position effects, in a direct comparison the BAC TG-EMBED method produced a significantly improved linear correlation of reporter gene expression versus copy number and more than a 2 fold higher level of reporter gene expression per copy number than that observed by flanking the same reporter gene with two copies of the chicken HS4 core insulator sequence (R2 = 0.930 versus 0.657).
We anticipate that the BAC TG-EMBED method will be able to be combined with the use of additional cis elements adjacent to the mini-transgene, chosen to maximize or stabilize transgene expression (15), to provide the benefits of both approaches. Indeed, whereas the chicken HS4 insulator sequence is capable of protecting against reporter gene silencing during long-term cell growth in the absence of selection (9), we observed significant reduction in reporter gene expression using the BAC TG-EMBED method in some clones after 60 days of continuous cell passaging without drug selection. This suggests that the mechanism by which the BAC TG-EMBED method shields reporter genes from chromosome position effects is different from the mechanism underlying insulator action. Experiments are now in progress to determine the mechanism underlying reporter gene silencing over time in the absence of selection and to test whether flanking the mini-reporter genes with insulator sequences can overcome the observed silencing during long-term cell growth in the absence of selection. An obvious alternative approach would be to test whether the use of eukaryotic rather than viral promoters to drive transgene expression could overcome this effect.
On a broader scope, experiments are also in progress to better understand the basic molecular mechanism by which the BAC TG-EMBED method protects mini-gene constructs from chromosome position effects. Future experiments will be aimed at testing our hypothesis that this protection is related to the ability of the BAC transgene arrays to adapt an open large-scale chromatin conformation and/or to position near specific nuclear compartments. We note that the methodology described here to compare the expression levels of multiple reporter genes contained within BAC transgene arrays can be adapted to dissect the molecular origin of chromosome position effects produced by specific cis sequences contained within the BAC.
Although we used the CMV promoter for our reporter gene constructs, the DHFR gene contained within the BAC, driven by its natural promoter, also appeared to show copy-number-dependent expression. We speculate that the BAC TG-EMBED method may work with many other promoters, therefore allowing considerable flexibility for transgene expression. By choosing a BAC containing a cloned genomic region that will assume an open chromatin conformation in the desired target cells, this method may be extendable to expression of transgenes in a wide range of cell types.
With regard to production of therapeutic proteins in mammalian cells, the BAC TG-EMBED method offers several additional key advantages over the current prevailing method, gene amplification. Using a single transfection step, the equivalent of several hundred-fold gene amplification can be obtained. Whereas gene amplification requires the use of transformed cell lines capable of gene amplification, the BAC TG-EMBED method should be applicable to a wide range of cell types. Finally, our results indicate that in contrast to the genome instability inherent in gene amplification, even large BAC transgene arrays show genome stability, as assayed by the stability of the BAC transgene array chromosome integration site and by the size of the transgene array after long-term cell passaging. Similar stability of multi-copy BAC transgene arrays has also been observed in CHO and murine embryonic stem cells (data not shown).
We note that an alternative BAC-based expression system for increased expression of a gene present in the genome of the host cell, would be simply to increase the copy number of this gene by transfecting an unmodified BAC containing the actual genomic locus coding for the protein to be expressed. We expect that at least for low copy number insertions this approach would also yield copy-number-dependent, position-independent expression when transfected into a cell type normally expressing this protein. However, the transcription factors regulating the genomic locus might be present at low concentrations, preventing copy-number-dependent expression at high BAC copy numbers. The endogenous promoter might be weak, preventing high-level expression even at high copy number. Finally, the cell type normally expressing the target transgene might not be amendable to large-scale protein production methods or even in vitro culture. The BAC TG-EMBED method has the advantage of allowing choice of the promoter and cell type to use for expression.
In principle, BAC recombineering could be used to replace the coding region of a housekeeping gene, in a BAC containing the housekeeping genomic locus, with the cDNA for the target transgene. Again, although one would still be restricted in the choice of promoters to optimize expression level and control over gene induction, the required BAC recombineering would be more complicated as compared to transposon-mediated insertion of a mini-gene using the BAC TG-EMBED approach. Moreover, this BAC recombineering approach would not work as well as the BAC TG-EMBED method for expression of multiple transgenes.
In conclusion, the BAC TG-EMBED expression method provides a novel, complimentary approach to current transgene expression methodologies with several key advantages for specific applications. Future extensions of this methodology should prove useful for industrial production of therapeutic proteins, production of recombinant proteins for biochemical studies, improved transgene expression for gene therapy, and multi-transgene expression for cell and tissue engineering.
Supplementary Data are available at NAR Online.
The National Institute of General Medical Sciences (grant number GM58460 and GM42516 to A.S.B.). Funding for open access charge: National Institutes of Health grant.
Conflict of interest statement. None declared.
We thank Edith Heard (Curie Institute) for providing DHFR BAC (clone 057L22 from CITB mouse library). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health.