|Home | About | Journals | Submit | Contact Us | Français|
Bacterial artificial chromosome (BAC)-based transgene can be expressed bicistronically with the target gene or fused to its translation start codon. To compare the transgene expression efficiencies of these two methods, mice were created that expressed green fluorescent protein (GFP) from the genomic locus of nucleostemin using the bicistronic (NSiGFP) or the ATG-fusion approach (NSmGFP). Three lines with 1, 2, and 4 copies of the NSiGFP transgene, and two lines with one copy of the NSmGFP transgene were generated. Of the three NSiGFP lines, only the 4-copy line can match the NSmGFP lines in their GFP protein expression levels. Analyses of the GFP and nucleostemin RNA transcript levels exclude IRES inefficiency and suggest premature termination of the bicistronic message as the cause for low GFP expression in the NSiGFP mice. This work provides important information for designing BAC transgenics when the transgene expression level is crucial.
Transgenic mouse technology is a powerful way to genetically label or ablate a selective cell type, or to alter the expression level of the target gene in vivo. The success of this approach depends on whether a promoter sequence is available for directing the expression of the transgene in the same way as the target gene. Choosing the right promoter sequence to drive the transgene expression often requires a tremendous effort in characterizing the promoter structures and determining their expression capability in vivo. This difficulty is best overcome by the knock-in strategy, which is time- and cost-consuming and requires expertise in handling mouse ES cells. Here, the BAC transgenic technique offers a reasonable solution to circumvent the drawbacks of both the conventional and the knock-in approach (Yang et al., 1997).
Because BAC clones contain large genomic fragments, all regulatory elements necessary for controlling the expression of the target gene are likely to be contained within a single BAC, and the transgene expression is less influenced by the insertion site than the conventional transgene is (Giraldo and Montoliu, 2001). Unlike the knock-in approach, the BAC transgenic approach does not require the lengthy process of germline transmission and cross-breeding, and may even produce a stronger than knock-in expression of the transgene in mice with multiple copies of the transgene. Because of these benefits, the BAC transgenic technology has been increasingly applied to characterize the transcriptional regulation of gene expression (Reizis and Leder, 2001; Yu et al., 1999), label specific cell types (Chi et al., 2003), confirm the functionality of genetic mutations by in vivo complementation (Antoch et al., 1997; Probst et al., 1998), and reveal novel genetic functions (Heintz, 2000; Yang et al., 1999). Despite its expression fidelity and convenience, several issues are associated with this approach. The BAC transgene usually includes a few hundred kilobases and genes other than the target gene. Consequently, the interpretation of the phenotypes may be confounded by overexpression phenotypes of those functionally unrelated neighboring genes. Finally, it is difficult to rule out small deletion or rearrangement in the entire BAC transgene.
BAC-based transgene can be expressed with the target gene as a bicistronic message or fused to its translation start codon. While both methods preserve the original exon-intronic structures of the target gene, the bicistronic method permits the target gene to be expressed from the transgene, whereas the ATG-fusion approach abolishes its expression from the transgene. Other than the target gene overexpression effect, there are no available data comparing these two methods regarding their transgene expression efficiencies. Here, we choose a GTP-binding nucleolar protein nucleostemin as the target molecule to address this question. Nucleostemin is preferentially expressed in stem cells, cancer cells, and adult testis (Baddoo et al., 2003; Tsai and McKay, 2002). It is essential for early embryogenesis and capable of binding telomeric repeat-binding factor 1 (Zhu et al., 2006) through a nucleolus-nucleoplasmic shuttling mechanism (Meng et al., 2006; Tsai and McKay, 2005).
Mouse nucleostemin is localized on chromosome 14 (25,551,396-25,558,021). BAC clones of C57BL/6J origin were identified from the USCS database. The RP23-102M6 clone was chosen to build the transgenic construct because of its long genomic sequences 5’ and 3’ to the nucleostemin locus (Fig. 1a). In the NSiGFP model, a non-attenuated internal ribosomal entry site-2 (IRES2) of the encephalomyocarditis virus was inserted after the stop codon of nucleostemin, followed by the second transgene GFP (Fig. 1b). To avoid perturbing the splicing of the transgene RNA, the loxP-flanked kanamycin selection cassette (Kan) was placed at a non-conserved site within the 13th intron based on homology comparison between the rodent and human genes. In the NSmGFP model, GFP was fused in-frame to the start codon of nucleostemin, followed by the Kan cassette and the remaining sequence of the 1st exon (Fig. 1c). Modifications of BAC clones were carried out in EL350 cells using the recombineering approach (Lee et al., 2001; Liu et al., 2003). BAC clones were screened by PCR assays and removed of their Kan cassettes by Cre recombination prior to pronucleus injection. The transgene copy numbers of three established NSiGFP lines were determined to be 2 (NSiGFP#1), 1 (NSiGFP#5), and 4 (NSiGFP#17) based on the Southern intensities of the transgenic fragment (TG, 1-copy per genome) to the nucleostemin-null fragment (null, 1-copy per genome) in the (NSiGFP+/−, NS+/−) mice (Fig. 2a). Two NSmGFP lines were established, both of which harbored one copy of the transgene, judged by the intensity ratios between the transgenic fragment (1-copy per genome) and the endogenous nucleostemin fragment (2-copy per genome) (Fig. 2b). The integrity of the BAC transgene was assessed by quantitative PCR (qPCR) that measured the ends of the BAC (end-1 and -2) and the nucleostemin gene (NS) (Fig. 2c). The amount of genomic DNA in each sample was normalized to its RNA polymerase-II. Compared to the wild-type sample, the copy numbers of transgenic end-1 (or end-2) were 2 (1.2), 0.4 (0.4), 5 (6.2), 0 (0), and 0.6 (0.6) in the NSiGFP#1, NSiGFP#5, NSiGFP#17, NSmGFP#1, and NSmGFP#14 lines (Fig. 2c). The transgene copy numbers in the NS coding region determined by qPCR correlate with the Southern blot measurement, except for NSiGFP#17 and NSmGFP#14, which appear to have 6 and 3 copies of the transgene per genome. These results indicate that the BAC transgene is intact in the multi-copy NSiGFP#1 and NSiGFP#17 lines, but is partially eroded at the end in the one-copy NSiGFP#5, NSmGFP#1, and NSmGFP#14 lines.
The GFP protein expression levels of the NSiGFP and NSmGFP mice were measured by direct visualization of the GFP signal in live embryos (Fig. 3a). The NSiGFP#17 and NSmGFP#14 embryos at day 10.5 displayed the strongest GFP signal, followed by the NSmGFP#1 embryo. The NSiGFP#1 and NSiGFP#5 embryos had weak GFP signals. The relative amounts of the GFP protein in these five transgenic lines were confirmed quantitatively by western blots (Fig. 3b). We demonstrated that the NSiGFP#17, NSmGFP#1, and NSmGFP#14 embryos expressed the most GFP proteins, followed by the NSiGFP#1 and NSiGFP#5 embryos. Notably, western blot detected more GFP proteins in the NSiGFP#1 embryo than in the NSiGFP#5 embryo, which is consistent with their transgene copy numbers.
The low GFP protein level in the NSiGFP mice compared to the NSmGFP mice may be caused by inefficient translation of the second transgene following IRES2. To test this idea, we measured the amount of the GFP RNAs in the adult testis. Quantitative reverse-transcription-PCR (qRT-PCR) showed that the relative levels of the GFP RNAs were 1.3 (NSiGFP#1), 1.0 (NSiGFP#5), 4.4 (NSiGFP#17), 4.4 (NSmGFP#1), and 5.9 (NSmGFP#14). These numbers correlate with their GFP protein levels and indicate that the differences in the GFP protein between the NSiGFP and NSmGFP mice cannot be explained by the efficiency of IRES2. Next, we examined the possibility that the lower GFP RNA level in the NSiGFP mice compared to the NSmGFP mice may be caused by transcript instability, aberrant splicing, or premature termination. We reason that if the bicistronic RNA is unstable, the amount of nucleostemin RNA transcribed from one copy of the transgene will be less than that transcribed from one copy of the endogenous allele. Contrarily, our data showed that the nucleostemin RNA levels in the transgenic testis are 3.2 (NSiGFP#1), 2.4 (NSiGFP#5), 6.0 (NSiGFP#17), 1.3 (NSmGFP#1), and 1.0 (NSmGFP#14) times that in the wild-type testis (Fig. 4b1). As a control, we showed that the amount of nucleostemin RNA in the NS+/− testis is one half of that in the wild-type testis (Fig. 4b2). These results indicate that the amount of the nucleostemin RNA of the bicistronic message per transgene copy is as much or higher than that per endogenous nucleostemin allele, and that the nucleostemin expression from the NSmGFP transgene is abolished. To test the possibility of aberrant splicing, qRT-PCRs were conducted across the 13th intron where a loxP sequence was inserted (Fig. 4d). Our results showed that the expression levels of the 3’ end of nucleostemin RNA per transgene copy in the NSiGFP testis is more than that per endogenous nucleostemin allele in the wild-type testis (Fig. 4c), indicating that there is no aberrant splicing across the 13th intron.
In this report, we examined the expression efficiencies of two BAC-based transgenic designs. Our data showed that the GFP expression level is significantly lower in the bicistronic lines than in the ATG-fusion lines on a per transgene copy basis. Notably, the differences in the GFP protein level between these two models correlate with their GFP RNA levels, indicating that low GFP expression in the bicistronic mice cannot be attributed to IRES2 inefficiency. Further qRT-PCR analyses demonstrated that the amounts of the nucleostemin RNA transcribed from one copy of the transgene in the NSiGFP mice are more than that transcribed from a single endogenous allele, detected either in the middle or at the 3’ end of nucleostemin. Considering these results, we conclude that the reduced GFP expression in the bicistronic transgenics is caused by premature termination of the bicistronic transcript between nucleostemin and GFP. Due to the difficulty in detecting the GFP signal on sections, this study analyzed the transgene expression levels from the whole embryo. Given that the GFP signals in the NSiGFP and NSmGFP live embryos appear grossly identical and relatively wide-spread, it is unlikely that a restricted expression of the transgene in the NSiGFP mice compared to the NSmGFP mice is responsible for the difference in transgene expression between these two models. In summary, our work demonstrates that the ATG-fusion strategy has a clear advantage over the bicistronic design in avoiding target gene overexpression and in directing a high-level transgene expression.
To obtain BAC clones containing the nucleostemin locus, the genomic position of nucleostemin was determined by the Ensembl program (www.ensembl.org). BAC clones containing this region were identified using the UCSC Genome Browser (genome.ucsc.edu), and obtained from the BACPAC Resources (bacpac.chori.org).
As the first step to construct the targeting vector, the loxP site in the pBACe3.6 vector was replaced by a DNA fragment that contains an ampicillin selection cassette and two flanking recombineering arms of 49bp (RT553, underlined) and 47bp (RT554, underlined) using the BAC recombineering approach. This DNA fragment was generated by a PCR reaction using primer pairs, RT553 and RT554, and the pTamp vector as the template (Lee et al., 2001). Sequence information for primers is as follows: RT553: 5’-ATC CAC AGG ACG GGT GTG GTC GCC ATG ATC GCG TAG TCG ATA GTG GCT CTT AGA CGT CAG GTG GCA C-3’; RT554: 5’-CGG CAC GTT AAC CGG GCT GCA TCC GAT GCA AGT GTG TCG CTG TCG ACC TCA CGT TAA GGG ATT TTG GTC-3’.
A BAC modification strategy was used as described in previous studies with slight changes (Lee et al., 2001; Liu et al., 2003). Host cells (EL350) were grown to mid-log phase (OD600=0.4–0.5) and induced to express red recombinase by incubation at 42C for 15 minutes. Linearized recombination cassette, free of vector backbone sequences, was transformed into the recombination-ready EL350 cells by electroporation. After incubation for 1 hour at 30C, cells were plated onto LB plates containing kanamycin (12.5ng/ml) and ampicillin (25ng/ml) antibiotics. The correctly recombined BAC clones were identified by PCR assays and excised of their drug selection cassettes in EL350 cells by arabinose-induced Cre recombination.
BAC DNA for microinjection was prepared using the Marligen High-Purity column and linearized by the PI-SceI enzyme that cuts specifically in the pBACe3.6 vector. Digested DNAs were purified by phenol-chloroform extraction and ethanol precipitation, and injected at a concentration of 2ng/ul into fertilized eggs derived from FVB mice in the transgenic core of the Center for Environmental and Rural Health. PCR analyses were used to screen for transgene-positive offspring. The transgenic lines were maintained in the FVB genetic background.
Embryos were dissected in cold phosphate-buffered saline and viewed immediately under a Zeiss stereoscope Discovery V12. Green fluorescent images were captured using the multidimensional acquisition function in the Axiovision Rel 4.5 software with a 10-second exposure time. Six images were collected from the top to the bottom of the embryo and stacked into one final image using the CZfocus MFC application. For western quantification of GFP protein, whole embryo lysates (E10.5) were extracted in sample buffer, fractionated by 10% sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), and transferred to Immobilon-P membrane (Millipore). Specific signals were detected by rabbit anti-GFP antibody at 1-to-4000 dilutions (Molecular Probes) and horseradish peroxidase-conjugated secondary antibody.
For genomic qPCR assay, BAC ends were amplified from 10ng of genomic DNAs with 0.5uM of primers and 100uM of dNTPs. The ΔC(t) values between the target fragment and RNA polymerase II were determined using the MyiQ single-color real-time PCR detection system and supermix SYBR green reagent. The ΔΔC(t) values between transgenic and wild-type mice were measured from three technical replicates and two biological replicates, and used to calculate their relative expression levels. For qRT-PCR analysis, DNaseI-treated total RNAs (5ug) were isolated from adult testis, and reversed transcribed into 1st strand cDNAs using random hexamers and M-MLV reverse transcriptase. Target cDNAs were amplified from 5ul of the diluted cDNA samples (100X for GFP and nucleostemin, and 500X for RNA polymerase II) with 0.5uM of gene-specific primers and 100uM of dNTPs. For qPCR, the ΔC(t) values between the GFP (or nucleostemin) and RNA polymerase II and the ΔΔC(t) values between different transgenic lines were determined as described above. Primer sequences are listed as follows: RP23-102M6 end-1 (59C): 5’-GCT CAC TCA TGG ACC C-3’ and 5’-GGG CAT ACA AGA TGC TC-3’; RP23-102M6 end-2 (59C): 5’-GCT GCC TAA GAT GAA GG-3’ and 5’-CAT TGG ACA GAC AGC AAC-3’; GFP (59C): 5’-CAA GCT GAC CCT GAA GTT CAT C-3’ and 5’-GTT GTG GCG GAT CTT GAA GTT C-3’; nucleostemin-1 (59C, Fig. 4b), 5’-CAA GCA TTG AGG AAC TAA GAC-3’ and 5’-GCA ATA GTA ACC TAA TGA GGC; nucleostemin-2 (59C, Fig. 4c): 5’-CTG ACA AAT GGA ATA CTA GAC G-3’ and 5’-TTA TAT ATA ATC TGT GGT GAA GTC-3’; RNA polymerase II (59C): 5’-GCC ATG CAG AAG TCT GGC CGT CCC CTC AAG-3’ and 5’-CTT ATA GCC AGT CTG CAG ATG AAG GTC AC-3’.
We thank Dr. E-Chiang Lee, Neal Copeland, and Jim Martin for providing the EL350 host cells and pTamp plasmid. We also thank Paul Swinton for pronucleus injection and Antonio Baldini for sharing the imaging system. This work is supported by R01 CA113750-01 to R.Y.T.