The efficiency of frameshifting at the PA-X shift site was previously estimated by translating reporter constructs in rabbit reticulocyte lysates and found to be around 1.3 per cent [
30]. When the frameshift cassette was fused into a dual luciferase reporter construct and expressed in tissue culture cells (see §4), comparably low frameshifting efficiencies (namely 0.74 ± 0.13%) were measured. Owing to the low levels involved and the lack of a suitably sensitive antibody to the common N-terminal domain of PA and PA-X, we have not been able to directly measure the frameshifting efficiency in the context of viral infection. Because PA-X is expressed at very low levels during virus infection, we were not able to isolate sufficient quantities from virus-infected cells for mass spectrometric analysis despite multiple attempts. Thus, in order to determine the precise site and direction of frameshifting, we used a construct in which an ORF-encoding green fluorescent protein (GFP) was fused in-frame to the 3′ end of the X ORF (
b). Frameshift expression of the construct would result in the transframe fusion PA-X-GFP, which could be affinity-purified on GFP-TRAP beads, while non-frameshift expression would result in a product that does not contain GFP. The construct was expressed in 293T cells, and PA-X-GFP was affinity-purified from cell lysates and resolved by SDS-PAGE. An in-frame control, in which the predicted shift site UCC_UUU_CGU_C was mutated to UCC_UUU_GUC to force expression of PA-X-GFP, was also prepared to show the approximate size at which the frameshift protein should migrate in gels. The wild-type construct produced a specific band migrating at the expected size for PA-X-GFP. A gel slice containing this protein was excised, digested with trypsin, and the resulting peptides were analysed by nano-liquid chromatography tandem mass spectrometry (nano-LC/MS/MS).
Eight separate PA-X-GFP tryptic peptides were identified, including peptides encoded both upstream and downstream of the shift site (
c; two of the peptides have overlapping sequence). Importantly, a peptide spanning the shift site itself was identified (
d). This peptide, GLWDSFVSPR, defines the shift site (UCC_UUU_CGU) and direction (+1) of frameshifting (
e). Molecular ions for GLWDSFVSPR were identified both with and without oxidation at the tryptophan, providing further support for the sequence assignment. No peptide compatible with –2 frameshifting was detected. Formally, the peptide GLWDSFVSPR is compatible with three different models for frameshifting: (i) +1 slippage with UUU in the P-site and an empty A-site; (ii) +1 slippage with UCC in the P-site and an empty A-site; and (iii) tandem +1 slippage with UCC in the P-site and UUU in the A-site. However, consideration of the potential for codon : anticodon re-pairings favours model (i). Both UUU and UUC are translated by a single tRNA isoacceptor whose anticodon, 3′-AAG-5′, has a higher affinity for UUC in the +1 frame than for the zero-frame UUU [
32]. By contrast, UCC is expected to be generally decoded by the serine tRNA with anticodon 3′-AGI-5′ (I, inosine), but whether it is decoded by 3′-AGI-5′ or a different serine tRNA when frameshifting occurs, re-pairing to CCU in the +1 frame would involve a mismatch at the first nucleotide position. Moreover, previous experiments showed that mutating UCC to AGC, GGG, CCC or AAA reduced but did not abolish frameshifting, while mutating UUU_CGU to UUC_AGA (with an appropriately positioned 3′ stop codon to prevent non-specific frameshifting elsewhere within the overlap region) knocked out frameshifting [
30]. These results are consistent with P-site slippage on UUU_C but argue against P-site slippage on UCC_U, although a low level of slippage on UCC_U cannot be ruled out. Interestingly, a UCC_U tetranucleotide is the site of +1 frameshifting in antizyme expression, although here frameshifting is stimulated, in part, by the presence of a stop codon in the A-site (a role that is unlikely to be substituted by a UUU codon in the A-site) [
3].
In other cases of +1 frameshifting, such as in bacteria and yeast, frameshifting is stimulated in part by a slowly decoded A-site codon such as a stop codon or codon whose cognate tRNA is limiting [
1,
33,
34]. At the influenza PA-X shift site, P-site slippage on the UUU_C tetranucleotide may be stimulated by the rare CGU codon in the A-site (CGU is one of the most seldom-used codons in the genomes of mammals and birds—the host species of influenza A virus; [
35]). In support of this, mutating the CGU to the more commonly used arginine codon, CGG, reduced frameshifting by 50 per cent [
30]. However, CGU and the more abundantly used codon CGC are expected to be decoded by the same tRNA isoacceptor with anticodon 3′-GCI-5′, and this tRNA species is not obviously limiting in mammals and birds [
36,
37]. Thus, the role and mode of action of the A-site codon remains uncertain, and conservation of CGU may in part be driven by constraints on the encoded amino acid sequence in the overlapping +1 reading frame.
| Table 1.Arginine codon usage frequencies (per 1000 codons) in selected organisms. |
The role of UCC in the E-site also remains uncertain. In analyses of codon usage in PA, it was observed that the motif UCC_UUU_CGU is extremely highly conserved at the 5′ end of the influenza A virus X ORF, despite the fact that five other codons could potentially be used to encode the serine [
30,
38]. Moreover, mutating the UCC codon to AGC (serine) or to GGG, CCC or AAA resulted in a 40 to 70 per cent reduction in the frameshifting efficiency [
30]. This suggests that UCC plays an important stimulatory role in the E-site. Earlier
in vivo work on E-site influence (independent of amino acid identity) on stop codon readthrough implies that interactions at that site influence competition for A-site acceptance, but whether this influence acts via the P-site merits investigation [
39,
40]. Notwithstanding complications due to an interaction with rRNA during bacterial release factor 2 +1 frameshifting, there is evidence in that case for the identity of the E-site codon having an effect on +1 frameshifting. This has been proposed to relate to the speed at which the E-site tRNA is released, with weaker codon : anticodon duplexes being associated with higher levels of frameshifting [
41–
44]. In an
E. coli cell-free system, even partially mismatched P-site codon : anticodon interactions, which can be augmented by E-site mismatches, trigger retrospective editing and so influence events in the A-site [
45]. A counterpart post-peptide bond effect has not been detected in
S. cerevisiae, but may exist and involve currently unidentified factors [
46,
47]. An E-site effect on +1 frameshifting could potentially be influenced by the E-site tRNAs in a proportion of translating ribosomes being near-cognate rather than the standard cognate tRNA. The proposal of an allosteric relationship between release of deacylated tRNA from the E-site being coupled to aminoacyl-tRNA acceptance in the A-site [
44] has drawn much criticism [
48–
51]. On its own, the observed E-site influence on +1 frameshifting could be interpreted as it acting via an effect on the length of the A-site pause that affects the probability of P-site realignment, but a direct effect on P-site codon : anticodon interaction, or rather on the translocating complex, seems more likely.
More generally, one might predict a class of +1 frameshift stimulators that comprise a UUU_C P-site slippery sequence and a restricted choice of A- and E-site codons. In eukaryote-infecting viruses, frameshifting by +1 nt has been predicted as the expression mechanism for non-5′-proximal ORFs in the closteroviruses (RdRp), leishmania RNA virus 1 (RdRp), chronic bee paralysis virus and the related Lake Sinai viruses 1 and 2 (RdRp), plant-infecting fijiviruses (Family Reoviridae; P5-2) and members of the proposed family
Amalgamaviridae of plant viruses (RdRp) (reviewed in [
52]). However, in most of these species, the site of frameshifting remained elusive. Characterization of the influenza virus frameshift site now suggests the site of +1 frameshifting in several of these viruses (). Several of these shift sites are also well supported by comparative genomic analysis [
53]. Interestingly, these putative shift sites all seem to show a preference for A-site CGN codons, as opposed to other CNN codons. As in PA-X expression, it is likely that the efficiency of frameshifting at such sites is low. However, these levels may be completely compatible with the expression level requirements of some viruses (cf. –1 frameshifting for polymerase expression in
S. cerevisiae totivirus L-A, where the ratio of Gag-Pol to Gag in the virion is of order 1–2% and, correspondingly, the frameshifting efficiency is around 1.8%) [
54]. Whether similar motifs are functionally used for cellular gene expression remains to be seen.