Loop-specific recognition is utilized by several other proteins of the first group for the readout of certain nucleotide sequences (). Though some proteins contact the helical RNA segments that close RNA loop regions, the majority of specific interactions are observed with non-paired nucleotides within the loops. These RNA-protein interactions resemble sequence-specific recognition patterns observed in complexes between proteins and single-stranded RNA (discussed below), which often involve small canonical RNA-binding domains, such as zinc-finger domains, the K-homology (KH) domain and the RNA recognition motif (RRM) [2
]. Not surprisingly, some proteins described here utilize canonical RNA-binding domains and motifs for RNA binding.
The structure of the translational repressor RsmE bound to the Shine-Dalgarno sequence of hcnA
mRNA shows how a protein dimer specifically recognizes the consensus sequence 5′-A
]. The loop contains six unpaired nucleotides A8-C9-G10-G11-A12-U13, with U13 and the C9-G10-G11 segment bulged out. The protein specifically recognizes the Watson-Crick edges of A8, G10, G11, the Hoogstean edge of A12, and the major-groove side of C7-G14 and U6-A15 base pairs. In contrast to small canonical RNA-binding domains, the sequence-specific recognition of unpaired nucleotides is mediated primarily by β-strand backbone residues, implying that the protein fold itself is responsible for RNA-binding specificity.
The sequence-specific recognition of the bulged out nucleotides in apical loops is a recurrent theme in complexes of aptamer RNAs with the KH1 domain of NOVA-1 KH1/2 protein (PDB code: 2ANR) and the RNA recognition motif (RRM) domain of human RBMY protein [35
]. The NOVA (neuro-oncological ventral antigen) family of proteins is expressed in neurons where it plays a crucial role in the regulation of alternative splicing [36
]. The NOVA-1 protein contains three KH domains. An earlier structure [37
] has revealed details of the recognition between the KH3 domain and a UCAC tetranucleotide embedded within the hairpin loop of an in vitro
-selected stem-loop RNA scaffold. However, it has not addressed the question how multiple KH domains can target RNA. The structure of the first two KH domains (KH1/2) bound to tandem UCAN repeats of an in vitro
-selected stem-loop RNA, attempted to answer this question. These structural efforts revealed that the KH2 domain does not participate in RNA binding and only the KH1 domain interacts with a 5′-UCAG-UCAC-C loop closed by three non-canonical base pairs. This domain primarily binds to the second UCAN repeat in the cleft usually used by KH-domains for ss-DNA and ss-RNA recognition. Despite the Watson-Crick edges of all four nucleotides interacting with the protein, only cytosine and adenine form sequence-specific hydrogen bonds, thereby validating the YCAN sequence consensus found using the SELEX approach [36
Testes-specific RBMY (RNA-binding motif gene on Y chromosome) protein encoded by the human Y chromosome is important for sperm development. The protein is possibly involved in pre-mRNA processing and recognizes an in vitro
-selected RNA hairpin with a 5′-CA
CAA loop and a 5′-GUC-loop-GAY consensus element in the loop-closing part of the stem [35
]. In the structure, CAA nucleotides protrude from the CACAA pentaloop and are spread on the β-sheet surface of the RRM, similar to other proteins that utilize the RRM-RNA mode of recognition. All three nucleotides form base-specific contacts with main and side chain atoms; however, only adenines provide base-specific discrimination. Unexpectedly, the protein makes additional contacts with a major groove of the stem using its β-hairpin, thereby demonstrating dual sequence and shape-specific RNA-recognition, a duality that is generally unusual for RRM motifs.
Two proteins, elongation factor SelB and mRNA-binding factor Vts1p, interact with stem-loop structures, whose loop regions, though composed of different sequences, demonstrate conformational similarity to the UNCG tetraloop fold [38
]. SelB is essential for incorporation of selenocysteine, the 21st
amino acid, into bacterial polypeptides. The factor binds selenocysteine insertion sequence (SECIS) in mRNA with extremely high selectivity, and this binding serves as a signal for delivery of selenocysteyl-tRNA at a UGA stop codon upstream of SECIS hairpin. The high binding specificity is achieved through base-specific interactions of a DNA- and RNA-binding winged-helix (WH) motif with consecutive bulged out guanine and unpaired uridine of the 5′-GGUC-U loop, and interactions with the RNA backbone, which are determined by shape complementarity and electrostatic properties of the protein surface [39
Yeast Vts1p has been implicated in vesicular transport and sporulation; however, its precise role remains unknown. The protein is a homolog of the Drosophila
protein Smaug, a translational repressor that mediates body pattering during embryogenesis by binding to a mRNA hairpin termed Smaug recognition element (SRE) [41
]. The SRE hairpin exhibits consensus sequences 5′-UNGA-N and 5′-GNGC-N which are targeted by α-helical sterile alpha motif (SAM) domain of Vts1p, a domain also implicated in protein-protein and DNA-protein interactions [42
]. Three structures of the Vts1p-SAM domain bound to two SRE variants show parallels with SelB-SECIS recognition, such as shape recognition of the loop region and base-specific binding to an unpaired nucleotide, guanosine in this case [42
]. In contrast to the SelB-SECIS complex, the bulged out nucleotide does not play a significant role in recognition by Vts1p-SAM.