|Home | About | Journals | Submit | Contact Us | Français|
Guanine-rich DNA sequences can form G-quadruplexes stabilized by stacked G–G–G–G tetrads in monovalent cation-containing solution. The length and number of individual G-tracts and the length and sequence context of linker residues define the diverse topologies adopted by G-quadruplexes. The review highlights recent solution NMR-based G-quadruplex structures formed by the four-repeat human telomere in K+ solution and the guanine-rich strands of c-myc, c-kit and variant bcl-2 oncogenic promoters, as well as a bimolecular G-quadruplex that targets HIV-1 integrase. Such structure determinations have helped to identify unanticipated scaffolds such as interlocked G-quadruplexes, as well as novel topologies represented by double-chain-reversal and V-shaped loops, triads, mixed tetrads, adenine-mediated pentads and hexads and snap-back G-tetrad alignments. The review also highlights the recent identification of guanine-rich sequences positioned adjacent to translation start sites in 5′-untranslated regions (5′-UTRs) of RNA oncogenic sequences. The activity of the enzyme telomerase, which maintains telomere length, can be negatively regulated through G-quadruplex formation at telomeric ends. The review evaluates progress related to ongoing efforts to identify small molecule drugs that bind and stabilize distinct G-quadruplex scaffolds associated with telomeric and oncogenic sequences, and outlines progress towards identifying recognition principles based on several X-ray-based structures of ligand–G-quadruplex complexes.
DNA can adopt structures other than the Watson–Crick duplex when actively participating in replication, transcription, recombination and damage repair. Of particular interest are guanine-rich regions, which can adopt a non-canonical four-stranded topology called the G-quadruplex. Such architectures are adopted in several key biological contexts, including DNA telomere ends, the purine-rich DNA strands of oncogenic promoter elements, and within RNA 5′-untranslated regions (UTR) in close proximity to translation start sites. Therefore, elucidation of the sequence-based diversity of G-quadruplex scaffolds could provide insights into the distinct biology of guanine-rich sequences within the genome.
G-quadruplexes are built from the stacking of successive G–G–G–G tetrads (G-tetrads) and stabilized by bound monovalent Na+ and K+ cations (1). The G-tetrad is a cyclic hydrogen-bonded square planar alignment of four guanines (Figure 1a), with the guanines adopting either anti or syn alignments about glycosidic bonds (Figure 1b and c, respectively). G-quadruplexes are very stable, with their large diameter and four grooves defining a unique architecture (2) that is distinct from duplex DNA.
The backbone strands (or columns) that constitute the stacked G-tetrad core of the G-quadruplex can adopt different directionalities. Furthermore, the relative strand directionalities are geometrically related with the glycosidic conformation of the guanines. There are four possibilities: (i) Four strands are oriented in the same direction; the glycosidic angles around the G-tetrad are anti–anti–anti–anti (3–5), and occasionally syn–syn–syn–syn (6). (ii) Three strands are oriented in one direction and the fourth is oriented in the opposite direction; the glycosidic angles are syn–anti–anti–anti or anti–syn–syn–syn (7). (iii) Two neighboring strands are oriented in one direction and the two remaining strands oriented in the opposite direction (as a result of which each strand has both parallel and anti-parallel adjacent neighbors); the glycosidic angles are syn–syn–anti–anti (8–10). (iv) Each strand has adjacent anti-parallel neighbors; the glycosidic angles are syn–anti–syn–anti (11–14).
Loops in G-quadruplexes are linkers connecting G-rich tracts that support the stacked G-tetrad core. The loops can be classified into four major families that depend in part on the size and sequence of the linkers: (i) Edge-wise or lateral loops connect two adjacent anti-parallel strands (Figure 2a), and are generally composed of two or more residues (9,15). (ii) Diagonal loops connect two opposing anti-parallel strands (Figure 2b) (8–10), and are generally composed of three or more residues. (iii) Double-chain-reversal or propeller loops connect adjacent parallel strands (Figure 2c) (7,16,17), and can be as small as one and as large as six or more residues. The adenine in single-residue double-chain-reversal loops that bridge two G-tetrad planes can form hydrogen bonds with one edge of the G-tetrad resulting in A–(G–G–G–G) pentad formation (18) or two opposing edges of the G-tetrad resulting in A–(G–G–G–G)–A hexad formation (16). (iv) V-shaped loops connecting two corners of a G-tetrad core in which a support column is missing (Figure 2d) (18).
Furthermore, loop residues can form base-pairing alignments, which in turn stack with the terminal G-tetrads, further stabilizing G-quadruplex structures. These include three bases in a plane, which can be classified either as base triples, where all three bases are non-contiguous in the sequence, or as base triads (19), where two adjacent bases from one strand are involved in the pairing alignment with a base from a second strand (20). Loop conformations can adopt diverse topologies (21,22) making them attractive targets for small molecule-based ligand recognition.
The G-quadruplex topology is defined by four grooves whose dimensions (depth and width) and accessibility vary based on both the overall topology and whether the loops are edge-wise or diagonal on one hand, and double-chain-reversal on the other. G-quadruplex formation requires monovalent cations, which are positioned within the central channel of stacked G-tetrads, thereby neutralizing the strong electrostatic potential associated with the inwardly pointing guanine O6 oxygen (23). The dehydrated cations are positioned either in a tetragonal bipyramidal coordination between G-tetrads planes (K+) (Figure 1d) (10), or in a range of geometries that span positioning within G-tetrad planes to out of plane alignments (Na+) (24). It has been shown that in general G-quadruplexes prefer K+ over Na+, and that this reflects in part the much greater energetic penalty for Na+ dehydration (25). Finally, the same sequence can adopt different G-quadruplex conformations in Na+ (14) and K+ (26) solution as determined by NMR, and also as monitored by fluorescently labeled oligonucleotides (27).
The subject of G-quadruplexes has been extensively reviewed in the literature (28–38). Despite a wealth of crystal and solution structures, it has proved difficult to define a comprehensive set of rules that specify the folding propensity of G-quadruplexes. Therefore, each new guanine-rich telomeric and oncogenic promoter sequence has to be individually structurally characterized as a function of monovalent cation type and, in addition, checked for conformational heterogeneity between two or more topologies in solution.
This review presents a structural biology perspective of recent advances in structures of G-quadruplexes formed by human telomeric and oncogenic promoter G-rich tracts, as well as the potential of small molecules to target-specific G-quadruplex folds, thereby setting the stage for structure-based design of new classes of cancer therapeutics. The review also highlights the increasing attention being focused on G-quadruplexes formed by G-rich RNA sequences and their role in mRNA regulation and processing.
Guanine-rich tracts are observed in critical segments of eukaryotic and prokaryotic genomes, promoter regions, both short microsatellite and longer minisatellite repeats, ribosomal DNAs, as well as telomeres in eukaryotes and immunoglobulin heavy chain switch regions of higher vertebrates. These guanine-rich tracts have the potential to form G-quadruplexes following transient destabilization of the duplex, a process that accompanies transcription, replication and recombination. Systematic algorithmic searches of bacterial and human genomes for guanine-rich tracts (restricted to minimum of four GGG segments separated by short linkers) (39–41) have noted that such putative G-quadruplex-forming sequences are prevalent in proto-oncogenes (which promote cell proliferation) and essentially lacking in tumor-suppressor genes (which maintain genomic stability) (42).
An increasing number of proteins have been identified that bind, promote or non-catalytically disrupt G-quadruplex formation (43–46). Both the β-subunit of the Oxytricha telomere end-binding protein (βTBP) (47) and repressor activator protein 1 (RAP1) in Saccharomyces cerevisiae (48) promote intermolecular G-quadruplex formation. In addition, the MutSα protein, involved in mismatch repair, targets G-quadruplex DNA in G-loop segments and promotes synapsis of transcriptionally activated immunoglobulin switch regions (49). Activation-induced cytosine deaminase (AID) also targets G-quadruplex DNA and plays a role in immunoglobulin class switch recombination (50). On the other hand, binding of POT1, a protein conserved from fission yeast to humans (51), disrupts G-quadruplex formation at telomeric G-rich overhangs (52), thereby promoting telomere extension by telomerase (53).
In addition, helicases catalytically unwind and nucleases cleave G-quadruplexes (43–46). RecQ DNA helicase family members are associated with genomic instability and predisposition to malignancies. The Bloom and Werner syndrome RecQ helicases bind to (54) and unwind intermolecular G-quadruplex scaffolds with a 3′ to 5′ polarity in the presence of ATP and Mg cations (55,56). Furthermore, G-quadruplex-specific nucleases cut within single-stranded DNA several nucleotides upstream of the G-quadruplex using a structure-specific mode of action (57–60). Gene disruption of such nucleases can lead to cellular senescence and telomere shortening (61). Such cleavage may also be required for DNA recombination and suggests that DNA quadruplexes may play a role in the formation of interchromosomal synapsis.
Strong evidence supporting G-quadruplex formation in vivo comes from the demonstration that in vitro generated single-chain antibody fragments specific for intermolecular telomeric G-quadruplex DNA react with ciliated protozoan Stylonychia lemnae macronuclei but not corresponding micronuclei (62). Additional evidence in support of G-quadruplex formation in vivo comes from the observation that telomere end-binding proteins control the formation of G-quadruplex DNA structures in vivo (63) and that intracellular transcription of G-rich DNAs induces formation of G-loops, novel structures containing G-quadruplex DNA on the non-template G-rich strand, as verified from nucleolin binding and sensitivity to G-quadruplex-specific nucleases (64). In addition, attempts have been made to monitor G-quadruplex formation at telomere proximal regions of chromosomal DNA using G-quadruplex-specific fluorescent 3,6-bis(1-methyl-4-vinylpyridinium) carbazole diiodide (BMVC) (65). Furthermore, it has been proposed that gene function correlates with potential for G-quadruplex formation in the human genome (42). Both inter- and intramolecular G-quadruplex formation has also been demonstrated for the diabetes susceptibility locus in the promoter region of the human insulin gene (66). Finally, guanine-rich tracts containing sequences capable of G-quadruplex formation have been shown to induce apoptosis in tumor cells (67–69).
In addition, guanine-rich RNA sequences capable of G-quadruplex formation have been identified in the vicinity of polyadenylation regions (70) involved in regulating 3′-end processing of mammalian pre-mRNAs (71). Such guanine-rich motifs can interact with hnRNP H protein subfamily members, thereby potentially mediating alternative, tissue-specific splicing events. There also appears to be a combinatorial code for splicing silencing which includes a combination of RNA UAGG and GGGG motifs (72). There are several examples of RNA G-quadruplex complexes that impact on pathways ranging from RNA processing, as in the case of the exoribonuclease mXRN1p (73), to translational repression, as in the case of the fd gene 5 protein (74). Guanine-rich tracts have also been observed within neuronal RNAs that bind the RGG-rich domain of the fragile X mental retardation (FXMR) protein (75,76).
Telomeres, nucleoprotein complexes located at the ends of eukaryotic chromosomes, are composed of tandem DNA repeats of guanine-rich sequences (77). Telomeres are essential for chromosomal stability and genomic integrity, provide sites for recombination events and transcriptional silencing, and appear to play a critical role in cellular aging and cancer (43,78–81). Telomeric DNA ends are composed of both duplex and guanine-rich 3′-overhang segments, with the former progressively decreasing in length after each round of cell division in somatic cells (82). By contrast, telomeric overhangs can be elongated by the enzyme telomerase, a ribonucleoprotein complex with reverse transcriptase activity (83), which is expressed in the majority of cancer cells, thereby helping to maintain telomere length (84).
The pairing of homologous chromatids at their telomere ends can be mediated through bimolecular quadruplex formation (11). Such quadruplex structures may also play a role in chromosome synapsis and recombination during meiosis (85).
The guanine-rich 3′-overhangs of telomeres, such as TTAGGG repeats in humans can equilibrate between single-stranded and monovalent cation-mediated G-quadruplex folds, with the latter inhibiting the activity of telomerase. The telomeric ends in a single-stranded form are maintained by hPOT1 (86), while disruption of this interaction leads to quadruplex formation. Thus, ligand-induced stabilization of telomeric G-quadruplex scaffolds in humans constitutes a promising strategy for anti-cancer drug development (87–91). Therefore, much effort has been devoted to the structural characterization of G-quadruplex topologies formed by one, two, three and four human telomeric TTAGGG repeats as a function of monovalent cation, so as to define the scaffolds for anti-cancer drug discovery.
Though extensive studies have been undertaken on both ciliate (Tetrahymena and Oxytricha) and eukaryotic (yeast and human) telomeres, the emphasis in this review will be primarily on human telomeres. Single molecule fluorescence energy transfer (FRET) studies of structure and unfolding kinetics of the intramolecular human telomere G-quadruplex revealed two stable folded conformations in both K+ and Na+ buffers (92). Both folded conformations can be opened by addition of complementary oligonucleotide, with temperature dependent studies indicating that unfolding is entropically driven in K+ buffers (ΔH = 6.4 kcal mol−1 and ΔS = −52.3 cal mol−1 K−1), while unfolding in Na+ buffers exhibits a more significant enthalpic barrier (ΔH = 14.9 kcal mol−1 and ΔS = −23.0 kcal mol−1 K−1). Single-molecule FRET spectroscopy has also been used to probe the dynamics of human telomeric DNA containing four guanine-tracts in K+ solution. Interconversion was detected between three FRET values, interpreted in terms of an unfolded and two folded G-quadruplex states, each of which was further subdivided into long- and short-lived species (93). The short-lived species were shown to determine the overall dynamics, apparently because they bridge transitions between the long-lived G-quadruplex states.
The earliest structural information of the human telomere focused on NMR studies of the single-repeat d(TTAGGGT) human telomere sequence in K+ cation solution (4). The NMR data established that the single-repeat human telomere sequence tetramerizes to form an all-parallel-stranded G-quadruplex composed of three stacked G-tetrads with all anti guanine glycosidic torsion angles.
The X-ray structure of d(TAGGGTTAGGGT) crystals grown from K+-containing solution defined the architecture of the G-quadruplex formed by the two-repeat human telomere sequence (17). The structure contained an unanticipated all-parallel-stranded G-quadruplex following bimolecular association of the two-repeat human telomere sequences, with the TTA segments forming double-chain-reversal (or propeller) loops (Figure 3a). In addition, the end segments also participate in formation of a T–A–T–A tetrad, through pairing of the major groove edges of Watson–Crick A–T pairs (17).
NMR studies on the two-repeat human telomere sequence d(TAGGGTTAGGGT) demonstrates interconversion between two dimeric G-quadruplex conformers consisting of three stacked G-tetrads in K+ solution (94). One of these conformers adopts a symmetric all-parallel-stranded G-quadruplex with double-chain-reversal loops and all anti guanines (Figure 3b), similar to that observed in the crystal structure (17). This conformer predominates for an analog containing a specific dU (in bold) for T substitution (designated U6)
The other conformer adopts an asymmetric anti-parallel G-quadruplex with edge-wise loops composed of six syn guanines and six anti guanines (Figure 3c). This conformer predominates for an analog (designated U1,brU7) containing specific dU and dbrU (in bold) for T substitutions
NMR-based complementary-strand trap, concentration-jump and temperature-jump methods have been used to monitor the kinetics of interconversion and activation barriers between the parallel and anti-parallel G-quadruplex conformers (94). The equilibrium shifts towards the anti-parallel G-quadruplex (Figure 3c) at low temperature and towards the parallel G-quadruplex (Figure 3b) at high temperature for the U1,brU7 sequence, with the corresponding enthalpy being 18.5 kcal mol−1. Furthermore, the anti-parallel G-quadruplex folds faster, but unfolds slower than the parallel quadruplex at temperatures below 40°C.
A related conformational equilibrium has also been observed between a pair of bimolecular G-quadruplexes formed by the d(TGGGGTTGGGGT) two-repeat Tetrahymena sequence in Na+-containing solution (95).
in Na+ solution (96). This sequence forms a unique asymmetric bimolecular quadruplex, in which the core composed of three stacked G-tetrads, involves all three G-tracts from one strand and only the last G-tract of the second strand. In this (3+1) G-quadruplex assembly, there is one syn–syn–syn–anti and two anti–anti–anti–syn G-tetrads, two edge-wise loops, three G-tracts oriented in one direction and the fourth oriented in the opposite direction (Figure 4a).
The (3+1) G-quadruplex topology adopted by the three-repeat human telomere sequence establishes how a segment containing three G-tracts can bind to the 3′-end G-tract of another segment. Such quadruplex formation could occur within the 3′-end overhang of human telomeres or when the 3′-end invades the adjacent double-stranded segment of the telomere to form the so-called t-loop (see schematic in Figure 4c) (97).
was solved in Na+ cation solution (9). The intramolecular fold contained three stacked G-tetrads connected by successive edge-wise, diagonal and edge-wise TTA loops. Each guanine-tract had both parallel and anti-parallel aligned neighboring strands around the G-quadruplex, with guanines adopting syn–syn–anti–anti glycosidic torsion alignments around each G-tetrad. The grooves were accessible for further recognition within this topology, while the connecting loops restricted access to the outward-directed faces of the terminal G-tetrads at both ends. Finally, the 5′- and 3′-terminii project toward the same ends of the G-quadruplex (Figure 5a).
The X-ray structure of d[AG3(T2AG3)3] crystals grown from K+ cation solution exhibited a completely different and unanticipated fold (Figure 5c) and structure (Figure 5d) for the intramolecular G-quadruplex (17). The G-quadruplex was composed of three stacked G-tetrads, such that all strands are parallel, all guanines adopt anti conformations and all three loops are of the double-chain-reversal (or propeller) type. The double-chain-reversal loops restrict access to three of the grooves, while access is available to the outward-directed faces of the terminal G-tetrads at both ends. Finally, the 5′- and 3′-terminii project toward opposite ends of the G-quadruplex (Figure 5c), thereby facilitating potential end-to-end alignments of successive G-quadruplexes.
These very different conformers reported for the four-repeat human telomeric sequence in Na+-containing aqueous solution (9) and in K+-containing crystals (17) appear to highlight the polymorphic character of G-quadruplex scaffolds (93) as a function of medium and/or monovalent cation type. Nevertheless, accumulating evidence, including biophysical measurements (98), implied that the intramolecular parallel-stranded G-quadruplex structure of the human telomere observed in K+-containing crystals, appears unlikely to be the major form in K+-containing aqueous solution. To this end, three groups have recently systematically investigated the solution structure(s) of four guanine-repeat human telomeric sequences in K+ cation solution, while keeping in mind that the more crowded environment of the crystal may more closely reflect the crowded situation in the cell nucleus.
The imino proton NMR spectrum of d[AG3(T2AG3)3] in K+ cation solution is indicative of multiple conformations in equilibrium and hence this sequence context is not readily amenable to structural characterization. Three research groups (those of Hiroshi Sugiyama, Danzhou Yang and our group) have taken somewhat different approaches to overcome this limitation and recently contributed to determination of the solution structure(s) of four-repeat human telomeres in K+ solution. Our group's approach is outlined in detail below and these results are placed in the context of independent contributions from the other two groups.
The imino proton NMR spectra corresponding to distinct predominant conformers together with one or more minor conformers were observed for the d[TAG3(T2AG3)3] sequence, where a T was added at the 5′-end (99), and for the d[TAG3(T2AG3)3TT] sequence, where a T was added at the 5′-end and a TT was added at the 3′-end (100), both in K+ cation solution, with both cases maintaining the sequence context of the TTAGG human telomere repeat.
The NMR-based folding topology was determined for the predominant conformer of the d[TAG3(T2AG3)3] sequence in K+ cation solution (Figure 6a), and the solution structure determined for an analog containing terminal modifications (underlined) of this sequence, namely d[TTG3(T2AG3)3A], with the latter yielding exceptional NMR spectra reflecting a single conformer, together with the same 2D spectral characteristics of the unmodified sequence (99). Similarly, insertion of a single 8-bromoguanine at position G16 in the d[TAG3(T2AG3)3] sequence to enforce a syn glycosidic bond at this position also resulted in NMR spectra corresponding to a single conformer with all the spectral characteristics of the unmodified sequence (101). The solution structure has been determined for the d[TAG3(T2AG3)3] G-quadruplex (designated human telomere G-quadruplex form-1) (Figure 6b) (101), whose (3+1) topology differs from folds reported previously in Na+ solution (Figure 5a) (9) and K+-containing crystal (Figure 5c) (17). Instead, this G-quadruplex contains three G-tracts oriented in one direction and the fourth in the opposite direction, one anti–syn–syn–syn and two syn–anti–anti–anti G-tetrads, and a double-chain-reversal loop followed by two edge-wise loops (99).
The same G-quadruplex folding topology (Figure 6a) has been independently reported for the four-repeat human telomere sequences in K+-containing solution by two other laboratories, one of which used NMR (102,103), while the other used both CD (104) and NMR (105). The NMR investigation by the former group focused on the sequence d[AAAG3(T2AG3)3AA], with the resulting (3+1) topology (102) stabilized by a stacked A–A–A triple (103), associated with introduction of terminal adenine modifications (underlined) at either end of the sequence. The latter groups research avoided terminal modifications and was based on judicious positioning of between four and five 8-bromoguanine substitutions, which enforce a syn guanine alignment at the corresponding guanines in the sequence (104,105).
The NMR-based folding topology has also been determined for the predominant conformer of the d[TAG3(T2AG3)3TT] sequence in K+ cation solution (100). This sequence adopts the same (3+1) G-quadruplex core topology adopted by the predominant conformer of the d[TAG3(T2AG3)3] in K+ cation solution (99) outlined in the previous paragraph, except that the first two linkers are of the edge-wise type and the last linker adopts a double-chain-reversal loop (designated human telomere G-quadruplex form-2) (Figure 6c). Insertion of a single 8-bromoguanine at position G15 in the sequence to enforce a syn glycosidic bond at this position resulted in NMR spectra corresponding to a single conformer with all the spectral characteristics of the unmodified sequence (101). The solution structure of the d[TAG3(T2AG3)3TT] G-quadruplex form-2 is shown in Figure 6d (101). An independent NMR-based study (106) has reached the same conclusions reported above regarding the folding topology (100) and solution structure (101) of form-2.
The demonstration of G-quadruplex forms 1 (Figure 6a) and 2 (Figure 6c) for the four-repeat human telomere in K+, together with the all-parallel-stranded, propeller-groove-linked G-quadruplex observed in crystals grown from K+ solution (Figure 5c) (17), support the view that multiple human telomeric G-quadruplex conformers can coexist in K+-containing solution, a conclusion reached from single molecule FRET studies of the four-repeat human telomere sequence (92). Furthermore, these studies establish that even small changes to flanking sequences perturb the equilibrium between different coexisting (3+1) G-quadruplex forms. More recent research has attempted to monitor G-quadruplex formation by the four-repeat human telomere in K+ solution under polyethylene glycol-induced crowding conditions (107) that perhaps mimic crystallization conditions.
The (3 + 1) G-quadruplex scaffold is unique in that three stands are oriented in one direction and the fourth oriented in the opposite direction. Furthermore, two of the three G-tetrads adopt anti–anti–anti–syn alignments while the remaining G-tetrad adopts a syn–syn–syn–anti alignment. This topology was first reported in 1994 for the four-repeat Tetrahymena telomere sequence, d(T2G4)4, in Na+ solution (7) and observed a decade later for a four guanine-repeat variant bcl-2 promoter in K+ solution in which two guanines were replaced by thymines (108) (see bcl-2 sequence section).
The adaptation of the (3 + 1) core G-quadruplex by the three-repeat human telomere dimeric G-quadruplex in Na+ solution (Figure 4a) (96), as well as by the four-repeat human telomere G-quadruplexes form-1 (Figure 6a) and form-2 (Figure 6c) in K+ solution, established it to be a robust folding topology, thereby highlighting its candidacy as an important platform for structure-based drug design.
Bioinformatics sequence analysis indicates that guanine-rich tracts capable of G-quadruplex formation are prevalent in the human genome (39–41). In addition, it has recently been shown that promoter regions spanning 1 kb upstream of transcription start sites of genes are significantly enriched in putative G-quadruplex-forming motifs and that these putative promoter G-quadruplex-forming regions strongly associate with nuclease hypersensitivity sites (109). It has been suggested that such promoter-based G-quadruplexes may be directly involved in gene regulation at the level of transcription (110). This has led to extensive investigations of the role of promoter-mediated G-quadruplex formation in transcriptional regulation of the oncogenic promoters of c-myc (111), VEGF (112), HIF-1α (113), bcl-2 (114) and c-kit (115,116).
Since promoter regions are part of DNA duplexes, they would be unwound during replication, prior to G-quadruplex formation. Support for this concept has emerged from single-molecule FRET studies on the c-kit promoter (117). This process could be facilitated by formation of single-stranded tracts during transcription and further stabilized through addition of G-quadruplex-stabilizing ligands (118).
Human c-myc is a transcription factor that is central to regulation of cell growth, proliferation, differentiation and apoptosis (119–121). The c-myc gene that encodes this protein is tightly regulated in normal cells and its aberrant overexpression is associated with the progression of many cancers (122). c-myc can be deregulated as a result of translocation, mutation and/or amplification. An important element in the c-myc promoter region, termed the nuclease hypersensitivity element IIII (NHE IIII), controls up to 90% of total c-myc transcription (123). The 27-nt purine-rich strand of this element, which contains six guanine-tracts (underlined)
has the capacity for forming alternate G-quadruplex folds depending on which tracts participate in scaffold formation (111,124,125). Guanine to adenine mutants within the 27-nt c-myc segment that destabilize G-quadruplex formation, result in increased c-myc transcription, while ligands like the porphyrin TMPyP4 that stabilize G-quadruplex formation, result in decreased c-myc transcription (111).
The imino proton NMR spectrum of the 27-nt c-myc NHE IIII segment containing six guanine-tracts exhibited characteristics of multiple G-quadruplex folds in equilibrium, including a broad envelope characteristic of aggregated species, precluding structural characterization. Therefore, systematic NMR studies have been restricted to four and five guanine-tract sequences as part of an effort towards understanding the underlying principles contributing to c-myc G-quadruplex formation.
Initial efforts have focused on G-quadruplexes that can be generated through involvement of four of the six guanine-tracts associated with the 27-mer c-myc NHE IIII element. Over 50 sequence variants were checked prior to the identification of two that gave imino proton spectral quality reflective of distinct single conformers that justified further structural characterization (126). One of these involved the second, third, fourth and fifth guanine-tracts (designated c-myc-2345) as reflected in the sequence
while the other involved the first, second, fourth and fifth guanine-tracts (designated variant c-myc-1245), with the guanines of the third tract replaced by thymines (in bold, below), as reflected in the sequence
The resulting NMR-based intramolecular G-quadruplex folding topologies in K+ solution for both c-myc-2345 and thymine for guanine-containing variant c-myc-1245 sequences contain a core of three stacked G-tetrads formed by four parallel G-tracts with all anti guanines and three double–chain-reversal loops bridging G-tetrad layers (126). The c-myc-2345 fold is shown in Figure 7a, while that for variant c-myc-1245 is shown in Figure 7b. These studies establish that single-residue (A or T) double-chain-reversal loops can bridge three G-tetrad layers. Indeed, systematic studies of DNA quadruplexes with different arrangements of short and long loops confirm that single-residue loops favor parallel-stranded topologies (127). Of the two G-quadruplex folds, c-myc-2345, which has a two-residue central loop (Figure 7a), is more stable by 15° than variant c-myc-1245, which has a six-residue central loop (Figure 7b), in K+ solution. This is also reflected in the imino proton exchange lifetimes of the central G-tetrads, which are longer for the c-myc-2345 compared to variant c-myc-1245, suggesting slower unfolding kinetics for the former G-quadruplex (126).
An NMR-based solution structure has been reported for a variant c-myc-2345 sequence in which guanines G14 and G23 have been replaced by thymines (in bold, below)
The NMR-based G-quadruplex topologies for myc-2345 (Figure 7a) and variant myc-1245 (Figure 7b) (126), as well as the related study of the solution structure of the variant c-myc-2345 (Figure 7c) (128) G-quadruplexes correct earlier conclusions regarding proposed c-myc folding topologies based solely on interpretation of footprinting data (111), in an otherwise highly cited contribution.
The variant c-myc-1245 (126) and c-myc-2345 (128) sequences replace guanines by thymines within G-rich tracts. Thymine, unlike inosine, has nothing in common with guanine, and thymine for guanine substitutions represent a significant perturbation of the wild-type c-myc sequence. Therefore, structural studies were next extended to the c-myc sequence containing five of the six guanine-tracts associated with the 27-mer c-myc NHE IIII element, while avoiding any thymine for guanine substitutions. This sequence (designated c-myc-23456)
is composed of the second, third, fourth, fifth and sixth guanine-tracts. The NMR-based folding topology (Figure 8a) and solution structure (Figure 8b) of the c-myc-23456 G-quadruplex in K+ solution is composed of three stacked guanine tetrads formed by four parallel guanine-tracts with all anti guanines and a snap-back 3′-end syn guanine (129). The guanines involved in G-tetrad formation are highlighted in bold below
and involve guanines from each of the five tracts. This snap-back configuration is facilitated by a stable diagonal loop, which contains a G–(A-G) triad, which stacks on and caps the G-tetrad core at one end of the G-quadruplex. The 5′- and 3′-ends of the sequences are at opposite ends of the snap-back c-myc-23456 G-quadruplex (Figure 8a) (129), as they are for the c-myc-2345 (Figure 7a) and variant c-myc-1245 (Figure 7b) G-quadruplexes (126).
The proto-oncogenic c-kit promoter encodes for a tyrosine kinase receptor, thereby regulating signal transduction cascades that control cell growth and proliferation (130). Oncogenic cellular transformations in c-kit are associated with mutations in structurally important regions, with human gastrointestinal stromal tumors (GIST) associated with mutations around the two main autophosphorylation sites in the juxtamembrane region (131), while myeloid leukemias and human germ cell tumors are associated with kinase domain mutants (132). The drug Gleevec (imatinib) is an effective in vitro and in vivo inhibitor of c-kit kinase activity and is widely used clinically against GIST (133). Like other small molecule drugs targeted against kinases, new patterns of resistance mutations within the active site, result in diminished binding and clinical effectiveness of the drug (134).
Selective gene regulation at the transcription level provides an alternate approach to c-kit inhibition. This can be achieved by induction of G-quadruplex structures within G-rich tracts of the c-kit promoter and their potential stabilization by bound ligands. Recently, imino proton NMR spectral studies established that the c-kit1 22-mer sequence
positioned between −87 and −109 nt upstream of the transcription start site of the human c-kit gene, forms a single G-quadruplex scaffold in K+ solution (115). Expectations that this sequence, which contains four GGG tracts (underlined, above), forms a conventional G-quadruplex, appeared unlikely when it was found that mutations within the linker segments were detrimental to G-quadruplex formation (115). It should be mentioned that a second highly conserved guanine-rich sequence has been recently identified in the c-kit gene, at a site critical for core promoter activity (116).
The NMR-based solution structure has been determined for the 22-mer c-kit1 sequence in K+ cation solution (135). The c-kit1 sequence, which exhibits an exceptionally well-resolved NMR spectrum (115), adopts a G-quadruplex topology (Figure 8c) and solution structure (Figure 8d) composed of three stacked G-tetrads and four connecting loops. The guanines involved in G-tetrad formation (in bold, below) include isolated guanine G10, but excludes G20 of the last G-tract.
Two single-residue linkers (A5 and C9) form two double-chain-reversal loops that bridge three G-tetrad layers, the two-residue linker connects two adjacent corners (G10 and G13), while the five-residue linker allows the terminal G21–G22 step to be inserted back into the G-quadruplex core. The loops are stabilized through formation of a Watson–Crick A–T pair that stacks over the top of the G-quadruplex and two non-canonical G–A pairs that stack over the bottom of the G-quadruplex.
This structure establishes a new folding principle that an isolated guanine (G10 in the present case) within a non-G-tract segment can participate in the formation of the structured G-quadruplex core (135). This result raises an element of caution regarding the use of programs that predict G-quadruplex folding topologies from sequence data, where they rely solely on the participation of guanines within G-tracts. Another notable feature is associated with formation of a snap-back parallel-stranded G-quadruplex core, where the last two guanines insert back into the core to complete adjacent G-tetrad alignments (Figure 8c). The 5′- and 3′-ends of the sequences are at opposite ends of the snap-back c-kit1 G-quadruplex, thereby allowing continuation of the DNA sequence in both directions without significant steric hindrance.
Both the c-myc 23456 (Figure 8a) (129) and c-kit1 (Figure 8c) (135) scaffolds contain distinct pronounced clefts, with their unique surface topologies making them attractive site-selective targets for drugs.
The bcl-2 gene mediates the t(14;18) chromosomal translocation associated with the onset of lymphomas (136,137). The bcl-2 gene is overexpressed in several human cancers, with the gene product functioning as an apoptosis inhibitor, thereby impacting adversely on the therapeutic action of cancer treatment regimes in the clinic (138). Thus, both the bcl-2 gene and its gene product constitute rational targets for anti-cancer therapy.
Transcriptional initiation of bcl-2 is controlled by a major promoter P1, containing a guanine-rich strand upstream of the initiation site and proximal to a nuclease hypersensitivity region (114). This bcl-2 promoter region contains six guanine-tracts containing three or more contiguous guanines (underlined)
with non-denaturing gel, footprinting and cd data interpreted in terms of a mixture of at least three G-quadruplex conformers in K+ solution. The second to fifth G-tracts (designated bcl-2 2345) forms the most stable G-quadruplex (114), and an attempt has been made to structurally investigate this sequence composed of the four central guanine-tracts. The NMR studies were undertaken on a variant in which guanines G15 and G16 were replaced by thymines (in bold, below) (108).
The same (3 + 1) G-quadruplex scaffold was first reported over a decade ago for the four-repeat Tetrahymena telomere sequence, d(T2G4)4,
Replacement of single guanines by inosines, where the exocyclic amino groups are replaced by protons, have been used previously in NMR-based studies of G-quadruplex formation in efforts to improve spectral quality (96). By contrast, replacement of two guanines by thymines in variant bcl-2 2345 constitutes a much more serious perturbation, especially for an internal guanine-tract, preventing these two guanines from potential participation in G-tetrad formation. Thus, opportunities exist for structurally investigating unperturbed bcl-2 oncogenic promoter sequences, perhaps involving five of the six guanine-tracts, as was accomplished previously for c-myc-23456 (129).
Vascular endothelial growth factor (VEGF) stimulates the formation of new blood vessels, providing oxygen and nutrients to primary tumor sites, thereby facilitating the proliferation of cancer cells. VEGF-mediated tumor angiogenesis, has stimulated interest in the VEGF gene and its potential as a target for cancer therapy (140). Elevation of VEGF expression in cancer is primarily regulated at the transcription level, with the VEGF promoter containing a purine-rich strand composed of five guanine-tracts of at least three guanines each (underlined)
that also serves as binding sites for Sp1 and Egr-1 transcription factors. The guanine-rich VEGF sequence forms G-quadruplex structures in monovalent cation solution (as monitored by cd and footprinting measurements), which are stabilized by G-quadruplex-interacting agents TMPyP4 and telomestatin (112). In addition, a DNase1 and S1 nuclease hypersensitivity site was identified to the 3′-side of the G-quadruplex forming region, but not for mutant sequences that inhibit quadruplex formation. Finally, the cd spectrum of the guanine-rich VEGF sequence in K+ is consistent with formation of a parallel-stranded G-quadruplex. Overall, the results are suggestive of the importance of structural transitions in enhancing open promoter complex formation, thereby facilitating transcriptional regulation (112).
Hypoxia inducible factor-1α (HIF-1α) is activated in many common human tumors and is associated with local invasion and metastasis (141). The HIF-1α promoter contains five guanine-rich tracts of at least three guanines each (underlined)
capable of all-parallel-stranded G-quadruplex formation in K+ solution, as indicated by chemical probing, cd and DNA polymerase arrest assays (113). Considerable effort has gone towards targeting HIF-1α in cancer therapy (142).
To date, no systematic structural investigations have been undertaken to determine the G-quadruplex structures adopted by the guanine-rich tracts of either the VEGF or HIF-1α promoters.
A series of nucleotide or repeat expansion disorders caused by the dynamic intergenerational expansion of triple repeat d(CGG)n–d(CCG)n, d(CAG)n–d(CTG)n and d(GAA)n–d(TTC)n sequences are associated with neurological, neuromuscular and neurodegenerative disorders (143,144). These diseases exhibit genetic anticipation, whereby the symptoms and penetrance are manifested in subsequent generations at a decreased age of onset and increased severity. The expandable repeats are found in diverse settings ranging from coding segments, to 5′- and 3′-UTRs, promoter regions and introns. It is likely that the pathogenesis of these debilitating diseases, and their disruption of cellular replication, repair and recombination machineries, reflects unusual DNA conformations generated for long repeats, for which several secondary structural models have been proposed in the literature (145–149). These guanine-containing repeats within complementary repetitive strands of the duplex can form slip-out hairpin-like folds (150), which in turn could form higher order architectures, including quadruplex formation following bimolecular association. One of these repeat expansion models proposes that the higher order structures stall the replication fork, giving time for addition of extra repeats, prior to replication fork restart (151).
Though the early emphasis on triplet expansion diseases was focused on the DNA template, more recent analysis has brought RNA repeats to the forefront, with the emphasis on gain-of-function contributions at the RNA level (152). Thus, structural studies need to be undertaken on both triplet repeat-containing DNAs and RNAs.
There has been considerable interest in the molecular basis for expansion of d(CGG)n–d(CCG)n tracts in genomic DNA that results in the onset of the FXMR syndrome (153,154), the single most common inherited cause of mental retardation (155). The d(CGG)n triplet repeat (can be designated CGG, GGC or GCG repeat depending on the phase of the readout) is observed within the first exon of the FMR-1 gene with n < 30 nt in normal individuals. This number increases up to ~200 nt in premutation carriers and further expands up to 2000 nt in individuals afflicted with fragile X syndrome. The genetic instability associated with the expansion of d(CGG)n repeats to the diseased state is facilitated by hypermethylation of cytosine residues (156) and results in suppression of FMR-1 gene transcription (154) and delay in replication in patients with the FMR-1 syndrome (157). It was initially shown that the fragile X syndrome d(CGG)n repeat forms a stable G-quadruplex in the presence of monovalent cations when n = 7, and also when n = 5, for its methylated cytosine counterpart (158). In addition, d(CGG)n repeats form structures that block DNA synthesis in vitro (159), with the block overcome by the Werner syndrome (WRN) helicase (160). Interestingly, the cationic porphyrin TMPyP4 (161) and the hnRNP-related protein CBF-A (162,163), both destabilize quadruplex formation, in contrast to their structural stabilization of the human telomere G-quadruplex.
Very high-quality NMR spectra were observed for d(GCGGT3GCGG), a sequence that embeds CGG and GCG steps, in Na+ solution, thereby defining a distinct folding topology (Figure 10a) and solution structure (Figure 10b) (164).
The sequence forms a bimolecular quadruplex containing G–C–G–C tetrads (Figure 10c) flanked by G–G–G–G tetrads in solution. The loops adopt edge-wise conformations and are aligned at opposite ends of the bimolecular quadruplex, while the strands directionalities alternate around the G-quadruplex and the G-tetrads adopt anti–syn–anti–syn alignments (Figure 10a). These studies establish the pairing alignments that can be potentially utilized by sequences containing the fragile X syndrome d(CGG)n triplet repeat to form quadruplex structures. Such quadruplex structures, stabilized by a mixture of G–C–G–C and G–G–G–G tetrads [see also, (165), for an alternate, but not structurally characterized quadruplex model], could serve as potential blockage sites for the progress of replication forks and account for the blockage at the fragile X locus observed experimentally (157).
The d(GAA)n-repeat is of considerable biological interest since expansion of d(GAA)n–d(TTC)n triplet repeats located within the first intron of the frataxin gene contributes to Friedrich's ataxia, an autosomal recessive neurodegenerative disease (166). The non-G–C rich nature of the sequence, together with the intronic localization and the requirement of both alleles, makes Friedrich's ataxia unique amongst the triplet-repeat disease sequences. Expression of the d(GAA)n triplet repeat leads to reduced levels of frataxin mRNA transcripts, and it has been shown to reflect impediment in transcription elongation, in a length and supercoil dependent manner (167). This impediment could reflect formation of a stable nucleic acid architecture (168), and several models have been proposed ranging from triplexes (169) to parallel-stranded duplexes (170). The parallel-stranded duplex model for d(GAA)n triplet repeats is intriguing, since further bimolecular pairing could result in quadruplex formation.
G-quadruplexes can contain pairing alignments beyond the G-tetrad and considerable effort has gone into defining these alignments. These include other homo- and mixed-tetrad pairing alignments, triads, pentads, hexads and heptads. Triads and triples are generally observed within edge-wise and diagonal loop regions, where they stack on terminal G-tetrads. By contrast, mixed tetrads, pentads and hexads are observed at both the ends and within G-quadruplexes.
Early structural studies identified edge-wise (9,15), and diagonal (8,10) loops that bridged anti-parallel-aligned columns around the G-quadruplex. An unanticipated development was the identification of double-chain-reversal loops that bridged adjacent parallel-aligned columns within the four-repeat Tetrahymena G-quadruplex (7). In this case, two thymine residues span three stacked G-tetrad planes. Next it was demonstrated that single residue double-chain-reversal loops can span two G-tetrad planes in an all parallel-stranded G-quadruplex (16). The importance of double-chain-reversal loops emerged center-stage following the structure determination of the four-repeat human telomere from crystals grown in K+ solution (17), where all three TTA loops were of the double-chain-reversal (or hairpin) type and each spanned three stacked G-tetrad planes (Figure 5c). The next discovery was that single-residue double-chain-reversal loops could span both two (16) and three (126) stacked G-tetrads. The latter result was most unexpected but was confirmed in subsequent studies on additional G-quadruplex folds (128,129,171,172).
The standard view of G-quadruplex formation involves a scaffold stabilized by stacked G–G–G–G tetrads. Nevertheless, mixed tetrads can also stabilize G-quadruplex formation and these include major groove-aligned G–C–G–C tetrads of the direct (Figure 10c) (14,164,173,174) and slipped (Figure 10d) (26) type and major groove-aligned A–T–A–T tetrads of the direct (17) and slipped (173) type. Minor groove-aligned mixed G–G–G–G and A–T–A–T tetrads have also been structurally characterized, but the bases deviate significantly from the tetrad plane (175,176).
NMR studies on d(AGGGT) in K+ solution are consistent with formation of a parallel-stranded G-quadruplex (177). Somewhat unexpectedly, nuclear Overhauser enhancement (NOE) cross peaks were observed between the adenine amino protons and the non-exchangeable H8 and H2 protons. This has lead to the proposal of A–A–A–A tetrad formation, with rapid interconversion between N6H•••N7 and N6H•••N3 hydrogen-bonding alignments. Furthermore, the terminal adenine residues appear to adopt syn glycosidic torsion angles based on the strong H8 to H1′ NOEs observed at short mixing times, suggestive of an A(syn)–A(syn)–A(syn)–A(syn) alignment (177). A more definitive approach would have been to use 15N isotopic labeling to directly monitor scalar coupling to define hydrogen-bonding alignments (178,179); (174), thereby validating the proposed A–A–A–A tetrad formation. The NMR-based conclusions contrast with crystallographic studies of RNA sequences (discussed in more detail in topologies and tetrad alignments section), where A–A–A–A tetrads have been definitively identified, but shown to adopt A(anti)–A(anti)–A(anti)–A(anti) alignments (180,181).
The concept of an anti-parallel DNA duplex stabilized by base triads was proposed more than a decade ago (19). A base triad involves alignment of three bases in a plane, where a base from one strand interacts through hydrogen bonding with two adjacent co-planar bases from the partner strand. The coplanar-aligned adjacent bases essentially form a platform, a feature identified initially in RNA (182). A triad differs from a triple, where the three bases come from three distinct strands. There are now several examples of base triads stacked over the terminal G-tetrads of G-quadruplexes. These include A–(T-A) (Figure 11a) (101,183), G–(C-A) (Figure 11b) (20), T–(A-A) (184), T–(A-T) (101), G–(A-G) (129), and G–(T-T) (185) triads, where in each case, the co-planar adjacent bases that constitute the platform, are indicated in brackets.
NMR-based investigations of G-quadruplexes have also identified formation of A–(G–G–G–G) pentads (Figure 11c) (18,171), A–(G–G–G–G)–A hexads (Figure 11d) (16) and heptads (186). Such alignments essentially are composed of G–(A-G) triads, where one or more A residue(s) align(s) along one or more minor groove edge(s) of a G-tetrad.
A systematic and penetrating study has reported on the impact of guanine modifications on formation of parallel four-stranded G-quadruplexes (187). These authors measured G-quadruplex association and dissociation kinetics to estimate the energetic penalty associated with single-site modifications of 12 different substitutions. Modifications involving the hydrogen-bonding positions on the guanine ring (O6, N1, N2 and N7) were detrimental to G-quadruplex stability as reflected in decreased association rate constants and reduced quadruplex lifetimes. The most deleterious effects were observed for central guanine substitutions, suggestive of an important role for this position in the nucleation process. By contrast, modifications that perturb neither the central carbonyl group alignment nor the cyclic hydrogen-bonding pattern are tolerated, as are other planar bicyclic ring systems that retain such constraints. Thus, substitution of guanine by either 8-bromoguanine or 6-methyl-isoxanthopterin accelerates quadruplex formation, especially when substituted at the 5′-end of the G-tract. It is conceivable that the bromo and methyl groups in these substitutions favor hydrophobic collapse during the process of strand association. These modifications also favor a syn glycosidic torsion angle, which correlates with the observation of syn glycosidic torsion angles at the 5′-guanine positions in (3 + 1) G-quadruplex scaffolds (7,101). Finally, non-guanine tetrads are destabilizing when positioned internally within a G-quadruplex, but can be accommodated when positioned over terminal G-tetrads due to stabilizing stacking interactions (187). A systematic study has also been undertaken on the effect of G-tract length on the topology and stability of intramolecular G-quadruplexes (188).
Two G-quadruplexes can interact through end-to-end stacking (16) or alternately through an interlocked configuration (18,171), where a guanine from one monomer completes the G-tetrad through interaction with three guanines from the other monomer. Such quadruplex–quadruplex interactions, especially those of the interlocked type, result in very stable topologies, and can involve the participation of junctional G–C–G–C tetrads (Figure 10c) (174), A–(G–G–G–G) pentads (Figure 11c) (18) and A–(G–G–G–G)–A hexads (Figure 11d) (16).
Previous NMR-based studies had demonstrated that the d(GGAGGAT) sequence formed a two-stranded arrowhead motif-aligned solely through non-canonical pair formation under low (10 mM) Na+ counterion conditions (189). Further NMR-based studies of this sequence and related d(GGAGGAG) indicated a pronounced conformational change on proceeding to moderate (150 mM) Na+ counterion conditions (16). Structural characterization of the moderate salt conformer demonstrated formation of end-to-end stacked G-quadruplexes involving four strands with a unique folding topology (Figure 11e) and solution structure (Figure 11f) for the d(GGAGGAG) sequence in 150 mM Na+ solution (16). Each G-quadruplex monomer, formed by alignment of two d(GGAGGAG) strands, is composed of a junctional A–(G–G–G–G)–A hexad (Figure 11d), a G–G–G–G tetrad and an A–A non-canonical pair. The A3 residue, involved in double-chain-reversal loop formation, also participates in A–(G–G–G–G)–A hexad formation. The end-to-end stacking of G-quadruplex monomers is mediated through stacking of their junctional A–(G–G–G–G)–A hexads (Figure 11e). A combination of Brownian dynamics and molecular dynamics simulations identified several stable monovalent cation-binding sites within the end-to-end stacked G-quadruplexes scaffold (16).
The guanine-rich d(G3AG2T3G3AT) sequence
contains one GG and two GGG segments and therefore was not expected to form a monomeric intramolecularly folded G-quadruplex. Despite this limitation of lacking four guanine-tracts, the sequence gave exceptional NMR spectra associated with a single conformation in Na+ solution (18). The stoichiometry of two, coupled the number of resonances, established that d(G3AG2T3G3AT) folds by interaction between symmetry-related G-quadruplexes. A uniformly 13C,15N-labeled sample was prepared to facilitate resonance assignments and identify hydrogen-bonding alignments (178,179); (174). NMR-based NOE and hydrogen-bonding constraints defined the folding topology (Figure 12a) and solution structure (Figure 12b) associated with a pair of interacting G-quadruplexes (18). Each symmetry-related G-quadruplex monomer undergoes three sharp turns, with all purines involved in pairing alignments. The first turn is of the double-chain-reversal type, the second turn is of the edge-wise type and the last involves a new alignment, the V-shaped turn (Figure 2d). Each monomer contains two stacked G(anti)–G(anti)–G(anti)–G(syn) tetrads, one of which forms a A–(G–G–G–G) pentad. There is a break in one of the four G-G columns that link adjacent G-tetrads within each monomer, resulting in a V-shaped scaffold. The A–(G–G–G–G) pentad from each monomer mutually stack on each other, with each pentad containing four bases from one monomer and a syn G1 from the partner monomer, thereby resulting in interlocked G-quadruplex formation (18).
HIV-1 integrase catalyzes the integration of proviral DNA into the host-cell genome, a reaction critical for efficient viral replication. NMR-based studies have solved the solution structure of an in vitro selected guanine-repeat-containing 93del DNA sequence
a potent nanomolar inhibitor of both processing and strand transfer functions of HIV-1 integrase (190). This sequence forms an unusually stable interlocked G-quadruplex architecture in K+ solution with a fold shown schematically in Figure 12c and solution structure shown in Figure 12d (171). Within each monomer subunit, one A–(G–G–G–G) pentad is sandwiched between two G–G–G–G tetrads, all G-stretches are parallel, are linked by three double-chain-reversal loops, and all guanines are anti, except for G1, which is syn. Interlocked G-quadruplexes formation is achieved through mutual pairing of G1 of one monomer, with three other guanines of the other monomer, to complete junctional G-tetrad formation.
The interlocked G-quadruplexes scaffold with its distinct surface architecture could be shape-specifically targeted by ligands, or in turn serve as a ligand that targets interfacial channels on multimeric proteins. Indeed, molecular docking approaches suggest that the 93del interlocked DNA G-quadruplex could potentially be positioned within a basic canyon formed between subunits of the tetrameric HIV-1 integrase (171).
The snap-back parallel-stranded G-tetrad core scaffolds have been observed for both the c-myc-23456 (Figure 8a) (129) and the c-kit1 (Figure 8c) (135) G-quadruplexes. In both cases, there is an interruption of the G-tetrad core, with base-pairing alignments in the last connecting loop important in stabilizing the snap-back scaffold. The c-myc-23456 (Figure 8a) and c-kit1 (Figure 8c) G-quadruplexes differ in that the former involves insertion of a single syn guanine (129), while the latter involves insertion of two anti guanines (135). The snap-back feature allows for continuation of the DNA sequence in both directions without significant steric hindrance.
The majority of attention has focused on DNA quadruplexes, their diversity of scaffolds and their potential role in biology. By contrast, RNA quadruplexes have received less attention, despite implications of their involvement at sites of RNA packaging (191), endonucleolytic cleavage activity (192), translational control (193) and mRNA turnover (73).
RNA quadruplex formation was initially established from NMR studies on r(UGGGGU) (194) and subsequently by a 0.61 Å crystal structure of the same sequence for crystals grown in Sr2+-containing solution (195). This sequence forms a parallel four-stranded RNA quadruplex with all anti guanines in both solution and the crystalline state, with Sr2+ sandwiched between every other G-tetrad plane in the crystal. These studies also identified formation of U–U–U–U tetrads, as well as G- and U-containing octads in the crystal, where uracils pair with the minor groove edges of guanines of the G-tetrad. Additional aspects of quadruplex structure have emerged from the 1.4 Å crystal structure of d(brU)-r(GAGGU) (181) and 1.5 Å crystal structure of r(U)-d(brG)-r(AGGU) (180). These studies unequivocally demonstrate formation of Na+ cation-coordinated all anti A–A–A–A tetrads involving either N6H•••N7 (former quadruplex) or N6H•••N3 (latter quadruplex) hydrogen-bonding alignments. More recently, a crystal structure of r(UGGUGU) established an even higher order architecture associated with a dimer of quadruplexes scaffold (6).
Bioinformatic searches for guanine-rich sequences in 5′-UTRs of the human genome has recently identified up to 3000 putative G-quadruplex-forming elements (196). One of these sequences, an 18-mer containing four guanine-tracts
is associated with the 5′-UTR of the oncogenic N-ras sequence, located 14-nucleotides downstream of the 5′-cap and 222-nucleotides upstream of the translation start site. This sequence, which contains four guanine-tracts is highly conserved across species, both within the guanine-rich segments and its position relative to the translation start site, and forms a G-quadruplex (as monitored by cd) as a function of monovalent cation. The measured tm was 63°C in 1 mM K cation, and the stabilization decreased in the order K+ > Na+ > Li+ (196). The RNA G-quadruplex was very stable since unfolding was not observed even at 95 C in K+ solution. The CD spectrum exhibited characteristics of a parallel-stranded G-quadruplex with a positive peak at 263 nm and a negative peak at 241 nm. The authors used a cell-free translation system coupled to a reporter gene assay to demonstrate that the N-ras G-quadruplex inhibits gene expression at the translational level (196). This seminal result opens opportunities for the identification of small molecule therapeutic agents with the potential for stabilizing 5′-UTR RNA G-quadruplex formation, thereby inhibiting translation of oncogenes.
There remain many structural challenges associated with G-quadruplex architecture, as well as the impact of structure on function. Several of these are outlined below.
To date there is no evidence for formation of all-purine major groove-aligned G–A–G–A tetrads within a quadruplex scaffold. The design and identification of G–A–G–A tetrads would expand the repertoire of sequences that can form quadruplex scaffolds. There are several possibilities for non-canonical G–A pairing alignments stabilized by two hydrogen bonds. These include G(anti)–A(anti) pairing along their Watson–Crick edges (197–199), sheared G(anti)–A(anti) pairing using the minor groove edge of G and major groove edge of A (200,201), G(anti)–A(syn) pairing using the major groove edge of A (202), and G+(syn)–A(anti) pairing using the major groove edge of protonated G (203,204). Such G–A non-canonical pairs could potentially align along their major groove edges to form G–A–G–A tetrads. It remains to be demonstrated whether G–A–G–A tetrad alignments can be accommodated in an otherwise G–G–G–G tetrad-containing G-quadruplex. The G–A–G–A tetrad has only two inwardly pointing carbonyls in contrast to four inwardly pointing carbonyls in the G–G–G–G tetrad available for monovalent cation coordination. Thus, the role of monovalent cations in stabilization of G–A–G–A tetrads is less clear at this time. It should be noted that ethanol can substitute for monovalent cations in facilitating quadruplex formation (205).
It is conceivable that the Friedrich's ataxia d(GAA)n triplet-repeat sequence (166) could adopt a quadruplex scaffold stabilized by G–A–G–A and A–A–A–A tetrads. It remains to be established whether such a quadruplex forms in solution, and if so, what the strand directionalities and which tetrad alignments define the topology.
Published structural efforts to date have focused on human telomeres containing one (4), two (17,94), three (96) and four repeats (9,17,99–106). However, there are interesting questions concerning (TTAGGG)n repeats, where n is >4, especially regarding the issue of whether adjacent human telomeric G-quadruplexes stack on each other (17), thereby adopting a beads-on-a-string architecture (206). In addition, in a particularly innovatively designed experiment, it has been shown that chiral cyclic-helicene molecules of a particular size, which are capable of wedge formation, can target quadruplex–quadruplex junctions and stabilize higher order human telomere structures (207). Both the architecture of quadruplex–quadruplex junctions and their complexes with ligands such as chiral cyclic-helicene constitute a significant challenge for the future.
In contrast to significant progress on the structures of the G-quadruplex folds of the c-myc (126,128,129) and c-kit (135) sequences in K+ solution, and the human telomere sequence in Na+ (9) and K+ (96,99–106) solution, little is known about the architecture and stability of quadruplex–duplex junctions, as would occur in a natural context, where telomeric G-quadruplexes could either cap the ends of the telomere or where telomeric and oncogenic promoter G-quadruplexes extrude out of a duplex segment. Studies of G-quadruplex–duplex junctions and their stability represent a significant structural challenge that will require a systematic investigation involving variations in the length and sequence of junctional residues.
Electron microscopy studies demonstrate that the telomeric G-overhang segment of chromosomal termini may adopt a non-linear configuration through formation of a lariat-shaped t-loop structure, where the overhang segment invades an adjacent duplex region through formation of a displacement D-loop (Figure 4c) (97). It should be noted that such a t-loop architecture has the potential for sequestering and protecting the 3′-terminii of telomere overhangs. Structural studies have the opportunity to discriminate between proposed alternate models of t-loop formation.
Both the VEGF and HIF-α promoter sequences (see earlier VEGF and HIF-1α sequences section) contain five guanine-tracts and hence may adopt more than one conformation in solution. Perhaps, through judicious choice of single inosine for guanine substitutions, high-quality NMR spectra corresponding to a single G-quadruplex conformation may be achievable for each of these oncogenic promoter sequences, as was successfully achieved previously for the five guanine-repeat c-myc-23456 G-quadruplex (129).
It should be noted that there are two examples of single bases separating G-tracts in both the VEGF and HIF-1α promoters, and these are potential sites for double-chain-reversal loops (7,16,126). Indeed, there are some similarities in the sequence between VEGF-1234 and HIF-1α–2345 in that they have an approximate consensus element
To date, no structures have been reported for oncogenic 5′-UTR RNA sequences. Potential candidates containing four guanine-tracts that impact on oncogenic events (196), include the 5′-UTR of N-ras (see earlier Oncogenic 5′-UTR regions section for sequence), as well as the 5′-UTR of Friend leukemia integration protein 1 (fli1), which exhibits the sequence
the 5′-UTR of apoptosis regulator (bcl-2), which exhibits the sequence
and the 5′-UTR of transcription factor AP1 (jun), which exhibits the sequence
Each of these sequences contains single-residue linkers with the potential for formation of double-chain-reversal loops. One can anticipate that several unanticipated topologies are likely to emerge for RNA quadruplexes, as reported previously for DNA quadruplexes (35). Furthermore, unlike oncogenic promoter DNA quadruplexes whose formation requires prior melting of the duplex segment, no such constraint exists for the primarily unstructured 5′-UTR RNA sequences.
The ultimate challenge would be to develop methods for probing structures and conformational transitions involving G-quadruplexes in living cells. Given the rich diversity of G-quadruplex scaffolds and their propensity to interconvert, it will be a challenge to identify small molecules that exhibit recognition selectivity for distinct scaffolds at the cellular level. Clearly, one anticipates further developments, for instance of fluorescent dyes, along the lines of the carbazole BMVC (65), to further address this problem.
The ribonucleoprotein enzyme telomerase, composed of an endogenous RNA template and a reverse transcriptase, can maintain telomere length by adding TTAGGG repeats to the 3′-ends of chromosomes (79,208). Telomerase is active in the majority of human tumor cells and requires single-stranded telomere ends as a primer for its activity (84,209). In this regard, telomerase levels correlate with cancer progression and the metastatic state. Telomerase activity can be negatively regulated in vivo through monovalent cation-mediated G-quadruplex formation at telomeric ends (210), and hence small molecules that bind and stabilize G-quadruplex structures constitute potent telomerase inhibitors. This concept was first validated when it was demonstrated that 2,6-diamidodianthraquinone inhibits the activity of telomerase by interacting with and stabilizing G-quadruplex structures (211). Some ligands could also act as molecular chaperones by increasing the association constant for G-quadruplex formation (212). G-quadruplex formation denies access of telomerase and telomeric DNA-binding proteins to telomere overhangs, thereby selectively interfering with telomere maintenance in tumor cells (87,88,213–218). Overall, telomerase inhibition leads to telomere-length reduction, tumor-cell senescence and ultimately apoptosis.
Diverse families of compounds have been identified that exhibit selectivity for G-quadruplexes over their duplex counterparts and inhibit telomerase action in human tumor cell lines with IC50 values in the sub-μM range. Many of these compounds contain polyaromatic heterocyclic ring systems, including anthraquinones, acridines, perylenes and porphyrins (215,216,219,220), capable of extensive π-stacking interactions with terminal G-tetrads (221,222), and in some cases containing at least two side chains directed towards the G-quadruplex grooves. Some insights into the principles of ligand-G-quadruplex recognition have emerged from the few published NMR and X-ray structures of complexes.
NMR studies of complexes formed between the single-repeat human telomere sequence d(TTAGGGT) and dicationic perylene tetracarboxylic diimide (223) and fluorinated pentacyclic quino[4,3,2-kl]acridinium cation (224) ligands, establish that these polycyclic ring systems interact with the all-parallel tetramolecular G-quadruplex through end-stacking over terminal G-tetrads. To date, despite occasional claims, there is no definitive spectroscopic and structural evidence that supports intercalation of polycyclic ring system-containing chromophores between G-tetrads of G-quadruplexes.
An X-ray structure has been reported for the antitumor drug daunomycin (Figure 13a) bound to the all-parallel-stranded tetramolecular G-quadruplex formed by d(TGGGGT) (Figure 14a) (222). Daunomycin aligns in a trimeric arrangement, with its anthracycline chromophores optimally end-stacked on the terminal G-tetrad (Figure 14b). In addition, the daunosamine sugar rings are positioned in the grooves and anchored through intermolecular hydrogen bonds.
One of the best-characterized telomerase inhibitors belongs to the family of 3,6,9-trisubstituted acridine molecules. The first-generation compound, BRACO-19, exhibits cell growth arrest, chromosomal end-to-end fusions (225) and antitumor activity in tumor xenografts (226), all within a short exposure time. BRACO-19 also induces G-quadruplex formation by competing with hPOT1 for binding to single-stranded telomeric overhangs. Such uncapping of telomerase from telomere ends, induces a rapid DNA damage response and selective cell death. More recently, the key anilino substitutent in BRACO-19 has been replaced by a benzylamino substitutent (Figure 13b) resulting in enhanced quadruplex interaction and superior telomerase inhibitor activity (227).
The X-ray structure of a disubstituted aminoalkylamido acridine bound to two-repeat Oxytricha d(GGGGTTTTGGGG) sequence establishes that the acridine ring system end-stacks with the terminal G-tetrad of the bimolecular G-quadruplex (Figure 14c) (221). The acridine ring threads through the diagonal loop of the bimolecular G-quadruplex, with stacking and intermolecular hydrogen-bonding interactions stabilizing complex formation (Figure 14d).
Malignant glioblastomas are very aggressive and invasive tumors of the central nervous system that are highly refractive to surgery, radiotherapy and chemotherapy. A family of bisquinolinium-substituted 2,6-pyridine-dicarboxamide derivatives were shown to inhibit cell proliferation at low doses and induce massive apoptosis in cultures of glioma cell lines (228). The apoptosis was preceded by multiple cell cycle alterations associated with telomere end fusion and anaphase bridge formation, suggesting that these pyridine-based G-quadruplex ligands could serve as promising agents against malignant gliomas. Furthermore, recent improvements within this family of G-quadruplex-binding ligands have resulted following replacement of the central pyridine-based core by a phenanthroline core (229). These bisquinolinium-substituted phenanthroline compounds (Figure 13c), which also adopt a planar crescent-shaped alignment due to internally organized hydrogen-bonded syn–syn conformation, exhibit high affinity and excellent selectivity for the four-repeat human telomere G-quadruplex. The current model of complex formation involves stacking of the planar ligands on terminal G-tetrads of G-quadruplexes, given that the crescent-shaped ligand exhibits excellent geometric complementarity with the dimensions of the G-tetrad.
Porphyrins have been used successfully as ligands for targeting G-quadruplexes (214,230,231). The most extensively studied cationic porphyrin has been 5,10,15,20-tetrakis-(N-methyl-4-pyridyl)porphyrin (TMPyP4) (Figure 13d), which induces telomerase inhibition upon targeting telomeric G-quadruplexes (232) and down-regulates the expression of the c-myc oncogene (111).
NMR-based approaches have been used to investigate the complex of TMPyP4 with the five guanine-tract-containing c-myc-23456 G-quadruplex (129). Large upfield shifts are observed for a subset of imino proton resonances on complex formation, with slow exchange between free and bound forms. Exchange cross peaks observed in the NOESY spectrum of a sample containing equal amounts of free and bound forms, allowed assignments of the imino protons of the complex based on the known assignments in the free form. The TMPyP4 porphyrin ring stacks towards one end of the G-quadruplex in the solution structure of the complex (129).
Recently, an X-ray structure has been solved of TMPyP4 bound to the all-parallel-stranded bimolecular G-quadruplex formed by the two-repeat human telomere d(TAGGGTTAGGG) sequence in K+ solution (Figure 14e) (233). Somewhat unexpectedly, the porphyrin rings do not stack on the terminal G-tetrads, but rather stack on the TTA nucleotides, both on base pairs formed at the 5′-ends of the G-quadruplex (Figure 14f), as well as the double-chain-reversal or propeller loops that span the grooves of the structure. In addition, the propeller loops undergo a conformational transition on complex formation.
A limitation of first-generation cationic porphyrins such as TMPyP4 is that they exhibit poor selectivity between G-quadruplex and duplex DNA. To overcome this limitation, studies were extended to a Mn(III)-coordinated porphyrin containing a central aromatic core and four relatively flexible arms carrying cationic end groups (Figure 13e). The binding of this Mn(III) porphyrin to the four-repeat human telomere DNA established that it targets the human telomere G-quadruplex by four orders of magnitude over duplex DNA (234). Furthermore, telomerase inhibition occurred with IC50 = 580 nM. A working model has been put forward for this remarkable selectivity, where the porphyrin is proposed to stack on terminal tetrads and the flexible cationic arms are likely to be positioned in the grooves. Since the substituted Mn(III) porphyrin has two axial ligands, one of these would have to be replaced by a Mn-bound water molecule that in turn could potentially insert into the central channel in the complex (234).
One of the most promising telomerase inhibitor candidates is telomestatin (Figure 13f), a macrocyclic torand natural product isolated from Streptomyces anulatus, consisting of seven oxazole rings and one thiazole ring that targets G-quadruplexes with high specificity (IC50 = 5 nM) (235), causing growth arrest, apoptosis and telomere dysfunction (236). Furthermore, telomestatin shows selectivity for cancer cell lines over normal cells, activates key components of the DNA damage-response pathway and sensitizes tumor cells to chemotherapeutic agents (237). Telomestatin may also exhibit selectivity, since it has been suggested that telomestatin and related macrocycle Se2SAP bind preferentially to different folds of the human telomere G-quadruplex (238). These results suggest that the equilibrium between conformational states of the human telomere G-quadruplex can potentially be shifted on complex formation with specific macrocyclic ligands.
Oxazole-based peptide macrocycles represent a new class of chemically synthesized G-quadruplex-binding ligands (239,240) as analogs of telomestatin. One of these, an oxazole-containing 24-membered macrocycles consisting of a hexazole designated HXDV (Figure 13g), inhibits the growth of human lymphoblastoma cells with an IC50 of 0.4 μM (240). HXDV binds and thermally stabilizes the structure of the four-repeat human telomere in K+ solution, but not to duplex or triplex DNA (241). The binding stoichiometry is two HXDV molecules per G-quadruplex, presumably consistent with stacking of the cyclic hexazoles on the terminal G-tetrads at either end of the G-quadruplex. Thermodynamic and mobility studies demonstrate that the binding of HXDV is entropically driven, with the entropic driving force reflecting contributions from favorable drug-induced alteration in the configurational entropy of the DNA (241). A challenge in all these studies is linking quadruplex binding to biological effects at the cellular level, as has been documented to date for BRACO-19 and telomestatin.
Many of the drug–G-quadruplex complexes solved to date emphasize recognition principles highlighting the contributions of intermolecular stacking, hydrogen-bonding and hydrophobic interactions at the expense of shape-complementarity to recognition. Nevertheless, shape-selective recognition represents a promising area for future growth, with some very elegant demonstrations attesting to its potential for G-quadruplex recognition.
The non-planar and non-aromatic steroid diamines have long been of interest as potential nucleic acid-binding ligands since they have been postulated to bind to and stabilize kink sites in DNA (242). Experimental support for this hypothesis emerged following NMR demonstration of partial insertion of the steroid diamine, dipyrandium, between unstacked base pairs of poly (dA–dT) (243). Most importantly, a temperature melting fluorescence-based screen of natural and synthetic molecules identified two steroid diamines, malouetine and funtumine that induce G-quadruplex stabilization (244). Of the two, funtumine substituted by a guanylhydrazone moiety (Figure 13h), is more promising, since it interacted selectively in vitro with human telomeric G-quadruplex. Funtumine-induced senescence and telomere shortening, as well as rapid telomeric G-overhang degradation and anaphase bridge formation, associated with uncapping of telomeric ends. These new results on first-generation steroid diamines hold promise for the future, given that they can be easily synthesized and modifications readily incorporated, in efforts to increase the selectivity and potency for human telomere G-quadruplex targets.
Recently, chiral cyclic-helicene molecules have been shown to exhibit chiral and selective binding to higher order structures by wedging between two adjacent four-repeat intramolecular human telomere G-quadruplexes connected by a TTA linker (207). A left-handed chiral cyclic-helicene with a short linker (Figure 13i) appears to be sandwiched within a chiral cleft formed by two human telomere G-quadruplexes stacked 3′-to-5′ with a connecting TTA loop in d[AGGG(TTAGGG)7], as monitored by cd and fluorescence (helicenes are strongly fluorescent) studies.
The current literature on drugs targeted to G-quadruplexes is primarily restricted to planar aromatic chromophores involved in end-stacking on terminal G-tetrads in G-quadruplexes (215,216). There remain many structural challenges associated with G-quadruplex recognition, as well as their impact on function. Several of these are outlined below.
The Watson–Crick and major groove edges of guanines are involved in hydrogen bonding within the G-tetrad alignment (Figure 1a), leaving the minor groove edge available for further recognition. Indeed, it has been shown that adenines can pair with G-tetrads to form A–(G–G–G–G) pentads (Figure 11c) (18) and A–(G–G–G–G)–A hexads (Figure 11d) (16), as a result of non-canonical G–A pair formation. It is thus conceivable that successive G-tetrad base edges can be targeted by (A)n-containing sequence segments. An alternate strategy has been to prepare conjugates containing quadruplex-stabilizing acridines linked to oligonucleotides that are complementary to the human telomere sequence (245).
The four grooves can adopt distinct dimensions based on the strand directionalities around the G-quadruplex, thereby offering the possibility of discriminating between distinct quadruplex types. The grooves are accessible for edge-wise and diagonal loops but occluded for double-chain-reversal loops. At this time, the current understanding of quadruplex groove-specific recognition is restricted to the interaction between the aglycone sugar ring and grooves in the daunomycin–G-quadruplex complex (222). Other promising developments include identification of a group of structurally related compounds that selectively target quadruplex grooves as monitored by circular dichroism binding measurements (246).
The diversity of loops linking G-rich strands in G-quadruplexes range from edge-wise to diagonal and double-chain-reversal types. These loops can also vary in length and sequence (21,22,247,248) and can adopt distinct conformations stabilized by non-canonical pairs, triples, triads and mixed tetrads. Thus, loops projecting from G-quadruplexes serve as promising distinct targets, as yet unexploited by drug design approaches.
To date, there has been less emphasis on the contribution of shape-complementarity between ligand and G-quadruplex to molecular recognition. It has been demonstrated that shape-specific complementarity between ligand and three-helical junction targets is key to both ribozyme-based catalysis (249) and metallosupramolecular helicate recognition (250). We anticipate that this will represent a challenging area for future investigation.
A library-based approach is eventually needed for unbiased and selectivity-driven identification of ligands that target unique G-quadruplex topologies and discriminate against closely related counterparts. To this end, the earliest combinatorial selection approaches generated carbocyanine–peptide conjugate libraries (251). Furthermore, ribosomal display has yielded antibody fragment libraries (62) that exhibit specificity for different G-quadruplex fold families. Recently, click chemistry has been used to generate bistriazole ligands to generate pharmacophores capable of π-stacking interactions with G-tetrads (252).
Library-based approaches can also be applied to identify proteins that target G-quadruplexes in a sequence and structure-specific manner. To this end, selection approaches have been used to engineer tandem zinc finger proteins that bind G-quadruplex scaffolds and effectively inhibit the activity of telomerase (253,254).
Further development of such library-based approaches should provide opportunities for modulating processes from DNA recombination to maintenance of telomere length and integrity.
Despite over a decade of structural research on G-quadruplexes, there is still no structure for a protein–G-quadruplex complex. This is unfortunate since there is an extensive literature on proteins that bind G-quadruplexes (44–46), including proteins that either facilitate or non-catalytically disrupt G-quadruplex formation, as well as helicases that catalytically unwind G-quadruplexes in an ATP-dependent manner and nucleases that cleave at or adjacent to G-quadruplex scaffolds. Many of these proteins bind specific G-quadruplex scaffolds. Therefore, there is a pressing need to devote efforts at structurally characterizing complexes of proteins that target DNA and/or RNA families of G-quadruplexes and G-quadruplex–duplex junctions.
G-quadruplex forming sequences have also been identified in alternatively spliced pre-mRNA sequences (71). One of the most important challenges in the future centers on determination of the structures of RNA G-quadruplex scaffolds and RNA G-quadruplex–duplex junctions adopted by alternatively spliced pre-mRNA sequences on complex formation with bound proteins.
A recent paper monitored the conformational transition of the four-repeat human telomere sequence d[G3(T2AG3)3] in 150 mM K+ solution on addition of PEG 200, a mediator of molecular crowding conditions (255). The CD spectrum changed from one typical of a (3+1) G-quadruplex to one typical of an all parallel-stranded G-quadruplex at 40% (w/v) of added PEG 200. The human telomere G-quadruplex in K+ solution containing 40% PEG 200 exhibited unusual stability and negatively impacted on polymerase processivity. These data provide strong support for formation of an all parallel-stranded G-quadruplex for the four-repeat human telomere in K+ solution under molecular crowding conditions (255).
Research in the Patel laboratory on the structure and recognition of G-quadruplexes is funded by NIH grant GM034504-22. The earlier contributions of Serge Bouaziz, Natalya Chernichenko, Andrey Gorin, Abdelali Kettani, Kim Ngoc Luu, Ananya Majumdar, Eugene Skripkin, Yong Wang and Na Zhang, former members of the Patel laboratory, are gratefully acknowledged. Funding to pay the Open Access publication charges for this article was provided by GM034504-22.
Conflict of interest statement. None declared.