|Home | About | Journals | Submit | Contact Us | Français|
The ends of eukaryotic chromosomes are called telomeres. This article provides a short history of telomere and telomerase research starting with the pioneering work of Muller and McClintock through the molecular era of telomere biology. These studies culminated in the 2009 Nobel Prize in Medicine. Critical findings that moved the field forward and that suggest directions for future research are emphasized.
The ends of eukaryotic chromosomes are called telomeres. They were discovered in the late 1930’s first in flies by Herman Muller and then in corn by Barbara McClintock (reviewed in ). These investigators inferred from the behavior of broken chromosomes that the natural ends of chromosomes were distinguishable from induced DNA breaks and must therefore be a special structure, dubbed the telomere. Muller surmised that this structure is essential for chromosome stability as he was unable to generate a chromosome lacking one after pulverizing chromosomes with X-rays. McClintock deduced that one of the essential functions of telomeres was to protect chromosome ends from fusing both with each other and with induced double strand breaks.
The Muller and McClintock findings were made before DNA was known to be the genetic material. The Watson-Crick double helical structure of DNA, reported in 1953, immediately suggested a mechanism for its replication. That is, each strand in the duplex acts as the template to guide synthesis of its complement. Thus, DNA is duplicated by “semi-conservative” replication and each daughter helix contains both a parental and a newly synthesized “daughter” strand. The end result is that the two newly made “daughter” chromosomes are expected to be exact copies of the parent chromosome, both in terms of nucleotide sequence and length.
The isolation and characterization of DNA polymerases made it clear that a conventional replication mechanism would not suffice for the ends of linear DNA molecules . DNA polymerases synthesize DNA only in the 5’ to 3’ direction and cannot start replication de novo. Chromosomal replication in bacteria and eukaryotes typically is primed by a short stretch of RNA that is then elongated by DNA polymerase. Since retention of RNA primers within DNA would compromise genome integrity, these RNA primers are removed during the process of replication. The small gaps generated by removal of internal RNA primers can be repaired easily by DNA polymerase followed by sealing the resulting nicks with DNA ligase. However, the gap at the 5’ ends of newly replicated strands cannot be filled in by this process. Thus, without a special replication mechanism, the 5’ ends of newly replicated strands would shorten with each round of DNA replication. Thus, one key function of telomeres is to provide a substrate that supports an unconventional mechanism of replication that solves this “end replication problem”.
The molecular era of telomere biology began when newly developed DNA sequencing methods were applied to the natural (ie., uncloned) ends of sub-chromosomal DNA molecules from ciliate macronuclei. Ciliates are unusual in that each cell has two types of nuclei. The micronucleus has conventional chromosomes but is transcriptionally inert, participating only in meiosis. After mating and meiosis, the transcriptionally active macronucleus is derived from the micronucleus by fragmentation of the chromosome followed by deletion, amplification and telomere addition which generates a polyploid bag of acentric DNA fragments. As a result, ciliate macronuclei have an amazingly high concentration of DNA termini and their associated proteins (reviewed in ).
Elizabeth Blackburn, then a post-doctoral fellow in Joseph Gall’s lab, sequenced the native ends of Tetrahymena macronuclear ribosomal DNA molecules and found that they consist of a variable number of non-protein coding 5’-T2G4-3’ repeats, about 50 repeats per end, although the exact number varies from molecule to molecule . Soon thereafter David Prescott’s lab determined the sequence and structure of DNA ends in the macronuclei of another group of ciliates. Although this second ciliate class is evolutionarily divergent from Tetrahymena, its telomeres also consist of non-coding repeats of a related but non-identical sequence, 5’-T4G4-3’ . Moreover, because telomeres in this group of ciliates have the unusual property of having a precise size, it was possible to show that the G-rich strand is longer than the C-rich strand. Thus, both ends of these macronuclear DNA molecules bear 16 nucleotide (nt) T4G4 single-strand “G-tails”.
As data accumulated on telomere sequences and structure in diverse organisms, it became clear that the conclusions reached from research on ciliates telomeres apply to most eukaryotes. That is, in the vast majority of organisms, telomeric DNA consists of simple repeats. The number of telomeric repeats per chromosome end varies enormously from only 20 bps to several tens of kbs depending on the organism. Moreover, even in the same organism, the number of telomeric repeats per end varies from cell to cell. While there are numerous sequences that can provide telomere function, most of these sequences have the common feature that the strand that forms the 3’ end of the chromosome is G-rich and longer than its complement, thus forming a 3’ single-strand tail. As discussed in more detail below, the 3’ single-strand G-tails serve as landing pads for sequence-specific DNA binding proteins that protect ends from degradation and fusions, essential functions of telomeres. Thus, regenerating the structure of telomeres presents another problem for DNA replication as the conventional replication apparatus is expected to generate blunt ends, not G-tails, at ends replicated by the leading strand DNA polymerase . Both the duplex repeats and the single-strand G-tails, along with the proteins that bind them, are necessary and sufficient for telomere function. Thus, regenerating the G-tail, as well as maintaining the duplex repeats, is essential.
The fact that telomeres from the same organism are often of different lengths provided an early hint that telomeric DNA is not always templated by the parent chromosome. This idea was strengthened by experiments studying the behavior of native termini from ciliates after their introduction into the budding yeast S. cerevisiae. When ciliate (T2G4)n or (T4G4)n telomeric ends are ligated to both ends of a linear vector and introduced by transformation into yeast, the ciliate ends allow the plasmids to be maintained as linear molecules [7, 8]. However, in both cases, these linear plasmids do not end in the ciliate telomeric sequence but rather are lengthened by the addition of yeast 5’-TG1–3-3’ telomeric DNA. Clearly the yeast 5’-TG1–3-3’ repeats are not templated by the ciliate telomeres to which they are added.
These and similarly provocative results led Elizabeth Blackburn and her graduate student Carol Greider to search for an activity that can elongate telomeric DNA in the absence of a DNA template. Wisely they started with protein extracts from post-mating Tetrahymena cells, which undergo massive telomere formation during the development of the macronucleus and are thus a rich source of telomere replication proteins . These efforts were rewarded by the identification of what is now called telomerase, a telomere-dedicated reverse transcriptase that uses an integral RNA subunit to template the addition of G-rich telomeric repeats to 3’ single-strand ends [10, 11]. These telomerase RNAs can be short, ~160 nts in ciliates , or long, (>1000 nts) in yeasts , but they always contain a short stretch that is complementary in sequence to the G-rich strand of telomeric DNA.
Over the years, it became clear that most eukaryotes use a telomerase based mechanism to replicate the very ends of chromosomes. This activity solves the first end replication problem because telomerase can elongate the 3’ single strand G-tails in the absence of a DNA template. After this lengthening, conventional RNA primed DNA replication can fill in the C-strand. In organisms like Tetrahymena, telomerase is highly processive  such that many telomeric repeats can be added in a single round of replication. Thus, telomerase need not act on a given telomere in every cell cycle. In fact, telomerase is highly regulated. In both mammals  and yeast , telomerase preferentially lengthens the shortest telomeres in the cell. In some organisms, including humans, telomerase is not expressed in most normal somatic cells (stem cells being a notable exception) resulting in telomere shortening with every cell division . In contrast, the vast majority of human cancers and immortalized cells in culture have robust telomerase activity .
The second end replication problem, regenerating G-tails at both ends of newly replicated chromosomes, is achieved by cell cycle regulated C-strand degradation [19, 20]. Although this event has been studied in most detail in S. cerevisiae, it is likely a universal step in telomere metabolism . In S. cerevisiae, C-strand degradation is tightly coupled to semi-conservative replication of telomeric DNA . Although both occur late in S phase, only replicated molecules acquire G-tails. G-tails are also produced by telomerase, but the cell cycle regulated appearance of G-tails occurs even in telomerase deficient cells . Remarkably the enzyme activities that process double strand breaks to generate the 3’ single strand tails that initiate homologous recombination are also involved in telomeric C-strand degradation . However, C-strand degradation is prevented from moving past telomeric repeats into single-copy sequences by telomere binding proteins that somehow limit this degradation .
Even though telomeres and double strand breaks are processed in a remarkably similar manner, cells clearly distinguish telomeres from double strand breaks. For example, in S. cerevisiae, a single double strand break is sufficient to trigger a full DNA damage checkpoint-mediated cell cycle arrest  yet telomeres are not perceived as DNA damage until they become critically short [26–28]. The ability of telomeres to shield chromosome ends from checkpoints, nucleases, and fusions are collectively known as the capping function of telomeres. This capping activity requires both G-tails and duplex telomeric DNA as well as the proteins that bind these structures. The core of this capping activity resides in sequence specific DNA binding proteins. The prototype of G-strand binding proteins, now called Pot1, was first isolated, not surprisingly, in ciliates  and later found in diverse organisms from fission yeast to humans . In vitro the heterodimeric ciliate G-tail binding complex is sufficient to protect otherwise naked telomeric DNA from nucleolytic degradation, a key part of the capping function . Pot1 is also essential for capping in vivo in organisms with conventional chromosomes [30, 31].
A heterotrimeric complex, called CST (Cdc13-Stn1-Ten1) that is functionally similar to Pot1 was first discovered in S. cerevisiae and more recently in fission yeast, vertebrates and plants (reviewed in ). Unlike S. cerevisiae, which contains only CST, these other organisms have both Pot1 and CST. CST is essential for end protection not only in S. cerevisiae but also in organisms that also have Pot1. However, in organisms with both, it is not clear how the various capping functions are parsed out between the two. One hypothesis is that CST differs from Pot1 in being a telomere-specific RPA (replication factor A), the sequence non-specific single strand DNA binding complex that is essential for chromosomal DNA replication, recombination, and repair, suggesting that CST has more general roles in telomere metabolism .
Although Pot1 and Cdc13, the DNA binding subunit of CST, are not related in primary sequence, both contact DNA via oligonucleotide-oligosaccharide binding folds (OB-folds). Likewise, the telomere-specific DNA duplex binding proteins, such as Rap1 in S. cerevisiae and TRF1/2 in vertebrates, can have divergent amino acid sequence yet contact DNA via conserved Myb binding motifs [34, 35]. Like G-strand binding proteins, duplex telomere binding proteins, such as mammalian TRF2  and S. cerevisiae Rap1 , are also critical for end protection. In addition to sequence-specific telomere binding proteins, there are multiple proteins that come to the telomere via protein-protein interactions. The six member telomere complex that protects vertebrate  and fission yeast  telomeres has been dubbed shelterin. In the shelterin complex, several bridging proteins connect the G-strand binding complex to duplex binding proteins.
Most studies on telomere replication focus on the “end replication problem” and its solution by telomerase. However, all but the very tip of the chromosome is replicated via standard semi-conservative DNA replication. Unexpectedly, the replisome has a harder time moving through telomeric DNA than through most other genomic regions. In wild type S. cerevisiae, replication forks slow as they move through telomeric DNA, even if the telomeric repeats are at a non-telomeric location . This slowing is highly exacerbated in the absence of the Rrm3 DNA helicase, which has a general role in promoting fork progression through stable protein-DNA complexes . Telomere binding proteins also have roles in regulating replication fork progression. In both S. pombe  and mammals , fork movement through telomeres is impeded when their respective duplex telomere binding proteins (Taz1, S. pombe; TRF1, mammals) are depleted. These replication problems are proposed to result from the high GC content of telomeric DNA, which increases its thermal stability and allows formation of stable secondary structures, such as G-quadruplexes. Whether or not this model is correct, clearly semi-conservative replication of telomeres is challenging.
Another function of telomeres is to regulate transcription, a phenomenon called telomere position effect (TPE) . Genes residing normally at telomeres or positioned there by molecular manipulation are transcribed at reduced levels (reviewed in ). Consistent with this finding, subtelomeric chromatin has many of the histone modifications typical of heterochromatin. This epigenetic transcriptional repression is semi-stable with genes able to switch from a repressed to a transcribed state. In S. cerevisiae, TPE regulates genes involved in stress or needed for growth on alternative carbon sources. In certain human parasites, such as Trypanasoma brucei and Plasmodium falciparum, which cause, respectively, sleeping sickness and malaria, TPE regulates the expression of surface antigen genes, thereby helping these parasites avoid elimination by the immune system. Although TPE is detected in human cultured cells , its role in gene regulation during developmentally programmed telomere shortening has not yet been demonstrated. Given the wide spread occurrence of TPE, the recent discovery that telomeric DNA is normally transcribed, at least in yeasts and mammals, came as a surprise (reviewed in ). These telomeric transcripts, called TERRA (TElomeric Repeat containing RNA), are transcribed from the C-rich strand and hence are G-rich in sequence. They are often associated with telomeric chromatin and are thought to inhibit telomerase in cis.
Telomere structure and function is a fascinating area of study from the standpoint of basic science. However, it has been increasingly clear that telomere function and telomere replication are intimately linked with aging and cancer in human cells. Even in human stem cells that express telomerase, telomeres shorten progressively with age. The inherited forms of certain diseases, such as dyskeratosis congenita and idiopathic pulmonary fibrosis, which can result in early death due to stem cell failure, are now known to be caused by mutations in telomerase subunits or in proteins that bind telomeres or telomerase RNA (reviewed in ). Both the absence and presence of telomerase affects tumorigenesis. The telomere shortening that occurs normally in human somatic cells due to lack of telomerase can result in telomere dysfunction if it proceeds beyond a critical point. Especially in checkpoint deficient cells, this dysfunction and its resulting genome instability can accelerate the accumulation of genetic changes associated with tumor formation. Paradoxically, once established, about 90% of human tumors express telomerase , and this expression contributes to the loss of growth control that is typical of many cancers.
In large part due to the connections between telomerase and human health, the 2009 Nobel Prize in Medicine was awarded to Elizabeth Blackburn, Carol Greider, and Jack Szostak who pioneered the study of telomerase in model organisms. While much has been learned in the past 70 plus years about telomere biology, the future will almost surely provide more unexpected findings and increased connections of telomeres to human health.
I think the National Institutes of Health for its support of research in my laboratory. I apologize to those colleagues whose work was not cited due to length constraints.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.