1.1. Aims of This Review
Single-molecule measurements provide unique information on heterogeneous populations of molecules: They give access to the complete distribution of observables (rather than only their first moments), allow discrimination between static and dynamic heterogeneity of their properties, and enable the detection of rare events or a succession of events hidden by ensemble averaging and the impossibility to synchronize molecules.1–8 Single-molecule methods have now pervaded several disciplines, in particular chemistry, evolving from a stage of proof-of-principle experiments to decisive research and discovery tools. A literature database search with the keyword “single molecule” gave over 5000 references at the time of this writing. This renders the prospect of an exhaustive discussion of current single-molecule work rather daunting. It proves, however, that single-molecule methods have gained the status of established techniques in various scientific fields and continue to propagate to new ones. Therefore, it seemed appropriate for a review of single-molecule methods in chemistry to provide a rapid description of the main technical approaches and focus on a few illustrative examples of their elucidative power. Even such an endeavor would have resulted in a heteroclite description of research on topics as diverse as quantum electrodynamics, low temperature and room temperature experiments on nanoparticles and organic or biological molecules, micromechanical manipulation, or fluorescence spectroscopy, among many others. Such an accumulation would have been of little interest, once the basic principles underlying each technique had been explained. It seemed, therefore, more appropriate to describe applications of a unique set of methods (fluorescence spectroscopy) to biochemical questions and, more specifically, the elucidation of protein structure, dynamics, and function.
Protein structure and function are intimately related, and a large amount of single-molecule studies have been performed to elucidate the nature and role of conformational changes in protein or enzyme functions. These questions are best studied when methods have been validated on model systems, and we will delve into some simple examples of such systems to illustrate concepts, which are used in more sophisticated and ambitious studies.
Another important aspect of protein science is the mechanism of protein structure formation (and loss thereof), i.e., protein folding and unfolding. Single-molecule methods have begun to yield very interesting results in this domain, and undoubtedly, more will follow. We have thus made this promising field one of the focuses of our review.
The review is organized as follows. We will first define the questions that have been studied so far at the ensemble level and that are now being addressed with single-molecule fluorescence spectroscopy. The next section presents a summary of basic concepts of fluorescence spectroscopy and briefly reviews recent developments in single-molecule analysis, to serve as a glossary for all experimental approaches described throughout the remainder of this review. We then turn to applications of single-molecule fluorescence resonance energy transfer (FRET) to study polypeptide chain collapse in small single-domain proteins under equilibrium conditions. We provide some examples on how to extract dynamic information from single molecules, namely, distance distributions within conformational subpopulations of proteins in the framework of protein folding and in enzymes. These aspects are divided into two parts: studies based on FRET and studies relying on fluorescence quenching. The last part of this review addresses recent studies of protein folding dynamics under nonequilibrium conditions. We conclude with general remarks and an overview of future prospects of these methods.
1.2. Questions in Protein Structure and Function
Proteins are heterobiopolymers that consist of a particular linear sequence of the 20 naturally occurring amino acids spontaneously forming three-dimensional (3D) structures in physiological media. The original conceptions about protein structures and protein function have been shaped by the first static pictures revealed by X-ray crystallography.9,10 This view has changed dramatically with (among other evidences) the use of low-temperature flash photolysis11–13 and hydrogen-exchange techniques,14–17 which revealed the existence of conformational substates and considerable dynamic fluctuations in native proteins. Motion within a protein is often necessary to guarantee its biological function and is an important component of binding specificity, as highlighted by the discovery of an increasing number of natively unfolded proteins (proteins that adopt an irregular structure in isolation but undergo a folding transition upon binding of a ligand).18–20
Interest in the otherwise biologically functionless denatured state stems from the fact that it represents the starting point of the protein-folding process; thus, a detailed understanding of both structure and dynamics of this thermodynamic macrostate is essential for a complete description of folding. Minimalist lattice models of proteins and Monte Carlo simulations predict a contraction of the denatured polypeptide chain in good solvent (e.g., high concentrations of denaturant) upon transfer in poor solvents (e.g., aqueous solutions) in cases where the overall attraction between residues dominates.21 Folding in a crowded cellular milieu most likely is initiated from such a collapsed coil state.22,23 On the other hand, high-resolution NMR experiments on denatured proteins show that residual nativelike structure (and even nativelike chain topology) may persist under even the most denaturing solution conditions.20,24–27 Whether the presence of residual structure accelerates folding by facilitating the formation of a folding nucleus or actually slows down folding because of the possibility to form non-native contacts is still debated. Unfortunately, a direct visualization and structural characterization of the denatured ensemble under more biologically relevant, mildly denaturing (or even native) conditions is difficult with ensemble methods due to the coexistence of the folded state, population averaging, and the low fractional population of the denatured state.
Since the seminal experiments of Anfinsen in the 1960s and 1970s,28 it has become clear that most proteins fold spontaneously into their native structure. Generally, protein folding occurs thermodynamically as a first-order transition,29 although the recent experimental realization of downhill folding (see below) indicates that this does not always need to be the case.30,31 Because of a limited set of model proteins available then, it was believed that each protein possesses a unique folding pathway from the unfolded to the folded protein, involving a discrete number of intermediate structures (the “old view” of protein folding).32–35 With the discovery of so-called two-state folders (proteins that fold without intermediates) more than a decade ago,36–38 this deterministic view of the folding process has changed radically. In this “new view” of protein folding, deterministic folding is replaced by a stochastic search of the many conformations available to a polypeptide,39–41 pointing out the possibility that protein folding can be a highly heterogeneous process. In two-state folding, which seems to be predominant in small single-domain proteins of less than 80 residues in length,38 only the native and denatured conformers, separated by a high-energy transition state, are detectable. Protein engineering experiments,42,43 minimalist lattice simulations,44,45 and analytical theory46 suggest that the formation of the folding transition state is reminiscent of nucleation. There are cases where additional intermediate structures are involved and phase diagrams can be constructed that delineate their existence as a function of external conditions.47–50 In some proteins, these intermediates are populated transiently in kinetic experiments under nonequilibrium conditions,51–54 but it is not always clear whether these intermediates are productive on-pathway intermediate or simply off-pathway traps,55 as suggested by the energy landscape theory.
In the modern statistical mechanical picture of protein folding, folding is described by a “funneled” energy landscape, which puts compact molecules (with small configurational entropy, Sc) near the center of the coordinate system.39–41,56 The funnel picture posits that, on average, the more compact a protein is, the lower its contact energy (Ec) is, because of favorable contacts. The funnel is not perfectly smooth but exhibits roughness due to unavoidable energetic frustration (e.g., steric hindrance, non-native contacts, or functional evolutionary constraints). Low-dimensional free energy surfaces can be obtained from the multidimensional energy landscape by averaging over all but a few reaction coordinates. Further averaging to one global coordinate, say the radius of gyration, yields the familiar one-dimensional (1D) free energy plots. Three different folding scenarios have been predicted using the statistical mechanical funnel picture of folding. Under conditions that stabilize the native state (F) only marginally, two-state or three-state folding (population of transient intermediates) scenarios may happen. With increasing thermodynamic bias toward F (e.g., upon stabilizing mutations), Sc nearly compensates Ec during folding and the folding process becomes downhill (type 0 folding).39
Techniques for the fast initiation of folding include ultra-rapid mixing using continuous-flow57 or laminar-flow devices,58–60 (laser) temperature jump,61,62 pressure-jump relaxation,63 and optical triggering.64,65 These techniques have revealed deviations from simple exponential kinetics in the folding of several small proteins and peptides.66–69 Nonexponential relaxation may point to the involvement of multiple folding pathways or motion on the energy landscape that does not strictly involve a crossing of the folding barrier. A major limitation from such ensemble studies stems from the fact that protein folding is a stochastic process. Initiation of folding therefore leads to rapid asynchronizm, which can hide rare folding events or scarcely populated folding intermediates.
Of particular interest would be to have access to the diffusive motion of the protein chain and to understand the factors determining this motion, mainly governed by the local roughness of the energy landscape. Diffusive motion enters the rate of folding (kobs) through the preexponential (k0) in the transition state theory expression for the rate kobs = k0 exp(−ΔGact/kT), in which ΔGact is the folding barrier. In the classical transition state theory describing gas phase small molecule reactions, k0−1 is in the femtosecond range. Protein folding reactions, however, differ substantially from this situation in that many noncovalent interactions, whose individual magnitudes barely exceed a few kilojoules per mole, must be broken and formed simultaneously and large entropy contributions due to nontrivial protein–solvent interactions must be taken into account. The preexponential k0 for protein-folding reactions can be modeled more realistically using Kramers' theory of barrier crossing and is related to the rate of intrachain diffusion.70 This rate has been experimentally inferred from ensemble FRET experiments on short peptides71 and more recently in experiments using triplet quenching72,73 or triplet–triplet energy transfer74,75 to study kinetics of contact formation between two sites on a polypeptide. These studies and microscopic folding models indicate that a realistic value of k0 may be as small as 10−7 s−1, setting a “speed limit” for protein folding (k0 reflects the folding rate in the absence of an energy barrier) of around 100 ns.76 It is possible, though, that in larger proteins, where the energy landscape is significantly rougher, the rate of intrachain diffusion may have a substantially smaller value.65
Single-molecule detection (SMD) experiments are well-suited for analyzing heterogeneous events such as protein-folding reactions.77 SMD allows real-time observations of a single molecule, thus removing ensemble and time averaging that are present in ensemble methods. SMD allows the study of asynchronous or nonsynchronizable reactions, the discovery of short-lived (nanosecond to millisecond lifetimes) transient intermediates, and the observation of full-time trajectories of pathways. Single-molecule experiments also allow the visualization of folding subpopulations and their direct quantification. Dynamics and extent of structure can thus be studied within such subpopulations, even under conditions of their coexistence.