De novo protein design has historically been used to test the principles governing protein folding and assembly(
1–
3). These principles have also been extended to the design of structures capable of binding metal ions(
4,
5), peptides(
6–
8), DNA(
9,
10), inorganic materials(
11), and proteins that catalyze reactions similar to those found in nature(
12–
15). However, protein design might have greater impact when applied to the engineering of controllable, structurally-defined molecular assemblies(
16). A solution to this problem would enable the manipulation and organization of objects on the molecular and atomic levels – a major challenge of modern nanoscience.
We describe a general approach for designing molecules that assemble along geometrically-specific surfaces into a pre-defined superstructure. Earlier studies focused on amphiphilic peptides that encourage binding and assembly at soft interfaces(
17–
19), but without explicit consideration of interpeptide packing geometry that defines the nano- to macrostructure of the overall complex. A good design strategy for encoding a specific mode of assembly is to engineer a protein structural unit that presents a functional group compatible with the targeted surface and associates into a periodic superstructure with a geometric repeat matching that of the targeted substrate (). However, an infinite continuum of such symmetry-matching arrangements can be generated out of common protein structural units. Thus, the most challenging aspect of designing such a surface-organizing assembly is the identification of a reasonable superstructure geometry, a problem we address in this study. Here, we apply our approach to design peptides that wrap SWNTs in a structurally-specific manner, creating a richly textured molecular surface. Previously studied biomolecules that interact with SWNTs include single-stranded DNA molecules (
20,
21), nanotube-binding peptides selected by phage-display (
22), and synthetic peptides with chemical features that favor SWNT binding (
23,
24). Beyond interacting with and solubilizing SWNTs, a unique and relatively unexplored potential offered by biomolecules is the ability to program structurally-specific modes of surface assembly, enabling nucleation of further superstructure, functionalization, and manipulation (
25).
The design process consists of three selection rules, which successively restrict the space of possible peptide-surface assemblies, and ultimately dictate peptide sequence (see ). Selection rule 1 identifies groups compatible with the target surface, as well as a protein structural unit capable of displaying such groups in a productive manner (see ). Selection rule 2 defines the intersubunit packing of these units on the target surface. Symmetry operations are used to create an elementary unit cell, which is then replicated to match the geometric repeat of the surface (see ). A continuum of assemblies remains possible at this point, each creating new protein-protein interfaces, within the unit cell and between neighboring unit cells. The key insight is provided by selection rule 3, which ensures that these interfaces are designable – that is, they can be accommodated, in a stable and specific manner (see ). Designable protein structural motifs occur frequently in nature, such that a structural database search can be used to assess the feasibility of specific intersubunit packing in addition to revealing sequence features that encode it(
26). In summary, the three selection rules define the intrinsic recognition motif, and its packing into a higher-order assembly in accord with the long-range order of the underlying surface.
These selection rules emerged from our efforts to engineer peptides targeting common species of SWNTs. In picking a functional group for contacting the SWNT (selection rule 1), we avoided strong hydrophobic recognition motifs employed in earlier studies(
23), instead relying on weaker protein-SWNT interactions to encourage the cooperative formation of the intended higher-order assembly (see ). We therefore chose the C
α methylene of Gly or the C
β methyl of Ala, presented in a repeating manner on an α-helix as the elementary structural unit.
Selection rule 2 stipulates that the arrangement of protein structural units should match the symmetry of the underlying surface. The cylindrical shape of a SWNT suggested an assembly with rotational or rotational screw symmetry, so we considered α-helical coiled coils forming a supercoil along the SWNT axis (). Common SWNTs have relatively hydrophobic surfaces, and radii in the range of ~3.75–4.1 Å (for the (5,6), (5,7), and (3,8) chiralities). This, together with the choice of a small sidechain for surface recognition defined the radius of the coiled coil to be around 9 Å, restricting the stoichiometry of the bundle to between 5 and 7 units (
26). We chose an antiparallel hexamer over a parallel α-helical bundle to exploit the additional degree of freedom (axial shift), available to antiparallel interfaces (
26). Although SWNTs are relatively smooth, their electronic surface is not entirely homogeneous and we considered that it may be advantageous in design to match the pitch angle of the helices formed by overlapping benzenoid rings down SWNT surfaces (see ) (
27).
Although the first two selection rules identified a specific topology, a large number of possible bundles with reasonable interfaces could be generated based on the four remaining parameters: the inter-helical separation, starting helical phase, superhelical pitch, and helical axial shift. Allowing fifty discrete values for each parameter within geometrically feasible ranges results in 6,250,000 possible design templates. We had previously found that no more than 1 in 100 α-helical coiled coils constructed using geometrically feasible parameter values are in fact designable with natural amino acids (
26). Therefore, in selection rule 3, we searched for assembly parameters that optimized the designability of the modeled interfaces, leading to a single most designable template for each targeted SWNT.
To assess designability, we used a rapid distance-matrix based method for searching tertiary motifs in the Protein Data Bank (PDB) that are geometrically similar to the query interface (). The number of matches within a given cutoff of the query interface amounts to a metric of its designability, and sequences of the matches help define features encoding intersubunit packing. Since this information is gathered from a wide range of structural contexts, sequences of the matches should be highly divergent at all positions except those that are particularly critical to the stability and structural specificity of the motif. The conserved positions are held constant in design, while the variable positions provide handles for encoding additional features, such as interaction with SWNTs, modulation of solubility, stability and specificity, or recruitment of additional functionality.
The selection rules were implemented into an automated procedure and applied to design of assemblies on the surfaces of SWNTs (3,8), (5,7) and (5,6), matching both size and pitch angles to each SWNT (corresponding pitch angles were −14.7°, −5.5°, and −3°, respectively (
27)). An antiparallel hexamer has two geometrically distinct helix-helix interfaces ( inset). The designability of these interfaces in the optimal template was starkly different among the three pitch angles (). For example, the optimal −14.7° template identified 119 and 89 natural motifs that were within 0.6 Å C
α RMSD of the two helix-helix interfaces comprising this assembly. The corresponding values for the best −5.5° structure were 4 and 7, and none were found within this cutoff for the −3° structure. Thus, the −14.7° template would be considered a much more designable target using common, genetically encoded amino acids.
Profiles of residue propensities in aligned sequences () show that optimal designability is reached when the two unique interfaces of the hexamer are quite different – one should be a “tight” Alacoil-like interface, while the other should resemble an antiparallel leucine zipper-like motif. Note that this information is obtained automatically, without resorting to extensive sidechain repacking calculations on candidate backbone structures.
Having chosen the −14.700B0 structure as the target, we followed two paths to complete the design process. In the first, a sequence was computationally optimized to adopt this hexameric antiparallel bundle around the (3,8) SWNT, constraining the strongly conserved positions from propensity profiles (positions
d and
e; ). Standard computational design techniques were applied to select the remaining variable positions (section 1.2 of Supporting Material (
27)) producing two sequences, HexCoil-Gly and HexCoil-Ala (see ), differing only in the identity of the SWNT-contacting position (Gly or Ala, respectively).
In a second approach we searched the PDB for a more complex scaffold that embedded the full −14.7 00B0 hexameric bundle within it and would be amenable to further design. A structural-similarity search identified a remarkably similar bundle (0.9 Å C
α RMSD over 156 residues) in the inner ring of helices of a domain-swapped helical bundle (called DSD; PDB code 1G6U; ,
S4-S5)(
28). Additionally, the strong sequence features discovered for the (3,8)-optimal template () were also present in DSD. Therefore, the central pore-lining Glu and Lys residues of DSD were converted to Gly or Ala to accommodate a SWNT in peptides designated DSD-Gly and DSD-Ala.
The hierarchic principles of our design approach suggest that a large portion of the driving force for assembly should originate from modestly favorable helix-helix interactions, which should stabilize the basic antiparallel dimeric unit, even in the absence of SWNTs. Without the underlying solid substrate, the hexameric bundle structure might not be the most stable one formed, but we expected to see assembly into related bundles in which the dimeric interface was preserved. Indeed, sedimentation equilibrium analytical ultracentrifugation (AUC) showed DSD-Gly and DSD-Ala to exist in a dimer-hexamer equilbirium between 10 µM to 100 µM peptide concentration (
Fig. S7). HexCoil-Ala associated into tetramers (
Fig. S8), whose structure was solved using diffraction data extending to 2.44 Å resolution by X-ray crystallography (see ). The asymmetric unit consists of an antiparallel dimer, whose structure is within 1.2 Å of the designed model (calculated over the backbone of 20 central residues per monomer). The designed Ala-rich face is well-situated to interact with the surface of the SWNT (). Finally, far UV circular dichroism spectroscopy (CD) of these peptides confirmed their helical content in solution and when bound to SWNT (
Fig. S9). Interestingly, HexCoil-Gly, which contains multiple helix-destabilizing Gly residues, assembled only in the presence of SWNTs (
Fig. S9) similar to previously designed surface-binding peptides(
29,
30).
The peptides formed water-soluble assemblies of SWNTs, producing aqueous suspensions that were stable for months. Two-dimensional photoluminescence (2D-PL) spectra were used to identify individual SWNT chiralities through their characteristic resonances (
31), and to rule out aggregation of SWNTs, which induces energy transfer between different species (
32). Designed peptides produce SWNT suspensions with 2D-PL peaks corresponding to (5,6), (5,7), and (3,8) chiralities (). The de novo designed peptides HexCoil-Ala and HexCoil-Gly sequester significantly more SWNTs into solution, compared to DSD variants (). Interestingly, though the (3,8) species is a minor product in the mixture of SWNTs used in our experiments, HexCoil-Ala and HexCoil-Gly show a dominant peak corresponding to this chirality (). This is of particular significance given that the target substrate for these designs was indeed the (3,8) species SWNT.
A number of control peptides were prepared to evaluate the structural mode of SWNT/peptide assembly. To probe the role of the small Ala and Gly residues contacting the SWNT, native DSD and an analog of DSD-Gly with two of its Gly residues changed to His were studied. Furthermore, to test the role of helix-helix packing in the HexCoil-Gly and HexCoil-Ala, the apolar residues at the “d” and “e” positions that pack at the two distinct helix-packing interfaces, and the SWNT-contacting “a” position, were interchanged (
Fig. S12). The resulting peptides, cHexCoil-Gly and cHexCoil-Ala (), have identical amino-acid compositions, hydrophobicity, and helical faces, and nearly identical hydrophobic moments (a measure of amphiphilicity) as their parents, but differ in their abilities to engage in the detailed packing interactions intended to stabilize surface assemblies. These negative control peptides (DSD, DSD-His, cHexCoil-Gly, or cHexCoil-Ala) were very inefficient at solubilizing SWNTs (,
S12), verifying the intended mode of SWNT contact and suggesting that the success of our designs rests upon the ability to form favorable inter-subunit interactions and a higher-order assembly.
Once SWNTs are wrapped by peptides in a structurally determined way, their solvent-exposed surfaces can be further elaborated to direct the assembly – or even the synthesis – of a third biological or non-biological layer. To illustrate this, we used the peptide/SWNT assembly to direct nucleation and assembly of gold nanoclusters in a geometrically-defined manner. The DSD-Gly peptide appeared advantageous for these studies, as its peripheral helices packing against the central hexameric ring allow for the construction of independent outward-facing binding sites along a larger-radius superhelix, facilitating microscopic imaging. A single Cys was introduced near the N-terminus of DSD-Gly, such that pairs of symmetry-related helices created convergent gold-binding sites ( and
S11). Addition of Au(III) under reducing conditions led to the appearance of 2 to 4 nm gold clusters visible by TEM (,
S3). Consistent with the design model, the pattern of spots is linear and systematically in-phase, and the observed inter-particle spacing of 47 Å in very good agreement with the model’s prediction of 52 Å (
Figs. S1–3) (
27).
The selection rules described here provide an objective reproducible method to design surface-binding peptides. Their aim is to assure that all effects are favorable for the formation of the intended assembly. Optimal interaction geometry between protein units, physicochemical compatibility between the surface and the protein, and matching between the geometry of the assembly and the symmetry of the substrate are all encoded at the same time in a “minimally frustrated” design. In applying this strategy to SWNT surfaces, we expected that the dominant surface features would be radius and the water-repellant nature, thus the driving force for assembly would originate primarily from matching the size and hydrophobicity of the SWNT, as well as inter-subunit packing. Indeed this strategy worked. The intended SWNTs were bound, thereby converting the very short-scale periodicity of a SWNT surface to long-scale periodicity of a SWNT/protein assembly, as illustrated by using the complex to further direct the nucleation of an additional layer of gold nanoparticles.
SWNTs present a challenging case for organizing structurally-specific assemblies due to their relatively featureless surfaces. Other molecular surfaces, such as ionic structures or boron nitride nanotubes(
33), are likely to have much higher heterogeneity in presented atomic groups, leading to better potential for anisotropy with respect to surface interactions. In such cases, we would expect the orientation of the coating assembly relative to the crystal lattice would be a very important discriminator and director of order. It is encouraging that even with the rather simple and smooth surfaces of SWNTs we have already achieved a significant level of success. The DSD versus HexCoil series of peptides illustrate different endpoints of the design process. Whereas the DSD scaffold was serendipitously discovered to approximately match the assembly geometry optimized via our approach, HexCoil-Ala and HexCoil-Gly were designed de novo to bind the (3,8) SWNT. Thus, it is encouraging that the latter peptides are more efficient and significantly more selective agents for solubilizing the desired target, showing a strong preference for solubilizing this tube type despite it being a minor component in a mixture of SWNTs. It is possible that the interfaces in the HexCoil peptides, which are unencumbered by the presence of a more involved tertiary packing, are sufficiently preorganized to allow selective binding, but not so rigid as to require a perfect fit for selective recognition to take place.
In summary, biological systems specialize in assembly, and hybrid nano-bio structures provide a powerful way to direct the assembly and tune the properties of nanomaterials. Computational protein design provides the means to do so in a highly directed and functionally relevant manner (
34).