|Home | About | Journals | Submit | Contact Us | Français|
Protein domains and peptide sequences are a powerful tool for conferring specific functions to engineered biomaterials. Protein sequences with a wide variety of functionalities, including structure, bioactivity, protein-protein interactions, and stimuli responsiveness, have been identified, and advances in molecular biology continue to pinpoint new sequences. Protein domains can be combined to make recombinant proteins with multiple functionalities. The high fidelity of the protein translation machinery results in exquisite control over the sequence of recombinant proteins and the resulting properties of protein-based materials. In this review, we discuss protein domains and peptide sequences in the context of functional protein-based materials, composite materials, and their biological applications.
Studies that require biomaterials with well-controlled physical and biological properties have increased dramatically since the concept of tissue engineering was first proposed. Cellular activities are often composite responses to multiple environmental stimuli [1,2]. To study how individual stimuli affect cellular response, precise control over the microenvironment is critical. However, this control is often very difficult to achieve in vivo. On the other hand, a defined in vitro microenvironment could be easily created from biomaterials with controlled properties. Efforts have been made to create such biomaterials with synthetic polymers, including poly(lactic-co-glycolic acid)  and poly(ethylene glycol) (PEG) , or with natural polymers, such as polysaccharides . In this review, we focus on biomaterials utilizing protein domains or peptide sequences in recombinant proteins or composite materials.
Progress in molecular biology has enabled the facile production of recombinant proteins. Furthermore, maturation in large-scale production techniques makes protein-based materials more economically feasible. New techniques such as incorporation of non-canonical amino acids further expand the possibilities of protein-based biomaterials.
A recombinant protein is designed at the DNA level, and DNA sequences encoding different protein domains can be assembled in the desired order with a specified number of repeats. This modularity in recombinant protein design enables the production of recombinant protein-based materials with diverse properties. The translation machinery also has high fidelity so that the desired recombinant protein will have the specified amino acid sequence. On the other hand, synthetic polymers or proteins harvested from nature often have dispersity in chain length or composition. Thus, the high fidelity in recombinant protein production promises precise control over material properties.
In general, a recombinant protein designed for tissue engineering applications is composed of modular domains that confer specific functions. For example, structural domains provide mechanical properties, and biological domains facilitate the interactions between cells and the materials. In addition, domains can be used that respond to environmental stimuli or enable spatiotemporal control over material properties. In this review, we focus on domains that are being widely used in protein-based materials or are being incorporated into composite materials for added functionality.
Recombinant proteins used as scaffold materials in tissue engineering serve as a temporary matrix before the desired tissue is regenerated. Thus, recombinant proteins should have appropriate structural domains that provide mechanical support and a microenvironment that supports cell proliferation and differentiation. An ideal structural domain should provide appropriate mechanical properties that match those of the surrounding tissues and should not trigger inflammation or adverse immune responses.
Elastin-like polypeptides (ELPs) are based on sequences derived from native elastin and are being actively studied. Elastin is the major component that provides elasticity to the extracellular matrix (ECM). ELPs are able to mimic the mechanical properties of native elastin and are mostly composed of a repeating amino acid sequence (VPGXG)n, where X is a guest residue consisting of any amino acid except proline .
The flexibility in choosing the guest residue expands the functionality of ELPs as a structural domain. One example is the use of lysine as a guest residue. This choice allows crosslinking through the primary amine side chain, and a range of mechanical moduli can be achieved with different degrees of crosslinking . Another example is the use of cysteine as a guest residue to facilitate surface immobilization of ELPs or crosslinking of free ELPs .
ELPs are also thermo-responsive, and this feature can be modulated using the guest residues. ELPs exhibit lower critical solution temperature (LCST) behavior; they are soluble below the LCST and form a coacervate, or a dense polymer-rich liquid phase, above the LCST. The Rodríguez-Cabello group has utilized the LCST behavior of ELPs to facilitate hydrogel formation of ELP-fusion proteins [9,10]. In addition, spherical structures of ELPs have been triggered by the thermo-responsive behavior . Structures formed by ELPs include hollow spheres or spheres with a dense core, and their size distributions can be controlled by salt concentration  or by the number of repeating units . ELP spheres have been used as templates for nanoparticle synthesis  and have been applied to other applications, such as drug delivery vehicles .
A recent study expanded the versatility of ELP-based materials by mixing them with peptide amphiphiles (PAs) to form self-assembled structures . ELPs mixed with PAs formed multilayer membrane architectures with each layer containing both components. The structure changed when the membrane interface was in contact with other surfaces. The authors demonstrated that a complex multi-way tube structure could be formed from a relatively simple one-way tube (Figure 1a).
Resilin, which is found in insect cuticle, has drawn great interest with its high elasticity, high resilience, and heat stability . The Elvin group first developed resilin-like polypeptides (RLPs) based on sequences from Drosophila melanogaster (Dros16, (GGRPSDSYGAPGGGN)n) and Anopheles gambiae (An16, (AQTPSSQYGAP)n), and both RLPs possess mechanical properties and heat stability that are similar to those of native resilin . Unlike ELPs, there are no guest residues in RLP sequences; however, RLP amino acid sequences in which lysine residues have been inserted to serve as a crosslinking site still retain resilin-like characteristics [18,19].
Photochemical crosslinking through tyrosine residues in RLPs can be mediated through Ru(II), and crosslinked RLPs have been explored for various applications. For example, Lv et al. reported a recombinant protein based on Dros16 and an RGD-containing domain from tenascin-C . Crosslinked hydrogels of this material had tunable mechanical properties controlled by protein concentration and facilitated spreading of human lung fibroblasts. Another example is the use of photocrosslinked RLPs to modify tissue culture polystyrene surfaces . High coating concentrations prevented fibroblast attachment and spreading, but cell attachment could be restored by incorporating an RGD peptide in the coating.
Outside of the field of tissue engineering, RLPs have been applied to green synthesis of fluorescent gold nanoclusters. Specifically, RLPs served as reducing agents during synthesis and as stabilizers after particle formation . Another example is the development of biocomposite adhesives with enhanced mechanical properties. In particular, RLPs were used to directly introduce nano-crystalline cellulose into epoxy resins .
A variety of crosslinking strategies have been developed for crosslinking structural domains. A straightforward approach is to include amino acids with reactive side chains in the structural domains. Crosslinking can thus be achieved by reagents that react with those side chains. Examples include crosslinking lysine with N-hydroxysuccinimide (NHS) esters or tris(hydroxymethyl)phosphine (THP) and cysteine with maleimide. Tyrosine is photochemically reactive in the presence of a Ru(II) catalyst. Enzyme-facilitated bond formation has also been utilized for crosslinking. Examples of these enzymes include lysyl oxidase, peroxidase, and transglutaminase .
The previous strategies require additional reagents for crosslinking; however, protein domains that form covalent bonds between two protein chains have been developed. The N-terminal and C-terminal domains of a self-splicing intein from Nostoc punctiforme have been utilized to facilitate protein hydrogel formation. Also, the SpyTag and SpyCatcher pair, which is derived from Gram-positive bacterial adhesins, forms an isopeptide bond between specific asparagine and lysine residues. The SpyTag and SpyCatcher domains have been used to form crosslinks in ELP hydrogels (Figure 1b), and the resulting crosslinked network can be controlled by the location of the SpyTag and SpyCatcher domains within ELP backbones [26,27].
Domains with strong protein-protein interactions can serve as physical crosslinking sites. For example, the Heilshorn group has developed a two-component system that forms a gel due to the physical crosslinking of WW and proline-rich peptide domains [28,29]. By using WW domains with varying dissociation constants (Kd), different gelation behaviors and mechanical properties can be achieved. Another example of physical crosslinking is the use of leucine zippers, which are α-helices that form coiled-coil domains. In work by Huang et al., physical crosslinking of leucine zippers was further stabilized by disulfide bond formation through incorporated cysteine residues .
One advantage of modular recombinant proteins is that bioactive cues can be directly incorporated among the structural domains at the desired location and density. In tissue engineering, the following issues have been addressed by the strategic fusion of bioactive domains: cell-material interactions, cell fate determination, and material response to cellular activities.
Cell attachment to the material is often the first consideration when designing biomaterials. Many cell-adhesive domains are derived from ECM proteins such as fibronectin, collagen, and laminin. The RGD sequence is a classic cell-binding domain that is derived from the 10th domain of fibronectin type III (FNIII). The RGD sequence is often presented with the PHSRN synergy site, which is derived from the 9th domain of FNIII, to increase cell adhesion and target the α5β1 integrin. The CS5 sequence, with the minimum sequence of REDV, is another cell-binding domain derived from FNIII. Hsueh and coworkers reported that ELP proteins containing the CS5 domain supported murine Schwann cell proliferation .
Another set of cell-binding domains is derived from laminins, which are heterotrimeric proteins found in the basal lamina. The PPFLMLLKGSTR sequence is derived from the laminin-5 α3 chain globular domain 3 (LG3). Including this domain within an ELP facilitated human keratinocyte attachment . This work also examined keratinocyte attachment with ELPs containing two other binding domains: fibronectin domains containing the RGD and PHSRN sequences and the GEFYFYDLRLKGDK sequence derived from the α1 chain of collagen type IV. Keratinocytes attached more quickly to proteins with the laminin or fibronectin domains compared to the collagen domain. The authors found that keratinocyte attachment to the laminin and fibronectin domains could be reduced with antibodies against the α3 and α5 integrin subunits, respectively, and thus concluded that the keratinocytes were likely utilizing the α3βI and α5βI integrins, respectively, to adhere to those proteins.
Many other cell-binding domains have been reported (e.g., DGEA from collagen type I and IKVAV and YIGSR from laminin) and are described in a review . However, new cell-binding domains are still being identified. Lee and coworkers recently showed that the C-terminal RKRK sequence cannot completely account for cell attachment to human tropoelastin and suggested that there is a new binding domain in domains 17 and 18 of tropoelastin .
Promoting stem cell differentiation into the desired cell lineage and preventing committed cells from de-differentiating are critical to the success of tissue engineering. Because growth factors play an important role in morphogenesis, it is logical that many of the bioactive domains that are capable of determining cell commitment are derived from the corresponding growth factors.
Bone morphogenetic proteins (BMPs) are a family of growth factors that are involved in the development of different tissues, but they are best known for their roles in bone morphogenesis. Recombinant human BMP-2 and -7 are approved by the Food and Drug Administration (FDA) for clinical applications. The KIPKASSVPTELSAISTLYL peptide is derived from the knuckle epitope of human BMP-2 and has been widely used as a bioactive domain to promote osteogenesis. Kim et al. recently showed that the BMP-2 peptide retained its bioactivity when fused in an RLP backbone and that the fusion protein enhanced osteogenic differentiation . The peptide has also been reported to promote chondrogenesis of human mesenchymal stem cells (hMSCs) in pellet culture .
Three peptides have been derived from human BMP-7: SNVILKKYRN, KPSSAPTQLN, and KAISVLYFDDS. In a recent work by Tao and coworkers, these three peptides were incorporated into self-assembling peptides and promoted ECM secretion by human degenerated nucleus pulposus cells . These results demonstrate the potential that these peptides have for intervertebral disc regeneration, which is a major clinical application of recombinant human BMP-7. It is expected that these BMP-7 derived peptides could be easily incorporated into recombinant protein-based materials through the modular design approach.
Growth factors such as vascular endothelial growth factor (VEGF) and platelet-derived growth factor (PDGF) play an important role in blood vessel formation in regenerating tissue. The KLTWQELYQLKYKGI sequence (named QK) is based on VEGF. The QK peptide retained its bioactivity and promoted endothelial cell behavior when crosslinked to a protein backbone  or presented as a fusion protein . Besides its well-known effects on angiogenesis, VEGF also has protective effects on neuronal cells. Verheyen and coworkers showed that the QK peptide also displayed this function and protected neurons from paclitaxel toxicity and hyperglycemic stress .
The previous sections focus on the use of shorter bioactive peptides; however, entire growth factors can also be integrated into fusion proteins without loss of their biological functions. Sun et al. used the SpyTag-SpyCatcher system to fuse chimeric leukemia inhibitory factor (MH35-LIF), which suppresses embryonic stem cell differentiation, to a crosslinked protein hydrogel . Mouse embryonic stem cells cultured in these hydrogels remained pluripotent without any additional LIF supplements. It is noteworthy that, although LIF is expressed as a glycoprotein in mammals, the recombinant version expressed by bacteria remains bioactive. As progress in molecular biology and protein engineering continues, it is anticipated that more growth factor sequences will be identified that can be harnessed as bioactive domains in recombinant protein design.
Biomaterial degradation is an important factor in tissue engineering. Ideally, degradation should synchronize with cellular regeneration so that there will be room for newly formed tissue. Peptide sequences that are sensitive to proteases can be used as degradation domains in recombinant protein-based materials. For example, ELPs are usually used as structural domains; however, they can also serve as degradation domains due to their sensitivity to elastase .
Degradation domains can also be explicitly incorporated into modular designs. Popular choices are matrix metalloproteinase (MMP)-sensitive sequences. MMPs are a family of endopeptidases that can degrade ECM proteins including collagen, elastin, fibronectin, and laminin. MMP-sensitive sequences have been widely used in applications requiring material degradation. For example, Price and coworkers incorporated an MMP-sensitive sequence into a silk-elastin-like protein hydrogel for viral-mediated gene delivery for cancer treatment . When implanted into mice, MMP-sensitive hydrogels had higher cell invasion compared to MMP-insensitive hydrogels (Figure 1c). Tumor-bearing mice treated with MMP-sensitive hydrogels had the highest survival rate.
MMPs also play important roles in many physiological events such as morphogenesis and tissue remodeling. For example, Fonseca and coworkers profiled the in vitro gene expression of hMSCs grown in basal or osteogenic medium and found that in osteogenic medium there was an increase in MMP-14 gene expression levels and alkaline phosphatase (ALP) activity at one week . Thus, it is anticipated that by selecting specific MMP-sensitive sequences it is possible to design materials that are selectively degraded by desired cells at specific stages of differentiation. For example, Sridhar et al. used an MMP-sensitive sequence, KCGPQGIWGQCK, as a degradable crosslinker for PEG hydrogels. Cells encapsulated in the degradable gels showed higher glycosaminoglycan (GAG) and collagen deposition compared to those in non-degradable hydrogels (Figure 1d) . In general, MMPs share common features in their cleavage sites. A recent study by Kukreja and coworkers analyzed target sequences of 18 MMPs and identified information for predicting and designing MMP-cleavable sequences . Overall, understanding the specificities of MMPs and identifying their cleavage sequences will expand the possibilities of using degradation domains for targeted applications.
In an ideal functional biomaterial, the material properties are precisely controlled, and the material is responsive to dynamic cellular activities and changes in environmental conditions. Therefore, domains that enable material responsiveness and spatial or temporal control over material properties have been explored. The Davies group utilized two MMP sequences with different enzyme specificities (one sequence is recognized by many MMPs whereas the other sequence is recognized by a specific MMP) and successfully modulated the in vitro invasion rates of fibroblasts and vascular smooth muscle cells into PEG hydrogels . They also demonstrated control over in vivo cellular invasion by making hydrogels with a mixture of MMP-sensitive sequences .
Leucine zippers have been utilized to control ligand accessibility and density. The stability and oligomerization states of leucine zippers can be easily tuned by changing their amino acid sequences. A heterodimeric leucine zipper pair was used to reversibly enable access to an RGD cell-binding domain. Exposure of gold nanorods to near-infrared (NIR) light resulted in a photothermal effect, which effectively denatured the leucine zippers and enabled access to the RGD sequence (Figure 1e) . Leucine zippers have also been utilized to increase ligand avidity. For example, a trimeric leucine zipper, cartilage matrix protein (CMP), was recently used to present a ligand for epidermal growth factor receptor (EGFR). The EGFR ligand was recombinantly fused to a CMP domain, and the fusion proteins oligomerized to form trimers through the CMP domain. A monomeric control was constructed by fusing the EGFR ligand to a mutated CMP domain, which was unable to oligomerize to form a trimer. When the ligand was presented as a CMP-facilitated trimer, it demonstrated enhanced binding strength compared to the monomeric ligand .
Conditional-splicing inteins are an intriguing domain in the recombinant protein toolbox. Unlike normal inteins, conditional-splicing inteins only splice after an environmental stimulus. Informative reviews of recent progress on inteins are available [50,51]. Recently, photoactivatable inteins have been achieved by incorporating non-canonical, photoactive amino acids. For example, both cysteine and serine residues in inteins have been modified with a photocage, and these photoactivatable inteins are promising protein-labeling tools with exquisite spatiotemporal control that can be directly used in live mammalian cells [52,53]. Inteins with photoactive tyrosines have also been used as a tool for making cyclic peptides .
Biomaterials with sophisticated control over properties are increasingly needed to address questions regarding cellular behavior for tissue engineering applications. These materials need to meet demanding requirements for specific mechanical properties, macromolecular structure, cell-material interactions, and responsiveness towards environmental changes. Protein domains and peptide sequences provide the desired functionality to recombinant proteins or to composite materials. In particular, recombinant proteins are promising materials because their modularity enables the facile combination of domains that confer the desired structure, bioactivity, and functionality. As we continue to unlock the sequence-structure-function relationship of natural proteins, we can expand the number of domains available as part of our recombinant protein toolbox. Thus, protein domains can be used to design materials that can address a larger variety of questions not only in the field of tissue engineering but also in stem cell and developmental biology, pharmaceutical engineering, and clinical practice.
This work was supported by NIH (NIDCR R03DE021755) and the American Heart Association Scientist Development Grant (12SDG8980014).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.