Actin assembly beneath enterohemorrhagic E. coli (EHEC) attached to its host cell is triggered by the intracellular interaction of its translocated effector proteins Tir and EspFU with human IRSp53 family proteins and N-WASP. Here, we report the structure of the N-terminal I-BAR domain of IRSp53 in complex with a Tir-derived peptide, in which the homodimeric I-BAR domain binds two Tir molecules aligned in parallel. This arrangement provides a protein scaffold linking the bacterium to the host cell's actin polymerization machinery. The structure uncovers a specific peptide-binding site on the I-BAR surface, conserved between IRSp53 and IRTKS. The Tir Asn-Pro-Tyr (NPY) motif, essential for pedestal formation, is specifically recognized by this binding site. The site was confirmed by mutagenesis and in vivo-binding assays. It is possible that IRSp53 utilizes the NPY-binding site for additional interactions with as yet unknown partners within the host cell.
The demand of monospecific high affinity binding reagents, particularly monoclonal antibodies, has been steadily increasing over the last years. Enhanced throughput of antibody generation has been addressed by optimizing in vitro selection using phage display which moved the major bottleneck to the production and purification of recombinant antibodies in an end-user friendly format. Single chain (sc)Fv antibody fragments require additional tags for detection and are not as suitable as immunoglobulins (Ig)G in many immunoassays. In contrast, the bivalent scFv-Fc antibody format shares many properties with IgG and has a very high application compatibility.
In this study transient expression of scFv-Fc antibodies in human embryonic kidney (HEK) 293 cells was optimized. Production levels of 10-20 mg/L scFv-Fc antibody were achieved in adherent HEK293T cells. Employment of HEK293-6E suspension cells expressing a truncated variant of the Epstein Barr virus (EBV) nuclear antigen (EBNA) 1 in combination with production under serum free conditions increased the volumetric yield up to 10-fold to more than 140 mg/L scFv-Fc antibody. After vector optimization and process optimization the yield of an scFv-Fc antibody and a cytotoxic antibody-RNase fusion protein further increased 3-4-fold to more than 450 mg/L. Finally, an entirely new mammalian expression vector was constructed for single step in frame cloning of scFv genes from antibody phage display libraries. Transient expression of more than 20 different scFv-Fc antibodies resulted in volumetric yields of up to 600 mg/L and 400 mg/L in average.
Transient production of recombinant scFv-Fc antibodies in HEK293-6E in combination with optimized vectors and fed batch shake flasks cultivation is efficient and robust, and integrates well into a high-throughput recombinant antibody generation pipeline.
Recombinant Antibodies; Single Chain Fv; scFv-Fc; ImmunoRNase; Transient Mammalian Protein Production; Serum-free medium
The family of lysosome-associated membrane proteins (LAMP) comprises the multifunctional, ubiquitous LAMP-1 and LAMP-2, and the cell type-specific proteins DC-LAMP (LAMP-3), BAD-LAMP (UNC-46, C20orf103) and macrosialin (CD68). LAMPs have been implicated in a multitude of cellular processes, including phagocytosis, autophagy, lipid transport and aging. LAMP-2 isoform A acts as a receptor in chaperone-mediated autophagy. LAMP-2 deficiency causes the fatal Danon disease. The abundant proteins LAMP-1 and LAMP-2 are major constituents of the glycoconjugate coat present on the inside of the lysosomal membrane, the 'lysosomal glycocalyx'. The LAMP family is characterized by a conserved domain of 150 to 200 amino acids with two disulfide bonds.
The crystal structure of the conserved domain of human DC-LAMP was solved. It is the first high-resolution structure of a heavily glycosylated lysosomal membrane protein. The structure represents a novel β-prism fold formed by two β-sheets bent by β-bulges and connected by a disulfide bond. Flexible loops and a hydrophobic pocket represent possible sites of molecular interaction. Computational models of the glycosylated luminal regions of LAMP-1 and LAMP-2 indicate that the proteins adopt a compact conformation in close proximity to the lysosomal membrane. The models correspond to the thickness of the lysosomal glycoprotein coat of only 5 to 12 nm, according to electron microscopy.
The conserved luminal domain of lysosome-associated membrane proteins forms a previously unknown β-prism fold. Insights into the structure of the lysosomal glycoprotein coat were obtained by computational models of the LAMP-1 and LAMP-2 luminal regions.
A broad working definition of structural proteomics (SP) is that it is the process of the high-throughput characterization of the three-dimensional structures of biological macromolecules. Recently, the process for protein structure determination has become highly automated and SP platforms have been established around the globe, utilizing X-ray crystallography as a tool. Although protein structures often provide clues about the biological function of a target, once the three-dimensional structures have been determined, bioinformatics and proteomics-driven strategies can be employed to derive their biological activities and physiological roles. This article reviews the current status of SP methods for the structure determination pipeline, including target selection, isolation, expression, purification, crystallization, diffraction data collection, structure solution, refinement and functional annotation.
Protein structure analysis; X-ray crystallography; Bioinformatics; Structural proteomics
Studying the biophysical characteristics of glycosylated proteins and solving their three-dimensional structures requires homogeneous recombinant protein of high quality.We introduce here a new approach to produce glycoproteins in homogenous form with the well-established, glycosylation mutant CHO Lec22.214.171.124 cells. Using preparative cell sorting, stable, high-expressing GFP ‘master’ cell lines were generated that can be converted fast and reliably by targeted integration via Flp recombinase-mediated cassette exchange (RMCE) to produce any glycoprotein. Small-scale transient transfection of HEK293 cells was used to identify genetically engineered constructs suitable for constructing stable cell lines. Stable cell lines expressing 10 different proteins were established. The system was validated by expression, purification, deglycosylation and crystallization of the heavily glycosylated luminal domains of lysosome-associated membrane proteins (LAMP).
Genetic factors and a dysregulated immune response towards commensal bacteria contribute to the pathogenesis of Inflammatory Bowel Disease (IBD). Animal models demonstrated that the normal intestinal flora is crucial for the development of intestinal inflammation. However, due to the complexity of the intestinal flora, it has been difficult to design experiments for detection of proinflammatory bacterial antigen(s) involved in the pathogenesis of the disease. Several studies indicated a potential association of E. coli with IBD. In addition, T cell clones of IBD patients were shown to cross react towards antigens from different enteric bacterial species and thus likely responded to conserved bacterial antigens. We therefore chose highly conserved E. coli proteins as candidate antigens for abnormal T cell responses in IBD and used high-throughput techniques for cloning, expression and purification under native conditions of a set of 271 conserved E. coli proteins for downstream immunologic studies.
As a standardized procedure, genes were PCR amplified and cloned into the expression vector pQTEV2 in order to express proteins N-terminally fused to a seven-histidine-tag. Initial small-scale expression and purification under native conditions by metal chelate affinity chromatography indicated that the vast majority of target proteins were purified in high yields. Targets that revealed low yields after purification probably due to weak solubility were shuttled into Gateway (Invitrogen) destination vectors in order to enhance solubility by N-terminal fusion of maltose binding protein (MBP), N-utilizing substance A (NusA), or glutathione S-transferase (GST) to the target protein. In addition, recombinant proteins were treated with polymyxin B coated magnetic beads in order to remove lipopolysaccharide (LPS). Thus, 73% of the targeted proteins could be expressed and purified in large-scale to give soluble proteins in the range of 500 μg.
Here, we report a cost-efficient procedure to produce around 200 soluble recombinant E. coli proteins in large-scale, including removal of LPS by polymyxin B coated beads for subsequent use of the proteins in downstream immunological studies.
A vector system is presented that allows generation of E. coli co-expression clones by a standardized, robust cloning procedure. The number of co-expressed proteins is not limited. Five ‘pQLink’ vectors for expression of His-tag and GST-tag fusion proteins as well as untagged proteins and for cloning by restriction enzymes or Gateway cloning were generated. The vectors allow proteins to be expressed individually; to achieve co-expression, two pQLink plasmids are combined by ligation-independent cloning. pQLink co-expression plasmids can accept an unrestricted number of genes. As an example, the co-expression of a heterotetrameric human transport protein particle (TRAPP) complex from a single plasmid, its isolation and analysis of its stoichiometry are shown. pQLink clones can be used directly for pull-down experiments if the proteins are expressed with different tags. We demonstrate pull-down experiments of human valosin-containing protein (VCP) with fragments of the autocrine motility factor receptor (AMFR). The cloning method avoids PCR or gel isolation of restriction fragments, and a single resistance marker and origin of replication are used, allowing over-expression of rare tRNAs from a second plasmid. It is expected that applications are not restricted to bacteria, but could include co-expression in other hosts such as Bacluovirus/insect cells.
Human Aortic Preferentially Expressed Protein-1 (APEG-1) is a novel specific smooth muscle differentiation marker thought to play a role in the growth and differentiation of arterial smooth muscle cells (SMCs).
Good quality crystals that were suitable for X-ray crystallographic studies were obtained following the truncation of the 14 N-terminal amino acids of APEG-1, a region predicted to be disordered. The truncated protein (termed ΔAPEG-1) consists of a single immunoglobulin (Ig) like domain which includes an Arg-Gly-Asp (RGD) adhesion recognition motif. The RGD motif is crucial for the interaction of extracellular proteins and plays a role in cell adhesion. The X-ray structure of ΔAPEG-1 was determined and was refined to sub-atomic resolution (0.96 Å). This is the best resolution for an immunoglobulin domain structure so far. The structure adopts a Greek-key β-sandwich fold and belongs to the I (intermediate) set of the immunoglobulin superfamily. The residues lying between the β-sheets form a hydrophobic core. The RGD motif folds into a 310 helix that is involved in the formation of a homodimer in the crystal which is mainly stabilized by salt bridges. Analytical ultracentrifugation studies revealed a moderate dissociation constant of 20 μM at physiological ionic strength, suggesting that APEG-1 dimerisation is only transient in the cell. The binding constant is strongly dependent on ionic strength.
Our data suggests that the RGD motif might play a role not only in the adhesion of extracellular proteins but also in intracellular protein-protein interactions. However, it remains to be established whether the rather weak dimerisation of APEG-1 involving this motif is physiogically relevant.
The availability of suitable recombinant protein is still a major bottleneck in protein structure analysis. The Protein Structure Factory, part of the international structural genomics initiative, targets human proteins for structure determination. It has implemented high throughput procedures for all steps from cloning to structure calculation. This article describes the selection of human target proteins for structure analysis, our high throughput cloning strategy, and the expression of human proteins in Escherichia coli host cells.
Results and Conclusion
Protein expression and sequence data of 1414 E. coli expression clones representing 537 different proteins are presented. 139 human proteins (18%) could be expressed and purified in soluble form and with the expected size. All E. coli expression clones are publicly available to facilitate further functional characterisation of this set of human proteins.
MS is a chronic inflammatory and demyelinating disease of the CNS with as yet unknown etiology. A hallmark of this disease is the occurrence of oligoclonal IgG antibodies in the cerebrospinal fluid (CSF). To assess the specificity of these antibodies, we screened protein expression arrays containing 37,000 tagged proteins. The 2 most frequent MS-specific reactivities were further mapped to identify the underlying high-affinity epitopes. In both cases, we identified peptide sequences derived from EBV proteins expressed in latently infected cells. Immunoreactivities to these EBV proteins, BRRF2 and EBNA-1, were significantly higher in the serum and CSF of MS patients than in those of control donors. Oligoclonal CSF IgG from MS patients specifically bound both EBV proteins. Also, CD8+ T cell responses to latent EBV proteins were higher in MS patients than in controls. In summary, these findings demonstrate an increased immune response to EBV in MS patients, which suggests that the virus plays an important role in the pathogenesis of disease.
A systematic approach for identifying human proteins and protein fragments that can be expressed as soluble proteins in Escherichia coli is described.
We describe here a systematic approach to the identification of human proteins and protein fragments that can be expressed as soluble proteins in Escherichia coli. A cDNA expression library of 10,825 clones was screened by small-scale expression and purification and 2,746 clones were identified. Sequence and protein-expression data were entered into a public database. A set of 163 clones was selected for structural analysis and 17 proteins were prepared for crystallization, leading to three new structures.
High-throughput protein structure analysis of individual protein domains requires analysis of large numbers of expression clones to identify suitable constructs for structure determination. For this purpose, methods need to be implemented for fast and reliable screening of the expressed proteins as early as possible in the overall process from cloning to structure determination.
88 different E. coli expression constructs for 17 human protein domains were analysed using high-throughput cloning, purification and folding analysis to obtain candidates suitable for structural analysis. After 96 deep-well microplate expression and automated protein purification, protein domains were directly analysed using 1D 1H-NMR spectroscopy. In addition, analytical hydrophobic interaction chromatography (HIC) was used to detect natively folded protein. With these two analytical methods, six constructs (representing two domains) were quickly identified as being well folded and suitable for structural analysis.
The described approach facilitates high-throughput structural analysis. Clones expressing natively folded proteins suitable for NMR structure determination were quickly identified upon small scale expression screening using 1D 1H-NMR and/or analytical HIC. This procedure is especially effective as a fast and inexpensive screen for the 'low hanging fruits' in structural genomics.
structural genomics; hydrophobic interaction chromatography; homonuclear NMR; protein domains; high-throughput expression
Functional Genomics, the systematic characterisation of the functions of an organism's genes, includes the study of the gene products, the proteins. Such studies require methods to express and purify these proteins in a parallel, time and cost effective manner.
We developed a method for parallel expression and purification of recombinant proteins with a hexahistidine tag (His-tag) or glutathione S-transferase (GST)-tag from bacterial expression systems. Proteins are expressed in 96-well microplates and are purified by a fully automated procedure on a pipetting robot. Up to 90 microgram purified protein can be obtained from 1 ml microplate cultures. The procedure is readily reproducible and 96 proteins can be purified in approximately three hours. It avoids clearing of crude cellular lysates and the use of magnetic affinity beads and is therefore less expensive than comparable commercial systems.
We have used this method to compare purification of a set of human proteins via His-tag or GST-tag. Proteins were expressed as fusions to an N-terminal tandem His- and GST-tag and were purified by metal chelating or glutathione affinity chromatography. The purity of the obtained protein samples was similar, yet His-tag purification resulted in higher yields for some proteins.
A fully automated, robust and cost effective method was developed for the purification of proteins that can be used to quickly characterise expression clones in high throughput and to produce large numbers of proteins for functional studies.
His-tag affinity purification was found to be more efficient than purification via GST-tag for some proteins.
Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins.
A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer.
The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
We have developed a system to identify highly specific antibody–antigen interactions by protein array screening. This removes the need for selection using animal immunisation or in vitro techniques such as phage or ribosome display. We screened an array of 27 648 human foetal brain proteins with 12 well-expressed antibody fragments that had not previously been exposed to any antigen. Four highly specific antibody–antigen pairs were identified, including three antibodies that bind proteins of unknown function. The target proteins were expressed at a very low copy number on the array, emphasising the unbiased nature of the screen. The specificity and sensitivity of binding demonstrates that this ‘naive’ screening approach could be applied to the high throughput isolation of specific antibodies against many different targets in the human proteome.