|Home | About | Journals | Submit | Contact Us | Français|
Intimins and invasins are virulence factors produced by pathogenic Gram-negative bacteria. They contain C-terminal extracellular passenger domains that are involved in adhesion to host cells and N-terminal β-domains that are embedded in the outer membrane. Here, we identify the domain boundaries of an E. coli intimin β-domain and use this information to solve its structure and the β-domain structure of a Y. pseudotuberculosis invasin. Both β-domain structures crystallized as monomers and reveal that the previous range of residues assigned to the β-domain also includes a protease resistant domain that is part of the passenger. Additionally, we identify 146 non-redundant representative members of the intimin/invasin family based on the boundaries of the highly conserved intimin and invasin β-domains. We then use this set of sequences along with our structural data to find and map the evolutionarily constrained residues within the β-domain.
The intimins and invasins (Int/Inv) constitute a family of outer membrane proteins (OMPs) found in Gram-negative bacteria that act as adhesins. Intimins are produced by “attaching and effacing” (A/E) pathogens such as enterohemorrhagic Escherichia coli (EHEC) and Enteropathogenic E. coli (EPEC). A/E pathogens intimately adhere to epithelial cells that line the intestinal wall and cause the formation of actin-rich lesions at the site of interaction. These lesions are promoted by binding of intimin to the translocated intimin receptor (Tir), a protein that is produced by the bacteria, injected into the host cell, and then integrated into the host cell membrane (Jerse et al., 1990; Kenny et al., 1997). Invasins are homologous to intimins and are produced by enteropathogenic species of the genus Yersinia (Y. enterocolitica and Y. pseudotuberculosis) (Isberg et al., 1987). Invasins bind to the β1 integrin super-family of proteins on the surface of eukaryotic host cells and trigger the rearrangement of the host cell cytoskeleton that in turn leads to the internalization of the bacteria (Isberg and Leong, 1990).
Int/Inv share a similar domain structure (Figure 1A–D): (1) an N-terminal signal peptide (SP) that targets them for secretion through the Sec complex, (2) an internal transmembrane “β-domain” that spans the outer membrane (OM), and (3) a C-terminal passenger domain that mediates interactions with host cells. Some Int/Inv also have an additional domain (LysM) between their signal peptide and β-domain that is located in the periplasm and predicted to bind peptidoglycan. The β-domain is conserved across family members, predicted to contain a β-barrel, and is necessary for the passenger domain to cross the OM (Touze et al., 2004). Structures for portions of the passenger domain have been solved (Batchelor et al., 2000; Hamburger et al., 1999; Luo et al., 2000). These structures form similar elongated rods composed of repeated bacterial immunoglobulin-like (BIG) domains and are capped at the C-terminus by C-type lectin-like domains.
Previous studies have indicated that Autotransporters (ATs) and Int/Inv share common characteristics that suggest they are structurally related (Newman and Stathopoulos, 2004; Touze et al., 2004): (1) both contain passenger domains that are secreted into the extracellular space, and (2) both contain a β-domain that is essential for the translocation of the passenger domain across the OM. However, there are some key differences: (1) Int/Inv have N-terminal β-domains and C-terminal passengers while ATs have these domains inverted (the passenger is N-terminal and the β-domain is C-terminal) and (2) Int/Inv passenger domains are composed of BIG and C-type lectin-like domains while monomeric AT passenger domains are predicted to fold as β-helices in >97% of cases (Junker et al., 2006).
For all solved monomeric AT β-domain structures, a 12-stranded β-barrel is connected to a passenger domain by an α-helical linker that spans the barrel pore (Barnard et al., 2012; Oomen et al., 2004; Tajima et al., 2010; van den Berg, 2010; Zhai et al., 2011). The Int/Inv β-domains were also thought to form β-barrels in the OM although the explicit domain boundaries and final number of β-strands were unknown (Newman and Stathopoulos, 2004; Touze et al., 2004; Tsai et al., 2010). Here we define the precise boundaries of intimin’s β-domain and identify a protease resistant extracellular domain that is not recognized as a BIG or C-type lectin-like domain. We then use this information to solve the lipidic cubic phase (LCP) X-ray crystal structures of the β-domains of intimin from EHEC strain O157:H7 and invasin from Y. pseudotuberculosis strain IP 32953. We will refer to these subtypes simply as intimin or invasin, respectively, from this point forward. Similar to AT β-domains, intimin and invasin are 12-stranded anti-parallel β-barrels containing linkers that span the barrel pore. However, their structures also have significant differences compared to the AT β-domains. The linker that spans the barrel pore adopts an extended conformation for intimin and invasin rather than α-helical as seen for ATs. The extended linker creates a large cavity on one side of the barrel pore. There is also a periplasmic α-helix between the barrel domain and linker for intimin and invasin that is not present in ATs. To conclude, we used the precise boundaries of the highly conserved β-domain to define the Int/Inv family and found groups of evolutionarily constrained residues that cluster together in the β-domain structure.
Previously, residues 189–550 of EPEC intimin were observed to form a protease-resistant domain that showed heat-modifiable mobility when subjected to SDS-PAGE. β-barrel outer membrane proteins are referred to as being heat-modifiable if they remain folded in SDS sample buffer at room temperature but then unfold upon heating. The compact, folded barrel migrates faster during SDS-PAGE than the unfolded/elongated protein. Additionally, residues 189–550 of EPEC intimin were predicted to be rich in β-sheet structure (Touze et al., 2004) and were thus proposed to contain the transmembrane β-barrel. To determine the minimal domain of intimin that contains the β-barrel, we made N-terminal deletions in a construct expressing residues 189–550 (HA-Int189–550) (Figure 2A) or C-terminal truncations of a construct expressing residues 1–550 (HA-Int1–550) (Fig 2C). We then tested to see if these truncation mutants showed heat-modifiable mobility when subjected to SDS-PAGE. As expected, HA-Int189–550 was found to be heat-modifiable (Figure 2B, lanes 1 and 2). The N-terminal deletion mutants HA-Int200–550 and HA-Int210–550 also showed typical heat-modifiable mobility whereas HA-Int220–550 did not (Figure 2B, lanes 3–8). Since residues 189–550 were previously found to be protease resistant (Touze et al., 2004), these results suggest that residues 189–210 form a periplasmic, protease resistant region upstream from the intimin β-barrel and that the N-terminal boundary of the β-barrel is located between residues 210 and 220. For the C-terminal truncation mutants, we found that deletions up to residue 450 (HA-Int1–450) did not affect intimin’s heat-modifiable mobility (Figure 2D, lanes 1–8). However, the constructs (HA-Int1–400, HA-Int1–411, and HA-Int1–430) with further C-terminal truncations were no longer heat-modifiable (Figure 2D, lanes 9–14). These results suggest that the C-terminus of the β-barrel is between residues 430 and 450, however, this conclusion was not correct as will be discussed below.
Intimin requires the β-barrel assembly machinery (BAM) complex for proper insertion into the OM (Bodelon et al., 2009). The BAM complex recognizes unfolded β-barrels in the periplasm by interacting with a “signature sequence” located in the final β-strand of the barrel (Robert et al., 2006). This signature sequence is comprised of hydrophobic or aromatic amino acids and includes the C-terminal residue of the final β-strand. A putative BAM signature sequence (L402YSMQFRYQF411) was identified in the intimin sequence between residues 402 and 411. We hypothesized that this stretch of residues could represent the last strand of the intimin β-barrel. However, the construct HA-Int1–411 did not show heat-modifiable mobility as would be expected for a fully folded β-barrel (Figure 2D, lanes 11 and 12). One explanation for this result could be that, similar to AT β-domains, intimin has an α-helical linker that spans the central pore of the β-barrel and stabilizes it (Barnard et al., 2007; Ieva et al., 2008). Indeed, PSIPRED calculations (Buchan et al., 2010) predict that residues 412–449 of intimin have α-helical secondary structure. Thus, if a similar α-helical linker region is located downstream of the last β-strand of intimin’s β-barrel, this could explain why HA-Int1–411 and HA-Int1–430 were not observed to be heat-modifiable. Using known AT structures as guides, we predicted that intimin’s linker should protrude from the pore of the β-barrel at approximately residue 450, and that the region encompassing residues 450–550 should be located in the extracellular space. To test this hypothesis, we assessed the proteinase K sensitivity of our C-terminal truncation mutants in whole cells. It should be noted that the region encompassing residues 450–550 of intimin is PK-resistant (Touze et al., 2004). However, we reasoned that if the PK-resistance of this region is due to tight folding, then deleting portions of this domain could render it PK-sensitive. Indeed, we found that in contrast to HA-Int1–550, the constructs HA-Int1–530 and HA-Int1–500 were sensitive to extracellularly added PK and were converted to PK-resistant fragments, similar in size to HA-Int1–450 (Figure 2E, lanes 1–8). This suggests that residues 1–450 are protected because they include the periplasmic domain, β-barrel, and α-helical linker of intimin and would therefore be shielded by the OM.
From these preliminary studies we concluded that the intimin β-domain likely contains a β-barrel comprised of residues 210–411 and an α-helical linker that passes through the barrel pore, connecting the barrel to the extracellular passenger domain. The linker likely exits the barrel pore at the cell surface near residue 450. The PK-resistant domain formed by residues 450–550 which we will refer to as domain “D00” from this point forward (Figure 1B) is not recognized as a BIG or C-type lectin-like domain by the Pfam database, suggesting that its structure and function might be different from the rest of the passenger domain.
Having determined that the intimin β-barrel and linker are located between residues 210 and 450, we cloned amino acids 208–449 from intimin into a pET9 vector. A PelB signal sequence was used to target intimin for secretion and, for the purposes of purification, a 10xHis tag and a TEV cleavage site were inserted between the PelB signal sequence and the intimin coding region (Supplementary Figure S1A). This construct (10xHis-TEV-Int208–449) was used to over-express the intimin β-domain and we were able to solubilize the recombinant protein from isolated membranes with detergent. Purified intimin β-domain showed heat-modifiable mobility when subjected to SDS-PAGE, indicating that it is properly folded in the detergent micelle environment (Supplementary Figure S1C). We crystallized it from lipidic mesophases in the C2221 space group and collected 1.85 Å resolution data (Table 1). Exhaustive attempts at structure solution via molecular replacement using β-barrel outer membrane proteins containing 10, 12, 14, or 16 β-strands proved unsuccessful. Thus, selenomethionine (SeMet) derivatized intimin was expressed, purified, and crystallized using the 10xHis-TEV-Int208–449 construct (Table 1). Although analysis of the data with the program PHENIX (Adams et al., 2010) indicated that the anomalous signal was weak with a resolution cut-off of 4.5 Å, the program SHARP (Bricogne et al., 2003) was able to identify five selenium sites, allowing us to solve the structure. An initial model was manually built into the experimentally determined electron density using the program Coot (Emsley and Cowtan, 2004). This model was then used to solve the high-resolution native intimin β-domain structure via molecular replacement using the program PHASER (McCoy et al., 2007). Representative electron density for the refined model is shown in Figure 3C. Intimin is the first lipidic mesophase membrane protein crystal structure solved using 3-wavelength SeMet MAD data. Although de novo phasing methods have been attempted for lipidic mesophase crystals in the past using SeMet derivatized proteins, all attempts have proven unsuccessful thus far (Caffrey, 2011). This has been attributed to low measureable anomalous signal since mesophase crystals are generally small, weakly diffracting, and often require averaging of several crystals to obtain complete datasets. In our case, a single mesophase crystal was sufficient to collect three complete anomalous datasets with a measurable anomalous signal due to SeMet incorporation.
Previously, Touze et al. showed that intimin could dimerize and attributed this property to the β-domain. Here our intimin construct crystallizes as a monomer (Figure 3A), suggesting that either the dimeric form cannot be captured under our purification and crystallization conditions or that our construct does not possess the region or regions necessary for dimerization. Indeed, dimerization was previously observed for intimin residues 189–550 (Touze et al., 2004) and here we only crystallized the β-domain (β-barrel plus linker, residues 208–449). Thus, it is possible that residues 189–207 and/or residues in the D00 domain (residues 450–550) mediate dimerization. There are also a few minor differences between our construct and the protein used by Touze et al. including an uncleaved purification tag at the N-terminus of our construct and three naturally occurring amino acid differences between residues 208–449.
The intimin β-domain contains a 12-stranded β-barrel and central linker that passes through the barrel pore (Figures 3A and 3B). Visible electron density begins at residue Gln 208 near the N-terminus of the β-barrel and is continuous to the C-terminal residue, Lys 449. The final strand of the β-barrel includes the identified BAM signature sequence (L402YSMQFRYQF411) and is immediately followed by a short α-helical region that would face the periplasm. The protein chain then passes through the central pore of the barrel in an extended conformation to form a linker that would connect the barrel to the D00 domain of the passenger. This is in contrast to the PSIPRED prediction of an α-helical linker and to the known monomeric AT structures which all contain α-helical linkers. The C-terminal residue, Lys 449, is located at the end of a β-strand that faces the extracellular space. Extracellular loops 4 and 5 interact with this β-strand to form a small 3-stranded β-sheet above the barrel. As model building progressed, electron density became evident for thirteen monoolein molecules surrounding the β-barrel, some fully ordered and some partially ordered (Supplementary Figure S2). Some are present in what appear to be individual ‘lipid channels’ on the barrel’s surface.
The barrel is ellipsoid in shape with a long axis of ~26 Å and a short axis of ~21 Å when measured between Cα atoms. These dimensions are similar to those of the AT EspP and, furthermore, the EspP and intimin barrels superpose closely (Supplementary Figures S3A and S3B). When the linker is removed from the intimin barrel pore, the equivalent pore diameter ranges from ~6 to ~10 Å when measured by the program HOLE using the Connolly option (Supplementary Figures S3A and S3C) (Smart et al., 1996). Similarly, the equivalent pore diameter of pre-cleavage EspP with its linker removed ranges from ~ 5 to ~11 Å. Since the side chains within the barrel pore can sample different conformations in vivo but are fixed during analysis, the equivalent pore diameter measured by HOLE likely underestimates the maximum size the pore can attain. Nevertheless, a pore diameter of ~10 Å would preclude the passage of protein domains with tertiary structure. Since ATs and the Int/Inv family likely share a common translocation mechanism and elements that would not fit inside a ~10 Å pore have been seen to be efficiently secreted for ATs (Jong et al., 2007; Leyton et al., 2011; Sauri et al., 2012; Skillman et al., 2005), the barrel pore seen in the intimin structure and the AT structures probably does not accurately depict the active translocation channel. It should be noted that several proteins containing disulfide bonds have been fused to intimin (Adams et al., 2005; Wentzel et al., 2001) and tested for passenger translocation to the cell surface. Similar to the AT studies, some of these proteins were translocated efficiently while others were not. However, none of the fusions that were transported efficiently by intimin were tested to see if their cysteines were more accessible on the cell surface in a dsbA minus strain compared to a wild-type strain. Increased accessibility of surface exposed cysteines in a dsbA minus strain would suggest that the disulfide bonds were formed in the periplasm and that the passenger fusions were at least partially folded during translocation. For a recent review discussing AT translocation models, please refer to Leyton et al. (Leyton et al., 2012).
The linker primarily contacts one side of the barrel pore, forming a large network of hydrogen bonds and salt-bridge interactions with residues from the barrel wall (Figure 4A). Due to this asymmetry, a large cavity (1901.9 Å3) (Dundas et al., 2006) is created on the opposite side of the barrel pore (Figures 4B and 4C). This cavity is almost completely enclosed by extracellular loop 5 and the periplasmic α-helix. However, small channels (~ 3 Å diameter) that connect to the large cavity create pathways that traverse the length of the barrel when visualized with the program CAVER (Bene et al., 2010). To see if the large cavity and these small channels could form a conductance pathway, we reconstituted purified intimin (10xHis-TEV-Int208–449) into a lipid bilayer and applied various voltages. However, no conductance increases were observed (Supplementary Figure S3D) suggesting that intimin cannot form an open channel. These results agree with those of Touze et al. When they similarly tested an intimin construct containing residues 1–550, they observed no changes in conductance as well. It is possible that this empty space could serve as a translocation pathway for the passenger (assuming loop 5 is flexible and the periplasmic α-helix is in a different conformation during translocation). However, this possibility is unlikely (unless the pore seen in the crystal structure can expand) because there is evidence, as mentioned above, that ATs and Int/Inv can transport passengers containing elements that would not fit inside a 10 Å pore (Adams et al., 2005; Jong et al., 2007; Leyton et al., 2011; Sauri et al., 2012; Skillman et al., 2005; Wentzel et al., 2001).
For the AT EspP, previous studies have shown that the portion of its α-helical linker near the periplasmic side of the barrel pore is essential for proper folding of the barrel and passenger translocation (Barnard et al., 2007; Ieva et al., 2008). Intimin’s extended linker is also important for stability of its barrel, as indicated by the lack of heat-modifiability observed for the HA-Int1–430 construct (Figure 2D, lanes 9–10). Thus, we decided to determine the minimum length of intimin’s linker necessary for formation of a heat-modifiable β-domain. Using the HA-Int1–450 construct, we constructed deletion mutants where the C-terminus was truncated two residues at a time. The heat-modifiable mobility of each of these mutants was then assayed and typical shifts in migration were seen until residues Leu 437 and Val 438 were deleted (Figure 5A, lanes 1–14 and Supplementary Figure S4A). At this point, further truncations in the linker led to the production of non-heat-modifiable products (Figure 5A, lanes 15–18). Both Leu 437 and Val 438 form hydrophobic interactions with the barrel wall and the solvation energy changes for these residues upon formation of the interface between the linker and barrel wall are −1.98 and −1.56 kcal/mol, respectively, as estimated using the PISA web server (Krissinel and Henrick, 2007).
We also investigated the contribution to β-domain stability of three salt bridges formed between the linker and the barrel wall. Linker residues Arg 434, Asp 436, and Arg 440 form charge-charge interactions with barrel wall residues Asp 236, Lys 301, and Asp 279, respectively (Supplementary Figure S4A), and are highly conserved in the Int/Inv family (Supplementary Figure S6). We disrupted these interactions with alanine substitutions in the HA-Int1–530 construct and then tested the mutants for heat-modifiability of their β-domains and PK-accessibility of their passenger domains (Supplementary Figures S4B and S4C). We found that disrupting any one of these interactions had no effect compared to the HA-Int1–530 wild-type construct, suggesting that multiple interactions between the barrel and linker need to be perturbed to significantly alter the stability of the β-domain or translocation of the passenger.
A novel structural feature found in intimin, but not ATs thus far, is the periplasmic α-helix that lines the bottom side of the barrel. To test whether this region is important for forming a stabile β-barrel or passenger translocation, we replaced this domain (residues 414–433) with a short linker consisting of two glycine residues to create the HA-Int1–530Δ414–433 construct. Unexpectedly, this construct showed wild-type heat-modifiability and PK-sensitivity, demonstrating that its β-barrel was properly folded and that the truncated passenger domain was secreted (Figures 5B and 5C).
To solve the β-domain of another Int/Inv family member, we used the intimin structure as a guide to solve Y. pseudotuberculosis invasin. First, we aligned the residues included in the structure with full-length invasin and found that residues 147–390 of invasin aligned with ~50% identity. Similar to intimin, these residues were then cloned behind a PelB signal sequence followed by a 10XHis tag and TEV protease cleavage site to create the construct, 10XHis-TEV-Inv147–390 (Supplementary Figure S1B). However, initial expression trials using this construct led to the production of invasin inclusion bodies and protein that was associated with the OM, but not extractable with detergents. One explanation for this result could be that invasin does not contain a typical E. coli BAM signature sequence. For invasin, this sequence (Q343WNLQMNYRL352) lacks the conserved Phe/Trp residue normally found at the last position. We hypothesized that changing this sequence for invasin so that it more closely matched the E. coli BAM signature sequence would result in insertion into the OM. Indeed, after mutating Leu 352 to a Phe residue, we were able to purify milligram quantities of properly folded and inserted invasin (Supplementary Figure S1D). The invasin β-domain was crystallized in monoolein lipidic mesophases and its structure was solved to 2.3 Å resolution by molecular replacement, using the high-resolution LCP structure of the intimin β-domain (Table 1). As expected, the invasin β-domain is monomeric and composed of a 12-stranded β-barrel and linker (Figure 6A). The β-domains of invasin and intimin are structurally very similar with an RMSD for their Cα carbons of 0.70 Å (Figure 6B) while the barrel domains (linker and portions of some extracellular loops removed) of invasin and EspP are less similar with an RMSD of 2.8 Å and sequence identity of 9% based on the structural alignment. One difference between the intimin and invasin structures lies in the first extracellular loop. For invasin, this loop is in a “flipped-out” conformation compared to intimin due to a two amino acid insertion in this region (Figure 6B and Supplementary Figure S6).
The β-barrel and linker regions of intimin and invasin share 50% sequence identity and 65% sequence similarity when their sequences are aligned via the NCBI BLAST server. On the other hand, their extracellular passenger domains share only 25–30% sequence identity with 37–50% sequence similarity. Additionally, a recent study by Tsai et al. on homologues of intimin and invasin confirmed that the β-domain is the most conserved part the protein. Since we identified the exact boundaries for the β-barrel and linker portions of intimin and invasin, we tried searching for new Int/Inv family members using only residues 208–449 of intimin and 147–390 of invasin rather than their entire protein sequences. BLAST queries against the non-redundant database using these regions identified 769 protein sequences with an E-value cut-off of 10−2. After removing sequence redundancy and additional filtering (see Supplementary Methods), we identified 146 representative members of the Int/Inv family, significantly more than the 69 bacterial proteins identified by Tsai et al. Analysis of these 146 sequences by the NCBI Taxonomy Browser indicated that they belonged to 83 Gram-negative bacterial species and one uncultured bacterium. Of the 83 Gram-negative bacterial species identified, 76 were γ-proteobacterial, 5 were β-proteobacterial, 1 was α-proteobacterial, and 1 was from Chlamydia (Supplementary Table S2 and Supplementary Figure S5). Multiple sequence alignment (MSA) of these 146 proteins was then performed using the programs COBALT (Papadopoulos and Agarwala, 2007) and MUSCLE (Edgar, 2004) (Supplementary Figure S6). A large number of highly conserved residues were observed in the MSA of the β-domain. Residues showing 100% identity across all 146 representative members are located in an interior section of the barrel wall that is close to the linker and in the linker itself (Supplementary Figures S6 and S7). Moreover, the MSA shows that β-strands are structurally conserved in the Int/Inv family and sporadic insertions and deletions are almost exclusively located in the loop regions of the β-domain (Supplementary Figure S6).
In addition to identifying conserved residues in the Int/Inv family, we also searched for residues that co-evolve. Residues undergo co-evolution when mutation in one residue triggers a mutation in another residue. These simultaneous changes allow a protein to maintain its conformational and functional stability during evolution. For example, a mutation that has a negative impact on the function of the protein will be accommodated by a compensatory mutation at a nearby site or sites such that the function of the protein is preserved. Complementing conservation analysis, co-evolution analysis allows for detecting functionally and structurally important sites that are not necessarily conserved. In particular, although such an analysis is based on sequence only it often reveals relations between residues that are close in space rather than sequence. Co-evolving pairs or groups of residues can be identified by searching for pairs of residues whose mutations are correlated more highly than expected from common evolutionary history of the whole proteins/domains. After co-evolution analysis with a slightly modified version of the Mlp method (Dunn et al., 2008), we found that the β-domain of the Int/Inv family contains numerous pairs of residues with significant co-evolution (Supplementary Table S3). A large group of co-evolving residues forms a patch of spatially close residues that cluster on one side of the β-domain (Figure 7A and C). The majority of these residues point towards the barrel pore from the barrel wall or are in extracellular loops near the barrel pore while two residues (Ile 444 and Lys 449) are in the linker. Additionally, this co-evolving patch is located near the cluster of residues that are 100% identical across the 146 Int/Inv representative family members (Supplementary Figure S7). Similar to the co-evolving patch, these identical residues all point towards the barrel pore while one residue (Arg 440) is part of the linker. Since the residues in the co-evolving patch and the identical residues are all located on the side of the β-domain that is near the linker or in the linker itself, this suggests that these regions of the β-domain are critical for Int/Inv biogenesis. However, we made deletions and point mutations (R440A, D279A) in this region and saw no effect on passenger translocation or heat-modifiable mobility (Figure 5A and Supplementary Figure S4). Together, these data suggest that several residues, whether they are part of the co-evolving patch or are highly conserved, need to be mutated to obtain a significant stability or translocation phenotype.
Co-evolving pairs of residues that were not part of the large co-evolving group (Supplementary Table S3) were also found throughout the β-domain (Figures 7B and 7D). The majority of the pairs of co-evolving residues are close to one another in the structure while some are also separated by large distances in the sequence. For example, Glu 324 and Arg 254 are separated by 70 amino acids in sequence, yet they were predicted to co-evolve in our analysis and were then found to form a salt-bridge in the intimin structure (Figures 7A and 7C).
Here, we present the first β-domain structures of two archetypal members of the Int/Inv family of adhesins. Previous work identified the β-domain of intimin as including residues 189–550, however we show that residues 450–500 actually form a protease resistant domain (D00) that is part of the extracellular passenger. The D00 domain is not recognized as a BIG or C-type lectin-like domain and its function is unknown. This domain could be involved in dimerization since a proteolytic fragment of intimin containing residues 189–550 has been shown to dimerize while our shorter construct containing residues 208–449 cannot. Another possible function for the D00 domain that could be explored in future studies is its role in passenger translocation. For ATs and intimin, this region of the passenger is translocated to the cell surface first while the majority of the passenger is still in the periplasm. Furthermore, for ATs, it has been shown that proper folding of this region is then necessary for translocation of the full-length passenger to the extracellular space (Adams et al., 2005; Peterson et al., 2010). The intimin and invasin β-domains also include novel features not seen in ATs including an extended linker, a large cavity inside the barrel, and a periplasmic α-helix. Similar to ATs, the portion of the linker closest to the periplasm is important for stability of the β-barrel. However, the periplasmic α-helix can be replaced by two glycines with little effect. One possible function for this α-helix could be to seal the large cavity inside the barrel from the periplasmic side. Finally, we used the precise boundaries of the highly conserved intimin and invasin β-domains to identify 146 non-redundant representative members of the Int/Inv family. We then used this set of sequences along with our structural data to find and map the evolutionarily constrained residues within the β-domain. Evolutionary analysis of these sequences pointed to many conserved residues and co-evolving groups of residues. In particular, perfect conservation of many residues in the interior of the barrel wall closest to the linker together with co-evolution of the linker and wall residues in the same portion of the barrel suggests the functional and structural importance of this region.
Details on plasmid construction are available in the supplement. To assay protein heat- modifiable mobility, E. coli AD202 cells transformed with the appropriate plasmids were grown in M9 minimal medium containing all the amino-acids except Met and Cys and supplemented with 100 μg/mL ampicillin or 50 μg/mL kanamycin. At an OD600nm=0.2, synthesis of the protein was induced by adding 100 μM IPTG. After 30 min of induction, cells were harvested by centrifugation (3000 rpm, 10 min, 4°C), resuspended in 1x PBS containing 1 mM PMSF (phenylmethanesulfonylfluoride) to an OD600nm=10, and sonicated until clear. N-octyl β-D-glucopyranoside was then added to the lysate to a final concentration of 1% (wt/vol) and samples were incubated for 15 min at room temperature. Unbroken cells and insoluble materials were then removed by centrifugation (5000xg, 10 min, 4°C). The supernatant was collected, and 15 μl was mixed with an equal volume of 2X SDS-PAGE loading buffer. These samples were incubated at the indicated temperatures for 10 min before separation by SDS-PAGE. Results were visualized via Western blot with an anti-HA antibody.
To assay passenger domain accessibility to proteinase K in whole cells, E. coli AD202 cells transformed with the appropriate plasmids were grown and induced as described for the β-domain stability assay. After 30 min of induction, two 1 mL aliquots were taken and the cells were harvested by centrifugation (3,000 rpm, 10 min, 4°C) and then resuspended in 1 mL of PBS. One sample was left untreated (−PK) while the other was treated (+PK) with 10 μL of 20 mg/mL proteinase K (Calbiochem). After 30 min of incubation at 4°C, 10 μL of 100 mM PMSF was added and the samples were TCA (trichloroacetic acid) precipitated and centrifuged (13,000 rpm, 4°C, 10 min). The pellets were resuspended in Trix buffer (15% glycerol, 200 mM TRIS, 15 mM EDTA, 4% SDS, 10 mM DTT), mixed with SDS-PAGE loading buffer, and resolved by SDS-PAGE. Results were visualized via Western blot using an anti-HA antibody.
Detailed expression and purification protocols are available in the Supplement. Purified native or selenomethionine-labeled 10xHis-TEV-Int208–449 and 10xHis-TEV-Inv147–390 in the size exclusion chromatography buffer were concentrated to 40 mg/mL and 28 mg/mL, respectively, and then diluted with dH2O to 20 mg/mL. Monoolein (Nu-Chek Prep) was melted at 42°C and then 60 μL of molten monoolein was mixed with 40 μL of protein at 20 mg/mL in a coupled syringe apparatus as described previously (Caffrey and Cherezov, 2009). The final concentration of protein in the lipidic mesophase was 8 mg/mL. Lipidic mesophase crystallization trials were set up using 96-well Laminex bases (Molecular Dimensions) and a Mosquito LCP robot (TTP Labtech) and then sealed with Laminex film covers (Molecular Dimensions). Each well contained 100 nL of lipidic mesophase and 750 nL of well solution. The plates were incubated at 21°C and crystals were visible within 1–2 days. Final optimized well solution for native 10xHis-TEV-Int208–449 contained 0.1 M sodium citrate pH 4.5–5.5, 0.05–0.1 M NaCl, 0.1–0.15 M MgCl2, and 30–34% PEG-400. SeMet-10xHis-TEV-Int208–449 optimized well solution contained 0.1 M phosphate-citrate pH -4.2, 30–40% ethanol, and 1–3% PEG-1000. 10xHis-TEV-Inv147–390 optimized well solution contained 0.05 M sodium citrate pH 3.8–4.4, 0.2 M Li2SO4, and 23–35% PEG-400.
Crystals were directly harvested from LCP using LithoLoops from Molecular Dimensions and flash-frozen in liquid nitrogen until data collection. All data were collected at the GM/CA-CAT 23ID-B and 23ID-D beamlines at the Advanced Photon Source within the Argonne National Laboratory. All data was processed using HKL2000 (Otwinowski and Minor, 1997). MAD phasing with the SeMet datasets (Peak, Inflection, and H.E. Remote) was performed using SHARP (Bricogne et al., 2003), which found 5 selenium sites and resulted in interpretable electron density maps after density modification using SOLOMON (Abrahams and Leslie, 1996). An initial model was built into the electron density maps and then used to solve the high resolution native 10xHis-TEV-Int208–449 dataset via molecular replacement using the program PHASER (McCoy et al., 2007). The 10xHis-TEV-Inv147–390 structure was solved via molecular replacement using PHASER with the high-resolution native 10xHis-TEV-Int208–449 structure as the search model. The structures were refined using the programs PHENIX (Adams et al., 2010) and REFMAC (Murshudov et al., 1997) interspersed with rounds of model building using Coot (Emsley and Cowtan, 2004). Data collection and refinement statistics are available in Table 1. Each data set was obtained from a single crystal. All figures containing molecular graphics were prepared using the program Pymol (Schrödinger).
BLAST (Altschul et al., 1990) searches using E-value cut-offs of 0.01 were performed against the non-redundant protein database with the intimin (gi: 118201527) and invasin (gi: 51596004) β-barrel sequences. The presence of a β-barrel in each protein from the search was confirmed using the Conserved Domain Database (Marchler-Bauer et al., 2011). Redundant sequences with >95% sequence identity were removed using the program CD-HIT (Li and Godzik, 2006). The remaining sequences were compiled into a MSA using the programs COBALT (Papadopoulos and Agarwala, 2007) and MUSCLE (Edgar, 2004) and the MSA was further improved by manual inspection. The final MSA contained 146 non-redundant representative members of the Int/Inv family. Co-evolving residues were identified using the modified Mlp method (Dunn et al., 2008). Detailed methods are available in the supplement.
We thank the members of the user support staff at the GM/CA-CAT beamline at the Advanced Photon Source, which is supported by National Cancer Institute grant Y1-CO-1020 and National Institute of General Medical Sciences grant Y1-GM-1104, for their assistance during data collection. This work was supported by the Intramural Research Program of the National Institute of Diabetes & Digestive & Kidney Diseases (JWF, NN, TJB, and SKB) and the National Library of Medicine (DW and TMP) of the National Institutes of Health (NIH); by the NIH Common Fund in Structural Biology grant P50 GM073197 (WL and VC); by NIH training grant GM008572 (EU), and by a Polish National Science Center grant 2011/01/B/ST6/02777 (DW).
The coordinates and structure factors have been deposited in the Protein Data Bank as entries: 4E1S for intimin and 4E1T for invasin.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.