Dm and SG cDNA Library Construction
Dm were reared in an insectary room in the Tropical Medicine Department, University of Brasilia. They were kept at 27°C ± 1.0°C with a relative humidity of 70% to 75% and a 16 h:8 h light:dark photoperiod. SGs were dissected from Vth instar nymphs and adults 7 days after a blood meal and transferred to RNA-Later (Ambion, Austin, TX, USA) solution in 1.5-ml polypropylene vials. SGs were kept at −20°C for isolating polyA+ RNA.
Dm SG mRNA was isolated from 15 SG pairs from Vth instar nymphs and adults using the Micro-FastTrack mRNA isolation kit (Invitrogen, San Diego, CA, USA). The PCR-based cDNA library was made following the instructions for the SMART (switching mechanism at 5′ end of RNA transcript) cDNA library construction kit (Clontech, Palo Alto, CA, USA). This kit provides a method for producing high-quality, full-length cDNA libraries from nanogram quantities of polyA+ or total RNA. It utilizes a specially designed oligonucleotide named SMART IV™ in the first-strand synthesis to generate high yields of full-length, double-stranded cDNA. Dm SG polyA+ RNA was used for reverse transcription to cDNA using Moloney murine leukemia virus reverse transcriptase (Clontech), SMART IV oligonucleotide, and CDS III/3′ primer (Clontech). Trehalose was added to the reaction that was carried out at 60°C for 1 h, then at 42°C for 40 min. Second-strand synthesis was performed according to a long-distance PCR-based protocol using the 5′ PCR primer and CDS III/3′ primer as sense and antisense primers, respectively. These two primers also create Sfi1A and B restriction enzyme sites at the end of the cDNA. Advantage™ Taq polymerase mix (Clontech) was used to carry out long-distance PCR reaction on a GeneAmp® PCR system 9700 (Perkin Elmer Corp., Foster City, CA, USA). The PCR conditions were: 95°C for 1 min; 14 cycles of 95°C for 10 s, 68°C for 6 min. An aliquot of the cDNA was analyzed on a 1.1% agarose/EtBr (0.1 μg/ml) gel to check for the quality and range of the synthesized cDNA. Double-stranded cDNA was immediately treated with proteinase K (0.8 μg/ml) at 45°C for 20 min and washed three times with water using Amicon filters with a 100-kDa cutoff (Millipore, Bedford, MA, USA). The clean double-stranded cDNA was then digested with SfiI restriction enzyme at 50°C for 2 h followed by size fractionation on a ChromaSpin–400 drip column (Clontech). The profiles of the fractions were checked on a 1.1% agarose/EtBr (0.1 μg/ml), and fractions containing cDNA were pooled in three different groups according to size: large, medium, or small sequences. Each group was concentrated and washed three times with water using an Amicon filter with a 100-kDa cutoff. Concentrated cDNA was then ligated into a λ TriplEx2 vector (Clontech), and the resulting ligation mixture was packaged using GigaPack® Gold III Plus packaging extract (Stratagene, La Jolla, CA, USA) according to the manufacturer’s instructions. The packaged library was plated by infecting log-phase XL1-Blue Escherichia coli cells (Clontech). The percentage of recombinant clones was determined by blue-white selection screening on LB/MgSO4 plates containing X-gal/IPTG.
Sequencing of the Dm cDNA Library
The Dm SG cDNA library was plated on LB/MgSO4 plates containing X-gal/IPTG to an average of 250 plaques per 150-mm Petri plate. Recombinant (white) plaques were randomly picked up and transferred to 96-well MicroTest ™ U-bottom plates (BD BioScience, San José, CA, USA) containing 75 μl of H2O per well. The phage suspension was either immediately used for PCR or stored at 4°C until use.
To amplify the cDNA using a PCR reaction, 4 μl of the phage sample were used as a template. The primers were sequences from the λ TriplEx2 vector and named PT2F1 (5′-AAG TACTCTAGCAATTGTGAGC-3′) and PT2R1 (5′-CTCTTCGCTATTACGCCAGCTG-3′), positioned at the 5′ and 3′ end of the cDNA insert, respectively. The reaction was carried out in MicroAmp 96-well PCR plates (Applied Biosystems, Inc., Fullerton, CA, USA) using FastStart PCR Master (Roche Molecular Biochemicals, Indianapolis, IN, USA) on a GeneAmp® PCR system 9700 (Perkin Elmer Corp.). The PCR conditions were: 1 hold of 75°C for 3 min, 1 hold of 94°C for 4 min, 33 cycles of 94°C for 1 min, 49°C for 1 min, and 72°C for 1 min 20 s. The amplified products were analyzed on a 1.2% agarose/EtBr gel. cDNA library clones (2880 clones) were PCR amplified, and those showing a single band were selected for sequencing. The PCR products were used as a template for a cycle-sequencing reaction using a DTCS labeling kit (Beckman Coulter, Fullerton, CA, USA). The primer used for sequencing (PT2F3) is upstream of the inserted cDNA and downstream of the PT2F1 primer. The sequencing reaction was performed on a Perkin Elmer 9700 thermocycler. Conditions were 1 hold of 75°C for 2 min, 1 hold of 94°C for 4 min, and 30 cycles of 96°C for 20 s, 50°C for 20 s, and 60°C for 4 min. After cycle-sequencing the samples, a cleaning step was performed using the multiscreen 96-well plate cleaning system (Millipore). The 96-well multiscreening plate was prepared by adding a fixed amount (manufacturer’s specification) of Sephadex-50 (Amersham Pharmacia Biotech, Piscataway, NJ, USA) and 300 μl of deionized water. After partially drying the Sephadex in the multiscreen plate, the entire cycle-sequencing reaction was added to the center of each well, centrifuged at 2,500 rpm for 5 min, and the clean sample was collected on a sequencing microtiter plate (Beckman Coulter). The plate was then dried on a Speed-Vac SC110 model with a microtiter plate holder (Savant Instruments Inc., Holbrook, NY, USA). Dried samples were immediately resuspended with 25 μl of formamide, and one drop of mineral oil was added to the top of each sample. Samples were either sequenced immediately on a CEQ 2000 DNA sequencing instrument (Beckman Coulter) or stored at −30°C. A total of 2728 cDNA library clones was sequenced.
Proteomic Characterization Using One-Dimension Gel Electrophoresis (1DE) and tandem mass spectrometry (MS)
Both the soluble and insoluble (pellet following centrifugation at 16,000 ×g) protein fractions from SG homogenates from Dm corresponding to three pairs of SGs (approximately 500 μg of protein per gland) were brought up in reducing Laemmli gel-loading buffer. The samples were boiled for 10 min, and approximately 100 μg of protein were resolved on a NuPAGE 4–12% Bis-Tris precast gel. Separated proteins were visualized by staining with SimplyBlue (Invitrogen). Gels were sliced into 20 (soluble) or 24 (pellet) individual sections () that were destained and digested overnight with trypsin at 37°C. Peptides were extracted and desalted using ZipTips (Millipore) and resuspended in 0.1% TFA prior to MS analysis.
Figure 1 One-dimensional polyacrylamide gel electrophoresis separation of the soluble (A) and insoluble (B) fractions of Dipetalogaster maxima salivary proteins. The grids indicate how the duplicated gel bands containing the salivary proteins (SN1, SN2, PT1, and (more ...)
Nanoflow reverse-phase liquid chromatography tandem MS was performed using an Agilent 1100 nanoflow LC system (Agilent Technologies, Palo Alto, CA, USA) coupled online with a linear ion-trap (LIT) mass spectrometer (LTQ, ThermoElectron, San José, CA, USA). NanoRPLC columns were slurry packed in-house with 5 μm, 300-Å pore size C-18 phase (Jupiter, Phenomenex, CA) in a 75-μm i.d. × 10-cm fused silica capillary (Polymicro Technologies, Phoenix, AZ) with a flame-pulled tip. After sample injection, the column was washed for 30 min with 98% mobile phase A (0.1% formic acid in water) at 0.5 μl/min, and peptides were eluted using a linear gradient of 2% mobile phase B (0.1% formic acid in acetonitrile [ACN]) to 42% mobile phase B in 40 min at 0.25 μl/min, then to 98% B for an additional 10 min. The LIT-MS was operated in a data-dependent MS/MS mode in which each full MS scan was followed by seven MS/MS scans where the seven most abundant molecular ions were dynamically selected for collision-induced dissociation (CID) using a normalized collision energy of 35%. Dynamic exclusion was applied to minimize repeated selection of peptides previously selected for CID.
Tandem mass spectra were searched using SEQUEST on a 20-node Beowulf cluster against a Dm proteome database with methionine oxidation included as dynamic modification. Only tryptic peptides with up to two missed cleavage sites meeting a specific SEQUEST scoring criteria [delta correlation (ΔCn
) ≥ 0.08 and charge state-dependent cross correlation (Xcorr
) ≥ 1.9 for [M+H]1+
, ≥ 2.2 for [M+2H]2+
and ≥ 3.5 for [M+3H]3+
] were considered as legitimate identifications. The peptides identified by MS were converted to Prosite block format 22
by a program written in Visual Basic. This database was used to search matches in the Fasta-formatted database of salivary proteins, using the Seedtop program, which is part of the BLAST package. The result of the Seedtop search was piped into the hyperlinked spreadsheet to produce a hyperlinked text file with details of the match. Note that the ID lines indicate, for example, sDGM12_80, which means that one match was found for fragment number 80 from gel band 12. Because the same tryptic fragment can be found in many gel bands, another program was written to count the number of fragments for each gel band, displaying a summarized result in an Excel table, for example on cell AX4 of Supplemental Table S2
. The summary in the form of sDGM12 → 5|sDGM13 → 4 indicates that five fragments were found in band 12, while four peptides were found in band 13. Furthermore, this summary included a protein identification only when two or more peptide matches to the protein were obtained from the same gel slice. The summary program also produces additional spreadsheet cells with the larger number of peptides found in a single gel band and the percent amino acid (aa) sequence coverage of the sum of the peptide matches, thus facilitating data analysis.
Proteomic Characterization Using Two-Dimensional Gel Electrophoresis (2DE) and peptide mass fingerprinting (PMF)
SGs obtained from several bug adults dissected at 7 to 9 days following the blood meal were punctured, and intraluminal fluids were harvested by centrifugation (2,000 × g, 5 min, 5°C). Salivary proteins were quantified using the Plus One 2D Quant Kit (GE Healthcare, Uppsala, Sweden) according to the manufacturer’s instructions. Saliva proteins (80 μg) resuspended in 350 μL of 2DE sample buffer (7 M urea, 2 M thiourea, 2.5% v/v Triton X-100, 85 mM DTT, 0.5% v/v Pharmalyte 3–10, 10% isopropanol) were applied to 18 cm pH 3–10 IPG gel strips, and submitted to isoelectrofocusing using an Ettan IPGphor3 Unit (GE-Healthcare, Piscataway, NJ, USA), no current for 6 h and 30 V for 6 h, followed by 500 V for 1 h, 1000 V for 1 h, and, 8000 V for 4 h, with a maximum current of 50 μA per strip, as optimized 19
. Before SDS-PAGE, IPG strips were subjected to reduction in equilibrium buffer (50 mM Tris, pH 8.8, 6 M urea, glycerol 30%, 4% SDS), adding with 125 mM DTT for 40 min at room temperature and alkylation in equilibrium buffer, and with 300 mM acrylamide for an extended 20-min period. Subsequently, the proteins were separated by 12% SDS-PAGE on a Protean II system (BioRad, Richmond, CA, USA), and silver-stained as adapted from Blum et al.23
After washing with ultrapure H2
O, the gels were scanned with the Image Scanner (PowerLook 1120; Amersham Biosciences) at 300-dpi resolution and stored in 1% acetic acid. Digitized images were analyzed with Image Master 2D Platinum 5.0 software (GE Healthcare) to count protein spots.
Protein identification was achieved by tryptic peptide fingerprinting using matrix-assisted laser desorption/ionization (MALDI)-time-of-flight (TOF)/TOF MS. Protein spots were cut from 2DE gels and processed as adapted from Garaguso and Borlak 24
. Briefly, spots were destained for 10 min in a fresh solution of 15 mM potassium ferrocyanide and 50 mM sodium thiosulphate. After destaining, gel pieces were washed with three cycles of 200 μl ultrapure H2
O followed by 50% ACN, then another two cycles in 50 mM NH4
for 5 min followed by washing in ACN for 5 min. During the final washing with ACN, gel fragments were pistil macerated and vacuum dried in a Speed Vacuum for 20 min. Dried gels were rehydrated with 5–10 μl of 25 mM NH4
, 2.5 mM CaCl2
, 12.5 ng/μl trypsin (Promega, Madison, WI, USA) and incubated in ice for 45 min. Immediately after incubation, the remaining solution was removed; to it was added, depending on the size of the gel piece, 5–10 μl of the same digestion buffer without enzyme, and it was incubated overnight at 37°C. The following day, 1 μl of 1% TFA was added to solution covering the gel pieces which contains the digestion product, and 1 μl of this acidified sample was loaded onto a 600-nm AnchorChip™
target plate (Bruker Daltonics, Karlsruhe, Germany) and allowed to dry completely. Afterward, 0.5 μl of 5-μg/μl DHB (2,5-dihydroxybenzoic acid) matrix in 30% ACN and 0.1% TFA was applied, allowed to dry completely, and subjected to PMF analysis. PMF analysis was performed using a MALDI-TOF/TOF mass spectrometer (Autoflex II; Bruker Daltonics). Mass spectra were processed using Flex Analysis 2.2 and Biotools 2.2 software (Bruker Daltronics).
Protein identification was performed using MASCOT software against the database of proteins predicted from the Dm cDNA library in our server. Spectra containing autodigested trypsin peptides were internally calibrated. The following parameters were used for database searches: monoisotopic mass accuracy up to 100 ppm for internally calibrated spectra and up to 200 ppm for uncalibrated spectra; up to one missed cleavage site; propionamide of cysteine as fixed modification; and oxidation of methionine and protein N-acetylation as variable modifications.
Bioinformatic Tools and Procedures
Expressed sequence tags (ESTs) were trimmed of primer and vector sequences, clusterized, and compared with other databases as previously described.25
The BLAST tool,26
and TreeView software29
were used to compare, assemble, and align sequences and to visualize alignments. For functional annotation of the transcripts, we used blastx30
to compare the nucleotide sequences with the non-redundant (NR) protein database of the National Center of Biological Information (NCBI) and to the Gene Ontology (GO) database.31
was used to search for conserved protein domains in the Pfam,33
and conserved domains (CDD) databases.36
We also compared the transcripts with other subsets of mitochondrial and rRNA nucleotide sequences.
Segments of the three-frame translations of the EST (because the libraries were unidirectional, we did not use six-frame translations) starting with a methionine in the first 100 predicted aa—or the predicted protein translation, in the case of complete coding sequences—were submitted to the SignalP server37
to help identify translation products that could be secreted. O
-glycosylation sites on the proteins were predicted with the program NetOGlyc (http://www.cbs.dtu.dk/services/NetOGlyc/
Functional annotation of the transcripts was based on all the comparisons above. Following inspection of all results, transcripts were classified as either secretory (S), housekeeping (H), or of unknown (U) function, with further subdivisions based on function and/or protein families. Sequence alignments were done with the ClustalX software package.39
Phylogenetic analysis and statistical neighbor-joining bootstrap tests (10,000 iterations) of the phylogenies were performed with the Mega package.40
Hyperlinked Excel spreadsheets of the assembled ESTs and of the salivary protein database are supplied as Supplemental spreadsheets S1
at the journal site.