The OBP gene family in the silkworm
We have identified a total of 44 candidate OBP genes in the newly assembled silkworm genome (Table ). These include the four previously identified OBPs: PBP1 (pheromone binding protein), GOBP1 and GOBP2(general odorant binding proteins), and ABPX (antennal binding protein) [32
]. Additionally, the PBP2 and PBP3 identified herein have been deposited in GenBank (accession numbers AM403100
). Thirty-five OBP-like genes were predicted by the GLEAN algorithm, 17 of which had their intron-exon boundaries corrected manually. Genomic regions containing another eight OBP genes were predicted by FGENESH+ and exon/intron splice site prediction software. We found many ESTs to support the prediction of 24 of the OBP genes. Because of the poor similarity of the signal peptide sequences, the first exons were difficult to predict for seven of the OBPs. OBP24 was incomplete, lacking the N-terminal region. This gene may be intact in the genome or a pseudogene. OBP9 was identified from EST data, but the entire gene is missing from the current assembly of the genome. It is possible that the current assembly does not cover the entire genome. The corresponding annotations on the Silkworm Genome Database (SilkDB) have been updated http://silkworm.swu.edu.cn/silkdb/
The silkworm OBP genes family
OBP genes are clustered in the genome
42 OBP genes were distributed across 10 chromosomes, BmOBP28 is located on scaffold896, but cannot be mapped to a chromosome based on the current genome assembly (Figure , ). More than three-quarters of the silkworm OBP genes are located within clusters as also seen in D. melanogaster
, A. gambiae
and A. mellifera
]. Twenty OBP genes are organized into five clusters on five chromosomes. Of these gene clusters, only Cluster 5 includes intervening genes. The largest cluster (Cluster4) contains 12 OBP genes, which occur in both orientations within a 90 kb region on chromosome 18. Cluster5 (OBP1–6), including PBP1, GOBP1 and GOBP2 reported in previous studies, is located on chromosome 19 in the same orientation. Three non-OBP genes are located between OBP1 and OBP2. Cluster1 (OBP22–27) is located on chromosome 5. OBP26 is present in the reverse orientation to the other members of the cluster. Three smaller clusters (Cluster2, Cluster3 and Cluster6) containing two or four genes are present on chromosome 7, 14 and 23. This genome organization, and especially the presence of several large clusters, indicates a relatively recent expansion of the silkworm OBP family.
Figure 1 Genomic locations of silkworm odorant-binding protein genes. 38 OBP genes are distributed across six chromosomes. Another four OBP genes (obp20, obp38, 39 and 44) are represented on chromosome 26, 16, 22 and 12, respectively. The four genes have been (more ...)
Figure 2 Four representative OBP gene clusters present in silkworm. Four gene clusters are located on scaffold2529, scaffold2902, scaffold2943 and scaffold3052, respectively. Each gene is depicted by arrowheads presenting the orientation of transcription in the (more ...)
Characteristics of the silkworm OBP family
Insect OBPs are generally quite divergent and the overall pairwise sequence identity is modest [19
]. The alignment of the predicted silkworm OBP-like proteins (Figure ) shows low average pairwise sequence identity between OBP family members. The predicted proteins have low molecular masses (14–22 kDa) and signal sequences are predicted at the hydrophobic N-terminus.
Figure 3 Alignment of the silkworm OBP-like family members. The signal peptides are boxed. Conserved residues are shown with a green or light green background. Highly conserved cysteine residues are marked by dark arrowheads. The rectangles under the alignment (more ...)
A six-cysteine signature is the most typical feature of classical insect OBPs [33
]. The spacing pattern of conserved cysteines in the silkworm OBP family is similar to that in Drosophila
. Following the naming system proposed by Hekmat-Scafe et al
], we refer to OBPs missing C2 and C5 as Minus-C OBPs, and those carrying more than six conserved cysteine residues as Plus-C OBPs. All six cysteine residues are completely conserved among twenty-nine typical silkworm OBPs. The spacing pattern of conserved cysteines in these typical OBPs is C1-X25–68
-C6 (where X is any amino acid). There are ten Minus-C OBPs (OBP22–31) that are missing the second and the fifth cysteines. Five Plus-C OBP members (OBP40–44) carry additional conserved cysteines located between C1 and C2 and after C6.
The majority of silkworm OBP genes carry 0–4 introns that are located in conserved positions (Figure ). Most introns are inserted in phase 0 and 1. Generally, the first intron is always present in phase 0, near the cleavage site of the predicted signal peptide. Six classes of conserved splice sites have been identified in the honey bee, D. melanogaster
, A. gambiae
, and T castaneum
(Figure ) [23
]. The splice sites in most silkworm OBPs belong to one of the six classes. However, several genes appear to have introns inserted in nonconserved positions or phases, such as Cluster5 (Figure ).
Phylogenetic analysis of the silkworm OBP family
The phylogenetic tree of the silkworm OBPs, constructed using the neighbor-joining method (Figure ), indicates six possible protein subfamilies. Following the description by Hekmat-Scafe, we named these subfamilies as PBP/GOBP, CRLBP, ABPI and ABPII as well as the two atypical families Plus-C and Minus-C. High bootstrap values support many terminal relationships and three subfamilies: PBP/GOBP, Minus-C and ABPI. However there was weak support for the other three subfamilies: ABPII, Plus-C and CRLBP and the overall tree architecture. The groupings are supported by a number of additional features.
Figure 4 Phylogenetic comparison of the OBP protein family members in the silkworm. An unrooted distance (neighbor-joining) tree was constructed using an alignment of the silkworm OBP-like family members after removing the highly divergent signal peptide sequences (more ...)
First, the spacing pattern of conserved cysteines is similar within each subfamily. The spacings of C1–C2 and C4–C5 in the PBP/GOBP and ABPI subfamilies are larger than in other subfamilies. By contrast, the spacing between C3 and C4 of Minus-C and ABPII is smaller than in the others. In all of the members of the Plus-C subfamily, C2 and C3 are separated by four residues, while C5 and C6 are separated by seven residues.
Second, the pairwise identity within each subfamily is higher than that between members of different subfamilies. The PBP/GOBP subfamily has the highest average pairwise sequence identity (36%), with a range from 22% to 55%. The average identities for the ABPII and Minus-C subfamilies are 35% and 29%, respectively. The other three subfamilies have lower internal sequence identities. Genes within the ABPI subfamily have lower average identity values than those in the ABPII subfamily.
Third, subfamilies are supported by the chromosomal clustering of OBP genes. The PBP/GOBP subfamily comprises the six members in Cluster 5. The Minus-C subfamily comprises nine members, of which seven occur in Cluster 1 and Cluster 2. OBP28, which is located on a small scaffold, shares high identity (78%) with OBP22. The Plus-C subfamily comprises five members, of which four members are located on chromosome 23. The CRLBP subfamily comprises eight members. Six of these are on chromosome 14 and OBP32–35 form Cluster 3. Gene Cluster 4 is divided into two subfamilies: the ABPI subfamily comprising seven members and the ABPII subfamily containing five genes. Two additional ABP family members are located on another scaffold. OBP20 has a single exon. In addition, we found the transcription terminating signal (AAACAAAA) in the 3' UTR. Two direct repeat sequences (TAATGAAATAAAATTA) are present in the 5'UTR and the 3'UTR. OBP20 may have moved to new genomic positions by retroposition.
Finally, all members within a subfamily share certain common intron insertion sites, which differ among subfamilies. The PBP/GOBP subfamily contains two intron insertion sites which were not found in D. melanogaster, A. gambiae, A. mellifera and T. castaneum. The ABPI and ABPII subfamilies have lost the conserved splice sites at S6 and S2, respectively. In the Minus-C subfamily, Cluster 1 and OBP28 have only one intron at the N-terminus, whereas Cluster 2 and OBP29 have additional introns at non-conserved sites. In the Plus-C subfamily, three common intron insertion sites are located at S1, S3 and S6 sites. In the CRLBP subfamily, OBP32, OBP35 and OBP36 have only one single exon each. OBP37 and OBP38 have conserved splice sites.
It is notable that the relatedness of the ABPII and Minus-C subfamilies are supported by a better bootstrap value than that for the ABPII and ABPI subfamilies. This interesting feature has also been found in bee and Drosophila
. This suggests that the Minus-C subfamily may be derived from an ancestor with six conserved cysteines. To better understand the high degree of OBP sequence divergence, we analyzed the evolutionary constraints that were acting on this gene family. The average pairwise ratio of nonsynonymous to synonymous substitutions (d
S) for sequences in each subfamily was < 1. This indicates that there is strong negative selection for silkworm OBP genes. However, we observed that pairwise d
S values for several members of the Minus-C family are > 1 (see Additional file 1
). This suggests that the members of the Minus-C subfamily are undergoing positive selection.
We also built a phylogenetic tree based on the alignment of OBP sequences in five species, B. mori, D. melanogaster, A. gambiae, T. castaneum and A. mellifera, representing four orders (Figure ). The six subfamilies for the silkworm defined above also form clades. Despite little bootstrap support for the clade, the Minus-C and ABP subfamilies are grouped together with OBPs in A. mellifera and T. castaneum. Only a few orthologies could be found among these five species (e.g., BmorOBP38 having two fly orthologues). The other subfamilies show no obvious relationship across species. This suggests that significant lineage-specific expansion and divergence have occurred in these insects.
Figure 5 A Neighbor-joining unrooted tree of annotated OBP protein among B. mori, D. melanogaster, A. gambiae, T. castaneum and A. mellifera. Bootstrap support is based on 1000 resampled data sets. The tree is condensed to show only branches with 15% bootstrap (more ...)
Expression patterns of the silkworm OBP genes
A numbers of EST libraries have been constructed for silkworm and more than 238,000 ESTs are available in GenBank. ESTs corresponding to 24 BmorOBP genes were identified using tBLASTn with BmorOBP protein sequences. The coding regions of 16 genes were covered completely by ESTs (Table ). ESTs for 17 of the OBP genes were recovered broadly from chemosensory libraries, including larval maxillary galea, epidermis, brain and adult antennae. Interestingly, ESTs for seven of the OBP genes were only found in maxillary galea. OBP23 is present in silkgland and wing disk besides maxillary galea. It is noteworthy that OBP23 and OBP11 are highly expressed in maxillary galea, with 37 and 100 ESTs, respectively. Three OBP genes (OBP1–3), together with OBP14, are only observed in the antennal library. OBP27 is present in multiple tissues (silkgland, brain, malpighian tubule, fat body, midgut, wing disk and testis) as well as an antennal library. Three genes (OBP27, OBP30 and OBP31) were found in brain, with OBP31 represented by two ESTs in the compound eye. In larval epidermis, only one EST was found for OBP39. Meanwhile, ten BmorOBPs were recovered from non-chemosensory tissue libraries, such as silkgland, malpighian tubule, fat body, midgut, testis, ovary, compound eye, hemocyte, and wing disk. Most ESTs were also found in the fat body, ovary, testis, silkgland, and wing disk.
We designed and constructed a genome-wide microarray using 70-mer oligonucleotides based on the draft silkworm genome sequence database. There are four probes, each of which contains a 20-bases stretch mismatching with their target genes. The probe sw11831 is for OBP7 and OBP8, and sw20121 is for OBP25–27. The gene expression patterns of OBPs were surveyed in multiple silkworm tissues on day 3 of the fifth instar and also in whole insects at 15 different time points from day 3 of the fifth instar larva through to the adult moth. A list of the thirty-one silkworm OBP probes used in this microarray is provided in Table . The expression data are visualized in Figure .
Figure 6 Expression patterns of silkworm OBPs in multiple tissues of larvae on day 3 of the fifth instar. The levels of expression are illustrated by a five grade color scale representing relative expression levels of < 500, 500–1000, 1000–2000, (more ...)
OBP gene expression in multiple tissues on day 3 of the fifth instar silkworm is consistent with EST representation in the database. We found fifteen genes with significant levels of expression (Figure ). The expression profiles of OBP genes differ markedly even among members of the same gene cluster. The majority of OBPs are expressed in testis, ovary, brain, epidermis and fat body. Three OBPs (OBP23, 25 and 31) gave stronger signals in brain and epidermis than in other tissues. Five genes (OBP1, 2, 13, 19, and 42) are restricted to brain and have low expression levels. OBP40 and OBP41 share a similar expression pattern in six tissues. However, overall, OBP41 is expressed at higher levels than is OBP40. OBP43 is expressed at low levels in testis, ovary, epidermis and fat body, which is a different pattern than that for OBP42. Sex-biased expression was examined based on two-fold differences in expression level between the sexes. OBP29, which is expressed at the highest level in testis and at low levels in ovary and fat body of males, was the most interesting case. However, the expression of the majority of OBPs does not appear to have an obvious sexual bias on day 3 of the fifth instar.
Our whole-organism array data failed to detect significant expression levels of 11 of the OBP genes at any time point in either sex. It is possible that some of these genes are expressed during these life stages but at levels below the detection limit. For example OBPs 13, 19 and 27, although not detected at the whole organism level in fifth instars, are detected in specific tissues at that life stage. Three classes of developmental expression profile for 18 members of the silkworm OBP-like family are shown in Figure . Expression of the first class of OBPs was restricted to adults and pupae near eclosion. A very faint signal was observed in male moths for OBP14. Three members of the same gene cluster (OBP1, 2, 3) and OBP13 were detected at moderate levels in adult. OBP3 was only observed in male. OBP42 was more abundant in males than females nine days after spinning. OBP30, OBP32 and OBP36 were detected with weak signals prior to the emergence and reached the highest level in male moths.
Figure 7 The developmental expression patterns of silkworm OBPs. The columns represents fifteen different sample time points: day three of the fifth instar and fourteen different times after spinning: 0 hour, 12 hours, 24 hours, 36 hours, 48 hours, 60 hours, 3 (more ...)
The second class of OBPs with three members (OBP31, 40, 41) is strongly expressed throughout all stages. Expression of OBP31 gradually rises and reaches its highest level in late pupae. OBP40 and OBP41 show three obvious expression peaks at larva, 60 h after spinning and adult. The expression of OBP29 in males, which is expressed at high levels in all lifes stages and reaches the highest level in adults, also follows this pattern.
Members of the third class of OBPs were expressed in several distinct phases. OBP23 is a good example with expression peaks in the larva, four days after spinning and again with a weak peak in the adult moth. The expression peaks of OBP25 are at 0 h, 60 h after spinning and in the adult. Furthermore, the transcripts were more abundant in males than in females at the late pupae stages. The highest expression level of OBP43 was at 12 h and gradually weakened until 6D after spinning. Only a weak signal was detected in the adult female. OBP8 shows a weak signal at several time points. OBP29 expression in females also followed this general pattern with high values 6d and 9d after spinning. This contrasts with its expression pattern in males, as described above.
In addition, we determined expression levels of OBP3–6 in moth antennae by QRT-PCR (Figure ). Consistent with previous studies [32
], BmorOBP3 was predominantly expressed in antennae of male moths. BmorOBP4 was equally expressed in antennae of both sexes. In contrast, BmorOBP5 showed substantially higher expression in female antennae. BmorOBP6 was marginally more highly expressed female antennae.
Four candidate PBP gene expression levels in the antennae of moths. Expression levels relative to the control gene BmActin3 are quantified by QRT-PCR. Bars on each column represent SD for three independent experiments.