Insect cuticles are formed during molting under hormonal regulation and are composed of complex and composite materials, made mainly of chitinous filaments embedded in proteinaceous layers (see [1
] for review). The cuticle acts as both skin and exoskeleton, and has diversified in its mechanical properties for optimal biological functions in various insects. The differences in the mechanical properties of the exoskeleton are probably dependent on the respective features of various cuticular proteins and chitin itself, on the precise combination of different cuticular proteins, and on their secondary stabilization, called "sclerotization". However, the number of types of cuticular proteins included in the cuticle has not yet been resolved, nor how their coordinated expression in epidermal cells and their excretion into the cuticle are precisely controlled during molting. These issues are very important for any understanding of the mechanisms underlying the formation of the highly ordered and layered structure of the cuticle.
The amino acid sequences of cuticular proteins have been reported for a wide variety of insects, by directly sequencing the purified cuticular proteins or by their deduction from the corresponding cDNA sequences [1
]. However, most excreted cuticle components are cross-linked, making them unextractable [2
]. Therefore, we infer that many other cuticular proteins are yet to be identified. Previously reported information on cuticular protein sequences has revealed several conserved motifs, such as the R&R consensus [1
], Tweedle [4
], and a 44-amino-acid motif [5
]. The R&R consensus sequence is the most prevalent motif, and was first reported by Rebers and Riddiford [6
]. An extended version of this consensus sequence was subsequently described and is generally referred to as the R&R consensus, which is known to bind chitin [1
]. Three distinct forms of this consensus are recognized: RR-1, RR-2, and RR-3 [1
]. RR-1 is characteristic of proteins in soft and flexible cuticles, and RR-2 proteins are associated with stiff and hard cuticles in general, although this classification is tentative [1
]. Many other cuticular proteins lacking these motifs have structures containing other repeated structures, such as GGX or AAP(A/V) [1
The comprehensive identification of cuticular proteins with an R&R motif has been attempted in Drosophila melanogaster
], Apis mellifera
], and Anopheles gambiae
], based on their genome sequences. Recently, we identified 220 putative cuticular protein genes (RR-1 56, RR-2 89, RR-3 3, Tweedle 4, CPF 1, CPFL 4, glycine-rich 29, and 34 other genes) in the silkworm Bombyx mori
]. A key question is whether each cuticular protein gene is expressed in a tissue- or stage-specific manner. Togawa et al. [16
] investigated the developmental expression patterns of all Anopheles
cuticular protein genes with R&R consensus motifs, using real-time quantitative reverse transcription-polymerase chain reaction (qRT-PCR). They found that all the genes are expressed and can be grouped into 21 clusters with different developmental expression profiles, although their tissue specificities were not examined.
has an advantage in the study of tissue specificity and the expression profiles of each tissue because its tissues are relatively large and it is easy to construct a tissue-specific cDNA library. More than 40 expressed sequence tag (EST) databases for various tissues at the different developmental stages of Bombyx
are available [17
]. Exhaustive sequencing of the full-length cDNAs expressed in epidermal cells is an effective method of gaining an overview of cuticular proteins, because it avoids directly sequencing the barely extractable cuticular proteins. Therefore, we constructed a library of full-length cDNAs, using a G-capping method [20
], from mRNAs expressed in the larval epidermal cells of the silkworm, B. mori
, during the fourth larval molt, when the subsequent cuticle of the fifth instar is formed.
We sequenced over 10,000 clones randomly selected from the library described above, and isolated 6,653 ESTs belonging to 1,451 nonredundant gene clusters. Seventy-one clusters were considered to be isoforms or premature forms of other clusters. Therefore, we identified 1,380 putative genes. About half of these ESTs encoded cuticular proteins, representing 92 genes. In addition to cuticular proteins, we also identified 290 genes encoding the amino acid sequences of their putative signal peptides, suggesting that they play some role in cuticle formation or other molting events. Here, we list those cuticular protein genes, secreted protein genes, and other epidermal protein genes, including those encoding transcription factors and many enzymes. These data should be useful in understanding cuticle formation and the insect molt.