The agricultural pest Ceratitis capitata, also known as the Mediterranean fruit fly or Medfly, belongs to the Tephritidae family, which includes a large number of other damaging pest species. The Medfly has been the first non-drosophilid fly species which has been genetically transformed paving the way for designing genetic-based pest control strategies. Furthermore, it is an experimentally tractable model, in which transient and transgene-mediated RNAi have been successfully used. We applied Illumina sequencing to total RNA preparations of 8–10 hours old embryos of C. capitata, This developmental window corresponds to the blastoderm cellularization stage. In summary, we assembled 42,614 transcripts which cluster in 26,319 unique transcripts of which 11,045 correspond to protein coding genes; we identified several hundreds of long ncRNAs; we found an enrichment of transcripts encoding RNA binding proteins among the highly expressed transcripts, such as CcTRA-2, known to be necessary to establish and, most likely, to maintain female sex of C. capitata. Our study is the first de novo assembly performed for Ceratitis capitata based on Illumina NGS technology during embryogenesis and it adds novel data to the previously published C. capitata EST databases. We expect that it will be useful for a variety of applications such as gene cloning and phylogenetic analyses, as well as to advance genetic research and biotechnological applications in the Medfly and other related Tephritidae.
Eicosapenta peptide repeats (EPRs) occur exclusively in flowering plant genomes and exhibit very high amino acid residue
conservation across occurrence. DNA and amino acid sequence searches yielded no indications about the function due to absence
of similarity to known sequences. Tertiary structure of an EPR protein coded by rice (Oryza sativa japonica) cDNA (GI: 32984786)
was determined based on ab initio methodology in order to draw clues on functional significance of EPRs. The resultant structure
comprised of seven α-helices and thirteen anti-parallel β-sheets. Surface-mapping of conserved residues onto the structure deduced
that (i) regions equivalent to β α4-
the primary function of EPR protein could be Ca2+ binding, and (iii) the putative EPR Ca2+ binding domain is structurally similar to
calcium-binding domains of plant lectins. Additionally, the phylogenetic analysis showed an evolving taxa-specific distribution of
EPR proteins observed in some GNA-like lectins.
ab initio structure prediction; function prediction; repeat proteins; surface mapping; taxa-specific
The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics. A more complete annotation of the genome will benefit functional and comparative studies and accelerate extensive industrial applications for this insect. To realize these goals, we embarked upon a large-scale full-length cDNA collection from 21 full-length cDNA libraries derived from 14 tissues of the domesticated silkworm and performed full sequencing by primer walking for 11,104 full-length cDNAs. The large average intron size was 1904 bp, resulting from a high accumulation of transposons. Using gene models predicted by GLEAN and published mRNAs, we identified 16,823 gene loci on the silkworm genome assembly. Orthology analysis of 153 species, including 11 insects, revealed that among three Lepidoptera including Monarch and Heliconius butterflies, the 403 largest silkworm-specific genes were composed mainly of protective immunity, hormone-related, and characteristic structural proteins. Analysis of testis-/ovary-specific genes revealed distinctive features of sexual dimorphism, including depletion of ovary-specific genes on the Z chromosome in contrast to an enrichment of testis-specific genes. More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters. The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes.
Bombyx mori; large-scale full-length cDNA collection; tissue-specific genes; sexual dimorphism; gene cluster; silkworm
A. assamensis is a phytophagous Lepidoptera from Northeast India reared on host trees of Lauraceae family for its characteristic cocoon silk. Source of these cocoons are domesticated farm stocks that crash frequently and/or wild insect populations that provide new cultures. The need to reduce dependence on wild populations for cocoons necessitates assessment of genetic diversity in cultivated and wild populations. Molecular markers based on PCR of Inter-simple sequence repeats (ISSR) and simple sequence repeats (SSR) were used with four populations of wild insects and eleven populations of cultivated insects. Wild populations had high genetic diversity estimates (Hi = 0.25; HS = 0.28; HE = 0.42) and at least one population contained private alleles. Both marker systems indicated that genetic variability within populations examined was significantly high. Among cultivated populations, insects of the Upper Assam region (Hi = 0.19; HS = 0.18; HE = 0) were genetically distinct (FST = 0.38 with both marker systems) from insects of Lower Assam (Hi = 0.24; HS = 0.25; HE = 0.3). Sequencing of polymorphic amplicons suggested transposition as a mechanism for maintaining genomic diversity. Implications for conservation of native populations in the wild and preserving in-farm diversity are discussed.
The Indian golden saturniid silkmoth (Antheraea assama), popularly known as muga silkmoth, is a semi-domesticated silk producing insect confined to a narrow habitat range of the northeastern region of India. Owing to the prevailing socio-political problems, the muga silkworm habitats in the northeastern region have not been accessible hampering the phylogeography studies of this rare silkmoth. Recently, we have been successful in our attempt to collect muga cocoon samples, although to a limited extent, from their natural habitats. Out of 87 microsatellite markers developed previously for A. assama, 13 informative markers were employed to genotype 97 individuals from six populations and analyzed their population structure and genetic variation.
We observed highly significant genetic diversity in one of the populations (WWS-1, a population derived from West Garo Hills region of Meghalaya state). Further analysis with and without WWS-1 population revealed that dramatic genetic differentiation (global FST = 0.301) was due to high genetic diversity contributed by WWS-1 population. Analysis of the remaining five populations (excluding WWS-1) showed a marked reduction in the number of alleles at all the employed loci. Structure analysis showed the presence of only two clusters: one formed by WWS-1 population and the other included the remaining five populations, inferring that there is no significant genetic diversity within and between these five populations, and suggesting that these five populations are probably derived from a single population. Patterns of recent population bottlenecks were not evident in any of the six populations studied.
A. assama inhabiting the WWS-1 region revealed very high genetic diversity, and was genetically divergent from the five populations studied. The efforts should be continued to identify and study such populations from this region as well as other muga silkworm habitats. The information generated will be very useful in conservation of dwindling muga culture in Northeast India.
The Asian rice gall midge (Orseolia oryzae) is a major pest responsible for immense loss in rice productivity. Currently, very little knowledge exists with regard to this insect at the molecular level. The present study was initiated with the aim of developing molecular resources as well as identifying alterations at the transcriptome level in the gall midge maggots that are in a compatible (SH) or in an incompatible interaction (RH) with their rice host. Roche 454 pyrosequencing strategy was used to develop both transcriptomics and genomics resources that led to the identification of 79,028 and 85,395 EST sequences from gall midge biotype 4 (GMB4) maggots feeding on a susceptible and resistant rice variety, TN1 (SH) and Suraksha (RH), respectively. Comparative transcriptome analysis of the maggots in SH and RH revealed over-representation of transcripts from proteolysis and protein phosphorylation in maggots from RH. In contrast, over-representation of transcripts for translation, regulation of transcription and transcripts involved in electron transport chain were observed in maggots from SH. This investigation, besides unveiling various mechanisms underlying insect-plant interactions, will also lead to a better understanding of strategies adopted by insects in general, and the Asian rice gall midge in particular, to overcome host defense.
Orseolia oryzae; susceptible host; resistant host; next generation sequencing (NGS); real time PCR; insect biotypes; insect-plant interaction
Many proteins of the Rel family can act as both transcriptional activators and repressors. However, mechanism that discerns the ‘activator/repressor’ functions of Rel-proteins such as Dorsal (Drosophila homologue of mammalian NFκB) is not understood. Using genomic, biophysical and biochemical approaches, we demonstrate that the underlying principle of this functional specificity lies in the ‘sequence-encoded structure’ of the κB-DNA. We show that Dorsal-binding motifs exist in distinct activator and repressor conformations. Molecular dynamics of DNA-Dorsal complexes revealed that repressor κB-motifs typically have A-tract and flexible conformation that facilitates interaction with co-repressors. Deformable structure of repressor motifs, is due to changes in the hydrogen bonding in A:T pair in the ‘A-tract’ core. The sixth nucleotide in the nonameric κB-motif, ‘A’ (A6) in the repressor motifs and ‘T’ (T6) in the activator motifs, is critical to confer this functional specificity as A6 → T6 mutation transformed flexible repressor conformation into a rigid activator conformation. These results highlight that ‘sequence encoded κB DNA-geometry’ regulates gene expression by exerting allosteric effect on binding of Rel proteins which in turn regulates interaction with co-regulators. Further, we identified and characterized putative repressor motifs in Dl-target genes, which can potentially aid in functional annotation of Dorsal gene regulatory network.
Microsatellite loci were isolated from the genomic DNA of the Asian rice gall midge, Orseolia oryzae (Wood-Mason) using a hybridization capture approach. A total of 90 non-redundant primer pairs, representing unique loci, were designed. These simple sequence repeat (SSR) markers represented di (72%), tri (15.3%), and complex repeats (12.7%). Three biotypes of gall midge (20 individuals for each biotype) were screened using these SSRs. The results revealed that 15 loci were hyper variable and showed polymorphism among different biotypes of this pest. The number of alleles ranged from two to 11 and expected heterozygosity was above 0.5. Inheritance studies with three markers (observed to be polymorphic between sexes) revealed sex linked inheritance of two SSRs (Oosat55 and Oosat59) and autosomal inheritance of one marker (Oosat43). These markers will prove to be a useful tool to devise strategies for integrated pest management and in the study of biotype evolution in this important rice pest.
rice; biotypes; virulence; Oryza sativa; SSR markers; pest of rice
The silkworm, Bombyx mori, is one of the most economically important insects in many developing countries owing to its large-scale cultivation for silk production. With the development of genomic and biotechnological tools, B. mori has also become an important bioreactor for production of various recombinant proteins of biomedical interest. In 2004, two genome sequencing projects for B. mori were reported independently by Chinese and Japanese teams; however, the datasets were insufficient for building long genomic scaffolds which are essential for unambiguous annotation of the genome. Now, both the datasets have been merged and assembled through a joint collaboration between the two groups.
Integration of the two data sets of silkworm whole-genome-shotgun sequencing by the Japanese and Chinese groups together with newly obtained fosmid- and BAC-end sequences produced the best continuity (~3.7 Mb in N50 scaffold size) among the sequenced insect genomes and provided a high degree of nucleotide coverage (88%) of all 28 chromosomes. In addition, a physical map of BAC contigs constructed by fingerprinting BAC clones and a SNP linkage map constructed using BAC-end sequences were available. In parallel, proteomic data from two-dimensional polyacrylamide gel electrophoresis in various tissues and developmental stages were compiled into a silkworm proteome database. Finally, a Bombyx trap database was constructed for documenting insertion positions and expression data of transposon insertion lines.
For efficient usage of genome information for functional studies, genomic sequences, physical and genetic map information and EST data were compiled into KAIKObase, an integrated silkworm genome database which consists of 4 map viewers, a gene viewer, and sequence, keyword and position search systems to display results and data at the level of nucleotide sequence, gene, scaffold and chromosome. Integration of the silkworm proteome database and the Bombyx trap database with KAIKObase led to a high-grade, user-friendly, and comprehensive silkworm genome database which is now available from URL: .
Palindromes are known to be involved in a variety of biological processes. In the present investigation we carried out a comprehensive analysis of palindromes in the mitochondrial control regions (CRs) of several animal groups to study their frequency, distribution and architecture to gain insights into the origin of replication of mtDNA.
Many species of Arthropoda, Nematoda, Mollusca and Annelida harbor palindromes and inverted repeats (IRs) in their CRs. Lower animals like cnidarians and higher animal groups like chordates are almost devoid of palindromes and IRs. The study revealed that palindrome occurrence is positively correlated with the AT content of CRs, and that IRs are likely to give rise to longer palindromes.
The present study attempts to explain possible reasons and gives in silico evidence for absence of palindromes and IRs from CR of vertebrate mtDNA and acquisition and retention of the same in insects. Study of CRs of different animal phyla uncovered unique architecture of this locus, be it high abundance of long palindromes and IRs in CRs of Insecta and Nematoda, or short IRs of 10–20 nucleotides with a spacer region of 12–14 bases in subphylum Chelicerata, or nearly complete of absence of any long palindromes and IRs in Vertebrata, Cnidaria and Echinodermata.
Microsatellites are the tandem repeats of nucleotide motifs of size 1–6 bp observed in all known genomes. These repeats show length polymorphism characterized by either insertion or deletion (indels) of the repeat units, which in and around the coding regions affect transcription and translation of genes.
Systematic comparison of all the equivalent microsatellites in the coding regions of the three mycobacterial genomes, viz. Mycobacterium tuberculosis H37Rv, Mycobacterium tuberculosis CDC1551 and Mycobacterium bovis, revealed for the first time the presence of several polymorphic microsatellites. The coding regions affected by frame-shifts owing to microsatellite indels have undergone changes indicative of gene fission/fusion, premature termination and length variation. Interestingly, the genes affected by frame-shift mutations code for membrane proteins, transporters, PPE, PE_PGRS, cell-wall synthesis proteins and hypothetical proteins.
This study has revealed the role of microsatellite indel mutations in imparting novel functions and a certain degree of plasticity to the mycobacterial genomes. There seems to be some correlation between microsatellite polymorphism and the variations in virulence, host-pathogen interactions mediated by surface antigen variations, and adaptation of the pathogens. Several of the polymorphic microsatellites reported in this study can be tested for their polymorphic nature by screening clinical isolates and various mycobacterial strains, for establishing correlations between microsatellite polymorphism and the phenotypic variations among these pathogens.
Molecular characterization of cattle breeds is important for the prevention of germplasm erosion by cross breeding. The Indian zebu cattle have their significant role in evolution of present day cattle breeds and development of some of the exotic breeds. Microsatellites are the best available molecular tools for characterization of cattle breeds. The present study was carried out to characterize two Indian cattle breeds, Ongole and Deoni, using microsatellite markers.
Using 5 di- and 5 tri-nucleotide repeat loci, 17 Ongole and 13 Deoni unrelated individuals were studied. Of the ten loci, eight revealed polymorphism in both the breeds. The di-nucleotide repeat loci were found to be more polymorphic (100%) than tri-nucleotide repeat loci (60%). A total of 39 polymorphic alleles were obtained at 4.5 alleles per locus in Ongole and 4.1 in Deoni. The average expected heterozygosity was 0.46 (±0.1) and 0.50 (±0.1) in Ongole and Deoni breeds, respectively. The PIC values of the polymorphic loci ranged from 0.15 to 0.79 in Ongole and 0.13 to 0.80 in Deoni breeds. Six Ongole specific and three Deoni specific alleles were identified. The two breeds showed a moderate genetic relationship between themselves with a FST value of 0.117 (P = 0.01).
This preliminary study shows that microsatellite markers are useful in distinguishing the two zebu breeds namely, Ongole and Deoni. Further studies of other zebu breeds using many microsatellite loci with larger sample sizes can reveal the genetic relationships of Indian breeds.
Two Indian cattle breeds, Ongole and Deoni, were characterized using microsatellite markers, revealing six Ongole and three Deoni specific alleles.
Molecular characterization of cattle breeds is important for the prevention of germplasm erosion by cross breeding. The present study was carried out to characterize two Indian cattle breeds, Ongole and Deoni using microsatellite markers. Using 5 di-and 5 tri- nucleotide repeat loci, 17 Ongole and 13 Deoni unrelated individuals were studied. Of the ten loci, eight revealed polymorphism in both the breeds. The di-nucleotide repeats loci were found to be more polymorphic (100%) than tri-nucleotide repeat loci (60%). A total of 39 polymorphic alleles were obtained at 4.5 alleles per locus in Ongole and 4.1 in Deoni. The average expected heterozygosity was 0.46 (+0.1) and 0.50 (+0.1) in Ongole and Deoni breeds, respectively. The PIC values of the polymorphic loci ranged from 0.15 to 0.79 in Ongole and 0.13 to 0.80 in Deoni breeds. Six Ongole specific and three Deoni specific alleles were identified. The two breeds showed a moderate genetic relationship between themselves with a FST value of 0.10.
The genus Morus, known as mulberry, is a dioecious and cross-pollinating plant that is the sole food for the domesticated silkworm, Bombyx mori. Traditional methods using morphological traits for classification are largely unsuccessful in establishing the diversity and relationships among different mulberry species because of environmental influence on traits of interest. As a more robust alternative, PCR based marker assays including RAPD and ISSR were employed to study the genetic diversity and interrelationships among twelve domesticated and three wild mulberry species.
RAPD analysis using 19 random primers generated 128 discrete markers ranging from 500–3000 bp in size. One-hundred-nineteen of these were polymorphic (92%), with an average of 6.26 markers per primer. Among these were a few putative species-specific amplification products which could be useful for germplasm classification and introgression studies. The ISSR analysis employed six anchored primers, 4 of which generated 93 polymorphic markers with an average of 23.25 markers per primer. Cluster analysis of RAPD and ISSR data using the WINBOOT package to calculate the Dice coefficient resulted into two clusters, one comprising polyploid wild species and the other with domesticated (mostly diploid) species.
These results suggest that RAPD and ISSR markers are useful for mulberry genetic diversity analysis and germplasm characterization, and that putative species-specific markers may be obtained which can be converted to SCARs after further studies.
The MICdb (Microsatellites Database) (http://www.cdfd.org.in/micas) is a comprehensive relational database of non-redundant microsatellites extracted from fully sequenced prokaryotic genomes. The current version (1.0) of the database has been compiled from 83 genomes belonging to different phylogenetic groups. This database has been linked to MICAS, the web-based Microstatellite Analysis Server. MICAS provides a user-friendly front-end to systematically extract data on microsatellite tracts from genomes. The database contains the following information pertaining to the microsatellites: the regions (coding/non-coding, if coding, their GenBank annotations) containing microsatellite tracts; the frequencies of their occurrences, the size and the number of repeating motifs; and the sequences of the tracts. MICAS also provides an interface to Autoprimer, a primer design program to automatically design primers for selected microsatellite loci.
Relevant for various areas of human genetics, Y-chromosomal short tandem repeats (Y-STRs) are commonly used for testing close paternal relationships among individuals and populations, and for male lineage identification. However, even the widely used 17-loci Yfiler set cannot resolve individuals and populations completely. Here, 52 centers generated quality-controlled data of 13 rapidly mutating (RM) Y-STRs in 14,644 related and unrelated males from 111 worldwide populations. Strikingly, >99% of the 12,272 unrelated males were completely individualized. Haplotype diversity was extremely high (global: 0.9999985, regional: 0.99836–0.9999988). Haplotype sharing between populations was almost absent except for six (0.05%) of the 12,156 haplotypes. Haplotype sharing within populations was generally rare (0.8% nonunique haplotypes), significantly lower in urban (0.9%) than rural (2.1%) and highest in endogamous groups (14.3%). Analysis of molecular variance revealed 99.98% of variation within populations, 0.018% among populations within groups, and 0.002% among groups. Of the 2,372 newly and 156 previously typed male relative pairs, 29% were differentiated including 27% of the 2,378 father–son pairs. Relative to Yfiler, haplotype diversity was increased in 86% of the populations tested and overall male relative differentiation was raised by 23.5%. Our study demonstrates the value of RM Y-STRs in identifying and separating unrelated and related males and provides a reference database.
Y-chromosome; Y-STRs; haplotypes; RM Y-STRs; paternal lineage; forensic