Search tips
Search criteria 


Logo of bioinformLink to Publisher's site
Bioinformation. 2009; 3(7): 314–320.
Published online 2009 February 28.
PMCID: PMC2655052

Use of sequence motifs as barcodes and secondary structures of Internal Transcribed spacer 2 (ITS2, rDNA) for identification of the Indian liver fluke, Fasciola (Trematoda: Fasciolidae)


Most phylogenetic studies using current methods have focused on primary DNA sequence information. However, RNA secondary structures are particularly useful in systematics because they include characteristics that give “morphological” information which is not found in the primary sequence. Also DNA sequence motifs from the internal transcribed spacer (ITS) of the nuclear rRNA repeat are useful for identification of trematodes. The species of liver flukes of the genus Fasciola (Platyhelminthes: Digenea: Fasciolidae) are obligate parasitic trematodes residing in the large biliary ducts of herbivorous mammals. While Fasciola hepatica has a cosmopolitan distribution, the other major species, i.e., F. gigantica is reportedly prevalent in the tropical and subtropical regions of Africa and Asia. To determine the Fasciola sp. of Assam (India) origin based on rDNA molecular data, ribosomal ITS2 region was sequenced (EF027103) and analysed. NCBI databases were used for sequence homology analysis and the phylogenetic trees were constructed based upon the ITS2 using MEGA and a Bayesian analysis of the combined data. The latter approach allowed us to include both primary sequence and RNA molecular morphometrics and revealed a close relationship with isolates of F. gigantica from China, Indonesia and Japan, the isolate from China with significant bootstrap values being the closest. ITS2 sequence motifs allowed an accurate in silico distinction of liver flukes. The data indicate that ITS2 motifs (≤ 50 bp in size) can be considered promising tool for trematode species identification. Using the novel approach of molecular morphometrics that is based on ITS2 secondary structure homologies, phylogenetic relationships of the various isolates of fasciolid species have been discussed.

Keywords: Fasciola hepatica, Fasciola gigantica, secondary structure, internal transcribed spacer, bar coding, motifs


The identification of closely related species based on morphological characters can be difficult. This is particularly the case for soft-bodied animals such as digenean trematodes. PCR‐based techniques utilizing the rDNA ITS2 sequences, which occur between the 5.8S and 28S coding regions, have proven to be a reliable tool to identify the helminth species and their phylogenetic relationships [1]. The nuclear ribosomal DNA second internal transcribed spacer (ITS2) sequences, which occur between the 5.8S and 28S coding regions, have proven useful for diagnostic purposes at the level of species. Fasciola spp and isolates of Fascioloides magna, another member of the same family, from different geographical regions have been discriminated on the basis of ITS sequences [2]. The ITS2 sequences have also been used to characterize the liver flukes from mainland China, which include F. hepatica, F. gigantica and an intermediate genotype, including polymorphism among ITS2 copies within the same fluke individual [3]. The emergence of sequence‐based identification with a BLAST similarity search connected to public databases has resolved several experimental and taxonomic constraints. In addition, BLAST outcomes give no information about species delimitation for closely related species. The identification of unknown ITS sequences based on these approaches therefore needs to be supported by phylogenetic analysis [4].

The trematode flukes of the genus Fasciola (the sheep liver fluke) are parasites of herbivores and infect humans accidentally causing fascioliasis worldwide. The parasite is very cosmopolitan in distribution being found throughout all regions of the world, both temperate and tropical. F. hepatica is the causative agent of fascioliasis or ’liver rot‘ in ruminants, where it may be an important pathogen. Human infections with F. hepatica are found in areas where sheep and cattle are raised, and where humans consume raw watercress, including Europe, the Middle East and Asia [5]. Infection with F.gigantica on the other hand is found more commonly in tropical regions of theworld, areas affected including Africa, Asia, many Pacific islands including Hawaii (where it has been reported inman), the Middle East and Southern Europe and the south of the USA [6]. Both fasciolids follow a similar two‐host life cycle, in which freshwater snails of the familyLymnaeidae act as intermediate or vector hosts and a broad spectrum of mammals including humans, mainly herbivorous large‐sized species, act as definitive hosts [7].

Most phylogenetic studies using current methods have focused on primary DNA sequence information. However, RNA secondary structures are particularly useful insystematics because they include characteristics, not found in the primary sequence, that give ’morphological‘information [8]. The novel approach of molecular morphometrics that relies both on traditionalmorphological comparison and on molecular sequence comparison by measuring the structural parameters of the ITS2 secondary structure homologies (geometrical features, bond energies, base composition etc.) is recently being used to study the phylogenetic relationships of various species [9]. This method allows one to take into account the regions where multiple alignments are barely reliable because of a large number of insertions and deletions. This method is based on the assumption that secondary structure can be phylogenetically as significant as primary sequence. It is well known that rRNA is highly conserved throughout evolution. Thus, the secondarystructure elements of the RNA molecule, i.e., the helices, loops, bulges, and separating single‐stranded portions, can be considered phylogenetic characters [10,11, 12].

At the same time, the ITS offers sequence motifs that are useful for the development of DNA barcoding. This is a technique that uses short DNA sequences from a standardized region of the genome as a diagonistic “biomarker” for species identification [13,14] as testified to by the creation of the consortium for the Barcode of life ( Different species have different barcodes, which allow them to be used to (i) identify specimens, (ii) discover possible new species, and (iii) make taxonomy more effective for science and society. In the present study, our main objective was to identify the Indian liver fluke (the Assam isolates) collected from ruminant hosts by the design and testing of ITS sequence motifs for oligonucleotide bar codes. We also aimed at determining the species prevalent in the northeastern region of the country, by comparing these sequences by primary sequence analysis and molecular morphometrics data.


Parasite material and DNA isolation

Adult Fasciola were obtained in live form from hepatic biliary ducts of freshly slaughtered cow, Bos indicus. The worms recovered from these hosts represented the geographical isolates from Assam, Northeast India and morphologically resembled Fasciola gigantica (deposition number of paratypes at Zoological Survey of India, Kolkata = W7787/1). Eggs were obtained from mature adult flukes by squeezing between two glass slides. For the purpose of DNA extraction, adult flukes were processed singly; eggs recovered from each of these specimens were also processed separately. The adult flukes were first immersed in digestion extraction buffer (containing 1% SDS, 25 mg Proteinase K) at 37° C overnight. DNA was then extracted from lysed individual worms by standard ethanol precipitation technique [15] and also extracted on FTA cards using Whatman's FTA Purification Reagent as described elsewhere [16]. DNA from the eggs was extracted only with the FTA card technique.

DNA amplification, Sequencing and its Analysis

The rDNA region spanning the ITS2 region was amplified from DNA obtained from the fluke by PCR. We used the universal primers, considered to be the general primers for trematodes and are designed based on conserved ITS sequences of Schistosoma species following [17] as given n Figure 1. The PCR amplification was performed following the standard protocol [18] with minor modifications. The resultant PCR products were separated by electrophoresis through 1.6% (w/v) agarose gels in TAE buffer, stained with ethidium bromide, transilluminated under ultraviolet light and then photographed. The known size fragments of Phi X 174 DNA/ Hae III Digest in agarose gel were used as marker. For DNA sequencing, the PCR products were purified using Genei Quick PCR purification Kit, and sequenced in both directions using PCR primers on an automated sequencer by DNA sequencing services of TCGA, New Delhi and Bangalore Genei Ltd. The DNA sequences were put to further analysis by using various Bioinformatics tools including similarity search BLAST (URL and phylogenetic prediction by CLUSTALW (URL for each query DNA sequence.

Figure 1
Primer design for ITS2

Molecular Phylogenetic Analysis

Initially, the sequences were aligned using ClustalW multiple alignment [19] with the default gap and extension penalties used by this program. Phylogenetic tree‐building methods presume particular evolutionary models. Therefore, while interpreting the results obtained, we considered different tree building models to entertain possible explanations. Only unique sequences were used in tree construction. ITS sequences were entered in the MEGA for construction of the phylogenetic trees using Maximum Parsimony and distance methods namely the Neighbor‐Joining, UPGMA and Minimum Evolution. Branch support was given using 1000 bootstrap replicates in MEGA [20].

Bayesian phylogenetic analysis

DNA sequences were aligned using ClustalX 2.0.7 [21]. The interleaved NEXUS file was edited manually in order for it to be recognized by Mr. Bayes V3.1.2 programme. Phylogenetic analysis was carried out using the Bayesian approach with combined datasets using Mrbayes 3.1 [22], wherein each data partition is allowed to have a different evolution rate. The model of evolution and prior settings for individual loci was set to the GTR model with gammadistributed rate variation across sites and a proportion of invariable sites. Metropolis‐coupled Markov chain Monte Carlo (MCMCMC) sampling was performed with four incrementally heated chains that were combinatorial run for 40,000 generations. The convergence of MCMCMC was then monitored by examining the value of the marginal likelihood through generations. Coalescence of substitution rate and rate model parameters were also examined. Average standard deviation of split frequencies was checked and the generations were kept on adding until the standard deviation value was below 0.01. The values slightly differed because of stochastic effects. The sample of substitution model parameters and samples of trees and branch lengths were summarized by the “sump burnin” and “sumt burnin” commands respectively. The values in the following commands were adjusted as per the 25% of our samples. The cladogram with the posterior probabilities for each split and a phylogram with mean branch lengths were generated and subsequently read by the tree drawing program Tree view V1.6.6 [23].

Motif identification, testing and validation

The ITS sequence motifs were identified from aligned sequences of the data set for the species using PRATT software ( The minimum percentage of sequences to match (C %) parameter was adjusted to report pattern matching at 100% of the sequence input. The motifs were expressed using the DNA alphabet (A, T, C, G) in PROSITE language. The validation of the motifs was performed for each species using a “PATTERN MATCHING” Web application ( considered that a motif was highly specific to a Fasciola species if it matched most or all the ITS sequences of this species but no other ITS of any other trematodes species.

Evaluation of the ITS motif specificity through BLAST analysis

The Fasciola ITS sequence motifs (patterns 1‐15) were subjected to BLAST algorithm against the non‐redundant GenBank database of the National Center for Biotechnology Information (nr at NCBI). The BLAST output were then analyzed to find only the exact or perfect pairwise matches showing significantly high scores in terms of percent identity and low expect (E) values for each species.

Predicted ITS2 RNA secondary structures and analyses

Secondary structures of ITS2 sequences of various fasciolid species were reconstructed by aligning their sequences using Bioedit [24]. The acquired structures with restrictions and constrains were submitted in MFOLD [25]. RNA was folded at a fixed temperature of 37° C, and the structure chosen from different output files was the desired 6‐helicoidal ring or the one with the highest negative free energy if various similar structures were obtained.


PCR amplification of ITS regions and its analysis

The PCR‐amplified products were successfully obtained using the primers as mentioned above. The nucleotide sequences were obtained for ITS2 of rDNA and were compared with sequences of other fasciolid species obtained from Genbank. The fragments of amplified DNA were estimated to be 480‐550bp long. For comparative purpose, the ITS2 sequences of fasciolids from various geographical regions were obtained from GenBank (Table 1). The Blast hit results show that the query ITS sequences were more similar to the sequences of various geographical isolates of Fasciola sp., F. hepatica and F. gigantica besides Fascioloides magna (belongs to the same family, i.e., Fasciolidae).

Phylogenetic trees

Phylogenetic trees were obtained by comparing the sequences of F. gigantica and available ITS sequences for other fasciolid species. Phylogenetic analyses using the various distance methods and character method like Maximum Parsimony showed that the topology is similar among the trees obtained with significant bootstrap support for the clades. The values of 70% and above in the bootstrap test of phylogenetic accuracy indicates reliable grouping among different members of fasciolids. The phylogenetic trees constructed in MEGA revealed a close relationship with isolates of F. gigantica from China, Indonesia and Japan (Figure 2).

Figure 2
Phylogenetic trees of ITS2 sequences of fasciolid species showing bootstrap values. (A) Neighbor Joining; (B) Maximum Parsimony; (C) Hypothetical Bayesian analysis

In silico identification of Fasciola sp. based on pattern matching ITS motifs

A total of 15 ITS2 PROSITE motifs were tested by BLAST analysis against the generalized GenBank database at NCBI (Figure 3). The Fasciola motifs exhibited exact or perfect matches with Fasciola from different geographical isolates with significant E‐value scores. Eight (pattern 1‐8) out of 15 Fasciola motifs exactly matched the sequences of Fasciola isolates from GenBank (best hits = more than 100; 100% of identity; E-values = 2e-16 to 1e-17). Five (pattern 11‐15) showed (best hits = more than 100; 95% identity and significant E‐value ranging from 5e-14 to 1e-14). The remaining two motifs (pattern 9‐10) exhibited more than 100 best hits with 92% identity and an E‐value score ranging from 4e-11 to 1e-11 (Tables 2 & 3 in supplementary material).

Figure 3
Sample of sequence motif in PROSITE format (from 5' to 3' ends) and Pattern matching of ITS motifs of different geographical isolates of fasciolid species

Secondary structure analysis

Six predicted RNA secondary structures werereconstructed from the unique sequences with highestnegative free energy of F. gigantica to provide the basicinformation for phylogenetic analysis (Figure 4). The ITS2 plus flanking regions of nuclear region ranged from 606bpin F. gigantica India to a minimum length of 361bp in F. gigantica China. The secondary structural features of ITS2 regions as shown in the figure were analysed based on conserved stems and loops. F. sp. Japan, F. gigantica Indonesia, F. gigantica Zambia and F. hepatica Uruguay exhibited same type of secondary structures. F. gigantica isolates from India and China show overall similarity in the ITS2 rRNA folding and have identical secondary structure. Secondary structures of the remaining species are weakly variable. The observed similarities at the secondary structural level are further reflected at the energy level. The only difference in their topology is due to differences in nucleotide sequences. These secondary structure predictions indicate that the domains basepair to form a core region central to several stem features implying that conservedness is more important for the proper rRNA folding pattern. Moreover the observed phylogenetic trend was identified with respect to the target accessibility sites for the seven different isolates. The orders of preference were interior loop, bulge loop, multiple branch loop, hairpin loop and exterior loop in all the isolates. The topology based only on the predicted RNA secondary structure of the ITS2 region resolved most relationships among the species studied. Bayesian analysis of the alignment retained the same topology and supported the same branches as the primary sequences (Figure 4). ITS2 sequence of the Indian isolate revealed closest similarity with the Chinese isolate (Fasciola gigantica) with significant bootstrap value.

Figure 4
Predicted ITS2 RNA secondary structures and their structure formation enthalpies according to MFOLD


Species identification and evolutionary inference generally use molecular and phylogenetic approaches on the ITS as target genomic loci. But taxonomy of Fasciola spp has been based mainly on morphological data complemented with ecological, cytological and pathological results as well as clinical manifestations. Morphological differences found on stained and mounted adult specimens have been widely used to discriminate between platyhelminth species. It is possible to distinguish between adult F. hepatica and F. gigantica on the basis of morphology, but much variation exists. Differentiating between two species is not possible on the basis of clinical, pathological, or immunological findings and their eggs are morphologically very similar [26]. Consequently, where both species occur concurrently or in overlapping geographical distribution, it is not possible to be certain as to which species is responsible for the disease. The low number of records of infection with F. gigantica may well be due to the lack of good tools to distinguish this species from F. hepatica [27].

The comparison of ITS sequences from worms of different hosts and of different countries indicates that there exists a high species‐specific homogeneity. In the present study, primary sequence analysis revealed a close relationship between the query sequence (from Northeastern region of India) and isolates of F. gigantica from China, Indonesia and Japan. Phylogenetic trees constructed showed that the groups of multiple closely related genotypes of F. gigantica from Asia are broadly sympatric. Such a pattern is expected for species with high gene flow whose populations have not been sundered by long term biogeographic barriers.

We present here a new approach of molecular morphometrics, in which the measurable structural parameters of the molecules are directly used as specific characters to construct a phylogenetic tree. These structures are inferred from the sequence of the nucleotides, often using energy minimization [28]. Several patterns of predicted secondary structures of RNA were constructed from unique ITS sequences from different geographical isolates of F. gigantica, which provided us with the additional information for correct identification of the species prevalent in the region. Molecular morphometrics appears to be complimentary to classical primary sequence analysis in phylogenetic studies as it takes into consideration only the size variations of homologous structural segments and this choice implies that the overall architecture of the molecule remains same among the observed taxa. This method helps in taking into account the regions where multiple alignments are barely reliable because of large number of insertion/deletion operations. In the present study the secondary structure analysis of the same data also confirmed the results mentioned for primary sequence analysis. Differences in their topology are only due to the fact that there are variable lengths of the sequences. However, there are difficulties in defining a distance between two related structures with variable topologies [29]. Nevertheless, because there were inconsistencies in the placement of a few Fasciola species, this study needs to be extended, in order to gain a better understanding of the systematics of this group as well as the evolution of their predicted ITS2 RNA secondary structures.


The present in silico identification of the Fasciola spp. with ITS sequence motifs and secondary structure is consistent with investigations made using traditional approaches (i.e., by morphology). The results also corroborate that the Fasciola species prevalent in Assam, Northeast India is in fact F. gigantica and not F. hepatica, which otherwise is the most common liver fluke throughout the globe. Lack of data on genotypic diversity of Fasciola species in Africa and India does not allow the origin of regional populations to be unambiguously determined. Further studies with additional molecular markers and bar coding will contribute towards determining the population structure and divergence between the two closely related species of this genus.

Supplementary material



This study was carried out under the ’All India coordinated Project on Capacity Building in Taxonomy: Research on Helminths‘, sanctioned to VT by Ministry of Environment & Forests, GOI; DBT Project to VT & AC; DSA (UGCSAP) in Zoology and UGC's UPE (Biosciences) programme in the School of Life Sciences at NEHU, Shillong.


Citation:Prasad et al., Bioinformation 3(7): 314-321 (2009)


1. Blair D, et al. Adv Parasitol. 1999;42:113. [PubMed]
2. Adlard RD, et al. Int J Parasitol. 1993;23:423. [PubMed]
3. Huang WY, et al. Vet Parasitol. 2004;120:75. [PubMed]
4. Holder M, Lewis PO. Nat Rev Genet. 2003;4:275. [PubMed]
5. Mas-Coma S, et al. Bull WHO. 1999;77:340. [PubMed]
6. Hammond JA. Trans R Soc Trop Med Hyg. 1974;68:253. [PubMed]
7. Mas-Coma S, et al. Int J Parasitol. 2005;35:1255. [PubMed]
8. Caetano-Anolles G. Nucleic Acids Res. 2002;30:2575. [PMC free article] [PubMed]
9. Billoud B, et al. Mol Biol Evol. 2000;17:1435. [PubMed]
10. Zwieb C, et al. Nucleic Acids Res. 1981;9:3621. [PMC free article] [PubMed]
11. Schultz J, et al. RNA. 2005;11:361. [PubMed]
12. Grajales A, et al. BMC Evol Biol. 2007;7:90. [PMC free article] [PubMed]
13. Ebach MC, Holdrege C. Nature. 2005;434:697. [PubMed]
14. Schindel DE, Miller SE. Nature. 2005;435:17. [PubMed]
15. Sambrooke J, et al. Molecular cloning: a laboratory manual. 1989.
16. Prasad PK, et al. Parasitol Res. 2007;101:1581. [PubMed]
17. Bowles J, et al. Mol Phylogenet Evol. 1995;4:103. [PubMed]
18. White BA. PCR Protocols, current methods and applications. 1993.
19. Thompson JD, et al. Nucleic Acids Res. 1994;22:4673. [PMC free article] [PubMed]
20. Hillis DM, JJ Syst Biol. 1993;42:182.
21. Larkin MA, et al. Bioinformatics. 2007;23:2947. [PubMed]
22. Huelsenbeck JP, Ronquist F. Bioinformatics. 2001;17:754. [PubMed]
23. DM R, et al. Page Comp Appl Biosci. 1996;12:357. [PubMed]
24. Hall TA. Nucl Acids Symp. 1999;41:95.
25. Zuker M. Nucl Acids Res. 2003;31:3406. [PMC free article] [PubMed]
26. Lotfy WM, Hillyer GV. Exp Pathol Parasitol. 2003;11:9.
27. Marcilla A, et al. Mol Cell Probes. 2002;16:327. [PubMed]
28. Zuker M. Methods Mol Biol. 1994;25:267. [PubMed]
29. Shapiro BA. Comput Appl Biosci. 1998;4:387. [PubMed]

Articles from Bioinformation are provided here courtesy of Biomedical Informatics Publishing Group