2.  Inter-genomic DNA Exchanges and Homeologous Gene Silencing Shaped the Nascent Allopolyploid Coffee Genome (Coffea arabica L.) 
G3: Genes|Genomes|Genetics  2016;6(9):2937-2948.
Allopolyploidization is a biological process that has played a major role in plant speciation and evolution. Genomic changes are common consequences of polyploidization, but their dynamics over time are still poorly understood. Coffea arabica, a recently formed allotetraploid, was chosen to study genetic changes that accompany allopolyploid formation. Both RNA-seq and DNA-seq data were generated from two genetically distant C. arabica accessions. Genomic structural variation was investigated using C. canephora, one of its diploid progenitors, as reference genome. The fate of 9047 duplicate homeologous genes was inferred and compared between the accessions. The pattern of SNP density along the reference genome was consistent with the allopolyploid structure. Large genomic duplications or deletions were not detected. Two homeologous copies were retained and expressed in 96% of the genes analyzed. Nevertheless, duplicated genes were found to be affected by various genomic changes leading to homeolog loss or silencing. Genetic and epigenetic changes were evidenced that could have played a major role in the stabilization of the unique ancestral allotetraploid and its subsequent diversification. While the early evolution of C. arabica mainly involved homeologous crossover exchanges, the later stage appears to have relied on more gradual evolution involving gene conversion and homeolog silencing.
PMCID: PMC5015950  PMID: 27440920
polyploidy; evolution; gene conversion; homoeologous recombination; genome dominance
3.  Gigwa—Genotype investigator for genome-wide analyses 
GigaScience  2016;5:25.
Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly true with regards to studies of genomic variation, which are currently lacking scalable and user-friendly data exploration solutions.
Here we present Gigwa, a web-based tool that provides an easy and intuitive way to explore large amounts of genotyping data by filtering it not only on the basis of variant features, including functional annotations, but also on genotype patterns. The data storage relies on MongoDB, which offers good scalability properties. Gigwa can handle multiple databases and may be deployed in either single- or multi-user mode. In addition, it provides a wide range of popular export formats.
The Gigwa application is suitable for managing large amounts of genomic variation data. Its user-friendly web interface makes such processing widely accessible. It can either be simply deployed on a workstation or be used to provide a shared data portal for a given community of researchers.
Electronic supplementary material
The online version of this article (doi:10.1186/s13742-016-0131-8) contains supplementary material, which is available to authorized users.
PMCID: PMC4897896  PMID: 27267926
Genomic variations; VCF; HapMap; NoSQL; MongoDB; SNP; INDEL; Web interface
4.  Genetic diversity, linkage disequilibrium and power of a large grapevine (Vitis vinifera L) diversity panel newly designed for association studies 
BMC Plant Biology  2016;16:74.
As for many crops, new high-quality grapevine varieties requiring less pesticide and adapted to climate change are needed. In perennial species, breeding is a long process which can be speeded up by gaining knowledge about quantitative trait loci linked to agronomic traits variation. However, due to the long juvenile period of these species, establishing numerous highly recombinant populations for high resolution mapping is both costly and time-consuming. Genome wide association studies in germplasm panels is an alternative method of choice, since it allows identifying the main quantitative trait loci with high resolution by exploiting past recombination events between cultivars. Such studies require adequate panel design to represent most of the available genetic and phenotypic diversity. Assessing linkage disequilibrium extent and panel power is also needed to determine the marker density required for association studies.
Starting from the largest grapevine collection worldwide maintained in Vassal (France), we designed a diversity panel of 279 cultivars with limited relatedness, reflecting the low structuration in three genetic pools resulting from different uses (table vs wine) and geographical origin (East vs West), and including the major founders of modern cultivars. With 20 simple sequence repeat markers and five quantitative traits, we showed that our panel adequately captured most of the genetic and phenotypic diversity existing within the entire Vassal collection. To assess linkage disequilibrium extent and panel power, we genotyped single nucleotide polymorphisms: 372 over four genomic regions and 129 distributed over the whole genome. Linkage disequilibrium, measured by correlation corrected for kinship, reached 0.2 for a physical distance between 9 and 458 Kb depending on genetic pool and genomic region, with varying size of linkage disequilibrium blocks. This panel achieved reasonable power to detect associations between traits with high broad-sense heritability (> 0.7) and causal loci with intermediate allelic frequency and strong effect (explaining > 10 % of total variance).
Our association panel constitutes a new, highly valuable resource for genetic association studies in grapevine, and deserves dissemination to diverse field and greenhouse trials to gain more insight into the genetic control of many agronomic traits and their interaction with the environment.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-016-0754-z) contains supplementary material, which is available to authorized users.
PMCID: PMC4802926  PMID: 27005772
Vitis; Association panel; Linkage disequilibrium; Power; Genome-wide association studies; SSR; SNP; sylvestris; Vassal collection; Haplotype; Kinship
5.  The Greater Phenotypic Homeostasis of the Allopolyploid Coffea arabica Improved the Transcriptional Homeostasis Over that of Both Diploid Parents 
Plant and Cell Physiology  2015;56(10):2035-2051.
Polyploidy impacts the diversity of plant species, giving rise to novel phenotypes and leading to ecological diversification. In order to observe adaptive and evolutionary capacities of polyploids, we compared the growth, primary metabolism and transcriptomic expression level in the leaves of the newly formed allotetraploid Coffea arabica species compared with its two diploid parental species (Coffea eugenioides and Coffea canephora), exposed to four thermal regimes (TRs; 18–14, 23–19, 28–24 and 33–29°C). The growth rate of the allopolyploid C. arabica was similar to that of C. canephora under the hottest TR and that of C. eugenioides under the coldest TR. For metabolite contents measured at the hottest TR, the allopolyploid showed similar behavior to C. canephora, the parent which tolerates higher growth temperatures in the natural environment. However, at the coldest TR, the allopolyploid displayed higher sucrose, raffinose and ABA contents than those of its two parents and similar linolenic acid leaf composition and Chl content to those of C. eugenioides. At the gene expression level, few differences between the allopolyploid and its parents were observed for studied genes linked to photosynthesis, respiration and the circadian clock, whereas genes linked to redox activity showed a greater capacity of the allopolyploid for homeostasis. Finally, we found that the overall transcriptional response to TRs of the allopolyploid was more homeostatic compared with its parents. This better transcriptional homeostasis of the allopolyploid C. arabica afforded a greater phenotypic homeostasis when faced with environments that are unsuited to the diploid parental species.
PMCID: PMC4679393  PMID: 26355011
Coffea arabica; Homeostasis; Natural allopolyploid; Primary metabolism; Temperature; Transcriptomic expression level
6.  SNiPlay3: a web-based application for exploration and large scale analyses of genomic variations 
Nucleic Acids Research  2015;43(Web Server issue):W295-W300.
SNiPlay is a web-based tool for detection, management and analysis of genetic variants including both single nucleotide polymorphisms (SNPs) and InDels. Version 3 now extends functionalities in order to easily manage and exploit SNPs derived from next generation sequencing technologies, such as GBS (genotyping by sequencing), WGRS (whole gre-sequencing) and RNA-Seq technologies. Based on the standard VCF (variant call format) format, the application offers an intuitive interface for filtering and comparing polymorphisms using user-defined sets of individuals and then establishing a reliable genotyping data matrix for further analyses. Namely, in addition to the various scaled-up analyses allowed by the application (genomic annotation of SNP, diversity analysis, haplotype reconstruction and network, linkage disequilibrium), SNiPlay3 proposes new modules for GWAS (genome-wide association studies), population stratification, distance tree analysis and visualization of SNP density. Additionally, we developed a suite of Galaxy wrappers for each step of the SNiPlay3 process, so that the complete pipeline can also be deployed on a Galaxy instance using the Galaxy ToolShed procedure and then be computed as a Galaxy workflow. SNiPlay is accessible at
PMCID: PMC4489301  PMID: 26040700
7.  Regulatory Divergence between Parental Alleles Determines Gene Expression Patterns in Hybrids 
Genome Biology and Evolution  2015;7(4):1110-1121.
Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history.
PMCID: PMC4419803  PMID: 25819221
hybridization; cis- and trans-regulation; allele-specific expression; allopolyploidy
8.  The coffee genome hub: a resource for coffee genomes 
Nucleic Acids Research  2014;43(Database issue):D1028-D1035.
The whole genome sequence of Coffea canephora, the perennial diploid species known as Robusta, has been recently released. In the context of the C. canephora genome sequencing project and to support post-genomics efforts, we developed the Coffee Genome Hub (, an integrative genome information system that allows centralized access to genomics and genetics data and analysis tools to facilitate translational and applied research in coffee. We provide the complete genome sequence of C. canephora along with gene structure, gene product information, metabolism, gene families, transcriptomics, syntenic blocks, genetic markers and genetic maps. The hub relies on generic software (e.g. GMOD tools) for easy querying, visualizing and downloading research data. It includes a Genome Browser enhanced by a Community Annotation System, enabling the improvement of automatic gene annotation through an annotation editor. In addition, the hub aims at developing interoperability among other existing South Green tools managing coffee data (phylogenomics resources, SNPs) and/or supporting data analyses with the Galaxy workflow manager.
PMCID: PMC4383925  PMID: 25392413
9.  SNiPloid: A Utility to Exploit High-Throughput SNP Data Derived from RNA-Seq in Allopolyploid Species 
High-throughput sequencing is a common approach to discover SNP variants, especially in plant species. However, methods to analyze predicted SNPs are often optimized for diploid plant species whereas many crop species are allopolyploids and combine related but divergent subgenomes (homoeologous chromosome sets). We created a software tool, SNiPloid, that exploits and interprets putative SNPs in the context of allopolyploidy by comparing SNPs from an allopolyploid with those obtained in its modern-day diploid progenitors. SNiPloid can compare SNPs obtained from a sample to estimate the subgenome contribution to the transcriptome or SNPs obtained from two polyploid accessions to search for SNP divergence.
PMCID: PMC3791807  PMID: 24163691
10.  An Improved Method for TAL Effectors DNA-Binding Sites Prediction Reveals Functional Convergence in TAL Repertoires of Xanthomonas oryzae Strains 
PLoS ONE  2013;8(7):e68464.
Transcription Activators-Like Effectors (TALEs) belong to a family of virulence proteins from the Xanthomonas genus of bacterial plant pathogens that are translocated into the plant cell. In the nucleus, TALEs act as transcription factors inducing the expression of susceptibility genes. A code for TALE-DNA binding specificity and high-resolution three-dimensional structures of TALE-DNA complexes were recently reported. Accurate prediction of TAL Effector Binding Elements (EBEs) is essential to elucidate the biological functions of the many sequenced TALEs as well as for robust design of artificial TALE DNA-binding domains in biotechnological applications. In this work a program with improved EBE prediction performances was developed using an updated specificity matrix and a position weight correction function to account for the matching pattern observed in a validation set of TALE-DNA interactions. To gain a systems perspective on the large TALE repertoires from X. oryzae strains, this program was used to predict rice gene targets for 99 sequenced family members. Integrating predictions and available expression data in a TALE-gene network revealed multiple candidate transcriptional targets for many TALEs as well as several possible instances of functional convergence among TALEs.
PMCID: PMC3711819  PMID: 23869221
11.  The Banana Genome Hub 
Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems.
Database URL:
PMCID: PMC3662865  PMID: 23707967
12.  Deep Sequencing Reveals Differences in the Transcriptional Landscapes of Fibers from Two Cultivated Species of Cotton 
PLoS ONE  2012;7(11):e48855.
Cotton (Gossypium) fiber is the most prevalent natural product used in the textile industry. The two major cultivated species, G. hirsutum (Gh) and G. barbadense (Gb), are allotetraploids with contrasting fiber quality properties. To better understand the molecular basis for their fiber differences, EST pyrosequencing was used to document the fiber transcriptomes at two key development stages, 10 days post anthesis (dpa), representing the peak of fiber elongation, and 22 dpa, representing the transition to secondary cell wall synthesis. The 617,000 high quality reads (89% of the total 692,000 reads) from 4 libraries were assembled into 46,072 unigenes, comprising 38,297 contigs and 7,775 singletons. Functional annotation of the unigenes together with comparative digital gene expression (DGE) revealed a diverse set of functions and processes that were partly linked to specific fiber stages. Globally, 2,770 contigs (7%) showed differential expression (>2-fold) between 10 and 22 dpa (irrespective of genotype), with 70% more highly expressed at 10 dpa, while 2,248 (6%) were differentially expressed between the genotypes (irrespective of stage). The most significant genes with differential DGE at 10 dpa included expansins and lipid transfer proteins (higher in Gb), while at 22 dpa tubulins, cellulose, and sucrose synthases showed higher expression in Gb. DGE was compared with expression data of 10 dpa-old fibers from Affymetrix microarrays. Among 543 contigs showing differential expression on both platforms, 74% were consistent in being either over-expressed in Gh (242 genes) or in Gb (161 genes). Furthermore, the unigene set served to identify 339 new SSRs and close to 21,000 inter-genotypic SNPs. Subsets of 88 SSRs and 48 SNPs were validated through mapping and added 65 new loci to a RIL genetic map. The new set of fiber ESTs and the gene-based markers complement existing available resources useful in basic and applied research for crop improvement in cotton.
PMCID: PMC3499527  PMID: 23166598
13.  Transposable Elements Are a Major Cause of Somatic Polymorphism in Vitis vinifera L. 
PLoS ONE  2012;7(3):e32973.
Through multiple vegetative propagation cycles, clones accumulate mutations in somatic cells that are at the origin of clonal phenotypic diversity in grape. Clonal diversity provided clones such as Cabernet-Sauvignon N°470, Chardonnay N° 548 and Pinot noir N° 777 which all produce wines of superior quality. The economic impact of clonal selection is therefore very high: since approx. 95% of the grapevines produced in French nurseries originate from the French clonal selection. In this study we provide the first broad description of polymorphism in different clones of a single grapevine cultivar, Pinot noir, in the context of vegetative propagation. Genome sequencing was performed using 454 GS-FLX methodology without a priori, in order to identify and quantify for the first time molecular polymorphisms responsible for clonal variability in grapevine. New generation sequencing (NGS) was used to compare a large portion of the genome of three Pinot noir clones selected for their phenotypic differences. Reads obtained with NGS and the sequence of Pinot noir ENTAV-INRA® 115 sequenced by Velasco et al., were aligned on the PN40024 reference sequence. We then searched for molecular polymorphism between clones. Three types of polymorphism (SNPs, Indels, mobile elements) were found but insertion polymorphism generated by mobile elements of many families displayed the highest mutational event with respect to clonal variation. Mobile elements inducing insertion polymorphism in the genome of Pinot noir were identified and classified and a list is presented in this study as potential markers for the study of clonal variation. Among these, the dynamic of four mobile elements with a high polymorphism level were analyzed and insertion polymorphism was confirmed in all the Pinot clones registered in France.
PMCID: PMC3299709  PMID: 22427919
14.  SNiPlay: a web-based tool for detection, management and analysis of SNPs. Application to grapevine diversity projects 
BMC Bioinformatics  2011;12:134.
High-throughput re-sequencing, new genotyping technologies and the availability of reference genomes allow the extensive characterization of Single Nucleotide Polymorphisms (SNPs) and insertion/deletion events (indels) in many plant species. The rapidly increasing amount of re-sequencing and genotyping data generated by large-scale genetic diversity projects requires the development of integrated bioinformatics tools able to efficiently manage, analyze, and combine these genetic data with genome structure and external data.
In this context, we developed SNiPlay, a flexible, user-friendly and integrative web-based tool dedicated to polymorphism discovery and analysis. It integrates:
1) a pipeline, freely accessible through the internet, combining existing softwares with new tools to detect SNPs and to compute different types of statistical indices and graphical layouts for SNP data. From standard sequence alignments, genotyping data or Sanger sequencing traces given as input, SNiPlay detects SNPs and indels events and outputs submission files for the design of Illumina's SNP chips. Subsequently, it sends sequences and genotyping data into a series of modules in charge of various processes: physical mapping to a reference genome, annotation (genomic position, intron/exon location, synonymous/non-synonymous substitutions), SNP frequency determination in user-defined groups, haplotype reconstruction and network, linkage disequilibrium evaluation, and diversity analysis (Pi, Watterson's Theta, Tajima's D).
Furthermore, the pipeline allows the use of external data (such as phenotype, geographic origin, taxa, stratification) to define groups and compare statistical indices.
2) a database storing polymorphisms, genotyping data and grapevine sequences released by public and private projects. It allows the user to retrieve SNPs using various filters (such as genomic position, missing data, polymorphism type, allele frequency), to compare SNP patterns between populations, and to export genotyping data or sequences in various formats.
Our experiments on grapevine genetic projects showed that SNiPlay allows geneticists to rapidly obtain advanced results in several key research areas of plant genetic diversity. Both the management and treatment of large amounts of SNP data are rendered considerably easier for end-users through automation and integration. Current developments are taking into account new advances in high-throughput technologies.
SNiPlay is available at:
PMCID: PMC3102043  PMID: 21545712
15.  Patterns of sequence polymorphism in the fleshless berry locus in cultivated and wild Vitis vinifera accessions 
BMC Plant Biology  2010;10:284.
Unlike in tomato, little is known about the genetic and molecular control of fleshy fruit development of perennial fruit trees like grapevine (Vitis vinifera L.). Here we present the study of the sequence polymorphism in a 1 Mb grapevine genome region at the top of chromosome 18 carrying the fleshless berry mutation (flb) in order, first to identify SNP markers closely linked to the gene and second to search for possible signatures of domestication.
In total, 62 regions (17 SSR, 3 SNP, 1 CAPS and 41 re-sequenced gene fragments) were scanned for polymorphism along a 3.4 Mb interval (85,127-3,506,060 bp) at the top of the chromosome 18, in both V. vinifera cv. Chardonnay and a genotype carrying the flb mutation, V. vinifera cv. Ugni Blanc mutant. A nearly complete homozygosity in Ugni Blanc (wild and mutant forms) and an expected high level of heterozygosity in Chardonnay were revealed. Experiments using qPCR and BAC FISH confirmed the observed homozygosity. Under the assumption that flb could be one of the genes involved into the domestication syndrome of grapevine, we sequenced 69 gene fragments, spread over the flb region, representing 48,874 bp in a highly diverse set of cultivated and wild V. vinifera genotypes, to identify possible signatures of domestication in the cultivated V. vinifera compartment. We identified eight gene fragments presenting a significant deviation from neutrality of the Tajima's D parameter in the cultivated pool. One of these also showed higher nucleotide diversity in the wild compartments than in the cultivated compartments. In addition, SNPs significantly associated to berry weight variation were identified in the flb region.
We observed the occurrence of a large homozygous region in a non-repetitive region of the grapevine otherwise highly-heterozygous genome and propose a hypothesis for its formation. We demonstrated the feasibility to apply BAC FISH on the very small grapevine chromosomes and provided a specific probe for the identification of chromosome 18 on a cytogenetic map. We evidenced genes showing putative signatures of selection and SNPs significantly associated with berry weight variation in the flb region. In addition, we provided to the community 554 SNPs at the top of chromosome 18 for the development of a genotyping chip for future fine mapping of the flb gene in a F2 population when available.
PMCID: PMC3022909  PMID: 21176183
16.  BLAST-EXPLORER helps you building datasets for phylogenetic analysis 
The right sampling of homologous sequences for phylogenetic or molecular evolution analyses is a crucial step, the quality of which can have a significant impact on the final interpretation of the study. There is no single way for constructing datasets suitable for phylogenetic analysis, because this task intimately depends on the scientific question we want to address, Moreover, database mining softwares such as BLAST which are routinely used for searching homologous sequences are not specifically optimized for this task.
To fill this gap, we designed BLAST-Explorer, an original and friendly web-based application that combines a BLAST search with a suite of tools that allows interactive, phylogenetic-oriented exploration of the BLAST results and flexible selection of homologous sequences among the BLAST hits. Once the selection of the BLAST hits is done using BLAST-Explorer, the corresponding sequence can be imported locally for external analysis or passed to the phylogenetic tree reconstruction pipelines available on the platform.
BLAST-Explorer provides a simple, intuitive and interactive graphical representation of the BLAST results and allows selection and retrieving of the BLAST hit sequences based a wide range of criterions. Although BLAST-Explorer primarily aims at helping the construction of sequence datasets for further phylogenetic study, it can also be used as a standard BLAST server with enriched output. BLAST-Explorer is available at
PMCID: PMC2821324  PMID: 20067610
17.  The Generation Challenge Programme Platform: Semantic Standards and Workbench for Crop Science 
The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making.
PMCID: PMC2375972  PMID: 18483570
18.  SAT, a flexible and optimized Web application for SSR marker development 
BMC Bioinformatics  2007;8:465.
Simple Sequence Repeats (SSRs), or microsatellites, are among the most powerful genetic markers known. A common method for the development of SSR markers is the construction of genomic DNA libraries enriched for SSR sequences, followed by DNA sequencing. However, designing optimal SSR markers from bulk sequence data is a laborious and time-consuming process.
SAT (SSR Analysis Tool) is a user-friendly Web application developed to minimize tedious manual operations and reduce errors. This tool facilitates the integration, analysis and display of sequence data from SSR-enriched libraries.
SAT is designed to successively perform base calling and quality evaluation of chromatograms, eliminate cloning vector, adaptors and low quality sequences, detect chimera or partially digested sequences, search for SSR motifs, cluster and assemble the redundant sequences, and design SSR primer pairs. An additional virtual PCR step establishes primer specificity. Users may modify the different parameters of each step of the SAT analysis.
Although certain steps are compulsory, such as SSR motifs search and sequence assembly, users do not have to run the entire pipeline, and they can choose selectively which steps to perform. A database allows users to store and query results, and to redo individual steps of the workflow.
The SAT Web application is available at , and a standalone command-line version is also freely downloadable. Users must send an email to the SAT administrator to request a login and password.
PMCID: PMC2216045  PMID: 18047663

