The Solanaceae family encompasses a number of species of agronomic and ornamental importance. With regards to cultivation for food consumption, in 2003, potato was the world's fifth largest crop in world-wide production acreage and the solanaceous vegetables tomato, eggplant, and pepper ranked 11th, 19th, and 22nd, respectively [1
]. Species grown for ornamental purposes include petunia and Nicotiana
species. While not consumed for food, these horticultural species are a substantial component of the US agronomic economy. For example, petunia represents greater than $148M output per year in the US [2
]. Tobacco represents another crop of significant economical importance with $1.6B in crop value in 2003 [3
]. A close relative of tobacco, Nicotiana benthamiana
, has been utilized as an experimental model for viral research and disease resistance studies. Coupled with the robust ability of virus induced gene silencing to silence transcripts [4
], N. benthamiana
has emerged as a model species for disease resistance research.
The Solanaceae have been bred and developed for a variety of purposes. Potato has been bred for tubers (modified stems) while tomato, pepper, and eggplant have been bred for enhanced fruit production. Likewise, petunia has been bred and selected for floral phenotypes while tobacco has been bred for leaf size. While these modern varieties are accentuated for particular morphological features, these species share common taxonomic features of the Solanaceae such as alternate leaves, flower parts in five, and fruit as a berry or capsule. Compared with other plant families such as the Poaceae, the range of genome sizes of solanaceous species is fairly narrow, ranging from 900 to 4600 Mb per haploid genome [5
]. Early studies of the Solanaceae genome revealed conservation of gene content among potato, tomato, tobacco, petunia, and eggplant. These studies employed relatively small scale cross-hybridization studies using cDNA and random genomic DNA clones [6
] in which a set of 20 tomato cDNA clones were hybridized with a panel of solanceous species including Lycopersicon
, and Nicotiana
. For the cDNA clones, there was strong hybridization across the Solanaceae; however, with the genomic clones (50 in total), there was a reduced degree of cross-hybridization with the non-Lycopersicon
species. These data suggested conservation among the coding sequences while the non-coding sequences had undergone substantial divergence.
Conserved gene content prompts the question of conserved gene order, i.e. synteny across the Solanaceae. A number of solanaceous species have a base chromosome number of 12 including the main vegetable crop species potato, tomato, pepper and eggplant. Using markers developed from tomato, a strong degree of co-linearity between potato and tomato has been demonstrated with the differences attributable to paracentric inversions occurring between these two species [7
]. Using the same approach in pepper, 18 homologous linkage blocks between tomato and pepper could be identified [9
]. In eggplant, tomato markers yet again revealed syntenic regions among tomato and eggplant [10
]. While these synteny studies utilized anonymous DNA clones as markers, comparative mapping of phenotypes such as fruit morphology [11
], pigmentation [12
] and disease resistance [13
] revealed syntenous mapping of these traits across the Solanaceae.
These early studies relied heavily on cDNA and random genomic clones. The advent of high throughput sequencing projects such as Expressed Sequence Tags (ESTs) [14
] has resulted in the generation of hundreds of thousands of sequences for solanaeous species. For this study, a total of 441,154 ESTs were collected from the public database (dbEST) representing the solanaceous species tomato (162,621), potato (189,864), pepper (29,894), tobacco (26,497), and N. benthamiana
(26,918). The available solanaceous ESTs, along with Expressed Transcripts (ETs), available in Genbank, can be clustered into gene indices [15
] that represent a non-redundant set of transcripts and facilitate analysis of redundant EST collections. Using potato and tomato gene indices, a comparative analysis of tomato and potato ESTs revealed that approximately 80% of the potato ESTs had a significant sequence match with a tomato EST at the nucleotide level (E value cutoff of 10-10
In this study, we report the construction and comparative analyses of gene indices for six solanaceous species (tomato, potato, tobacco, pepper, petunia and N. benthamiana). These gene indices represent a total of 116,207 non-redundant sequences which we have utilized to assess sequence conservation among the Solanaceae on a genomic scale. We significantly extended previous studies on sequence similarity and conservation among these species as well as documented more thoroughly the characteristics of the coding portion of the Solanaceae genome. Using computational methods, we have identified putative orthologs among these species and generated a phylogenetic tree to ascertain the relationship and sequence divergence among these species. In addition to these computational approaches, we assessed the similarity of expression profiles in mature leaves to experimentally validate the sequence conservation of these species using heterologous hybridization to potato cDNA microarrays. The comparison of the solanaceous transcripts to the predicted proteomes of the near-complete genome sequences of Arabidopsis, rice, as well as to 21 other plant gene indices resulted in the identification of solanaceous transcripts without putative homologs, suggesting that a portion of these transcripts have a high likelihood of being unique to the Solanaceae. These analyses provide insight into the overall sequence conservation among eudicots (Arabidopsis and Solanaceae) as well as between the Solanaceae and the monocots (i.e., rice).