Inferring the genome organization and gene content of an extinct species has the potential to provide detailed information about the recent evolution of species descended from it. If we know what was present in the genome of an ancestor, we can deduce how a current-day descendant differs from it. We can then ask questions about how it came to be different. The most recent changes in a genome are often the most interesting ones, because they reflect the most recent (or even current) evolutionary pressures acting on that genome
[1],
[2].
Yeast species offer the potential for the precise reconstruction of ancestral genomes, because many genomes have been sequenced and they show extensive colinearity of gene order among species
[3]–
[6]. As the number of sequenced genomes from related species rises, so does the precision with which we can reconstruct their history. In this study we compare the genomes of a group of species in the subphylum Saccharomycotina, spanning an evolutionary time-depth that is comparable to that of the vertebrates
[7]. A whole-genome duplication (WGD) event occurred during the evolution of this subphylum
[8], and we can compare the genomes of several species (including
S. cerevisiae) that are descended from this event to the genomes of several species that branched off before the WGD occurred. We focus on an ancestor that existed approximately 100–200 Mya, at the point immediately before the WGD occurred. The evolutionary period beginning with this ancestor corresponds to a time during which the
S. cerevisiae lineage became increasingly adapted to rapid fermentative growth
[9],
[10] and extensive rearrangement of the genome occurred (including the deletion of thousands of redundant copies of duplicated genes)
[11].
Previous studies in other systems have employed both manual and computational approaches to reconstructing ancestral genomes. One of the most successful applications of computational methods has been the estimation of the ancestral order of orthologous genes in the common ancestor of 12 Drosophila species
[12],
[13]. Ancestral reconstruction is more difficult when ancient polyploidizations are present
[14]. In studies of the 2R duplications in vertebrates, for example, the emphasis has been on establishing the ancestral gene content of paralogous chromosomal regions rather than on their precise gene order
[15],
[16]. We chose to use a manual, parsimony-based, approach to reconstructing the yeast ancestor at the point of WGD. The manual approach has the attractions of being tractable (whereas computational methods are still under development
[17],
[18]), of providing an independent result to which computational results can be compared, and of forcing us to examine every rearrangement event without prejudice as to what mechanism might have caused it.
Sankoff and colleagues
[14],
[17],
[18] have developed computational methods that aim to reconstruct ancestral gene order in datasets that include polyploidizations. In recent work
[18], they evaluated their ‘guided genome halving’ (GGH) algorithm by comparing its results to ours, using a preliminary version of the manually-derived ancestral yeast gene order that we report here as a ‘gold standard’. As currently implemented, the GGH algorithm can only consider input from a single post-WGD genome and 1–2 non-WGD outgroups, and only considers genes that are duplicated in the post-WGD genome.
Inferring the set of genes that existed in a yeast ancestor, and the order of those genes along the chromosomes, is of interest from both genome-evolutionary and organismal-evolutionary standpoints. Knowing the ancestral gene order enables us to trace all the inter- and intra-chromosomal rearrangements that occurred
en route from this ancestor to the current
S. cerevisiae genome, which is informative about the molecular mechanisms of evolutionary genome rearrangement and is also phylogenetically informative. Knowing the ancestral gene content allows us to identify genes that have been added to, or lost from, the
S. cerevisiae genome during the past 100 Myr. Previous studies have shown that changes in gene content can provide a strong indication of changing evolutionary circumstances, either in cases of gene loss (such as the losses of
GAL,
DAL and
BNA genes in
Candida glabrata [1],
[19],
[20]) or in cases of gene gain (such as the
ADH2 and
URA1 genes of
S. cerevisiae [9],
[21],
[22]). Even though it may not be possible to conclude that any particular gene gain was adaptive, the clear links between the functions of the gained genes
ADH2 and
URA1 and the adaptation of
S. cerevisiae to a fermentative lifestyle
[23] suggested to us that a systematic search for all the genes that were gained by
S. cerevisiae since WGD would be worthwhile.