The genomic structural analysis (DNA content, PFGE, and aCGH analysis) indicated that YJS329 retained a diploid karyotype and had much lower structural polymorphisms than the bioethanol strain JAY270 and some other industrial strains [1
]. We also sequenced the genome of YJSH2 (a haploid spore derived from the same tetrad as YJSH1) using the Illumina paired-ends method. After mapping the reads of YJSH2 to the YJSH1 genome, we estimated that the YJS329 genome had about 0.6 SNP/kb between allelic regions in homologous chromosomes (unpublished data). These results indicated that the YJS329 strain was genetically very stable, a desirable phenotype for industry practice. Although S288c has been widely used in scientific research, because of the high number of Ty elements, its genome seems to be more plastic [31
]. High expression activity of Ty elements in genes was confirmed in the S288c-derived strain BYZ1 as a result of a dose effect (Additional file 4
). The duplicated region on chromosome 4 in BYZ1 is probably the result of chromosomal translocations by ectopic recombination mediated by the flanking Ty elements. Strikingly, no dosage-compensation mechanisms acted to normalize the expression from each gene because the higher expression (1.59-fold) of this duplicated region almost matched the higher gene dose (1.5-fold). These results indicated that spontaneous Ty-driven rearrangements could be quite common and, if ignored, could easily lead to incorrect experimental results in genetic studies, especially for the S288c-derived strains.
Second-generation sequencing technology has proven to be an effective tool for the investigation of the genome sequences and structures of yeast strains and has provided many new insights into genome evolution and phenotypic effects [1
]. The level of nucleotide polymorphisms between YJSH1 and S288c (0.57%) is very similar to the level separating S288c and AWRI1631 (wine strain), YJM789 (pathogenic strain), M22 (vineyard strain) or YPS163 (oak tree strain) [21
], but, interestingly, YJSH1 was grouped closely with sake strains, consistent with their geographical distributions. To the best of our knowledge, YJS329 is the first bioethanol strain for which a high-quality assembled genome has been completed. The SNPs and indels that we have identified in the aligned regions of YJSH1 and S288c constitute the main genome mutations in these two strains. Mutation frequencies were found to be higher in the intergenic regions than in the coding regions, we found that up to 40% of the SNPs and 88% of the indels were located in intergenic sequences (accounting for about 27% of the genome). This pattern could arise from the sequence characteristics of intergenic regions (for example: the abundance of repeated sequences). However, we also observed a considerable number of mutations in the ORFs that play important roles in specified physiological activities. Remedying some of these mutations may improve the capabilities or change the specified phenotype of YJS329. A total of 11 ORFs were predicted in the YJS329 genome that are absent from the S288c genome. Remarkably, some of these ORFs may be very similar to those in other Saccharomyces
species, including S. paradoxus
, S. carlsbergensis
, and S. mikatae
. Therefore, during the evolution of the YJS329 genome, repeated yeast hybridization events that were followed by the gradual loss of one of the contributing genomes might have occurred. Undoubtedly, the genotypic characteristics of YJS329 that have been revealed in the present study will enrich the genetic resources of this species, which will be valuable for breeding strains with the desired phenotypes.
The recently developed RNA-Seq approach was used to explore the transcription profiles of the YJS329 and BYZ1 S. cerevisiae
strains. Among the 2,611 differently expressed genes in these two strains, many were involved in the trehalose metabolism pathways, antioxidative factors, and membrane composition biosynthesis that are closely related to multiple stress-tolerance and fermentation characteristics. For example, consistent with the higher oleic acid content of membranes, the genes encoding the subunits of fatty acid synthetase (FAS1
), the acetyl-CoA carboxylase gene (ACC1
), and the genes that function in fatty-acid desaturation and elongation (ELO1
) were considerably up-regulated in YJS329. Our results indicated that most of the differences in the physiological factors were consistent with the mRNA transcription differences between these two strains. Transcription –regulatory network analyses revealed that the transcription factors Msn2/4p, Hap1p, Hsf1p, and Arr1p might give prominence to the differently expressed genes and phenotypic differences between the two strains. This result was consistent with the observation that the trans
variation is more common in expression polymorphism in yeast [37
]. In spite of this, the contributions of cis
variations on the divergence of mRNA expression and physiological metabolism should not be neglected because our results confirmed that mutations in the promoters of some important transcription factors and genes could directly affect the efficiency of their promoter efficiency. Overall, the molecular mechanisms underlying the mRNA expression differences between YJS329 and BYZ1 might involve: (i) SNPs and indels in the cis-acting elements that affect the expression efficiency of the genes; (ii) the inactivation of transcription factors by SNPs or indels; and (iii) changes in gene copy number. Remarkably, the discrepancies between the transcriptional profile (for example, of Hap1p) and the phenotype in the two strains might reflect variations in the activities of homologous proteins or posttranscriptional regulation, which deserve further assessment. In addition, here, for the first time, the expression activities of some novel ORFs under different conditions have been determined. Our study shows that whole-genome sequencing combined with RNA-Seq is a powerful tool for linking genotypes and phenotypes in functional genomic studies.