Chinese Hamster Ovary derived immortalized cell lines are the preferred host system for therapeutic protein production. CHO cell line engineering work has made incredible progress in optimizing products and titers by focusing manipulating single genes
2 and selecting clones with desirable traits following various treatments (e.g., mutagenesis or media adjustment). This progress has been accomplished without the availability of genomic sequences. Here, we present a publicly available annotated genome sequence for a CHO cell line, which represents yet another tool in the bioprocessing toolbox. It is not anticipated that this draft sequence will directly improve product titers to the extent as achieved through careful screens in the past. However, the CHO-K1 genomic sequence will facilitate the design of targeted genetic manipulations to aid in cell-line engineering (), help in the elucidation of components underlying poorly characterized phenotypes (), and allow for more comprehensive deployment of “omic” tools for CHO-K1 and related cell lines ().
A genome-scale analysis of the glycosylation genes in the CHO-K1 genome identifies homologs to 99% of the human glycosylation-associated transcripts, with 53% of them expressed. The high coverage of homologs provides a unique opportunity for glycoform manipulation in CHO cells. Indeed, the high variability of gene silencing has led to the generation of the diverse selection of Lec mutant cell lines
20. Moreover, it has been shown that clonal selection can lead to a sub-population of CHO cells expressing genes like GGTA1, that were thought to be inactive
31. This result suggests that many other unexpressed glycosylation genes in the CHO genome can be potentially activated or silenced to alter the repertoire of glycan structures from CHO cells (). In addition, the genome sequence will facilitate the development of genome-scale metabolic models for CHO cells. Such models allow for the assessment of the network-level effects of cell line treatments, and have been successful at predicting optimal designs for bioprocess optimization in prokaryotes
52–54.
The genome of CHO cells can also provide insight into less-well characterized phenotypes. For example, the global analysis of viral susceptibility genes in the CHO genome demonstrates that key plasma membrane receptor genes, CAMs, and genes involved T-cell activation and macromolecular assembly are not expressed in CHO-K1. Furthermore, the lack of expression of several key viral entry receptors for HSV-1, HIV, HBV, and pseudorabies virus opens up the possibility for an in-depth analysis of CHO cell resistance to viral infection. In addition, we found several key regulatory molecules such as histone factors to be lacking expression in CHO-K1. This analysis demonstrates that the genome sequence can be integrated with omic data analysis to generate hypotheses to guide further study into poorly characterized phenotypes of CHO cells ().
The CHO-K1 genome should facilitate the interpretation of various omic data types. However, it is important to note that CHO-K1 is an ancestral cell line from which many CHO cell lines have been derived. During the course of the rather stringent manipulations involved in optimizing cell lines (e.g., selection for growth in different media compositions and switching cells from adherent cell culture to suspension-adapted growth), many genomic changes have likely occurred due to the inherent genomic instability of these cell lines (e.g., SNPs, indels and other structural variations). Moreover, the cell lines derived from CHO-K1 that are widely used in the industry (e.g. DUKX-B11 and DG44) may contain additional genetic changes from chemical and radiation mutagenesis
5, 6. Thus, this genome sequence of the ancestral K1 cell line should not be considered as directly representative of all CHO cell lines. However, the full coverage draft genomic sequence of the ancestral K1 cell line will serve as a foundation to support efforts in sequencing other CHO cell lines (). These additional genomic sequences will provide a context for transcriptomic and proteomic data interpretation in the respective cell lines. It will also facilitate the identification or design of other potential targets or tools for cell line engineering (e.g., miRNAs, siRNAs, etc.).
The availability of the CHO-K1 genomic sequence provides a valuable resource for genome-scale CHO-cell research and will aid in manufacturing applications. However, we expect the quality of the genomic sequence will be iteratively improved over time as more genomic information becomes available for CHO-K1 and other CHO cell lines. Moreover, we anticipate that characterizing effects of sequence variations on gene products and expression would improve the functional annotation of these cell lines. These improvements may enhance the application of CHO-cell engineering and other techniques to improve protein production and quality.