Viruses are ubiquitous parasites of all cellular life forms. As a group, they are united by their intracellular reproduction and reliance on the host cell translation system, but not necessarily by common origin [
1]. Indeed, not a single gene is represented in the genomes of all known viruses, although a small group of ‘viral hallmark genes’ encoding some of the key proteins involved in genome replication and virion structure formation are shared by extremely diverse subsets of viruses [
2,
3]. Thus, viruses as a class of biological agents are not monophyletic, at least not within the traditional concept of monophyly. Nevertheless, several large groups of viruses infecting diverse hosts do appear to share common ancestry in the strict sense – that is, to have evolved from a single ancestral virus – which is indicated by the conservation of sets of genes encoding proteins responsible for many functions essential for virus reproduction.
One of the most expansive apparently monophyletic viral divisions currently includes six families of eukaryotic viruses with large DNA genomes that are collectively denoted nucleo-cytoplasmic large DNA viruses (NCLDV; table ) [
4,
5]. The best known of these viral families, Poxviridae, is a large assemblage of animal viruses that includes a major human pathogen, the smallpox virus, important animal pathogens, such as rabbit myxoma virus, as well as vaccinia virus, one of the best characterized models of molecular biology [
6,
7,
8]. Another family of the NCLDV that recently became the focus of much attention and fascination is the Mimiviridae, which so far includes two closely related giant viruses isolated from Acanthamoeba – Mimivirus and Mamavirus. With their genomes being slightly larger than 1 megabase, these viruses are undisputed genome size record holders in the virosphere, exceed numerous parasitic bacteria, and approach the genome size of the simplest free-living prokaryotes [
9,
10,
11,
12,
13].
The NCLDV infect animals and diverse unicellular eukaryotes, and either replicate exclusively in the cytoplasm of the host cells, or possess both cytoplasmic and nuclear stages in their life cycle (table ). The NCLDV typically do not strongly depend on the host replication or transcription systems for completing their replication [
6,
14]. In line with this relative independence of virus reproduction from the host cell functions (apart from translation, of course), the NCLDV encode several conserved proteins that mediate most of the processes essential for viral reproduction. These key proteins include DNA polymerases, helicases and primases responsible for DNA replication, Holliday junction resolvases and topoisomerases involved in genome DNA processing and maturation, transcription factors that function in transcription initiation and elongation, ATPase pumps mediating DNA packaging, chaperones involved in capsid assembly, and capsid proteins themselves [
4,
5,
15]. Although several viral hallmark genes [
3] are shared by NCLDV and other large DNA viruses, such as herpesviruses and baculoviruses, the conservation of the entire set of core genes clearly demarcates the NCLDV as a distinct class of viruses [
5].
Recently, a novel giant virus, denoted Marseillevirus, has been isolated from Acanthamoeba. Genome analysis of Marseillevirus indicated that it represents a putative new family of NCLDV that appears to be distantly related to iridoviruses and ascoviruses [
16]. In addition, comparative genomic analysis revealed probable gene exchange between Marseillevirus and Mimiviruses, an observation that suggests a role of amoeba as a ‘melting pot’ of giant virus evolution.
We performed a new comparison of the updated collection of NCLDV genomes and constructed clusters of orthologous NCLDV genes (NCVOGs), 177 of which were represented in two or more viral families [
15]. The NCVOGs were employed for phylogenetic analysis and for reconstruction of the ancestral viral gene set. Here we review the results of these analyses in the context of the origin and evolution of the NCLDV, and attempt to decipher the origins of the NCLDV genes that are mapped to the last common ancestral virus.