|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: MH SWH JPV. Performed the experiments: MH. Analyzed the data: CT SWH JPV. Contributed reagents/materials/analysis tools: MP. Wrote the paper: SWH JPV.
The APOBEC3 gene cluster encodes six cytidine deaminases (A3A-C, A3DE, A3F-H) with single stranded DNA (ssDNA) substrate specificity. For the moment A3A is the only enzyme that can initiate catabolism of both mitochondrial and nuclear DNA. Human A3A expression is initiated from two different methionine codons M1 or M13, both of which are in adequate but sub-optimal Kozak environments. In the present study, we have analyzed the genetic diversity among A3A genes across a wide range of 12 primates including New World monkeys, Old World monkeys and Hominids. Sequence variation was observed in exons 1–4 in all primates with up to 31% overall amino acid variation. Importantly for 3 hominids codon M1 was mutated to a threonine codon or valine codon, while for 5/12 primates strong Kozak M1 or M13 codons were found. Positive selection was apparent along a few branches which differed compared to positive selection in the carboxy-terminal of A3G that clusters with A3A among human cytidine deaminases. In the course of analyses, two novel non-functional A3A-related fragments were identified on chromosome 4 and 8 kb upstream of the A3 locus. This qualitative and quantitative variation among primate A3A genes suggest that subtle differences in function might ensue as more light is shed on this increasingly important enzyme.
The APOBEC3 seven gene cluster (A3A-C, A3DE, A3F-H) encodes six cytidine deaminases with single stranded DNA (ssDNA) substrate specificity , , , , , , , , . Several are clearly innate restriction factors for viruses, notably for retroviruses, hepadnaviruses or parvoviruses , , , , , , , , , , , , , , , . A3G and A3F constituted such a strong barrier for the lentiviral group of retroviruses that all but one encode a vif gene whose protein (Vif) is a powerful antagonist , , , , , , , , . Hepatitis B virus (HBV) is restricted by at least two A3 enzymes while herpes simplex virus type 1 is restricted by A3C , , . To date there are no reports of A3 antagonists encoded by these viral genomes. This antiviral role fits with the repeated observation that several A3 genes are up-regulated by type I and II interferons , , , , . However, recent work has shown that this antiviral role is just part of a bigger picture , , . For example, A3A can restrict Line transposition , , , . Several A3 enzymes can initiate catabolism of mitochondrial DNA, in which uracil N-glycosylase plays a major role downstream of editing . For the moment A3A is the only enzyme that can initiate catabolism of both mitochondrial and nuclear DNA .
These A3 proteins mediate hydrolytic deamination at the C4 position that oxidises cytosine to uracil in ssDNA so generating C→U hyper-edited molecules , , , , , , , , . The active sites of A3 enzymes are characterized by a conserved zinc-finger HAEX23–28PCX2–4C motif . These A3 enzymes show a strong preference for cytidine deamination occurring segment carboxy-terminal to the zinc finger impacts this dinucleotide specificity . Human A3A expression is initiated from two different methionine codons (M1 and M13), both of which are in adequate but sub-optimal Kozak environments .
Even though a number of primate genomes are available, only the chimpanzee locus is colinear. For the orang-utan the A3A gene is incomplete while the entire locus contains 12 exon 3/exon 6 domains rather than 11. The A3A gene is missing in the Rhesus macaque assembly, while the marmoset locus doesn't exist per se, sequences being distributed over numerous contigs. As the A3 locus shows signs of extensive gene conversion, the apparent gaps might reflect assembly problems.
We have analyzed the genetic diversity among A3A genes across a wide range of primates including New World monkeys, Old World monkeys and Hominids. There is variation among the Kozak motifs with the M1 initiator methionine being absent for chimpanzees, bonobos and gorillas. Some, but not all, A3A lineages show positive selection suggesting that A3A enzymes may not be truly orthologous.
Twelve primates A3A sequences spanning New and Old World monkeys were derived by amplification of genomic DNA and given aligned to the human sequence (Figure 1). The A3A protein is initiated at codons M1 or M13 giving rise to two different proteins both with ssDNA cytidine deaminase activity . The Kozak context of both human A3A initiator codons is considered to be adequate. For 3 hominids, codon M1 was mutated to a threonine codon or valine codon which probably abrogates translation initiation (Figure 1, Table 1). For both New World monkey sequences, the M1 Kozak context was strong suggesting that translation initiation at M13 would be reduced. In addition, the Kozak context of the M13 codon was strong for 3/12 primates notably C. guereza, C. aethiops and C. neglectus (Table 1). For all the others, the context is considered to be adequate for translation initiation.
Sequence variation was observed in all exons apart from the very small exon 5. Some of the exon 5 sequences differ compared to some recently reported . On a pairwise basis up to 31% amino acid divergence was observed overall, with 6%, 21% and 30% among hominids, Old World small monkeys and New World monkeys respectively. That the variation is as great overall as that between the New World monkeys, suggests that there has not been too much gene conversion in the New World lineage. Exon 3 encodes the hallmark HXEX23–28PCX2–4C motif for cytidine deaminases (Figure 1). Among all the human A3 enzymes only A3A encodes the PCX4C variant. Interestingly, the New World A3A sequences are singular in that they encode the PCX2C variant typical of all other A3 enzymes.
In order to characterise whether this variation shows signs of selection, we estimated the relative numbers of non-synonymous (dN) and synonymous (dS) nucleotide substitutions per site and dN/dS ratios over the twelve primate species using the Hyphy package and FEL and REL methods . We investigated models in which the dN/dS ratio is allowed to vary among the complete sequence using the GA-branch analysis. There was significant positive selection with estimated dN/dS ratios >1.0 (p>0.95), at five sites, notably D41, L62, C64, H160 & H168 (Figure 1, in blue). By contrast, several sites were under significant negative selection, notably S7, N24, V25c, A107, F125, E157 and W162 (Figure 1, in green).
A phylogenic tree for the complete sequence of A3A was constructed using BioNJ (Figure 2A). The red internal branches denote those where dN/dS>1 (p>0.9) which are confined to a small fraction of the total number of branches. Among A3 enzymes, the A3A sequence is phylogenically closest to the carboxy-terminal domains of A3B and A3G. In view of a large collection of A3G sequences , a comparable analysis was made using the A3Gc sequences (Figure 2B). The branch-specific patterns of dN/dS variation for both A3A and A3Gc cytidine deaminases are different, a good example being the New World monkey lineage.
When performing Blat searches for this study (UCSC Genome Bioinformatics), we identified a segment of 288 bp on human chromosome 4 with strong homology to exon 3 of the A3A/A3Bc/A3Gc cluster (Figure 3A) which will be referred to as ΨA3chr4. Homology went out to a few hundred bases either side with the splice sites perfectly conserved. The sequence is present in human, chimpanzee, gorilla, orang-utan, macaque and marmoset genomes while absent in horse, dog, cat and rodent genomes. At the protein level, the exon revealed a HVEXnSCX2C motif similar to that for all A3 deaminases (HAEXnPCX2–4C) (Figure 3A). While the A→V substitution is found in AID and APOBEC1 sequences, the P→S substitution is without precedent. Phylogenic analysis based on amino acid sequences showed that it emerged after the (A3A, A3Bc)A3Gc split (Figure 3B).
5′ and 3′ RACE failed to identify any transcripts while no EST was found in the databases. Nonetheless, to ascertain whether this exon encoded a functional domain, we synthesized a fusion gene with the exon surrounded by exons 1, 2, 4 and 5 of the human A3A gene. The construct was cloned in pcDNA3.1 TOPO resulting in addition of the V5 tag. When transfected into HeLa cells and stained with FITC-conjugated anti-V5 antibody the construct was viable and strongly nuclear, more so than hA3A indicating that residues impacting A3A localization lie in exon 3 (Figure 3C). In order to demonstrate editing activity, HeLa cells were co-transfected by the reconstructed pΨA3chr4 clone and an infectious molecular clone of hepatitis B virus. Total DNA was analysed at 72 hrs by a nested PCR/3DPCR approach as previously described , .
The minimal denaturation temperature (Td) for the HBV X gene segment analysed is 91.8°C (Figure 3D, ). When co-transfected with the reconstructed pΨA3chr4 clone, the lowest Td was equally 91.8°C indicating that the recombinant may not be packaged into assembling HBV virions. Accordingly, a non-viral region corresponding to MT-COI gene was been amplified by PCR/3DPCR . The minimal Td for MT-COI DNA was 87°C with or without pΨA3chr4 (Figure 3E), suggesting that the chromosome 4 fragment is indeed devoid of ssDNA cytidine deaminase activity.
Finally, an additional ~1.1 kb A3A-related fragment was identified ~8 kb upstream of the human A3A gene in the same orientation as the entire A3 locus. For comparison, the A3A-A3B intergenic region is ~19 kb. The fragment comprises 104 bp (37%) of intron 4, exon 5 and downstream sequences. Overall it shows 96% nucleic acid homology to hA3A. As such it must represent a vestige of prior gene conversion. Indeed, the sequence is surrounded by repeat elements, some of which are found surrounding the hA3A and hA3B genes. This A3A remnant is found in the chimpanzee, orang-utan and rhesus macaque genomes.
The primate A3A gene shows considerable qualitative and quantitative genetic variation, with up to 31% amino acid variation. Translation initiation sites vary there being at least four different configurations (Table 1). Positive selection is apparent along a few but not all branches suggesting that differences may emerge when more attention is turned to this important enzyme.
From the outset, differences in the restriction patterns of primate A3G on HIV-1Δvif were noted , , . More recent reports show that several human and macaque A3 cytidine deaminases are not strictly equivalent when using HIV-1 as a readout . Indeed, as several reports have shown subtle differences for A3B, A3DE and A3G , , , this should transpire for A3A. However, as this enzyme impacts the integrity of the human genome, it is possible that the variation in structure and evolution of the A3A gene could impact cell biology.
During data analyses, two A3A related fragments were identified. The ΨA3chr4 exon 3 fragment proved to be devoid of catalytic activity when spliced together with exons 1, 2, 4 and 5 from A3A. This solo A3 exon is reminiscent of the recent finding of an isolated APOBEC1 exon in the tetrapod lineage that was subsequently lost . The second A3A fragment is particularly interesting in that it shows that the present organization of the primate A3 locus might well have come about via more gene conversion than previously thought . In conclusion, there is subtle qualitative and quantitative variation among primate A3A genes. In turn, gene expression and perhaps interferon sensitivity might follow.
Faecal samples were collected from wild non-habituated western gorilla (Gorilla gorilla gorilla) and chimpanzees (Pan troglodytes troglodytes) in Cameroun with permission of the Cameroonian Ministries of Health, Research and Environment and Forestry and Wildlife, and from bonobo (Pan paniscus) in the Democratic Republic of Congo with the permission of the Ministries of Science and Technology and Forest Economy . DNA was extracted as previously described . For mantled guereza (Colobus guereza) and mandrills (Mandrillus sphinx), DNA was extracted from whole blood on samples that were collected on primate bushmeat with permission from Cameroonian Ministries of Health, Research and Environment and Forestry and Wildlife, as previously reported . Primary cells and cells line were obtained for orang-utan (Pongo pygmaeus) that died of natural causes while housed at the Wanariset orang-utan Reintroduction Center in East Kalimantan, Indonesia  and white-handed gibbon (Hylobates lar, ATCC 57763) respectively, while samples of rhesus monkey (Macaca mulatta, ATCC CCL-7), vervet monkey (Cercopithecus aethiops, ATCC CCL-81), and necropsy tissue samples from a squirrel monkey (Saimiri sciureus) and cotton-top tamarin (Sanguinus Oedipus) that died of natural causes while kept in a zoo have been already described . Primary cells from De Brazza's monkey (Cercopithecus neglectus) came from an animal that died of natural causes while housed at the zoo de la Palmyre (France).
Hot start PCR was performed with corresponding primers (Table 2). The first reaction involved standard amplification, the reaction parameters were 95°C for 5 min., followed by 35 cycles (95°C for 30 s., 50–55°C for 30 s. and 72°C for 1 min.) and finally for 10 min. at 72°C for the first round. Differential amplification occurred in the second round using the equivalent of 0.2 µL of the first round reaction as input. Conditions were identical to the first PCR. The buffer conditions for all amplification were 2.5 mM MgCl2, 50 mM KCl, 10 mM Tris-HCl pH 8.3, 200 µM of each dNTP, 100 µM of each primer, and 2.5 units of BIOTaq polymerase (Bioline) in a final volume of 50 µL. PCR products were purified from agarose gels (Qiaex II kit, Qiagen, France) and ligated into the TOPO TA cloning vector (Invitrogen, France). After transformation of Top10 electrocompetent cells (Invitrogen), up to 15 clones were picked. Sequencing was outsourced to GATC biotech. All mutations were confirmed by inspection of the chromatogram. The pΨA3chr4 insert was synthetized by GeneCust and cloned into the pcDNA3.1 TOPO-V5 vector (Invitrogen).
Briefly, 105 QT6 cells (ATCC CRL 1708) were cotransfected with 1 µg of pΨA3chr4 plasmid DNA along with 1 µg pCayw, a plasmid encoding an infectious molecular genome of hepatitis B virus (HBV) using FuGENE 6 (Roche). Total DNA was extracted using the MasterPureTM complete DNA and RNA purification kit (Epicentre). QT6 cells were maintained in HAM's F40 medium, supplemented with 1% chicken serum, 10% FCS, 5% tryptose phosphate, 2 mM L-glutamine, 50 U/ml penicillin and 50 µg/ml streptomycin .
HeLa cells (ATCC CCL 2) were grown to a density of 5.105 cells per dish  and transfected with 1 µg of pΨA3chr4 or pA3A using FuGENE 6 (Roche). After 48 hours, the cells were washed twice with PBS, fixed for 45 minutes in a 5050 methanol/ethanol mix. As primary antibodies, a mouse monoclonal antibody specific for the V5 epitope tag (Invitrogen) was used at a 1200 dilution for 1 hour at room temperature. Cells were washed twice with PBS, and FITC-conjugated anti-mouse antibody anti-mouse was used as second antibody (Sigma) at a dilution 1200 for 30 minutes at room temperature. We used Vectashield, mounting medium for fluorescence with DAPI (Vector laboratories, Inc.). Immunofluorescence was observed by microscopy (Zeiss).
Sequences were aligned using the MUSCLE program, and neighbor-joining trees were obtained using BioNJ as implemented in http://phylogeny.fr. The final output was edited using Treeview . The relative numbers of non-synonymous (dN) and synonymous (dS) nucleotide substitutions per site were estimated using the random effects likelihood (REL) and the fixed effects likelihood (FEL) methods available via the Datamonkey web interface of the HyPhy package . Estimates of dN/dS ratios were based on neighbor-joining trees obtained from phylogeny.fr.
We used the genetic algorithm (GA-Branch) method available in HyPhy  to detect lineage-specific variation in selection pressure. This assigns different classes of dN/dS ratios to each lineage to determine the best-fit model of lineage-specific evolution, and it calculates the probability (≥90%) that along a specific lineage dN/dS>1 .
Accession numbers were deposited at GenBank: Colobus guereza (JN177339), Cercopithecus aethiops (JN177340), Cercopithecus neglectus (JN177341), Mandrillus sphinx (JN177342), Macaca mulatta (JN177343), Hylobates lar (JN177344), Gorilla gorilla (JN177345), Pongo pygmaeus (JN177346), Pan paniscus (JN177347), Pan troglodytes troglodytes (JN177348), Saguinus oedipus (JN177349) and Saimiri sciureus (JN177350).
The Molecular Retrovirology Unit is “Equipe labelisée LIGUE 2010”. Faecal samples and blood from bushmeat were collected in Cameroon with the approval of Cameroonian Ministries of Health, Research, and Environment and Forestry and Wildlife. We thank the zoo de la Palmyre and Dr Pascal Pineau for providing primate samples.
Competing Interests: The authors have declared that no competing interests exist.
Funding: Funding was received from the Institut Pasteur, Agence Nationale de Recherche sur le SIDA (ANRS), Agence Nationale de Recherche (ANR), l'Institut de Recherche pour le Développement (IRD), l'Institut National de la santé et de la Recherche (INSERM) and the Centre National de Recherche Scientifique (CNRS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.