Search tips
Search criteria 


Logo of bttDove Medical PressSubscribeSubmit a ManuscriptSearchFollowDovepressBiologics: Targets and Therapy
Biologics. 2010; 4: 75–81.
Published online 2010 May 25.
PMCID: PMC2880343

Hepatitis B virus and Homo sapiens proteome-wide analysis: A profusion of viral peptide overlaps in neuron-specific human proteins


The primary amino acid sequence of the hepatitis B virus (HBV) proteome was searched for identity spots in the human proteome by using the Protein Information Resource database. We find that the HBV polyprotein shares sixty-five heptapeptides, one octapeptide, and one nonapeptide with the human proteins. The viral matches are disseminated among fundamental human proteins such as adhesion molecules, leukocyte differentiation antigens, enzymes, proteins associated with spermatogenesis, and transcription factors. As a datum of special interest, a number of peptide motifs are shared between the virus- and brain-specific antigens involved in neuronal protection. This study may help to evaluate the potential cross reactions and side effects of HBV antigen-based vaccines.

Keywords: HBV proteome, human proteome, similarity analysis, viral versus human proteome overlapping, vaccine-related cross-reactions


Vaccination for infectious diseases may be associated with potential adverse events and possible long-term health disorders (see Indeed, antigen-specific immunotherapy protocols may target not only the antigen from the infectious microorganism, but also host tissues expressing antigens that share sequences with the target.1 In general, a vaccine produces a weak immune response; also autoimmune cross-reactions are extremely rare events.25 Under normal non-stimulated conditions, immune system fails to make immune responses to protein vaccines, unless adjuvants are added.6,7 Consequently, the active vaccine preparations currently in use contain adjuvants for obvious reasons of desired immunogenicity,8,9 so intrinsically carrying a certain degree of inducing/enhancing a potential cross-reactivity risk.

In order to define quantitatively and qualitatively the molecular basis of active vaccine (auto)immunity, we are undertaking proteomic sequence-to-sequence profile analyses between microbial versus human proteins.10 Here, the HBV polyprotein was examined for amino acid sequence similarity to the human proteome at the heptamer level. We describe a high level of sharing of heptapeptide motifs between HBV and human proteins, with numerous neuronal proteins involved in the viral versus human peptide overlapping.


The HBV polyprotein primary sequence (Taxonomic ID: 10407; EMBL Accession: X51970) was dissected into heptamers that were analyzed for exact sequence similarity to the human proteome using PIR perfect match program ( The heptamers were offset by one residue, ie, overlapping by six residues: ie, MQLFHLC, QLFHLCL, LFHLCLI, FHLCLII, etc. The human proteome consisted of 36,103 proteins at the time of analysis. The function of the human proteins and potential disease associations were analyzed using the Universal Protein Resource (UniProt; see Repeated sequences, fragments, and uncharacterized entries were filtered out.


HBV proteins were analysed for amino acid sequence identity to the human proteome using heptamers as scanning units. The theoretical probability of a sequence of 7 amino acids occurring at random in two proteins may be calculated as 20−7 or 1 in 1 280,000,000,1 assuming that all amino acids occur with the same frequency. Moreover, to determine the number of times a given viral heptamer might occur at random in the human proteome, one must consider the size of the viral and human proteomes. The analyzed human proteome was formed by 36,103 proteins and 10,431,975 unique 7-mers, and the HBV polyprotein was formed by 1,586 unique 7-mers10 Therefore, the number of times we would see a HBV 7-mer at random in the human proteome is 20−7 times the number of 7-mers in the two proteomes. This probability is 12.9. In contrast, Table 1 illustrates that HBV proteins actually share peptide sequences with the human proteome for a total of 65 heptamers. The table also shows that HBV and human proteomes also share one octamer (RLGLSRPL peptide, AA Pos 796–803 in the HBV polymerase protein) and one nonamer (SPRRRTPSP peptide, AA Pos 186–194 in the viral HBV core protein).

Table 1
Sharing of 7-mer motifs between HBV and human proteomes. Location in the viral protein and amino acid sequence of the heptapeptide motifs are reported. The human proteins sharing heptapeptides with the HBV proteome are characterized by accession number ...

Moreover, Table 1 shows that the human proteins hosting heptapeptides from HBV proteome comprehend numerous critical antigens specifically (or, in a few instances, uniquely) expressed in the brain. The critical neuronal role exerted by the human molecules hosting viral motifs is illustrated by the following examples. RNF19 or E3 ubiquitin-protein ligase is involved in neuronal protection,47 BSN or protein bassoon is exclusively expressed in brain and functions in the organization of the cytomatrix at the nerve terminals active zone which regulates neurotransmitter release,51 CENG1 or phosphatidylinositol-3-kinase enhancer participates in the prevention of neuronal apoptosis,23 and so on. Obviously, it is logical to postulate that immune cross-reactions with these neuronal antigens might carry a sequela of inflammatory brain lesions.

Furthermore, Table 1 shows that another set of human proteins hosting 7-mer viral motifs is represented by spliceosomal proteins.18,21,25 This datum is worth noting in the light of the numerous reports on a possible link between splicing phenomena and neurodegenerative diseases. Indeed, (dysregulated) splicing has been implicated in the: 1) selection of the autoimmune T-cell repertoire in multiple sclerosis;66 2) reduction of the adenosine A1 receptor-β transcript in MS patients, that potentially leads to increased macrophage activation and central nervous system inflammation;67 3) expression of the citrullinated myelin basic protein isomer, an autoantigen in multiple sclerosis;68 4) generation of alternatively spliced transcripts of the gene for human Cu, Zn superoxide dismutase, a causative gene for autosomal dominant amyotrophic lateral sclerosis.69 Moreover, a complex splicing pattern characterizes the human myelin/oligodendrocyte glycoprotein, an highly encephalitogenic autoantigen and a target for autoaggressive immune responses in CNS inflammatory demyelinating diseases.70 Finally, aberrant splicing has been involved in the generation of an aberrant transcript of excitatory amino acid transporter 2 that has been associated with amyotrophic lateral sclerosis.71 In this regard, it is also remarkable that the long viral nonamer motif, ie, the SPRRRTPSP peptide sequence (aa pos 186–194 in the viral HBV Core protein), is present in the human Ser/Arg repetitive matrix protein 1 (SRRM1), that is part of pre- and post-splicing multiprotein mRNP complexes.21 SSRM1 is involved in a number of pre-mRNA processing events (see Table 1 for details). Again, it is quite logical to postulate that a cross-reaction with SRRM1 would alter a number of physiological functions.


To our knowledge, this study is the first and most important of its kind in providing a clear-cut analysis of the identity platform linking HBV and Homo sapiens proteomes. Two considerations emerge from the data reported here. First, although the theoretical probability of sharing perfect identical heptapeptide fragments is relatively low, actually we find 65 perfect identical matches between the viral and human proteomes. Based on the need for five or six amino acids to induce a monoclonal antibody response,1,72 the 65 heptapeptide overlaps might clearly induce autoimmune reactions. Second, the nature of the overlapping is also of interest since a number of viral motifs occur in human proteins that are crucially involved in the neuronal structure and functions.

Given the premises illustrated under the Introduction, these data warn against adverse side-effects of active vaccination using entire HBV antigens. In parallel, the present study might be useful for designing anti-HBV vaccines based on not-shared portions of the viral antigens. More in general, the data reported in this study define a practicable procedure to define possible cross-reactions potentially associated with active vaccines.


RR and DK were supported in this work by the Ministry of University, Italy (60% of funds). The authors report no conflicts of interest in this work.


1. Oldstone MB. Molecular mimicry and immune-mediated diseases. FASEB J. 1998;12:1255–1265. [PubMed]
2. Janeway CA., Jr Approaching the asymptote? Evolution and revolution in immunology. Cold Spring Harb Symp Quant Biol. 1989;54:1–13. [PubMed]
3. Fraser CK, Diener KR, Brown MP, Hayball JD. Improving vaccines by incorporating immunological coadjuvants. Expert Rev Vaccines. 2007;6:559–578. [PubMed]
4. Mizrahi M, Lalazar G, Ben Ya’acov A, et al. Beta-glycoglycosphingolipid-induced augmentation of the anti-HBV immune response is associated with altered CD8 and NKT lymphocyte distribution: a novel adjuvant for HBV vaccination. Vaccine. 2008;26:2589–2595. [PubMed]
5. Havarinasab S, Pollard KM, Hultman P. Gold- and silver-induced murine autoimmunity requirement for cytokines and CD28 in murine heavy metal-induced autoimmunity. Clin Exp Immunol. 2009;155:567–576. [PubMed]
6. Bryan JT. Developing an HPV vaccine to prevent cervical cancer and genital warts. Vaccine. 2007;25:3001–3006. [PubMed]
7. Schmidt CS, Morrow WJ, Sheikh NA. Smart adjuvants. Expert Rev Vaccines. 2007;6:391–400. [PubMed]
8. Tong NK, Beran J, Kee SA, et al. Immunogenicity and safety of an adjuvanted hepatitis B vaccine in pre-hemodialysis and hemodialysis patients. Kidney Int. 2005;68:2298–2303. [PubMed]
9. Halperin SA, Dobson S, McNeil S, et al. Comparison of the safety and immunogenicity of hepatitis B virus surface antigen co-administered with an immunostimulatory phosphoro-thioate oligonucleotide and a licensed hepatitis B vaccine in healthy young adults. Vaccine. 2006;24:20–26. [PubMed]
10. Kanduc D, Stufano A, Lucchese G, Kusalik A. Massive peptide sharing between viral and human proteomes. Peptides. 2008;29:1755–1766. [PubMed]
11. Kawakami B, Sugiyama A, Takemoto M, et al. NEDO human cDNA sequencing project EMBL/GenBank/DDBJ databases; 2003
12. The MGC Project Team The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC) Genome Res. 2004;14:2121–2127. [PubMed]
13. Totoki Y, Toyoda A, Takeda T, et al. Homo sapiens protein coding cDNA EMBL/GenBank/DDBJ databases; 2005
14. Gendron RL, Good WV, Adams LC, Paradis H. Suppressed expression of tubedown-1 in retinal neovascularization of proliferative diabetic retinopathy. Invest Ophthalmol Vis Sci. 2001;42:3000–3007. [PubMed]
15. Bevilacqua MP, Stengelin S, Gimbrone MA, Jr, Seed B. Endothelial leukocyte adhesion molecule 1: an inducible receptor for neutrophils related to complement regulatory proteins and lectins. Science. 1989;243:1160–1165. [PubMed]
16. Ota T, Suzuki Y, Nishikawa T, et al. Complete sequencing and characterization of 21,243 full-length human cDNAs. Nat Genet. 2004;36:40–45. [PubMed]
17. Van Rompay AR, Norda A, Linden K, Johansson M, Karlsson A. Phosphorylation of uridine and cytidine nucleoside analogs by two human uridine-cytidine kinases. Mol Pharmacol. 2001;59:1181–1186. [PubMed]
18. Makarov EM, Cowger JJM, Longman D, et al. Mammalian PRP4 kinase copurifies and interacts with components of both the U5 snRNP and the N-CoR deacetylase complexes. Mol Cell Biol. 2002;22:5141–5156. [PMC free article] [PubMed]
19. Mungall AJ, Palmer SA, Sims SK, et al. The DNA sequence and analysis of human chromosome 6. Nature. 2003;425:805–811. [PubMed]
20. Bahe S, Stierhof YD, Wilkinson CJ, Leiss F, Nigg EA. Rootletin forms centriole-associated filaments and functions in centrosome cohesion. J Cell Biol. 2005;171:27–33. [PMC free article] [PubMed]
21. Blencowe BJ, Issner R, Nickerson JA, Sharp PA. A coactivator of pre-mRNA splicing. Genes Dev. 1998;12:996–1009. [PubMed]
22. Itoh-Satoh M, Hayashi T, Nishi H, et al. Titin mutations as the molecular basis for dilated cardiomyopathy. Biochem Biophys Res Commun. 2002;291:385–393. [PubMed]
23. Rong R, Ahn JY, Huang H, et al. PI3 kinase enhancer-Homer complex couples mGluRI to PI3 kinase, preventing neuronal apoptosis. Nat Neurosci. 2003;6:1153–1161. [PubMed]
24. Sato S, Cerny RL, Buescher JL, Ikezu TJ. Tau-tubulin kinase 1 (TTBK1), a neuron-specific tau kinase candidate, is involved in tau phosphorylation and aggregation. Neurochem. 2006;98:1573–1584. [PubMed]
25. Chaudhary N, McMahon C, Blobel G. Primary structure of a human arginine-rich nuclear protein that colocalizes with spliceosome components. Proc Natl Acad Sci U S A. 1991;88:8189–8193. [PubMed]
26. Scherl A, Coute Y, Deon C, et al. Functional proteomic analysis of human nucleolus. Mol Biol Cell. 2002;13:4100–4109. [PMC free article] [PubMed]
27. Maines MD, Polevoda BV, Huang TJ, McCoubrey WK., Jr Human biliverdin IXalpha reductase is a zinc-metalloprotein. Characterization of purified and Escherichia coli expressed enzymes. Eur J Biochem. 1996;235:372–381. [PubMed]
28. Thompson PM, Gotoh T, Kok M, White PS, Brodeur GM. CHD5, a new member of the chromodomain gene family, is preferentially expressed in the nervous system. Oncogene. 2003;22:1002–1011. [PubMed]
29. Wu LC, Wang ZW, Tsan JT, et al. Identification of a RING protein that can interact in vivo with the BRCA1 gene product. Nat Genet. 1996;14:430–440. [PubMed]
30. Taylor TD, Noguchi H, Totoki Y, et al. Human chromosome 11 DNA sequence and analysis including novel gene identification. Nature. 2006;440:497–500. [PubMed]
31. Grandori C, Gomez-Roman N, Felton-Edkins ZA, et al. c-Myc binds to human ribosomal DNA and stimulates transcription of rRNA genes by RNA polymerase I. Nat Cell Biol. 2005;7:311–318. [PubMed]
32. Oshiumi H, Matsumoto M, Funami K, Akazawa T, Seya T. TICAM-1, an adapter molecule that participates in Toll-like receptor 3 mediated interferon-beta induction. Nat Immunol. 2003;4:161–167. [PubMed]
33. Ninomiya K, Wagatsuma M, Kanda K, et al. NEDO human cDNA sequencing project EMBL/GenBank/DDBJ databases; 2002
34. Putilina T, Wong P, Gentleman S. The DHHC domain: a new highly conserved cysteine-rich motif. Mol Cell Biochem. 1999;195:219–226. [PubMed]
35. Olives B, Sonia M, Mattei MG, et al. Molecular characterization of a new urea transporter in the human kidney. FEBS Lett. 1996;386:156–160. [PubMed]
36. Garcia-Gonzalo FR, Munoz P, Gonzalez E, et al. The giant protein HERC1 is recruited to aluminum fluoride-induced actin-rich surface protrusions in HeLa cells. FEBS Lett. 2004;559:77–83. [PubMed]
37. Katoh M, Katoh M. Identification and characterization of the human FMN1 gene in silico. Int J Mol Med. 2004;14:121–126. [PubMed]
38. Nilsson NE, Kotarsky K, Owman C, Olde B. Identification of a free fatty acid receptor, FFA2R, expressed on leukocytes and activated by short-chain fatty acids. Biochem Biophys Res Commun. 2003;303:1047–1052. [PubMed]
39. Yoshida T, Imai T, Kakizaki M, Nishimura M, Takagi S, Yoshie O. Identification of single C motif-1/lymphotactin receptor XCR1. J Biol Chem. 1998;273:16551–16554. [PubMed]
40. Holness CL, Simmons DL. Molecular cloning of CD68, a human macrophage marker related to lysosomal glycoproteins. Blood. 1993;81:1607–1613. [PubMed]
41. Clark HF, Gurney AL, Abaya E, et al. The secreted protein discovery initiative (SPDI), a large-scale effort to identify novel human secreted and transmembrane proteins: a bioinformatics assessment. Genome Res. 2003;13:2265–2270. [PubMed]
42. Lee JD, Ulevitch RJ, Han J. Primary structure of BMK1: a new mammalian map kinase. Biochem Biophys Res Commun. 1995;213:715–724. [PubMed]
43. Hibino H, Pironkova R, Onwumere O, Vologodskaia M, Hudspeth AJ, Lesage F. RIM binding proteins (RBPs) couple Rab3-interacting molecules (RIMs) to voltage-gated Ca(2+) channels. Neuron. 2002;34:411–423. [PMC free article] [PubMed]
44. Erickson JD, Schaefer MKH, Bonner TI, Eiden LE, Weihe E. Distinct pharmacological properties and distribution in neurons and endocrine cells of two isoforms of the human vesicular monoamine transporter. Proc Natl Acad Sci U S A. 1996;93:5166–5171. [PubMed]
45. Ring HZ, Chang H, Guilbot A, Brice A, LeGuern E, Francke U. The human neuregulin 2 (NRG2) gene: cloning, mapping and evaluation as a candidate for the autosomal recessive form of Charcot-Marie-Tooth disease linked to 5q. Hum Genet. 1999;104:326–332. [PubMed]
46. Dickinson LA, Edgar AJ, Ehley J, Gottesfeld JM. Cyclin L is an RS domain protein involved in pre-mRNA splicing. J Biol Chem. 2002;277:25465–25473. [PubMed]
47. Ishigaki S, Hishikawa N, Niwa J, et al. Physical and functional interaction between dorfin and valosin-containing protein that are colocalized in ubiquitylated inclusions in neurodegenerative disorders. J Biol Chem. 2004;279:51376–51385. [PubMed]
48. Vasicek TJ, Leder PJ. Structure and expression of the human immunoglobulin lambda genes. J Exp Med. 1990;172:609–620. [PMC free article] [PubMed]
49. Courvalin JC, Lassoued K, Bartnik E, Blobel G, Wozniak RW. The 210-kD nuclear envelope polypeptide recognized by human autoantibodies in primary biliary cirrhosis is the major glycoprotein of the nuclear pore. J Clin Invest. 1990;86:279–285. [PMC free article] [PubMed]
50. Scherer SW, Cheung J, MacDonald JR, et al. Human chromosome 7: DNA sequence and biology. Science. 2003;300:767–772. [PMC free article] [PubMed]
51. Winter C, Dieck S, Boeckers T, et al. The presynaptic cytomatrix protein Bassoon: sequence and chromosomal localization of the human BSN gene. Genomics. 1999;57:389–397. [PubMed]
52. Grimwood J, Gordon LA, Olsen AS, et al. The DNA sequence and biology of human chromosome 19. Nature. 2004;428:529–535. [PubMed]
53. Zenz T, Roessner A, Thomas A, et al. HIan5: the human ortholog to the rat Ian4/Iddm1/lyp is a new member of the Ian family that is overexpressed in B-cell lymphoid malignancies. Genes Immun. 2004;5:109–116. [PubMed]
54. Colonna M, Samaridis J. Cloning of immunoglobulin-superfamily members associated with HLA-C and HLA-B recognition by human natural killer cells. Science. 1995;268:405–408. [PubMed]
55. Tashiro H, Yamazaki M, Watanabe K, et al. NEDO human cDNA sequencing project EMBL/GenBank/DDBJ databases; 2003
56. Katoh Y, Katoh M. Comparative genomics on HHIP family orthologs. Int J Mol Med. 2006;17:391–395. [PubMed]
57. Nagase T, Ishikawa K, Suyama M, et al. Prediction of the coding sequences of unidentified human genes. XI. The complete sequences of 100 new cDNA clones from brain which code for large proteins in vitro. DNA Res. 1998;5:277–286. [PubMed]
58. Walters KJ, Kleijnen MF, Goh AM, Wagner G, Howley PM. Structural studies of the interaction between ubiquitin family proteins and proteasome subunit S5a. Biochemistry. 2002;41:1767–1777. [PubMed]
59. Oshima A, Takahashi-Fujii A, Tanase T, et al. NEDO human cDNA sequencing project EMBL/GenBank/DDBJ databases; 2003
60. Hosoi T, Koguchi Y, Sugikawa E, et al. Identification of a novel human eicosanoid receptor coupled to G(i/o) Biol Chem. 2002;277:31459–31465. [PubMed]
61. Tarpey PS, Raymond FL, O’Meara S, et al. Mutations in CUL4B, which encodes a ubiquitin E3 ligase subunit, cause an X-linked mental retardation syndrome associated with aggressive outbursts, seizures, relative macrocephaly, central obesity, hypogonadism, pes cavus, and tremor. Am J Hum Genet. 2007;80:345–352. [PubMed]
62. Inokuchi J, Komiya M, Baba I, Naito S, Sasazuki T, Shirasawa SJ. Deregulated expression of KRAP, a novel gene encoding actin-interacting protein, in human colon cancer cells. Hum Genet. 2004;49:46–52. [PubMed]
63. Kurowska M, Rudnicka W, Maslinska D, Maslinski W. Expression of IL-15 and IL-15 receptor isoforms in select structures of human fetal brain. Ann N Y Acad Sci. 2002;966:441–445. [PubMed]
64. Bonnefond L, Fender A, Rudinger-Thirion J, Giege R, Florentz C, Sissler M. Toward the full set of human mitochondrial aminoacyl-tRNA synthetases: characterization of AspRS and TyrRS. Biochemistry. 2005;44:4805–4816. [PubMed]
65. Veugelers M, Bressan M, McDermott DA, et al. Mutation of perinatal myosin heavy chain associated with a Carney complex variant. N Engl J Med. 2004;351:460–469. [PubMed]
66. Klein L, Klugmann M, Nave KA, Tuohy VK, Kyewski B. Shaping of the autoreactive T-cell repertoire by a splice variant of self protein expressed in thymic epithelial cells. Nat Med. 2000;6:56–61. [PubMed]
67. Johnston JB, Silva C, Gonzalez G, et al. Diminished adenosine A1 receptor expression on macrophages in brain and blood of patients with multiple sclerosis. Ann Neurol. 2001;49:650–658. [PubMed]
68. Tranquilli LR, Cao L, Ling NC, Kalbacher H, Martin RM, Whitaker JN. Enhanced T cell responsiveness to citrulline-containing myelin basic protein in multiple sclerosis patients. Mult Scler. 2000;6:220–225. [PubMed]
69. Hirano M, Hung WY, Cole N, Azim AC, Deng HX, Siddique T. Multiple transcripts of the human Cu,Zn superoxide dismutase gene. Biochem Biophys Res Commun. 2000;276:52–66. [PubMed]
70. Delarasse C, Della Gaspera B, Lu CW, et al. Complex alternative splicing of the myelin oligodendrocyte glycoprotein gene is unique to human and non-human primates. J Neurochem. 2006;98:1707–1717. [PubMed]
71. Lauriat TL, Richler E, McInnes LA. A quantitative regional expression profile of EAAT2 known and novel splice variants reopens the question of aberrant EAAT2 splicing in disease. Neurochem Int. 2007;50:271–280. [PubMed]
72. Lucchese G, Stufano A, Trost B, Kusalik A, Kanduc D. Peptidology: short amino acid modules in cell biology and immunology. Amino Acids. 2007;33:703–707. [PubMed]

Articles from Biologics : Targets & Therapy are provided here courtesy of Dove Press