|Home | About | Journals | Submit | Contact Us | Français|
Human rhinoviruses (RVs) of the A, B, and C species are defined agents of the common cold. But more than that, RV-A and RV-C are the dominant causes of hospitalization category infections in young children, especially those with asthma. The use of cadherin-related family member 3 (CDHR3) by RV-C as its cellular receptor creates a direct phenotypic link between human genetics (G versus A alleles cause Cys529 versus Tyr529 protein variants) and the efficiency with which RV-C can infect cells. With a lower cell surface display density, the human-specific Cys529 variant apparently confers partial protection from the severest virus-induced asthma episodes. Selective pressure favoring the Cys529 codon may have coemerged with the evolution of RV-C and helped shape modern human genomes against the virus-susceptible, albeit ancestral Tyr529.
“Almost everyone has suffered at one time or another, from the sneezing, sniffling, watery-eyed misery of a good old-fashioned common cold. The culpable agent in many cases is a rhinovirus, one of the small, positive-sense RNA-containing picornaviruses, whose very name evokes images of a runny nose.” A little over 30 years ago, I started a minireview (1) with that exact statement in a technical article discussing the structure considerations behind experimental failure of peptide-based vaccine directions (quite in vogue at the time) to “cure” the common cold. In the ensuing decades, disappointment, echoing truth, was presaged by the final lines of that review. “Sadly this means that until someone has a better idea, the best advice when you have the sniffles is to stay in bed, drink plenty of liquids and call your doctor in the morning. It will be some time before we can cure the common cold.”
In 1985, the atomic structure of the first rhinovirus (RV), B14, was published (2), and there was great hope in the clinical world that the new data would lead to effective antivirals and vaccines, ridding the United States (alone) of ~62 million cases each year of RV-caused colds, which are by far the leading cause of lost work days (adults) and school time (children). The Protein Data Bank (PDB) now has records of >70 iterations of RV structures (35 for B14 alone) documenting a myriad of receptor and immunogenic complexes, as well as putative antiviral drug interactions. With these data, combined with deep sequencing efforts and phylogenetic considerations, spectacular advances (reviewed in reference 3) have indeed been made in our understanding of the physical and genetic characteristics of these viruses. The canonical isolates in the rhinovirus A (RV-A) and RV-B species cluster as distinct clades in the Enterovirus genus (Picornaviridae), with members sharing >70% amino acid identity over the nonstructural regions of their polyproteins (~2,150 amino acids [aa]). Within species, further subdivision into genotypes (>85% nucleotide identity in key capsid regions) is reasonably synonymous with experimental serotypes and currently defines ~77 RV-A and ~30 RV-B genotypes. Neutralizing B-cell epitopes (NIMs) map to raised regions on prominent surface protrusions of the three largest capsid proteins, VP1, VP2, and VP3, where extensive sequence variation accounts for the lack of protective cross-reactivity. The majority of these viruses (major group) uses intercellular adhesion molecule 1 (ICAM-1) as their cellular receptor, recognizing contacts in the N-terminal domain of this protein to trigger receptor-mediated endocytosis as the initiating process of infection. A minority of isolates (minor group, ~10 RV-A) instead prefers low-density lipoprotein receptor (LDLR) family members for the same function. Hydrophobic capsid binding drugs, based on pleconaril derivatives, which interfere with receptor binding or requisite uncoating transitions, have efficacy against RV-A and RV-B (RV-A+B) if they can intercalate into an internal pocket-like feature within the capsid VP1 protein. Unfortunately, the pocket varies in shape and charge among many isolates, causing disparate efficacy differences in this approach to antiviral therapies. Given these properties and that the combined plethora of 100-plus defined RV-A+B genotypes apparently circulate worldwide endlessly, only the most stoic of optimists has seriously poked at the idea of a realistic “cure” in the last 30 years. Certainly for the near term, the shared misery inherent to common colds will continue to be borne as an innate part of the human condition. Still, these viruses really do not do anything serious, right?
Enter rhinovirus C (RV-C). During worldwide surveillance for emerging respiratory viruses, mounted in response to early severe acute respiratory syndrome (SARS) epidemics, clinics started capturing circulating virus sequences that were similar to those of RV-A+B but in their own way unique. First described by genome sequencing in 2006 (4) but not physically grown in culture until 2011 (5), RV-C now includes ~55 additional defined genotypes. The isolates are binned taxonomically using criteria similar to those for RV-A+B. Although technically no pair of RV-C genotypes has yet to be tested for cross-protection, the sequences (and structures) again suggest that these types will have nonoverlapping antigenicity. When the initial surveillance data were indexed with patient records, it quickly became apparent that RV-C viruses, previously invisible to clinical screens, had special, unanticipated medical relevance in addition to being previously missed, frequent agents of the common cold. RV-C isolates are found in both the upper and lower airways and have been identified in some studies as the dominant virus type in up to half of diagnosed respiratory infections in young children worldwide (6). Of greater importance, the deep-lung etiology is now known to have special clinical significance, since many of these strains are associated with severe, hospitalization category infections in young children, especially those with asthma (5, 7). In a typical study from Wisconsin, 289 children recorded an average of 12.2 RV infections per infant year. Of these, RV-A and RV-C were each about 7 times more likely than RV-B to cause moderate-to-severe illness (8). RV or RV coinfections were detected in ~90% of kids hospitalized with acute asthma episodes, of which RV-C was the primary agent, presenting most frequently (59.4%) in patients with the severest symptoms (9).
Unlike RV-A+B, which for the most part will effortlessly lyse whole plates of cultured monolayers (e.g., HeLa cells) in plaque assays, RV-C is completely refractive to growth in standard tissue culture. After the virus discovery, though, it was quickly found that one could productively transfect cells with full-length genome RNA, native or recombinant. In other words, sequential virus passage must require at least one missing surface receptor that is neither the ubiquitous ICAM-1 nor LDLR. The lack of such display in normal cultured cells precluded virus growth. Painstaking low-level infections were achieved in some human organ cultures at first and then in air-liquid interface cells in protocols which failed more often than not (5). But eventually these directions produced sufficient materials to suggest receptor-like candidates. Tests for this involved transfection of contender cDNAs into nonsusceptible cells with subsequent probing for signs of virus infection. Only a single gene, cadherin-related family member 3 (CDHR3), gave even the faintest glimmer of a positive reaction (10). Serendipitously, there was an almost simultaneous clinical report, the first known publication on CDHR3, stating that the human gene, in fact, had two alleles, presenting TGT and TAT codons on the same single-nucleotide polymorphism (SNP), RS6967330 (11). The native high prevalence of the G-encoded Cys529 protein variant (69% of participants were G/G in that study) had little clinical relevance, but the minority A-encoded Tyr529 variant (3% A/A homozygous) tagged this site as a highly potent “genome-wide … susceptibility locus for early childhood asthma with severe exacerbations” (11). CDHR3, regardless of allele, was found expressed to high levels in airway epithelium, including deep lung sites. But although both alleles seemed to give similar protein synthesis levels, the proteins themselves differed considerably in relative cell surface display. When these genes are transfected (11) or transduced (10) into cells (e.g., 293T, HeLa), the asthma susceptibility protein (Tyr529) dominates the cell surface, while the asthma-protective protein (Cys529) does not (Fig. 1). Capitalizing on this concept, the RV-C receptor hunt switched focus to the Tyr529 variant and was almost instantly rewarded with fully infectious virus amplification systems. The direct and immediate experimental outcomes included tissue culture-adapted virus strains for enhanced growth (12) and satisfactory virus isolation for a high-resolution cryo-electron microscopy (cryoEM) determination of the RV-C15a capsid structure (13). Since that point, RV-C and CDHR3 investigations have been necessarily coevolving.
Classical cadherins are multifunctional Ca2+-dependent cell adhesion proteins, whose primary job is holding cells together through homologous contacts on or between cell surfaces (reviewed in reference 14). The multiple members of the superfamily and related subfamilies hold in common a linear arrangement of tandem extracellular (EC) repeat domains (5 are typical), preceded by a signal sequence and tailed with a transmembrane domain (TM) linked to cytoplasmic recognition units. As a rule, the EC repeats (6 β-strands each) orient themselves linearly, in long, slightly curved stalks (Fig. 2) according to obligate Ca2+ binding at unit junctions mediated by clusters of acidic residues (Asp, Glu). Interprotein contacts, commonly involving the outermost EC domains, are responsible for the adhesion properties. In turn, these are influenced by the stiffness or flexibility of the protein, conferred by the bound Ca2+ ions, removal of which can trigger withdrawal of the cadherin from the cell surface (15).
CDHR3 has no currently described native roles in organism development or lung function, although all known animal genomes maintain this gene with a high degree of sequence conservation. Modeling algorithms identify 6 EC domains (human sequence, aa 24 to 681), preceded and tailed by common signal (aa 1 to 23), TM (~aa 714 to 735), and cytoplasmic units (aa 736 to 885). Iterations of structure representations (10, 11) similar to Fig. 2 have been published along with predictions that the presumed RV-C contacts required for infection of humans probably involve the outermost 3 repeats (EC1, EC2, EC3). Three motif-anticipated glycosylation sites, Asn186 (EC2), Asn384 (EC4), and Asn624 (EC6), have been confirmed by mass spectroscopy (data not shown), of which sites the Asn186 location is an intriguingly suggestive computational docking match for a glycan binding pocket on the virion surface discovered as part of the C15a structure determination (13). The possible involvement of other cellular proteins and the comapping of virus-CDHR3 binding footprints are currently undergoing rigorous experimental examination, as is the molecular mechanism(s) by which asthma susceptibility and virus infection frequency is apparently conferred by the Cys529 or Tyr529 variants.
The initial description of CDHR3 posited that Cys529 in EC5 might impinge on an expected disulfide bridge between Cys566 and Cys592, ~20 Å distant in EC6, interfering with protein stability (11) and causing removal from the cell surface. It is equally plausible that this residue, nested intimately in the EC5-EC6 contact face, could easily reorient any required subset of the 12 nearby Asp/Glu residues (red in Fig. 2) contributing to Ca2+ binding and protein rigidity, as is also required for consequent surface display. The 529 position is actually flanked by Asp528 and Glu530 (Asp-Cys-Glu) and modeled within 3 to 8 Å of Glu458, Glu459, Asp561, Glu562, Asp696, and Asp698, all of which are presumed analogs to known Ca2+ chelator residues in determined cadherin structures (15). Adding even more relevance to this region, Cys529 is <15 Å from Asn624, one of the mapped CDHR3 glycosylation sites. The charge and size differences between Cys529 and Tyr529 certainly have the structural potential to influence any of these parameters.
Putting this all together, the key medical and biological information converge on the idea that a relatively rare, but phenotypically dominant, asthma-related A-encoded Tyr529 variant of the CDHR3 gene apparently causes greater RV-C receptor display on the pulmonary cell surfaces of children, making these kids, particularly the youngest, 2 to 5 times more susceptible to devastating virus-triggered asthma episodes (10, 11). In cell culture it is also becoming clear that RV-C cell-to-cell spread (i.e., the ability to form a plaque) is likewise dependent upon CDHR3 display density, a principle likely to hold true for natural infections as well. The probabilities of an initial contact triggering a productive infection and of subsequent virulent spread throughout the bronchia and lungs are logically then both under the influence of CDHR3 genetics. The A risk allele manifests itself as a measurable virus and asthma susceptibility factor, even for children who are A/G heterozygous (11). Why then, if one can be (partially) protected by the G allele (Cys529), does the extant human population still encode the obviously detrimental alternative, A (Tyr529)?
Apparently, Nature has been working on this! In ongoing efforts as part of a very rich set of collaborative investigations involving (among others), Nels Elde and Alesia McKeown (University of Utah) and John Hawks, David O'Connor, and Mary O'Neill (University of Wisconsin—Madison), it is now becoming clear that among all known recorded animal consensus or individual genome sequences, the Tyr529 variant was the singular ancestral protein, and its gene is the obvious sole homolog to prehuman, prehominid, preprimate, and pre-anything else, all the way back to the emergence of the first organisms with lungs. Modern humans alone encode a protective Cys529 variant. The 1000 Genomes Project (www.1000genomes.org) has recorded allele frequencies that range from >95% G (most Asian populations) to >31% A (most African populations), dependent upon specific ancestral lineages. Both alleles circulate as part of larger haplotypes on chromosome 7 that (preliminary data suggest) may date back at least 50,000 years, indicating the existence of Cys529 proteins and their availability for positive selection, although at some low level, prior to “Out of Africa.” We are only beginning initial forays into the paleogenetics data sets to trace the factors that led to this virus-protective phenotype, but there are intriguing hints that the G allele (Cys529), while completely absent in Neanderthal or Denisovian artifacts, was detectably present in Eurasian Neolithic populations ~8,000 years ago (16), albeit at frequencies much lower than in modern populations from the same area. Perhaps not coincidentally, our current best estimates on virus evolution place the RV-C divergence from RV-A (i.e., switch to CDHR3 receptors) within a similar time frame. Over the coming year, we hope to explore the hypothesis that the RV-C preference for CDHR3 may have helped shape fixation of the G-dependent Cys529 protein into our present asthma-resistant genotypes. Apparently, RV-C is not just a cause of the common cold! If the observed, devastating effects of RV-C infections on modern children expressing Tyr529 proteins are any predictor, surely this infectious combination had, from the very beginning, the lethal potential to shape human evolution.
RV efforts in the Palmenberg group are supported by NIH grant U19 AI104317.
I thank Jim Gern, Kelly Watters, and Yury Bochkov for helpful suggestions and critical readings of the manuscript.