|Home | About | Journals | Submit | Contact Us | Français|
“Genomic medicine” refers to the diagnosis, optimized management, and treatment of disease—as well as screening, counseling, and disease gene identification—in the context of information provided by an individual patient’s personal genome. Genomic medicine, to some extent synonymous with “personalized medicine,” has been made possible by recent advances in genome technologies. Genomic medicine represents a new approach to health care and disease management that attempts to optimize the care of a patient based upon information gleaned from his or her personal genome sequence. In this review, we describe recent progress in genomic medicine as it relates to neurological disease. Many neurological disorders either segregate as Mendelian phenotypes or occur sporadically in association with a new mutation in a single gene. Heritability also contributes to other neurological conditions that appear to exhibit more complex genetics. In addition to discussing current knowledge in this field, we offer suggestions for maximizing the utility of genomic information in clinical practice as the field of genomic medicine unfolds.
We have entered the personal genomic era. Complete diploid genomes of multiple individuals have been sequenced (Schuster et al. 2010; Bentley et al. 2008; Wang et al. 2008; Ahn et al. 2009; Wheeler et al. 2008), in some cases offering diagnostic (Lupski et al. 2010) or even therapeutic (Worthey et al. 2010) insights. Genomic methods, including array comparative genomic hybridization (aCGH) and single-nucleotide polymorphism microarrays (“SNP-chips”) have already proven their clinical utility by providing information about the content of individual patients’ genomes; whole-genome sequencing (WGS) and its more streamlined counterpart, whole-exome sequencing (WES), appear to be on the verge of doing so. Although these genomic methodologies differ in many ways from “traditional” medical tests (e.g. complete blood counts, electroencephalography, or magnetic resonance imaging) and even from single-locus genetic tests (i.e. DNA sequencing of ‘candidate genes’ or enzyme assays for diseases considered during a differential diagnosis), they have similarly established their place among the tools available to physicians to anticipate, diagnose, and effectively manage illness in individual patients.
The neurological diseases represent a vast category of illness encompassing diseases of varying severity, age of onset, clinical presentation, genetic influences, and therapeutic responses. Many have a specific genetic basis (OMIM; http://www.ncbi.nlm.nih.gov/omim); whether this refers to an established cause-and-effect relationship between genotype at a single locus and phenotype (as in single-gene Mendelian disorders caused by highly penetrant variants) or to a genotype predisposing to illness (with possible modification by the environment or by other features of the genome) varies among different neurological diseases. The substantial number and variety of neurological conditions for which genomic variation plays a role in the susceptibility, course, and/or treatability of disease makes this an exciting class of disorders to consider in the context of genomic medicine.
We provide a working definition of genomic medicine, explain relevant genomic technologies and the types of genome variation they detect, and then review recent achievements in the field of genomic medicine as they pertain to selected neurological conditions. Conditions that have been reviewed recently (or concurrently in this special issue of Human Genetics) in a genomic context and those beyond the scope of this short review, for example, nervous system cancer and tumor syndromes, stroke, deafness, dementia, Parkinson disease, and autism, will not be covered. Rather, we will focus on a few neurological conditions most illustrative of the principles and possibilities of genomic medicine. We will conclude the article by offering suggestions for strategically maximizing the value of personal genome information in the practice of genomic medicine, particularly as this discipline pertains to neurological disease.
The term “genetic medicine” entered the scientific vocabulary almost four decades ago (Osmundsen 1973). This concept, made more relevant to clinical practice by the advent of disease gene mapping (Botstein et al. 1980; Gusella et al. 1983), gene identification by positional cloning and other methods (Royer-Pokora et al. 1986; Riordan et al. 1989; Wallace et al. 1990), and clinical mutation testing (Nowak 1994), referred—and still refers—to medicine in the context of an individual’s genotype at a single locus. Related terms like “pharmacogenetics” (the impact of genotype on pharmacology) had already been in use for some time (Evans and Clarke 1961).
The term “genomic medicine” arose in the late 1990s (Sikorski and Peters 1997; Beaudet 1999). In his 1998 Presidential Address to the American Society of Human Genetics, Dr. Arthur Beaudet described it as “the routine use of genotypic analysis, usually in the form of DNA testing, to enhance the quality of medical care” (Beaudet 1999). Since that time, the initial phases of the Human Genome Project (HGP) have been completed (International Human Genome Sequencing Consortium 2001, 2004), which allows a modified definition of genomic medicine to be constructed: The screening, diagnosis, counseling, and/or treatment of individual patients that is enhanced by knowledge of their genomes (and, in certain situations, their cancer genome(s) and/or the genomes of their microbial flora; Auffray et al. 2011). By this definition, genomic medicine contrasts with genetic medicine in that multiple loci are considered, a broadening of our thinking to incorporate genome-wide (as opposed to locus-specific) information (Table 1). Importantly, the genome is finite and contains all of the genetic information required for the existence of an individual living organism. The sum total of genome variation, including both simple nucleotide variation (SNV) and copy number variation (CNV), specifies much or most of the biological individuality of a person. For the purpose of this review, genomic medicine will also refer to disease gene identification, both in inherited and sporadic disease, using genomic technologies.
Disease gene discovery is entering the realm of the clinic. Although the research laboratory remains paramount in this pursuit, clinical diagnostic laboratories have in recent years established many novel gene–phenotype connections (Samuels et al. 2008; Vissers et al. 2010b). As some familiarity with the genomic technologies used in gene discovery is necessary to understand their capabilities and limitations, a brief description of each follows. When appropriate, citations are chosen from studies of neurological disease.
Microarrays consist of small pieces of the human genome arrayed on a glass slide and enable simultaneous genome-wide interrogation of multiple loci. The more sequences placed on the array, the higher the resolution of genomic interrogation, analogous to the better ‘picture’ resolved by increasing the pixel count of a digital camera, CT/MRI machine, etc. Microarrays were originally designed to measure relative mRNA expression (expression arrays; Schena et al. 1995), then subsequently co-opted to measure relative genomic content (aCGH; Pinkel et al. 1998), invaluable for detecting genomic rearrangements (deletions and duplications) genome wide. It became apparent by the 1990s that many disease phenotypes could result from structural changes of the human genome; these afflictions were referred to as genomic disorders (Lupski 1998, 2009). Clinically, aCGH has been used to diagnose patients with genomic disorders originating from both large (Cheung et al. 2005) and small (Boone et al. 2010) genomic copy number changes. In research and clinical settings, aCGH has been used to discover new disease loci by associating specific CNVs with clinical phenotypes. Such disease-associated CNVs can include genomic intervals encompassing many genes (Potocki et al. 2007, 2000), affect individual genes (Walsh et al. 2008), or even occur in non-coding regions of the genome (Smyk et al. 2007).
Another type of microarray is the SNP (single-nucleotide polymorphism) array, which enables genotyping of >105 SNPs throughout the genome. It is used clinically to detect absence of heterozygosity (AOH) owing to uniparental disomy or consanguinity, and even allows an estimation of the degree of relatedness of parents of consanguineous offspring (Altug-Teber et al. 2005; Schaaf et al. 2011). AOH detection enables homozygosity mapping, which can aid in disease gene identification in recessive conditions (Gundesli et al. 2010). SNP arrays are also the technology used in modern linkage mapping (a technique to map and identify disease loci responsible for Mendelian disorders) (Sas et al. 2010) and genome-wide association studies (GWAS; Grant and Hakonarson 2008).
The goal of GWAS is to identify either risk genotypes (i.e. those found more commonly in a cohort of unrelated patients with a disease) or protective genotypes (found more commonly in unaffected control individuals) at one or more SNPs throughout the genome. As multiple loci may be identified in a single GWAS, this technique is amenable to the study of diseases with an oligo- or polygenic origin (i.e. an individual’s genotypes at a few or several loci can each contribute to disease risk) and of genetically heterogeneous single-gene disorders (i.e. a mutation in any one of several genes results in the same clinical phenotype). Susceptibility to many common diseases is thought to have these types of genetic underpinnings; referring to this and to the large number of patients (difficult to enroll in studies of rare diseases) and the relatively large contribution to disease heritability that a given haplotype must make to be identified, GWAS is said to be ideal for the study of “common diseases [caused by] common variants” (i.e. investigating the CDCV hypothesis).
An unanticipated finding has been the repeated observation that even well-executed GWAS may only account for a small fraction of the heritability of many diseases (Bogardus 2009). The “rare variants, common disease” (RVCD) hypothesis suggests that multitudinous, rare variants, potentially at many different loci (i.e. constituting genetic heterogeneity), may contribute to the remaining unexplained heritability. For rare variants that are ‘non-ancient,’ or even de novo in the affected individual, the haplotype in which they exist will be uninformative to GWAS. This hypothesis has played out in the case of Parkinson disease, where high-throughput sequencing of a single gene (GBA) may explain more heritability of disease burden than GWAS in certain populations (Sidransky et al. 2009; Tsuji 2010).
Once a single haploid reference human genome had been completed (International Human Genome Sequencing Consortium 2001), genome resequencing could begin. In resequencing (referred to below as “sequencing”), newly derived sequences are mapped to the reference genome, which allows the sum total of variation in a personal genome to be identified by virtue of comparison to the reference and specific variations to be correlated to specific phenotypes. Only with the invention of massively parallel sequencing (MPS) using next-generation sequencing technology (methods that are faster and cheaper than Sanger dideoxynucleotide sequencing, the method used to complete the reference human genome) (Metzker 2010) has routine genomic sequencing become feasible. Two general approaches to genome sequencing exist: (1) whole-genome sequencing (WGS) and (2) whole-exome sequencing (WES), in which exonic sequence (~1–2% of the genome) is enriched via a ‘capture’ step, allowing preferential sequencing of exons to reduce time and cost (Okou et al. 2007; Hodges et al.2007; Albert et al. 2007). Two drawbacks of exome sequencing are that mutations in non-exonic sequence or poorly captured exons may be missed and that very little, if any, CNV data are provided.
Linkage mapping in Mendelian disease may eventually be replaced by sequencing, especially as SNPs are usually not themselves the disease-causing or predisposing variants. Sequencing has not yet been used successfully to replace GWAS in cases of multi-gene disease.
Diseases displaying genetic heterogeneity arise from mutations in any of several genes. Thus, an efficient way to test clinically for mutations is for a physician to order a panel test, in which multiple genes are sequenced. This type of genomic panel can be diagnostic, whereas others are predictive.
Predictive panel tests fall into several categories: carrier tests, disease susceptibility tests, and pharmacogenomic tests. A further distinction is that some are direct-to-consumer genetic tests (DTC-GTs) while others are ordered by physicians. Predictive tests generally use SNP arrays to assess an individual’s genome for both known disease-causing mutations and disease-associated SNPs. In the latter case, the strength of the disease association varies widely among SNPs (United States National Human Genome Research Institute; http://www.genome.gov/gwastudies/). In the case of one currently offered DTC-GT panel, 12 of 24 (50%) carrier status tests involve neurological/muscular conditions (e.g. familial dysautonomia and limb-girdle muscular dystrophy), 10 of 18 (56%) pharmacogenetic tests involve neurological/psychiatric conditions (e.g. response to interferon-beta therapy and naltrexone treatment response), and 27 of 93 (29%) disease risk tests involve neurological/psychiatric conditions (e.g. restless leg syndrome and stroke; only five of the 27 conditions (19%) are “conditions and traits for which there are genetic associations supported by multiple, large, peer-reviewed studies”) (https://www.23andme.com/health/all/). Legal and ethical issues of DTC-GT are discussed in an accompanying article in this issue of Human Genetics (Caulfield 2011).
The large number of known genetic conditions (even among the neurological disorders), genes involved in them, variants reported in these genes, and the incredible rate with which discoveries in these areas are reported have made online sources the ideal repository for this information. Table 2 constitutes a brief list of internet resources valuable for the practice of genomic medicine. These and other online resources are discussed by De Sevo (2010), Waggoner and Pagon (2009), and Mangan et al. (2009).
“Classic” examples of mutations causing human disease are often point mutations (e.g. HbS in sickle cell anemia; Ingram 1959). Although illustrative of the genetic code, this type of mutation does not portray the entire picture with respect to genomic variation and the genomic basis of disease. Indeed, visible structural mutations (aneuploidy, translocation), genomic/genic/intragenic/non-coding copy-number variation, inversions, microsatellite expansion, indels, mutations in non-coding RNAs, regulatory sequences, and other non-coding regions of the genome, and epimutations may lead to neurological (and other) diseases. A brief, illustrative list of neurological conditions resulting from various mutation types constitutes Table 3. Furthermore, a given disorder may be caused by any of several mutation types (Fig. 1). At the present time, a dichotomy exists between genomic methods that detect SNVs (detectable with SNP arrays and sequencing) and those that detect CNVs (best detected using aCGH). It is anticipated that whole-genome sequencing may eventually rival aCGH at efficiently detecting CNVs (Mills et al. 2011).
Intellectual disability or ID (previously referred to as mental retardation) can be caused either by environmental insults or by genetic lesions and may exist in isolation (nonsyndromic ID) or as part of a complex of symptoms (syndromic ID). A large number of genes has been implicated in ID (Leiden Open Variation Database Mental Retardation Database; http://www.LOVD.nl/MR). The number of genes reported to be associated with X-linked intellectual disability (XLID), the most common mode of inheritance in ID, is climbing towards 100 (Chiurazzi et al. 2008); a figure of ~10% of X-chromosome genes is given by Tarpey et al. (2009). Molecular diagnostic tests for XLID have become quite sophisticated (GeneTests; http://www.ncbi.nlm.nih.gov/sites/GeneTests/?db=GeneTests). For example, next-generation sequencing of the exons and flanking intronic sequence of 92 genes implicated in XLID has just been introduced by one diagnostic laboratory (http://www.ggc.org/diagnostics/next-gen-sequencing.htm). Yet, despite the large number of known XLID genes and tests to detect mutations in them, substantial missing heritability exists in this condition. As such, XLID has been the subject of several recent large-scale gene- and mutation-finding efforts.
Tarpey et al. (2009) sequenced the coding exons of most X-chromosome genes in 208 ID families where pedigrees were consistent with X-linked inheritance and previous mutation detection efforts had failed. Mutations were detected in both known ID-related genes and newly proposed ones, including SYP and ZNF711. As a family based approach was used, only mutations which segregated with disease in males and obligate carrier females were considered candidates. This approach, although powerful, was only informative for a minority (35/208; 17%) of families. This may be on account of disease-causing mutations in non-captured sequence (including non-annotated exons), synonymous mutations being overlooked (intentionally) as part of the computational mutation-finding algorithm, or CNVs that were not detected by exome sequencing, among other reasons.
Another family based approach was that of Whibley et al. (2010), who performed high-density X-chromosome aCGH in 251 families with evidence of XLID, most of which were mutation-negative families from the study by Tarpey et al. (2009). CNVs deemed likely to be pathogenic, which ranged from 2 kb to 11 Mb in size, were found in approximately 10% of families. These CNVs interrupted or encompassed both known XLID genes and new candidates (PTCHD1, WDR13, FAAH2, and GSPT2). Complementing this study of X-linked ID CNVs, the search for autosomal CNVs associated with ID has also been fruitful (reviewed in Vissers et al. 2010b).
The aforementioned XLID studies were performed on families in which segregation of the mutations with disease in males and carrier females strengthened evidence for causation. Furthermore, the inheritance pattern (X-linked) limited the genomic scope of capture sequencing and aCGH. Vissers et al. (2010a), contrastingly, performed whole-exome sequencing on 10 pre-screened subjects with sporadic mental retardation. In each case, both parents were available for testing, which allowed the search for potentially causative mutations to be narrowed to include only de novo variants. Assuming autosomal dominant causality, the authors identified a heterozygous, de novo, non-synonymous mutation in each of nine different genes, six of which were suggested to explain sporadic ID in as many patients. Two of these were in known ID genes, and four were in genes that made ‘biological sense,’ though no functional experiments were performed. Of the remaining three patients, one had inherited a mutation in a known XLID gene from a carrier mother in whom the mutation occurred de novo. These data suggest that new mutations can contribute significantly to ID (reviewed by Lupski 2010). Whole-genome sequencing in ID is likely imminent, though issues of data analysis, replication, appropriate use of controls, and functional studies will remain important to consider.
Much progress has been made finding genes responsible for Mendelian syndromes involving ID. The gene for Kabuki syndrome (MIM 147920), MLL2, was recently identified by whole-exome sequencing of a cohort of Kabuki syndrome patients and confirmed by subsequent MLL2 sequencing in multiple other affected individuals (Ng et al. 2010). Similarly, a gene for Schinzel–Giedion midface retraction syndrome (MIM 269150), SETBP1, was discovered by Hoischen et al. (2010) by exome sequencing.
Part of the excitement of genomic medicine is the potential to understand how genotypes at more than one locus might modify a given phenotype. Girirajan et al. (2010) identified a rare, recurrent genomic deletion at 16p12.1 associated with ID/developmental delay that, in combination with other CNVs, has been proposed to cause disease in a “two-hit” fashion. Alone, the CNV was found to predispose to neuropsychiatric disease (for example in transmitting parents). Interestingly, probands with this CNV were more likely to have an additional large CNV in their genome than were controls, and those with two CNVs had symptoms more severe than those with only the 16p12.1 deletion. Biological reasons for this phenomenon can be hypothesized, although these will have to be tested by experimentation in the future. Furthermore, other explanations for the observed data will have to be ruled out. It is likely that many other two-hit pairs of loci exist, and that three-hit, four-hit, and higher order models, as proposed for digenic (Kajiwara et al. 1994) and triallelic (Katsanis et al. 2001) inheritance, may eventually more accurately explain the risk of ID/developmental delay and other phenotypes.
Although not always thought of as a disease itself, pain is certainly a phenomenon of neurological origin and importance. Genome-wide studies of pain are limited in number, though a handful of single-gene pain or pain insensitivity syndromes have provided insights. Paroxysmal extreme pain disorder (PEXPD; MIM 167400) consists of episodic and intense rectal, mandibular, and/or ocular pain. Fertleman et al. (2006) mapped this condition to chromosome 2q using linkage analysis in a single, large pedigree. They subsequently identified causative heterozygous missense mutations in SCN9A, a gene in this region that encodes the alpha subunit of the Na(v)1.7 voltage-gated sodium channel, in multiple PEXPD families and individuals. Intriguingly, heterozygous gain of function mutations in this gene have also been shown to cause autosomal-dominant primary erythermalgia (MIM 133020), a disorder of vascular congestion/dilation and burning pain of the distal lower extremities that is precipitated by heat and other factors (Drenth and Michiels 1992). Functional in vitro analysis demonstrated that PEXPD-causing and primary erythermalgia-causing mutations have differing effects on cellular electrophysiology (Fertleman et al. 2006). Thus, these are allelic channelopathies that arise from separable ion conduction phenomena. Furthermore, carbamazepine, which effectively treats PEXPD but not primary erythermalgia, rectifies the electrophysiological abnormalities caused by PEXPD mutations, but not those of primary erythermalgia mutations. This example is instructive as to how genotype may dictate not only phenotype, but also pharmacologic response.
Very interestingly, homozygous loss of function mutations in SCN9A can cause a third disorder linked to this locus, autosomal recessive congenital indifference to pain (MIM 243000). Cox et al. (2006) mapped this condition in three families to 2q and identified homozygous nonsense mutations in the gene. They demonstrated that Na(v)1.7 function was nonexistent in cells with these mutations and suggested that SCN9A and the channel it encodes are attractive targets for drug development to treat pain.
Two recent studies have identified genetic risk factors for neuropathic pain by combining human and model organism genomic approaches. Costigan et al. (2010) performed genome-wide expression analysis in dorsal root ganglia of rat models of neuropathic pain following nerve injury. The expression of KCNS1, a potassium channel alpha subunit, was found to be decreased following nerve injury. The authors then tested for sequence differences at this locus in patient cohorts, and a risk allele for neuropathic pain was identified. Nissenbaum et al. (2010) mapped a mouse quantitative trait locus (QTL) for pain to a ~4 Mb region of the mouse genome. Subsequent expression and informatic analyses identified CACNG2 as a candidate pain susceptibility gene in this region, further implicated by experiments in CACNG2 hypomorphic mice. The authors then described CACNG2 polymorphisms that segregated with chronic neuropathic pain in post-surgical human subjects. These studies, each unique in methodology, provide examples of how model organism genomics may inform human genomics, identifying susceptibility loci that could aid drug development and/or risk stratification in patients.
Over 20 genes have been implicated in the monogenic epilepsies (Rees 2010). Most of these constitute channelopathies or result from some other manner of dysfunctional modulation of neurotransmission. Many display autosomal dominant inheritance, suggesting that pathogenicity may derive from haploinsufficiency or altered stoichiometry of channel subunits, from a dominant-negative effect, or from gain of function. Mutations in putative epilepsy genes may be tested electrophysiologically in vivo and in vitro, providing evidence for or against their implication in the disease. Despite remarkable progress in family-based mapping and identification of epilepsy disease genes, over half of epilepsy patients remain genetically undiagnosable (Rees 2010). Unfortunately, both a recent case–control approach examining candidate genes (Cavalleri et al. 2007) and GWAS (Kasperavičiūte et al. 2010) failed to find additional susceptibility loci. Equally unfortunate for patients is that a sophisticated and effective genome-based algorithm for treatment optimization, ideal for a genetically heterogeneous condition for which multiple treatment options exist, is not yet a reality (Anderson 2008).
Multiple mutations have been described in the SCN1A gene, which cause a handful of related epilepsy phenotypes including Dravet syndrome (also known as severe myoclonic epilepsy of infancy [SMEI]; MIM 607208) and generalized epilepsy with febrile seizures plus (GEFS+; MIM 604233) (Human Gene Mutation Database; http://www.hgmd.cf.ac.uk/ac/index.php). Individuals with somatic or germline mosaicism for this gene have been described (Gennaro et al. 2006). Interestingly, and expectedly, a mosaic, transmitting parent may have mild or nonexistent symptoms, whereas that individual’s child, who is expected to have the mutation in all of his or her (brain) cells, is fully affected. Furthermore, Vadlamudi et al. (2010) recently described a set of monozygotic twins discordant for Dravet syndrome in which the affected individual had a mutation in SCN1A and the unaffected twin did not. These and similar studies indicate that the concept of mosaicism is important in epilepsy and support the idea that mutations can arise at any stage of development (Fig. 2) (Lupski 2010). In addition, they suggest something more profound: that perhaps many patients have epilepsies that result from mutations during their own development, such that the brain contains cells with a mutation—potentially even in a known epilepsy gene—that is not present in ‘testable tissues’ (most commonly blood or potentially fibroblasts) (Lindhout 2008). These patients would evade a genetic diagnosis. In addition, perhaps in some focal epilepsies only part of the brain carries a mutation, or perhaps some types of epilepsies can only arise when two genetically distinct populations of cells co-exist in a single individual’s brain. As Lindhout (2008) suggests, these hypotheses could be tested by searching for somatic mutations in postmortem or surgically resected brain samples from patients with epilepsy (and other neurological disorders!). One can imagine an initiative akin to The Cancer Genome Atlas (TCGA; http://cancergenome.nih.gov/abouttcga) or the Catalogue of Somatic Mutations in Cancer (COSMIC, http://www.sanger.ac.uk/genetics/CGP/cosmic/) in which genome-wide methods like aCGH or genome/exome sequencing are used, although a simple ‘candidate’ gene approach focusing on known disease genes may be sufficient as an initial, proof of principal experiment.
Multiple sclerosis (MS) is a complex disease that affects multiple central nervous system substructures and is characterized by inflammation, demyelination, and neuronal pathology (Hauser and Oksenberg 2006). Although epidemiologic studies indicate that an individual’s environment contributes heavily to disease susceptibility (Ebers 2008), a modest level of heritability is indicated by familial aggregation (Oksenberg and Baranzini 2010). Several susceptibility alleles or haplotypes have been found by GWAS and candidate-gene approaches, the highest risk of which correspond to HLA genes on chromosome 6p21–p23 (Svejgaard 2008; Oksenberg and Baranzini 2010). De Jager et al. (2009) extracted additional statistical power from several GWAS by pooling their results in a meta-analysis. Risk alleles in three new non-HLA genes were identified, all of which were replicated in follow-up studies on independent control and MS-afflicted individuals.
None of the GWAS-identified risk genotypes are completely penetrant; as well, no families transmitting the MS phenotype with a clear inheritance pattern have been described. Despite these indications of the genetic complexity of MS, Baranzini et al. (2010) took a whole genome, whole epigenome, and whole transcriptome sequencing approach to determine whether detectable changes might explain disease in a small number of individuals. Specifically, these authors sequenced the entire genomes of one pair of MS-discordant monozygotic twins, and the epigenomes and mRNA transcriptomes of CD4+ T cells from three pairs of MS-discordant monozygotic twins. No informative differences were revealed in any of the three arms of the study, including informative indels, SNPs (even confirmed MS susceptibility SNPs), HLA haplotypes, CNVs, methylation differences, or mRNA expression changes.
The question has arisen whether clinical genetic testing might have some utility to estimate MS risk, diagnose the disease, or offer prognostic information. In one recent analysis, Sawcer et al. (2010) argued that doing so at this time would offer few, if any, patients any medically actionable information. Future MS genomic studies are needed to assemble a usable, multi-locus model of MS susceptibility and progression. Population-scale whole-genome sequencing and elegant computational analyses are likely to be the necessary technologies in this pursuit.
Two recent studies (Ramagopalan et al. 2009, 2010) provided exciting biological evidence to support the idea that the genetic and environmental influences on MS susceptibility may interact (Ebers 2008). Latitude is the environmental variable most predictive of MS risk (Lauer 1997), an effect proposed to be mediated by vitamin D via differences in sunlight exposure (Ebers 2008). Ramagopalan et al. described a vitamin D responsive element in the MS risk-associated locus HLA-DRB1 (Ramagopalan et al. 2009), then subsequently identified vitamin D receptor-binding sites genome-wide via chromatin immunoprecipitation followed by sequencing (ChIP-seq), providing experimental evidence for binding sites that overlap with several susceptibility loci for MS and other autoimmune diseases (Ramagopalan et al. 2010).
Studies that probe the interface of the environment and the genome may prove to be particularly useful in that an individual’s environment could potentially be modified; thus, such gene × environment studies might suggest options for disease prevention, modification, or cure. An instructive example of this involves dietary modification of another neurological disorder—ataxia. Familial isolated deficiency of vitamin E (VED; MIM 277460) is a recessive disorder caused by mutations in the gene encoding α-tocopherol transfer protein, TTPA (Ouahchi et al. 1995). Deficiency of this protein prohibits α-tocopherols, members of the vitamin E family of molecules, from being incorporated into plasma VLDL (Traber et al. 1990) and results in vitamin E deficiency and resultant neurological symptoms—most strikingly a Friedreich-like ataxia (Harding et al. 1985). Vitamin E supplementation slows progression of the disease or may even lead to some improvement of neurological symptoms (Gabsi et al. 2001; Mariotti et al. 2004). One report of presymptomatic treatment prevented symptom onset in the younger siblings of an affected patient for the 5-year duration of the study (Amiel et al. 1995).
Charcot–Marie–Tooth disease (CMT) is the most common inherited neurological disease (Szigeti and Lupski 2009). Patients with CMT experience progressive deterioration of the peripheral nerves with secondary muscle wasting and weakness in a distal distribution (i.e. distal symmetric polyneuropathy, or DSP) (England et al. 2009a, b). The disease is extremely heterogeneous both clinically and molecularly. Clinical subtypes include Charcot–Marie–Tooth disease, type 1 (CMT1), Charcot–Marie–Tooth disease, type 2 (CMT2), Dejerine–Sottas neuropathy (DSN; MIM 145900), congenital hypomyelinating neuropathy (CHN; MIM 605253), and Roussy–Levy syndrome (RLS; MIM 180800) (Jani-Acsadi et al. 2008). Electrophysiological studies enable a distinction between the two major classes of the disease: demyelinating, with symmetrically slowed nerve conduction velocity (NCV); and axonal, which is associated with normal NCV but reduced muscle action potentials.
Thus far, over 40 different genetic loci have been linked to CMT; for approximately 30 of these loci, specific genes have been identified (Bird 2011). These genes encode proteins involved in myelination, axonal transport, Schwann cell differentiation, signal transduction, mitochondrial function, protein translation, and single-stranded DNA break repair (Szigeti and Lupski 2009). CMT displays autosomal dominant, autosomal recessive, or X-linked transmission, predominately depending on the locus/gene involved; intriguingly, however, for mutations in some loci either dominant or recessive inheritance may be observed depending on the specific mutation (De Jonghe et al. 1997; Keller and Chance 1999; Nelis et al. 1999). The most prevalent form of Charcot–Marie–Tooth disease, CMT1A (MIM 118200), is caused in the vast majority of cases by copy-number gain of the PMP22 gene and a gene dosage effect (Lupski et al. 1991, 1992; Raeymaekers et al. 1991). This condition was the first genomic disorder described (Lupski 1998, 2009). The CMT1A 1.4 Mb genomic duplication in 17p12 results from unequal crossing over of homologous chromosomes at repeated sequences that flank the duplicated region (Pentao et al. 1992). The reciprocal deletion leads to hereditary neuropathy with liability to pressure palsies (HNPP; MIM 162500) that manifests with a recurrent nerve dysfunction secondary to nerve compression (Chance et al. 1993, 1994). Point mutations in PMP22 can lead to autosomal-dominant or autosomal-recessive forms of CMT (Roa et al. 1993; Shy et al. 2006).
Despite the significant advancements in CMT research, there are still multiple unidentified CMT genes and likely many genetic factors affecting the phenotypic expressivity that remain to be discovered. Progress in these areas has the potential to accelerate in the coming years owing to recent technological advances. For example, the application of whole-genome sequencing in a patient with recessive CMT recently enabled identification of two different causative alleles in the SH3TC2 gene and documented the first successful application of this powerful technique for the identification of disease-causing mutations (Lupski et al. 2010). In addition, as family members of the affected individual were available for testing, it was demonstrated that carriers for either mutation exhibited milder, distinct, dominant neuropathic symptoms. These included an electrophysiologically characterized axonal neuropathy segregating with a missense, potentially gain of function variant and susceptibility to carpal tunnel syndrome (CTS) segregating with a nonsense allele; the latter variant demonstrated, as has been shown by Young et al. (1997) and Potocki et al. (1999) for PMP22, that haploinsufficiency of a CMT gene can result in genetic susceptibility to the common complex trait of CTS.
Spinal muscular atrophy (SMA) is one of the most common autosomal-recessive nervous system disorders, and is characterized by degeneration of the alpha motor neurons in the spinal cord and brain stem nuclei leading to progressive muscle weakness (Wang et al. 2007; Lunn and Wang 2008). Patients with SMA present with a broad clinical spectrum ranging from death in early infancy to normal adult life with only mild weakness. SMA is classically divided into four types (SMA I–IV) based on the age of onset and highest function achieved. All types of SMA are caused by disruption of the SMN1 gene that is localized to chromosome 5q13 (Melki et al. 1990; Lefebvre et al. 1995). This gene-harboring region is unstable and subject to intrachromosomal rearrangements including gene duplications, conversions, and deletions. This fact delayed the cloning of the disease-causing gene (Burghes 1997; McLean et al. 1994).
There are two paralogous (telomeric and centromeric) elements of ~500 kb in the SMA region, with SMN1 localized to the telomeric copy and the SMN2 gene localized to the centromeric copy. Increased SMN2 number and reduction or absence of SMN1, resulting from gene conversion from SMN1 to SMN2, occasionally occurs on account of the genomic architecture at this locus (Burghes 1997). SMN1 is a fully active gene that is translated into full-length protein (Lefebvre et al. 1995). SMN2 differs from this gene by only five nucleotides; one of these changes causes abnormal splicing and skipping of exon 7 in 90% of SMN2 mRNA (Lorson et al. 1999). However, 10% of SMN2 mRNA retains exon 7, yielding fully active protein (Lefebvre et al. 1995). Quantitative analyses of SMN2 copy number in SMA patients has shown that higher copy number correlates with milder phenotype (Mailman et al. 2002). This provides an interesting example of how genomic architecture may play a role not only in generating disease-causing mutations, but also in modulating disease phenotype.
Another mechanism explaining clinical heterogeneity of SMA involves the existence of modifiers in trans. Recently, Plastin 3 was identified as a potential modifier of the clinical course of SMA through transcriptome-wide differential expression analysis (Oprea et al. 2008). The work of Oprea et al. (2008) suggests that Plastin 3 may have an effect on axonogenesis that is altered in individuals lacking SMN expression and acts as a protective modifier of SMA.
Duchenne muscular dystrophy (DMD; MIM 310200), Becker muscular dystrophy (MIM 300376), and X-linked dilated cardiomyopathy (MIM 302045) are allelic disorders caused by mutations of the dystrophin (DMD) gene. The most severe form, DMD, starts early in life and is characterized by progressive muscle weakness (Darras et al. 2008). The dystrophin gene is localized to Xp21 and encodes a large glycoprotein. The gene spans 2.4 Mb and consists of 79 exons (Koenig et al. 1987). Deletions of one or more exons are prevalent and observed in up to 65% of patients whereas duplications of gene fragments are documented in 5–15% of subjects with DMD (Den Dunnen et al. 1989), suggesting that point mutations and potentially other structural variants such as inversions account for the remaining portion of cases of the disease. In dystrophanopathies, the severity of the phenotype varies with the level of expression of dystrophin and whether or not the translational reading frame has been disrupted by mutation (Aartsma-Rus et al. 2006). Gene rearrangements in dystrophin are clustered in the 5′ and central part of the gene (Den Dunnen et al. 1989). To date, few benign CNVs have been identified in the DMD gene; one study showed the presence of benign CNVs in 6/341 subjects, all of which were localized to intronic sequences (del Gaudio et al. 2008). Interestingly, there are not many low copy repeats (LCRs) or other repetitive sequences that might promote genome instability, rendering the region prone to gene rearrangements within the DMD gene; indeed, only 3/26 breakpoints found by del Gaudio et al. localized in proximity to repetitive sequences.
A search for genetic factors able to modify the clinical course of DMD resulted in the identification of the SPP1 (osteopontin) gene as a potential genetic modifier of the disease (Pegoraro et al. 2011). A polymorphism localized to the SPP1 gene promoter was shown to change promoter activity, resulting in lower SPP1 mRNA production. The effect of SPP1 genotype on DMD progression was relatively substantial—similar to the effect of the pharmacologic use of steroids (about 1 year difference in the time to loss of ambulation).
Amyotrophic lateral sclerosis (ALS), known also as Lou Gehrig’s disease, is a progressive degenerative disease of both lower and upper motor neurons (Wijesekera and Leigh 2009). Patients with ALS present with focal muscle weakness, bulbar symptoms, muscle fasciculations, and neuropsychiatric deficits. It has been estimated that familial forms of ALS account for 5–10% of cases. Mutations in superoxide dismutase 1 (SOD1) (Rosen et al. 1993) are the most frequent genetic culprit in familial ALS (FALS), being found in 20% of patients (Andersen 2006). Other genes commonly involved in FALS include FUS (also known as TLS; mutated in ALS6 [MIM 608030]) (Kwiatkowski et al. 2009; Vance et al. 2009) and TARDBP (mutated in ALS10 [MIM 612069]) (Yokoseki et al. 2008; Sreedharan et al. 2008; Kabashi et al. 2008); however, mutations in these genes are only found in up to 8% of subjects with ALS (Valdmanis et al. 2009). Collectively, mutations in known genes explain disease in only 25–30% of familial ALS cases (Tsuji 2010).
Two FALS-associated genes provide interesting examples of how different mutations at a single locus may result in multiple disease phenotypes. Missense mutations in senataxin (SETX) have been reported in patients with juvenile amyotrophic lateral sclerosis 4 (ALS4; MIM 602433) (Chen et al. 2004). SETX mutations are also the cause of autosomal recessive spinocerebellar ataxia-1 (SCAR1; MIM 606002) (Moreira et al. 2004), also known as ataxia-ocular apraxia-2 (AOA2). Similarly, heterozygous mutations of FIG4, known to cause an autosomal recessive form of Charcot–Marie–Tooth disease (CMT4J; MIM 611228) when homozygous or compound heterozygous mutations occur (Chow et al. 2007), have recently been reported in patients with ALS11 (MIM 612577) and adult-onset primary lateral sclerosis (PLSA1; MIM 611637) (Chow et al. 2009).
Recent studies argue against CNVs playing a large role in the pathogenesis of ALS (Blauw et al. 2008, 2010), with one exception (Broom et al. 2008). GWA studies have identified a few genes that may play a role in the pathogenesis of ALS (Valdmanis et al. 2009), for example DPP6 in different populations of European ancestry (van Es et al. 2008), however additional investigation is needed to confirm these observed associations. One recent hypothesis proposes a role for retroelements in pathogenesis of ALS, as it was found that the activity of reverse transcriptase in plasma of subjects with ALS was higher than in controls, however the meaning and significance of this finding is currently obscure (Mougeot et al. 2009).
Even before the initial phases of the HGP were complete, the inception of the “postgenomic era” was claimed (Scangos 1997). Although a scientific achievement of Herculean importance, the HGP provided only a single haploid genome sequence representing an individual who was healthy at the time the DNA sample was obtained. Countless questions concerning genomic variation among populations with differing ancestry, among persons afflicted with heritable disease, and even among healthy individuals within a family remained unanswered. The continuing insights provided by the International HapMap Project (International HapMap3 Consortium 2010), 1000 Genomes Project (The 1000 Genomes Project Consortium 2010), and other genomic studies large and small—and the new questions that arise from this work—remind us that we are most certainly in the genomic era, and that, as personal genomes are probed in various ways, personal genomics is becoming a reality.
Two current challenges that must be overcome are an incomplete reference genome (The 1000 Genomes Project Consortium 2010) and incomplete or incorrectly annotated mutation databases. Particularly in the case of predictive genomic testing, it will be of the utmost importance for clinicians to thoughtfully consider the disease-causing potential of rare or novel mutations in view of published evidence. Prudent questions should include: did a study or studies reporting a mutation as disease-causing fail to find it in a statistically significant number of appropriately selected controls? Might there exist protective variants that reduce penetrance in certain ethnic groups or individuals? Have healthy family members been tested? Has mosaicism been considered? Has the patient’s mutation been confirmed by a second methodology? Wheeler et al. (2008), in sequencing the genome of an apparently healthy adult, found multiple risk alleles and purportedly disease-causing variants, and Lupski et al. (2010), in a patient with Charcot–Marie–Tooth disease, identified a purportedly pathogenic variant documented in a mutation database as being associated with a “persistent vegetative state;” these studies indicate that the above questions will need to be answered even in the case of screening and carrier testing.
Physicians must consider the clinical utility of a genomic test before it is ordered. This is no different from ordering traditional medical tests. Unfortunately, in any genomic panel, the utility of each individual result will depend on the locus tested and the variant detected. Therefore, clinicians should become familiar with each component of a panel test and decide, in consultation with the patient, whether ordering it would be of greater potential benefit than harm. Given the vast amount of genomic variation that exists among healthy individuals (The 1000 Genomes Project Consortium 2010; Conrad et al. 2010; International HapMap3 Consortium 2010), it is likely that no result of a genomic test will be completely ‘normal’ (i.e. completely free of carrier mutations, risk alleles, variants of unknown significance, or disease-causing mutations with reduced or age-dependent penetrance). Findings such as these may be unsettling to patients and should be considered before any genomic test is utilized.
We have attempted to provide evidence for both the promise and limitations of whole-genome resequencing and its potential for genomic medicine. This technology will soon become commonplace in both research studies and clinical practice. Even the most cheaply and easily obtainable sequence data requires proper analysis—a current and future challenge (Mardis 2010). For example, one current difficulty for sequencing-based mutation detection is the identification of genomic inversions and variants within or consisting of repeated sequences. Another such hurdle to overcome is assigning significance to mutations found in non-coding regions of the genome (i.e. establishing the genomic code). Yet another is making sense of genomic contributions to complex traits: might multiple common mutations with minor effects interact to manifest a phenotype? Or might oligogenic inheritance in the individual with one or a few variants of major effects—but tremendous genetic heterogeneity in the population of patients—be at play? And what of gene–environment interactions? Both innovative bioinformaticists and computational biologists will be required to meet these and similar challenges.
The neurogenomics community—composed of clinicians, scientists, informaticists, and even patients and healthy volunteers—should be proud of the accomplishments of the past and excited about the prospects of the future to improve human quality of life and explain some of the mysteries of the human nervous system and brain—our most unique and defining organ.
P.M.B. is a fellow of the Baylor College of Medicine Medical Scientist Training Program (T32GM007330-34). This work was supported in part by a National Eye Institute (NEI) Training Program Grant (T32EY007102) (P.M.B) from the United States National Institutes of Health (NIH), and by a National Institute of Neurological Disorders and Stroke (NINDS) Grant (R01NS058529) (J.R.L.) from the NIH. J.R.L. is a paid consultant for Athena Diagnostics and Ion Torrent Systems and is a co-inventor on multiple United States and European patents related to molecular diagnostics. The Department of Molecular and Human Genetics at Baylor College of Medicine derives revenue from the chromosomal microarray analysis offered in the Medical Genetics Laboratory.
Philip M. Boone, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Room 604B, Houston, TX 77030, USA.
Wojciech Wiszniewski, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Room 604B, Houston, TX 77030, USA.
James R. Lupski, Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Texas Children’s Hospital, Houston, TX 77030, USA.