|Home | About | Journals | Submit | Contact Us | Français|
Originally, locus symbols (e.g., DYT1) were introduced to specify chromosomal regions that had been linked to a familial disorder with a yet unknown gene. Symbols were systematically assigned in a numerical series to designate mapped loci for a specific phenotype or group of phenotypes. Since the system of designating and using locus symbols was originally established, both our knowledge and our techniques of gene discovery have evolved substantially. The current system has problems that are sources of confusion, perpetuate misinformation, and misrepresent the system as a useful reference tool for a list of inherited disorders of a particular phenotypic class. These include erroneously assigned loci, duplicated loci, missing symbols, missing loci, unconfirmed loci in a consecutively numbered system, combining causative genes and risk factor genes in the same list, and discordance between phenotype and list assignment. In this article, we describe these problems and their impact, and propose solutions. The system could be significantly improved by creating distinct lists for clinical and research purposes, creating more informative locus symbols, distinguishing disease-causing mutations from risk factors, raising the threshold of evidence prior to assigning a locus symbol, paying strict attention to the predominant phenotype when assigning symbols lists, and having a formal system for reviewing and continually revising the list that includes input from both clinical and genetics experts.
Originally, locus symbols (e.g., DYT1) were introduced to specify chromosomal regions that had been linked to a familial disorder with a yet unknown gene.1 Symbols were systematically assigned in a numerical series (e.g., PARK1, 2) to designate mapped loci for a specific phenotype. Many of the locus symbols that were repeatedly confirmed have become used by clinicians and researchers as the name for the gene and the condition of interest (e.g., DYT1 dystonia) even where ultimately a gene (DYT1 = TOR1A) has been discovered.2 Thus, the locus symbols are entrenched in the medical lexicon of everyday use. A list inevitably is looked to as a source of reference, for example assuming that the “DYTs” provide a complete and accurate catalogue of the known inherited dystonias. Unfortunately this is not the case.
Over time a number of problems with the system of designating and using locus symbols have arisen. There are locus symbols of questionable significance that have resulted from irreproducible linkage studies or failure to ever establish linkage to a chromosomal region. There are known pathogenic mutations whose genes do not have locus symbols. Furthermore, the membership on specific lists sometimes bears a questionable relationship to the main clinical features of interest. In this article, we discuss the origins of these problems and propose solutions. We will use the DYT and PARK lists as illustrations, but the issues are generalizable to many different classes of disorders.
Many locus symbols are assigned by the Human Genome Organisation Gene Nomenclature Committee (HGNC). The committee is located at the European Bioinformatics Institute in Cambridge, UK, and is jointly funded by the US National Human Genome Research Institute and the Wellcome Trust (UK) and operates with policy advice from an International Advisory Committee.
To obtain a new locus symbol through HGNC, researchers fill in an online form and specify the phenotype, the cytogenetic location, flanking markers of the locus, and the maximum lod score obtained. However it is not uncommon that investigators assign their own locus symbols, without seeking approval by HGNC. When this happens HGNC tries to contact them with the aim of assigning a novel symbol postpublication.
Historically, clinical disorders were linked to chromosomal regions before a gene would be discovered. If a responsible gene is subsequently identified, the mapped locus entry is withdrawn and merged into the entry for the cloned gene in the HGNC database, if HGNC becomes aware of the new discovery. For example, PARK1 has been withdrawn and replaced with SNCA. For some loci the responsible protein-encoding gene has been identified but there is no currently agreed-upon gene name (e.g., PARK2, PARK7). For these genes the locus symbol has been retained.
In modern genetic times additional layers of complexity are added. Due to our expanding knowledge of the function of genes and the advent of next generation sequencing approaches to genetic discovery, specific genetic mutations or rare variants will increasingly be identified in individuals with a familial disorder without the antecedent discovery of a linked chromosomal region. Furthermore, genome-wide association studies are identifying loci that contain frequent variations (genetic polymorphisms) as being associated with disease risk on a population level. This leads to the discovery of genetic risk factors rather than disease-determining gene mutations. Currently, this process may lead to a new locus symbol (e.g., PARK16)3 or may not (e.g., the MAPT gene associated with Parkinson disease [PD]),4 depending only on whether or not the discoverers request a symbol. The current status of locus symbols from the DYT and PARK lists are shown in the figure.
The problems we have identified are each discussed below. Tables 1 and and22 show the DYT and PARK lists and we refer to them repeatedly in the examples. Although some of the symbols have already been withdrawn by HGNC, we include them as this is the common practice in the literature. The last column in each table indicates whether or not the entry represents one of the problems described.
Classic linkage analysis has identified monogenic forms with dominant inheritance (e.g., DYT1 and PARK1) and recessive inheritance (e.g., PARK7) based on the assumption that all affected members of a family share the same monogenic cause. Carrying several common risk polymorphisms represents a different form of genetic influence or disease risk.5 Some of these genetic risk factors have had locus symbols assigned to them. Linkage studies in unrelated individuals in isolated communities (e.g., leading to the assignment of the PARK10 locus)6 and more recently genome-wide association studies (e.g., PARK16, PARK17, and PARK18 loci)3,4,7 have been performed to identify such genetic risk factors. However there are equally or better-established genetic risk factors that have not been provided with locus symbols, e.g., the MAPT or GBA genes and PD.
Genetic risk factors have very different implications than a disease-causing mutation for the relationship between a disease and a chromosomal region.5 For instance, carriers of a homozygous mutation in the PARK2 gene encoding Parkin have a risk of close or equal to 100% of developing PD within a normal lifespan. In contrast, based on a meta-analysis of 5 large genome-wide association studies and independent replication of these findings, a 2.5 times higher risk of PD has been estimated based on carrying the 11 most significantly associated variants.8 Assuming a prevalence of PD of 0.14% in the general population,9 a 2.5-fold increase in risk corresponds to a lifetime risk for developing PD of only 0.35%. Assigning a locus symbol to genetic risk factors invites a misleading view of their significance for disease causation and leads to misuse of genetic testing resources.
Policy-making on this subject is complicated by the fact that a single gene can provide both risk variants and causative mutations (e.g., PD associated with variations in the SNCA or LRRK2 genes).8 Currently there is no way in the locus symbol system to make the distinction between disease-causing and risk-conferring genetic variations.
The disorders belonging to lists of loci can include a broad range of phenotypes. For example, the DYT loci range from early-onset generalized dystonias (e.g., DYT1, DYT6) to paroxysmal dyskinesias (DYT8, DYT9, DYT10, DYT18, DYT19, DYT20) with variable movement disorder phenomenology (e.g., dystonia, chorea, ballism).10 This is not necessarily a problem as long as the core movement disorder (dystonia) is a prominent and consistent feature. However, this is not always the case; for example, the movement phenotype in the paroxysmal dyskinesias may be variable or mixed and may never demonstrate dystonia. As a second example, parkinsonism is a rare manifestation of mutations in the PLA2G6 gene. Disconnections between list membership and phenotype may arise for several reasons. First, the cases that serve to establish the linkage to a particular locus may not be typical, and as additional cases are discovered, it is clear that the main movement disorder is different from the list to which the disorder was assigned. Second, nonscientific motivations can have an influence on the choice of symbol used or requested. For example, in cases of mixed phenotypes, a “PARK” symbol may be more desirable than a “DYT” symbol because of the potentially increased chances of publication in a higher impact journal and of obtaining funding for studying a (seemingly) more common disorder with a higher socioeconomic burden.
Some conditions include more than one prominent movement disorder. For example, DYT3 or X-linked dystonia-parkinsonism (“Lubag”) is a member of only the DYT list and not listed among the PARKs. The same can be said for rapid-onset dystonia-parkinsonism (DYT12). Currently, there is no precedent for representing these disorders on more than one list.
Some established genetic disorders are never granted a locus symbol. In some cases this is because the discovery does not come to the attention of HGNC. In others, the causative gene may be discovered without antecedent linkage to a chromosomal region. The latter can occur with candidate gene approaches to gene discovery. An example of this is mutations in the SPR gene coding for sepiapterin reductase causing dopa-responsive dystonia.11 Although the SPR gene has been proven as a cause of dystonia, the chromosomal region has never been assigned a place in the list of locus symbols. This would not be within the remit of HGNC, which is charged with naming genes and is not a phenotypic database. This example highlights a paradox; the lists of locus symbols have widespread use as a link between phenotype and genotype, and such a linking list is an essential resource for clinical diagnostic purposes. However, the body in charge of assigning locus symbols does not have a mandate to retain the link between phenotype and genotype. This problem will be magnified as next generation sequencing allows mutation detection to be the first step in finding the genetic basis for a familial disorder and the lists of loci will become increasingly incomplete.
There are a number of loci that have not yet been linked to a gene despite many years since the symbol was designated (tables 1 and and2).2). For instance, since linkage of dystonia to a region on chromosome 1 (DYT13) was identified in 2001,12 dystonia in no other family has been mapped to the same locus. In a similar vein, mutations in some genes implicated in PD or dystonia have not been found in any other family despite extensive mutational screening (e.g., the UCHL1 gene for PARK5).13 Without independent replication there is considerable uncertainty whether or not a locus or a gene are truly linked with a disorder. In the current system of nomenclature there is no way to distinguish confirmed from unconfirmed loci.
Occasionally, technical errors or uncertainties of clinical phenotyping result in erroneous linkage with disease. PARK4, for example, was designated as a chromosomal region associated with PD14 but was later found to be identical with PARK.15 This misdesignation was based on a sample handling error. DYT14 was erroneously designated because of clinical misclassification of one family member and lack of gene dosage analysis of previously known genes.16 The family was later found to have dopa-responsive dystonia associated with a deletion in GCH1 (DYT5a).17 Once symbols are assigned they become very difficult to retract, particularly when they are given a number which provides them a holding place in the list.
Sometimes symbols are established to indicate a family with an apparently inherited disorder, in the absence of any linkage to a chromosomal region, resulting in a locus symbol with no related locus. DYT2 and DYT4 are examples of symbols assigned based on clinical observations of single families. While this may serve the purpose of drawing the attention of other investigators who may have seen similar families, it does not appear sensible to place these observations in a list of locus symbols. The possible dangers are illustrated by the case of DYT2, which has been misused by some as a name for recessively inherited dystonia that is not linked to any of the known loci or due to mutations in known genes.18–20 This results in DYT2 cases being lumped together even though they may have completely different genetic bases.
With investigative teams working in parallel to discover genes and gene loci, there is the potential for erroneous duplication of loci. For example, the symbol DYT18 has been assigned to GLUT1 deficiency due to mutations in the SLC2A1 gene.21 Phenotypic similarity and localization of the DYT18 and DYT9 loci in the same chromosomal region prompted re-examination of the DYT9-linked family.22 Affected members of this family were subsequently found to also carry SLC2A1 mutations and thus, DYT9 was found to be identical with DYT18.23 Clinical and locus overlap also exists for paroxysmal dyskinesias DYT10 and DYT19, as well as for DYT8 and DYT20.24 It remains to be seen whether they also represent examples of locus duplication.
In response to these problems we have developed specific recommendations for revising the system. One difficulty with the current state of affairs is that the same lists of locus symbols are used by both clinicians and researchers, who have very different needs. It is unlikely that a single list is going to serve both of these groups. Our recommendations are framed from a clinical point of view to guide the development of a list that will show clinicians the known clinical spectrum of genetically determined disorders that display that phenotype and will guide clinical evaluation and diagnostic testing. This clinical list would need to be complemented by a comprehensive research database of reported loci and variants, including those that have not yet been confirmed or where a gene has not yet been identified.
Tables 3, ,44 , and and55 provide the updated lists of PARK and DYT symbols for clinical use as they would appear with the recommendations in place. When describing the locus symbols we refer to the phenotype designator (e.g., PARK) as the locus prefix and the gene name as the suffix.
If we know only the chromosomal region associated with a particular phenotype and there is no specific haplotype that is consistently associated with the condition, there are no direct implications for diagnostic testing. Therefore, including these loci in the list of genetically determined dystonias, for example, invites inappropriate requests for genetic testing of individuals with similar phenotypes.
Numbering (e.g., DYT1, 2) conveys no useful information except possibly as a mnemonic aid for the total number of defined loci. However, the mnemonic aid is negated when loci fail to be confirmed, are subsequently revealed to be misassigned, or are not assigned a symbol at all. One would think that there are 21 DYT-assigned disorders linked to chromosomal loci because the highest number is DYT21; however, there are only approximately 13. We recommend that the symbol prefix be followed by the gene name (e.g., DYT-SGCE [currently DYT11]). This naming system conveys that the responsible gene has been defined, and maintains the connection between the phenotype (dystonia) and the gene. The current system used by HGNC of withdrawing locus symbols and merging them into the entry for the causative gene does not maintain the connection between the phenotype and gene in the symbol name and thus does not effectively disseminate the knowledge of gene discovery within the clinical and research communities.
Although remembering gene names is more difficult than remembering single numbers, we argue that the expansion of our knowledge of genetically determined disorders will quickly result in the inability to remember complete lists of any sort. From a clinical perspective the important step is to recognize that a particular phenotype can be genetically determined, which should then trigger referring to the appropriate list that should convey the genetic information.
We strongly recommend that a locus symbol prefix (e.g., PARK) would be conferred only upon disease-causing genes (causing monogenic disorders) and not upon risk factors. One would never use the risk factors to diagnose a condition and a clear distinction of these situations will help to avoid clinical misuse.
There is already a very useful systematic list of genes and loci associated with risk of PD on the PD GENE website (http://www.pdgene.org). This important resource for research has been developed by the Max Planck Institute for Molecular Genetics, the Michael J. Fox Foundation, and the Alzheimer Research Forum. A broader effort not restricted to PD is the Human Variome Project which aims to catalogue human genetic variation and its relationship to disease to facilitate genetic research (http://www.humanvariomeproject.org/). A particular challenge will be agreeing upon a threshold cutoff between disease-causing gene and risk factor due to the phenomenon of reduced penetrance in monogenic forms (e.g., DYT1 or G2019S mutations in the LRRK2 gene). Recognizing that there are disease-causing mutations and risk factors that may arise from the same gene (e.g., SNCA), we propose that such genes be represented on both lists.
A number of the inaccuracies and redundancies in the current list of locus symbols could have been avoided if a higher level of evidence had been required prior to symbol assignment. We acknowledge that there is a need for a catalogue of unconfirmed findings to allow researchers to build upon the work of others. However, the list of locus symbols that inevitably becomes a master list of genetically determined disorders is not the place for these preliminary findings. The Venice criteria, which were developed to assess and rate the cumulative epidemiologic evidence in genetic associations,25 are used to help assess the certainty of risk factor associations. A similar process is warranted for disease-causing mutations. Specific recommendations on the levels of evidence are beyond the scope of this article and should be decided by a committee of experts.
For a locus to be a member of a particular phenotypic list, the phenotype (e.g., dystonia in the case of DYTs) should be the most prominent movement disorder seen in the disease linked to that locus and should be a consistent feature. For the paroxysmal dyskinesias we recommend that these disorders be provided with a unique prefix (e.g., PDYSK, table 4) as their paroxysmal nature is the most striking clue to the diagnosis and dystonia may not be a feature at all.
Where more than one movement disorder is a prominent and consistent feature of cases, or when a second movement disorder can predominate in individual cases, a double prefix could be assigned to this locus (e.g., DYT/PARK-ATP1A3) and the symbol would belong to more than one list. This would ensure that any one list is, in fact, a complete reference of the established monogenic conditions that manifests with a specific movement disorder and thus a reliable reference for clinicians.
Revising locus symbols is a task with wide-reaching implications for the way researchers and clinicians communicate information about genetic disorders. The issues are complex and finding the best system will require extensive discussions hearing many points of view. The goal of our article is to provide a framework for starting these important discussions. This task will require input from both genetics and subspecialty clinical experts internationally. Task forces with subspecialty knowledge, perhaps organized by professional bodies such as the Movement Disorder Society in the case of parkinsonism and dystonia, could act in an advisory capacity to revise the current system. Further, since HGNC's mandate is to maintain a gene database rather than a phenotypic one, it would be important to involve representatives from phenotypic databases such as Online Mendelian Inheritance in Man (OMIM, http://www.ncbi.nlm.nih.gov/omim) and Genetests (www.genetests.org). Collaboration between clinical experts and geneticists will also be important going forward when a new locus symbol is requested. This may help avoid symbol assignment that is not the most consistent with the phenotype. A more formal mechanism for incorporating new data into the naming system needs to be established and should include input from the same groups. It will also be important for journal editorial boards to ensure that new symbols are published only after the appropriate process has been undertaken.
Although we have drawn on parkinsonism and dystonia as examples of existing problems, the suggestions for a new system are applicable to any field with disorders that have several genetic determinants; in neurology, other examples include the series of SCA (spinocerebellar ataxia), SPG (spastic paraplegia), and JBTS (Joubert syndrome) loci. Currently, all known genes implicated in monogenic forms of PD and risk single nucleotide polymorphisms combined explain only a fraction of all PD cases etiologically. It seems likely that much of this “missing heritability” may be accounted for by rare sequence variants,26 which can now be assessed in more detail after finishing the pilot phase of the 1,000 Genomes Project by next generation sequencing.27 In addition to lists of causative genes and genetic risk factors we may soon see the evolution of a list of rare sequence variants linked to PD, dystonia, and many other diseases. With the advent of next generation sequencing, we expect that we will see a rapidly increasing number of genes and variants of unknown pathogenicity that, undoubtedly, will be assigned locus symbols by either HGNC or the investigators themselves. Although of great interest scientifically, associated risk variants are practically meaningless in the clinical setting. It will be important to guide the field through these changes and prepare a system of nomenclature that will ensure that the evolving technologies do not create greater nosologic problems than already exist.
The authors thank Dr. Elspeth Bruford from HGNC for her comments and review of the manuscript.
Dr. Marras: design and conceptualization of the study, drafting and revision of manuscript. Dr. Lohmann: drafting and revision of manuscript. Dr. Lang: drafting and revision of manuscript. Dr. Klein: design and conceptualization of the study, drafting and revision of manuscript.
Dr. Marras has served as a consultant for Solvay Pharmaceuticals, Inc. and has received research support from Merck Serono, the National Parkinson Foundation, Parkinson Study Group, Parkinson Disease Foundation, Michael J Fox Foundation, and the Canadian Institutes of Health Research. Dr. Lohmann received funding from the German research foundation and the Bachmann-Strauss dystonia and Parkinson foundation. Dr. Lang has served on scientific advisory boards for Abbott, Allon Therapeutics, Inc., Biovail Corporation, Boehringer Ingelheim, Cephalon, Inc., Ceregene, Eisai Inc., Medtronic, Inc. Lundbeck Inc., NeuroMolecular Pharmaceuticals, Novartis, Merck Serono, Solvay Pharmaceuticals, Inc., TaroPharma, and Teva Pharmaceutical Industries Ltd.; has received speaker honoraria from GlaxoSmithKline and UCB; receives/has received research support from the Canadian Institutes of Health Research, the Dystonia Medical Research Foundation, the Michael J Fox Foundation, the National Parkinson Foundation, and the Ontario Problem Gambling Research Centre; and has served as an expert witness in cases related to the welding industry. Dr. Klein has received speaker honoraria from GlaxoSmithKline, Boehringer Ingelheim, Merz Pharmaceuticals, LLC, and Orion Corporation; serves on the editorial boards of Neurology®, Movement Disorders, and Parkinson's Disease; serves as a consultant for Centogene, Boehringer Ingelheim, and Link Medicine; and receives research support from the German Research Foundation (DFG), intramural funding from the University of Lübeck, the Volkswagen Foundation, the Possehl Foundation, the Hermann and Lilly Schilling Foundation, and the Fritz Thyssen Foundation.