|Home | About | Journals | Submit | Contact Us | Français|
The third Human Variome Project (HVP) Meeting “Integration and Implementation” was held under UNESCO Patronage in Paris, France, at the UNESCO Headquarters May 10–14, 2010. The major aims of the HVP are the collection, curation, and distribution of all human genetic variation affecting health. The HVP has drawn together disparate groups, by country, gene of interest, and expertise, who are working for the common good with the shared goal of pushing the boundaries of the human variome and collaborating to avoid unnecessary duplication. The meeting addressed the 12 key areas that form the current framework of HVP activities: Ethics; Nomenclature and Standards; Publication, Credit and Incentives; Data Collection from Clinics; Overall Data Integration and Access—Peripheral Systems/Software; Data Collection from Laboratories; Assessment of Pathogenicity; Country Specific Collection; Translation to Healthcare and Personalized Medicine; Data Transfer, Databasing, and Curation; Overall Data Integration and Access—Central Systems; and Funding Mechanisms and Sustainability. In addition, three societies that support the goals and the mission of HVP also held their own Workshops with the view to advance disease-specific variation data collection and utilization: the International Society for Gastrointestinal Hereditary Tumours, the Micronutrient Genomics Project, and the Neurogenetics Consortium.
The history of human genetics features a strong tradition of collaboration, which has produced remarkable achievements. One such example is the unraveling of the complex genetics of the HLA system, which was built on sharing of reagents and unpublished data. A more recent example is the Human Variome Project (HVP) (www.humanvariomeproject.org), which stems from a similar need to understand genetic variation and its implications for the treatment of diseases that arise from mutations and variations in the human genome. However, in the era of high-throughput analysis as well as ultraspecialization, one challenge is to deal with the unprecedented data overload and harness it for the benefit of routine healthcare. HVP is not “collaborative and exclusive” like many other scientific endeavors but “exclusively collaborative.” Its global nature demands “local action for the global good” and aims to bring together key individuals who are currently working on the development of protocols and systems relating to the generation, interpretation, and storage of data on human genetic variation as well as those generating the data.
The major aims of the HVP are the collection, curation, and distribution of all human genetic variation affecting health. This global project involves community activity to collect this information for databasing. It is estimated that there will be at least 100 to 1,000 variations in each of the estimated 20,000 human genes. This ambitious project is the vision of Prof. Richard Cotton (Melbourne, Australia) and the attendees at the official launch in 2006 in Melbourne [Cotton et al., 2007; Ring et al., 2006] The first meeting was sponsored by the World Health Organisation (WHO) and supported by the American College of Medical Genetics and the Genomic Disorders Research Center. It brought together representatives from the European Union (EU), United Nations Educational, Scientific, and Cultural Organization (UNESCO), Organization for Economic Co-operation and Development (OECD), National Center for Biotechnology Information (NCBI), and the European Bioinformatics Institute (EBI), as well as geneticists from 30 countries. In 2008, the HVP Planning Meeting was held in Spain, which formulated the actions needed to address the challenge and consolidated the 12 key areas that form the current framework of HVP activities [Kaput et al., 2009]. We have minimized reference to the publications of the substantial amount of work by the members of the HVP/ HGVS community to simplify this report. Reference to such work can be found in the following: The HGVS Website, Cotton et al. , Kaput et al. , and the HVP Website (www.humanvariomeproject.org/index.php/publications).
The third HVP meeting on “Implementation and Integration” was held under UNESCO Patronage in Paris, France, at the UNESCO Headquarters May 10–14, 2010. It was co-organized by the Genomic Disorders Research Center (the HVP Coordinating Office) and the Division of Basic and Engineering Sciences, the Natural Sciences Sector of UNESCO, in association with the American College of Medical Genetics. It was attended by over 150 registrants from 30 countries representing all continents. The aim was to develop procedures for the implementation of the recommendations and actions from the first two meetings in a global collaborative context and to prepare the systems necessary to routinely and systematically gather the growing number of disease mutations now being discovered.
The objectives were:
The meeting was officially opened by Dr. Walter Erdelen, UNESCO Assistant Director-General for Natural Sciences, who welcomed the participants and expressed UNESCO’s support for the principles and mission of the Human Variome Project. He outlined UNESCO’s active involvement in setting up the first meeting of the Human Genome Project in Paris in 1989, in response to the “revolution in biology that occurred in the late 1970s and 1980s” and the promise of “genetic engineering to solve the problems of both old and new diseases.” He emphasized UNESCO’s role in facilitating international scientific cooperation in this effort, particularly among developing countries and between developing and developed nations, and concluded that he saw “the Human Variome Project as a natural successor to the Human Genome Project.”
The Plenary Lecture was given by Sir John Burn, the Director of the Institute of Human Genetics at the Centre for Life in Newcastle upon Tyne, UK. Among his many contributions to clinical genetics, he was the Founding President of the International Society for Gastrointestinal Hereditary Tumours (In-SiGHT), which is closely associated with the Human Variome Project. He gave a spirited overview of the use of genetic knowledge in the current healthcare system, where gene mutation testing is not always applied when necessary. He outlined the extraordinary promise of Next Generation sequencing and other new technologies and their potential benefits for routine clinical practice. He predicted that diagnostic labs will continue to be important players in the discovery of less predictable intronic and promoter variants. However, the key driver for success is access to detailed phenotypic data with the mutation data, which will allow therapeutic intervention for mutation carriers before the onset of the disease. He concluded that in order to offer better treatment options for patients, genetic risk has to be quantified using mathematical models such as Bayes’ Theorem, which has already been applied for many unclassified variants in hereditary breast and colorectal cancers.
The plans and projects are ready for specific grant applications to carry out the defined and agreed activities. These could be piloted by students. More detailed summaries from each working group can be found in the online Supporting Information.
The Ethics Workgroup has produced detailed guidelines that are intended to help the curators of locus-specific or disease-specific variation databases (LSDBs) resolve ethical issues [Povey et al., 2010].
It was resolved to explore the diverse needs of different cultures with a view to modification of future guidelines, with particular emphasis on acceptable forms of consent (both written and unwritten), privacy, and rules for collection and display of clinical data. The group will work with the Workgroup for the Data Collection from Clinics to find a method for the ethical export of data, including the possibility that explicit consent from the donor may not always be required or feasible. The group also voted against restrictions being placed on the secondary use of data that have already been published and are electronically available, but will maintain awareness of these concerns. Another developing area is a trend toward “open consent,” where donors consent to acknowledging that although researchers will expend every effort to assure privacy, in reality such assurance cannot be guaranteed. In other words, participants agree that privacy violations are a reality of research, specifically in regard to future possible uses of data. The ethical rationale behind this notion is transparency, namely, that it is ethical to be utterly transparent about actual practice.
The meeting discussed the proposed changes to HGVS recommendations for the description of sequence variants, which have evolved considerably over time (www.HGVS.org/mutnomen). The meeting agreed with the most significant changes, such as the introduction of a versioning system, which allows users to indicate the version used, and the recommendation to use the Locus Reference Genomic (LRG) sequence format as the new standard for the description of sequence variants. Details on how to obtain an LRG can be found at the LRG Website (www.lrg-sequence.org). Other subjects were discussed in depth, for example, changing the symbol for a stop codon from X to a * in the description of changes at the protein level, to follow existing guidelines (IUPAC/IUBMB). Additionally, it was agreed that the HGVS recommendations should be advertised more widely (journals, human genetics societies, granting organizations, etc.), and the HGVS should try to get an ISO certification. The meeting agreed that guidelines are needed regarding the description of copy number variant (CNV) genes and complex sequence variants in the human genome. Proposals will be available for comment on the HUGO Gene Nomenclature Committee (HGNC) Website (www.genenames.org/index.html) and the HGVS Website (www.HGVS.org/mutnomen/HGVS_extend_PT.doc). Finally, agreement was reached that HVP should implement a gene ontology and consider the VariO proposal, which is available for comment on the VariO Website (variationontology.org). Mauno Vihinen has undertaken the task of reviewing all the standards available and those needed and to develop them.
One of the recommendations from the previous HVP Meetings was to encourage more submission of mutation data to databases, by developing incentives, such as microattribution to the individual submitting data [Axton, 2008]. This concept was further discussed and developed by the Workgroup and integrated with new initiatives, such as Open Researcher and Contributor ID: ORCID, www.orcid.org; DataCite, www.datacite.org; Bioresource Impact Factor, BRIF, and nanopublication. To test the idea of microattribution reviews [Axton, 2008], hemoglobin variants have been collectively annotated together with terms from the MeSH controlled vocabulary describing phenotypes, population frequency data, geographic origin in the HbVar database (globin.bx.psu.edu/hbvar), and Globin Gene Server LOVD (lovd.bx.psu.edu). Each entry in the review gives citation credit to the original observers and the curators, both for published and unpublished data. After publication, these data can be represented on a wiki browser to collect comments and future variant reports, and can form the substrate for Concept Web nanopublication constructs (RDF triples). Unique, persistent identifiers for bioresources will be piloted in several projects, including the ongoing GEN2PHEN project (www.gen2phen.org) and Bio-SHARE-EU, which will commence in early 2011. Tracking citations to these identifiers from various scholarly works will enable the application of bibliometrics to assess impact of individual publications, as well as larger data compilations (databases). Via the Datacite initiative, bioresource IDs can be equated with DOIs and cited in the same way as preprints and other nonreviewed public resources. Publication of exacting research and “mutation updates,” where deposition of variants is rewarded by authorship, has previously been restricted to Nature Genetics and Human Mutation.
Clinics have a central role in the collection and utilization of genetic knowledge because of their direct interaction with the patient and with the asymptomatic family members who may request genetic testing. Therefore, accurate and standardized phenotype data must accompany genotype data to maximize their clinical benefit. The Workgroup voted unanimously that phenotype data should be provided by clinics despite the challenges posed, and agreed that advancing our understanding of genome variation will not proceed without correlation with human phenotypes. To accomplish this task, standardized phenotyping definitions need to be developed that have universal rather than disease-specific applications. Furthermore, bioinformatic systems are needed to assemble these data and to collate with published information. The issue of the role of the lab versus the clinic in acquiring phenotype information remains undecided. Regulation or legislation might be used to require either or both to participate in this process. However, there were serious misgivings that enforced collection would be difficult to supervise and might impede the provision of genetic tests, especially for rare disorders.
This Workgroup discussed the challenges of data integration and access from heterogeneous systems outside the central databases, such as disease-specific mutation databases. The systems need to have a user-friendly interphase and data need to be standardized. It was concluded that many of the available systems are unfamiliar to the individual researcher and professionals working in the healthcare system. It was resolved to establish a Webpage on the HVP Website, which gives a listing of all the available software/tools and further information, such as the IT language used for their development, availability, price, URL, contact persons or institutes, and the experience with their use and implementation. A discussion forum should be created for people to submit problems and queries. Because a related initiative has already been started in the GEN2PHEN project, HVP will interact with this subgroup to share the large amount of work. An upcoming special issue of Human Mutation, co-edited by Peter Robinson and Annika Lindblom, will discuss these systems in more detail.
There have been encouraging advances in this area in recent years, including (1) standard international nomenclature rules (www.hgvs.org), (2) professional Standards and Guidelines for Clinical Molecular Laboratories (see American College of Medical Genetics, www.acmg.net), (3) multiple gene-specific databases, and (4) algorithms using multiple predictive software tools, functional assays, and family studies that are available for prediction of pathogenicity of variants of unknown clinical significance. However, the collection of data remains a challenge. The barriers identified include lack of incentive for laboratories to provide genotypic data and for clinicians to provide phenotypic data, concerns of patient confidentiality and HIPPA regulations, and practical issues surrounding data transfer. Also, the trend toward clinical syndrome, exome or genome sequencing is going to increase the scale of data substantially, and gene-specific databases may not be appropriate for these data.
Unclassified variants (UVs), such as missense mutations in disease-associated genes, are a major problem in the interpretation of clinical significance of mutations. For example, a very significant proportion of the described BRCA1 and BRCA2 sequence variants fall in this category. The overall aims of this group are (1) to classify the pathogenicity of genetic variants with sufficient confidence to use clinically, (2) to pursue classification using methods that incorporate multiple lines of evidence or data types, and (3) to establish standards, validation, quantification, and transparency in evaluating each data type. Since the 2008 HVP meeting, considerable progress has been made with assigning pathogenicity of UVs, particularly in hereditary breast and colorectal cancers by the International Agency for Research on Cancer (IARC) Unclassified Genetic Variants Working Group. This work and other important initiatives were discussed in the meeting.
A unique set of challenges for the global aspirations of HVP are created by the diverse ethical and legal requirements of individual countries and cultural differences relating to human health. Therefore, a centrally mandated one-size-fits-all approach is not feasible, and HVP has developed its One Project-Two Channels-Multiple Locations strategy. Central to this strategy is the establishment of HVP Country-Specific Nodes, which can range from one large country having several nodes, or several countries grouping together to form a single node. The steps required to begin the process of creating a HVP Country-Specific Node are shown in Box 6.
The Workgroup recommended that HVP commence a concerted worldwide educational campaign, which targets policy makers, healthcare providers, scientists, patients, families, patient support groups, and the public. This campaign should aim to explain the importance of data collection demonstrating the potential universal benefit, increase the number of efficient curators, and advise about setting up a country node and maintaining it. One of the objectives of HVP is to involve emerging countries in the generation of molecular data, by developing capacity and skill and funding training programs. The Workgroup recommended establishing further regional/specialist networks, such as the African Genome Initiative, Pan Asian Consortium, the Center for Arab Genomic Studies, and the emerging Ibero–American HVP network initiative. These networks and linkages ensure regional networking and sustainability. An article detailing the setting up of a country node is in preparation, and a pilot study has been funded in Australia to join other country initiatives, which need to be implemented.
This Workgroup discussed how to improve the clinical utility of genetic information stored in databases and assist translation of gene tests from research to clinical practice. The data collected should be complete enough to be clinically useful, including both genotype and phenotype information. This is particularly important as genetic testing is increasingly integrated into mainstream clinical services, because many medical professionals are likely to have less experience in the interpretation of genetic data than services provided by clinical geneticists. The Workgroup recommended that laboratories and clinics submit genotypes with succinct data sets of phenotypes and family data in coded form.
Reports should try to include as much phenotypic data as possible to try to facilitate interpretation of whether a variant is clearly pathogenic, clearly neutral, or a Variant of Uncertain Significance (VUS, also called UVs). However, final classification of pathogenicity for sequence variants should be undertaken by expert panels, to avoid presenting conflicting conclusions to the biomedical and clinical communities. A pilot study is underway under the auspices of InSiGHT to classify UVs in the DNA mismatch repair genes responsible for Lynch syndrome. The need was identified for Web-based tools to be used for immediate classification of variants. The online resource should list a network of interested scientists offering in silico studies, functional studies, and association studies to classify UVs in the genes of interest. The Workgroup also recommended that HVP should endorse accreditation of labs with standards used in each country and a system of a limited number of accredited LSDBs for high quality. HVP should have a way of certifying databases with a scoring system to meet minimum requirements.
There are 1,550 LSDBs currently listed by the HGVS (www.hgvs.org/dblist/glsdb.html). Major initiatives such as the EU Framework Project 7 funded GEN2PHEN project, have provided the infrastructure necessary to establish and maintain more LSDBs. Specialized software have been developed, which allow fully Web-based initiation and curation of LSDBs following HGVS recommendations and have led to a more uniform presentation and content. This Workgroup covered three important development areas for LSDBs: data transfer between databases, and from laboratories to databases, as well as database systems and curation.
As much more data are going to be available in the near future it is important to automate routine steps to allow curators to concentrate on development and quality control. All variation sources should use the same standards, such as HGVS nomenclature, LRG or RefSeqGene sequences, HGNC gene names, dbSNP reference numbers, and ISO certified standards. The Workgroup recommended that a rating system be adopted for database quality control, which should be reevaluated at frequent intervals, for example, every 3 years. A training module for LSDB curators should be established to educate about the quality requirements and standards. Database curators should be given credit in published articles and each database should obtain a digital identifier. LSDBs should be consistent with the LSDB object model developed at GEN2PHEN and allow easy integration with other resources.
This session discussed how information related to human variants and associated phenotypes are being processed and represented by central databases, and focused on recommendations for data elements that should be centralized. The EBI and NCBI are ready to accept submissions of consented summary data from LSDBs and diagnostic laboratories and have developed a clearly defined process to do so. The minimum dataset required for data submission is shown in Box 9.
together with additional, optional recommended fields [Cotton et al., 2008]. For their part, the central databases are committed to allowing data to be imported and displayed on common genome browsers and to facilitate supporting bidirectional data exchange. They will validate HGVS names of the submissions, assign database identifiers where necessary, and provide the technical solutions to enable placement on current and future genome assemblies. They can provide an easy-to-download report with Database, PubMed and OMIM identifiers, allele frequencies from population studies, GWAS associations, representation on LRG/RefSeqGene sequences, and validation results. For data that are not consented for release, the EBI and NCBI provide archives for controlled access data (EGA/dbGaP). A number of specific actions were agreed upon, such as NCBI and EBI to develop standard exchange formats, in conjunction with GEN2PHEN. The central database work group recommends creating a single LSDB Registry in a central location and the use of a phenotype and variation ontology.
Expert data curation is necessary to ensure that the publicly available gene mutation data are accurate. No single body worldwide can be expected to bear all the cost. This session proposed various approaches to the sustainability of this program, examined the cost-effectiveness of using virtual genomic information as a tool to enhance disease screening (cancer), and finally proposed a global mode of cooperation to sustain the HVP effort. The existing sustainability models that have been successful for other scientific consortia include (1) mixed model of institutional/government funding and cost recovery; (2) partnership between academic and private sector; (3) distributed cost model, where cost of infrastructures and its administration are distributed across different grants and contracts within an overall scientific and funding plan; (4) split cost by country, disease, gene, such as the “Adopt-a-Gene™ or Adopt-a-Disease™ approach”; (5) top-down (UNESCO/WHO imprimatur) and bottom-up (individual groups/consortia seeking and lobbying for funding) strategies to seek funding. It was recommended that HVP develops an overall scientific plan for specific research and translational projects that will incorporate infrastructure cost, following the approach that is being piloted by the InSiGHT group. HVP should maintain a running list of grants and research projects that rely on or benefit from the HVP initiative to provide leverage with funders and establish a consortium of global funders interested in the HVP goals and mission.
Three societies that support the goals and the mission of HVP and that aim to spread activities to their genes of interest, also held their own Workshops with the view to advance disease-specific variation data collection and utilization. InSiGHT is the peak organization representing health professionals and researchers working on inheritable gastrointestinal cancers (www.insightgroup.org). InSiGHT has volunteered to be a pilot for collecting mutation data for all genes from all countries, and this system is intended to form the template for other disease/specific gene collections. The purpose of the InSiGHT Workshop was to build on the initiatives founded at the 2009 meeting in Duesseldorf between InSiGHT, the HVP, and the National Institutes of Health Colon Family Register [Kohonen-Corish et al., 2010], to understand the progress and direction of the LOVD Colon Cancer Gene Variant Databases, to establish a robust process for addressing unclassified variants, and to commission a phenotype template on the database.
A summary of this meeting is included in the Supporting Information and a full report is being prepared for publication.
The Nutrigenomics Organization (www.nugo.org) organized a Micronutrient Genomics Project (MGP) Workshop, the fourth in a series of meetings to establish and organize the creation of an international micronutrient genomics knowledge base (www.nugo.org/micronutrients) and research effort. The MGP is planning a public bioinformatics resource available consisting of a knowledge base with integrated analytical tools and databases. A key distinction of this effort will be the ability for new research results to be stored, managed, and retrieved for analyses. The three components of the MGP knowledge base are (1) a genetic variation module for all micronutrient-relevant variations; (2) a micronutrient pathway module that links pathways and gene–nutrient interactions at the level of RNA, protein, and metabolites (micronutrients.wikipathways.org); and (3) a database of omics data, phenotype, and study design. Working groups consisting of researchers and experts have been or are being established for each micronutrient. A summary of this meeting is included in the Supporting Information, and a full report is being prepared for publication.
The Neurogenetics Consortium held their second HVP Workshop for the implementation and improvement of mutation databases for genetic disorders of the nervous system. The specific challenges of this task were discussed, including clinical complexity and overlap, genetic heterogeneity, phenotype nomenclature, variant interpretation, informatics procedures, and ethical aspects. Specific case examples on inherited neuropathies, channelopathies, motor neuron diseases, Parkinson’s disease, and mitochondrial cytopathies were presented. This multidisciplinary meeting was attended by clinical neurologists, clinical geneticists, basic researchers, private companies, and informaticians. The meeting resolved to form international expert working groups with the aim to establish coordinated LSDBs on neurogenetic disorders such as spastic paraparesis, Charcot-Marie-Tooth, and mitochondial diseases. Researchers and healthcare professionals with an interest in the field are encouraged to join these efforts. The main goals and specific tasks emanating from the meeting are summarized in Box 11.
A more detailed report of the two Neurogenetics Consortium meetings will be published elsewhere.
“The vision of the Human Variome Project is to be a catalyst for reduction in human disease in the 21st century by facilitating the establishment and maintenance of standards, systems, and infrastructure for the worldwide collection and sharing of all genetic variations effecting human disease. The HVP is an international consortium committed to reducing the burden of genetic disease on the world’s population. We believe that the collection of information on every instance of a genetic variation and its affect on human health is the only way that our vision can be achieved. The sharing of information on genetic variation and its consequences allows existing treatments to be delivered more effectively to patients and new treatments and cures to be developed. To ensure the complete capture of all human genetic variation, the HVP is focused on collecting information through two separate, yet complementary, channels: country-specific collection and gene/disease specific collection.”
The full document “Project Roadmap 2010–2012,” which outlines all HVP activities and its governance structure, is publicly available (www.humanvariomeproject.org/index.php/publications/policy-documents).
Any interested individuals can join the HVP Consortium by registering at www.humanvariomeproject.org. Countries, databases, organizations, and groups are encouraged to participate by becoming a Partner Initiative or an Affiliated Initiative and should apply in writing to the Scientific Advisory Committee (www.humanvariomeproject.org/index.php/about/scientific-advisory-committee) via the HVP Coordinating Office in Melbourne, Australia.
The collation of genetic variations began in the 1950s by researchers and clinicians who established LSDBs to assist research and clinical care. In the 1990s, the discovery of single nucleotide polymorphisms (SNPs) grew into a colossal effort, resulting in the dbSNP. Efforts within the field have achieved increasing alignment between the two approaches; however, their application and infrastructure still require greater consistency. As exemplified in this meeting, the HVP has drawn together disparate groups, by country, gene of interest, and expertise, who are working for the common good with the shared goal of pushing the boundaries of the human variome and collaborating to avoid the unnecessary duplication often prevalent in this area of healthcare. Consequently, more countries are expressing interest: those gathering and curating mutational data together with those with critical relevant technologies.
The 12 sessions dealt with the general aspects needed for phenotype and genotype documentation in inherited disease. In concert with these sessions, three “applied” satellites were convened, one of which addressed the initial HVP/InSiGHT pilot study into inherited colon cancer. Importantly, this gives the variome community a test bed for approaches chosen for their apparent ability to assist busy labs and clinical workers to submit and access data critical to their patients. The community represented at this meeting has focused on the projects that are critical to further their area and represent targets for collaborations, major projects, and student projects. This list is a substantial advance on the 96 recommendations outlined by the first meeting [Cotton et al., 2007]. Furthermore, many of these 96 recommendations have been acted upon during the past 4 years. It is hoped that meetings in specialized areas, such as pathogenicity of variants and disease-specific genes, will lead to similar advances in the next years.
Special thanks must go to those at UNESCO whose support allowed this special meeting to occur in such a high profile venue, especially to Maciej Nalecz and Walter Erdelen. The Patronage by UNESCO was particularly welcome. Dr. Erdelen is thanked for his wonderful introductory comments. Julia Hasler and Casimiro Vizzini are thanked for their careful attention to detail and day-to-day support, which allowed the meeting to run smoothly and productively. We are grateful to Heather Howard, Rania Horaitis, Tim Smith, and Lauren Martin for their diligence, and the excellent organizational work they put into the meeting. We thank Wiley–Blackwell Publishing for sponsoring the Editors Meeting and EBI for holding a training workshop as a satellite event. Finally, we thank the speakers and attendees who made the meeting such a successful event.
Additional Supporting Information may be found in the in online version of this article.
Communicated by Mark Paalman
Disclaimer: The views discussed in this publication do not necessarily reflect those of the U.S. FDA.