In our studies of genetically isolated populations in a remote mountain area in the center of Sardinia (Italy), we found that 80–85% of the inhabitants of each village belong to a single huge pedigree with families strictly connected to each other through hundreds of loops. Moreover, intermarriages between villages join pedigrees of different villages through links that make family trees even more complicated. Unfortunately, none of the commonly used pedigree drawing tools are able to draw the complete pedigree, whereas it is commonly accepted that the visual representation of families is very important as it helps researchers in identifying clusters of inherited traits and genotypes. We had a representation issue that compels researchers to work with subsets extracted from the overall genealogy, causing a serious loss of information on familiar relationships.
To visually explore such complex pedigrees, we developed PedNavigator, a browser for genealogical databases properly suited for genetic studies.
The PedNavigator is useful for genealogical research due to its capacity to represent family relations between persons and to make a visual verification of the links during family history reconstruction. As for genetic studies, it is helpful to follow propagation of a specific set of genetic markers (haplotype), or to select people for linkage analysis, showing relations between various branch of a family tree of affected subjects.
PedNavigator is an application integrated into a Framework designed to handle data for human genetic studies based on the Oracle platform. To allow the use of PedNavigator also to people not owning the same required informatics infrastructure or systems, we developed PedNavigator Lite with mainly the same features of the integrated one, based on MySQL database server. This version is free for academic users, and it is available for download from our site
Meiotic crossovers are the major mechanism by which haplotypes are shuffled to generate genetic diversity. Previously available methods for the genome-wide, high-resolution identification of meiotic crossover sites are limited by the laborious nature of the assay (as in sperm typing).
Several methods have been introduced to identify crossovers using high density single nucleotide polymorphism (SNP) array technologies, although programs are not widely available to implement such analyses.
Here we present a two-generation "reverse pedigree analysis" method (analyzing the genotypes of two children relative to each parent) and a web-accessible tool to determine and visualize inheritance differences among siblings and crossover locations on each parental gamete. This approach is complementary to existing methods and uses informative markers which provide high resolution for locating meiotic crossover sites. We introduce a segmentation algorithm to identify crossover sites, and used a synthetic data set to determine that the segmentation algorithm specificity was 92% and sensitivity was 89%. The use of reverse pedigrees allows the inference of crossover locations on the X chromosome in a maternal gamete through analysis of two sons and their father. We further analyzed genotypes from eight multiplex autism families, observing a 1.462 maternal to paternal recombination ratio and no significant differences between affected and unaffected children. Meiotic recombination results from pediSNP can also be used to identify haplotypes that are shared by probands within a pedigree, as we demonstrated with a multiplex autism family.
Using "reverse pedigrees" and defining unique sets of genotype markers within pedigree data, we introduce a method that identifies inherited allelic differences and meiotic crossovers. We implemented the method in the pediSNP software program, and we applied it to several data sets. This approach uses data from two generations to identify crossover sites, facilitating studies of recombination in disease. pediSNP is available online at .
To assess the ability of My Family Health Portrait (MFHP) to accurately collect family history for six common heritable disorders.
Family history is useful to assess disease risk, but is not widely used. We compared the pedigree from MFHP, an online tool for collection of family history, to a pedigree supplemented by a genetics professional.
150 volunteers collected their family histories using MFHP. A genetic counselor interviewed the volunteers to validate the entries and add diagnoses, as needed. The content and the affection assignments of the pedigrees were compared. The pedigrees were entered into Family Healthware™ to assess risks for the diseases.
The sensitivity of MFHP varied among the 6 diseases (67–100%) compared to the supplemented pedigree. The specificities ranged from 92–100%. When the pedigrees were used to generate risk scores, MFHP yielded identical risks to the supplemented pedigree for 94–99% of the volunteers for diabetes and colon, breast, and ovarian cancer. The agreement was lower for coronary artery disease (68%) and stroke (83%).
These data support the validity of MFHP pedigrees for four common conditions – diabetes and colon, breast, and ovarian cancer. The tool performed less well for coronary artery disease and stroke. We recommend that the tool be improved to better capture information for these two common conditions.
My Family Health Portrait; common disease; family history; risk assessment; pedigree
A significant portion of frontotemporal lobar degeneration (FTLD) is due to inherited gene mutations, and we are unaware of a large sequential series that includes a recently discovered inherited cause of FTLD. There is also great need to develop clinical tools and approaches that will assist clinicians in the identification and counseling of patients with FTLD and their families regarding the likelihood of an identifiable genetic cause.
To ascertain the frequency of inherited FTLD and develop validated pedigree classification criteria for FTLD that provide a standardized means to evaluate pedigree information and insight into the likelihood of mutation-positive genetic test results for C9orf72, MAPT, and GRN.
Information about pedigrees and DNA was collected from 306 serially assessed patients with a clinical diagnosis of FTLD. This information included gene test results for C9orf72, MAPT, and GRN. Pedigree classification criteria were developed based on a literature review of FTLD genetics and pedigree tools and then refined by reviewing mutation-positive and -negative pedigrees to determine differentiating characteristics.
Academic medical center.
Patients with FTLD.
MAIN OUTCOMES AND MEASURES
The rate of C9orf72, MAPT, or GRN mutation–positive FTLD in this series was 15.4%. Categories designating the risk level for hereditary cause were termed high, medium, low, apparent sporadic, and unknown significance. Thirty-nine pedigrees (12.7%)met criteria for high, 31 (10.1%) for medium, 46 (15.0%) for low, 91 (29.7%) for apparent sporadic, and 99 (32.4%) for unknown significance. The mutation-detection rates were as follows: high, 64.1%; medium, 29%; low, 10.9%; apparent sporadic, 1.1%; and unknown significance, 7.1%. Mutation-detection rates differed significantly between the high and other categories.
CONCLUSIONS AND RELEVANCE
Mutation rates are high in FTLD spectrum disorders, and the proposed criteria provide a validated standard for the classification of FTLD pedigrees. The combination of pedigree criteria and mutation-detection rates has important implications for genetic counseling and testing in clinical settings.
Pedigree genotype datasets are used for analysing genetic inheritance and to map genetic markers and traits. Such datasets consist of hundreds of related animals genotyped for thousands of genetic markers and invariably contain multiple errors in both the pedigree structure and in the associated individual genotype data. These errors manifest as apparent inheritance inconsistencies in the pedigree, and invalidate analyses of marker inheritance patterns across the dataset. Cleaning raw datasets of bad data points (incorrect pedigree relationships, unreliable marker assays, suspect samples, bad genotype results etc.) requires expert exploration of the patterns of exposed inconsistencies in the context of the inheritance pedigree. In order to assist this process we are developing VIPER (Visual Pedigree Explorer), a software tool that integrates an inheritance-checking algorithm with a novel space-efficient pedigree visualisation, so that reported inheritance inconsistencies are overlaid on an interactive, navigable representation of the pedigree structure.
Methods and results
This paper describes an evaluation of how VIPER displays the different scales and types of dataset that occur experimentally, with a description of how VIPER's display interface and functionality meet the challenges presented by such data. We examine a range of possible error types found in real and simulated pedigree genotype datasets, demonstrating how these errors are exposed and explored using the VIPER interface and we evaluate the utility and usability of the interface to the domain expert.
Evaluation was performed as a two stage process with the assistance of domain experts (geneticists). The initial evaluation drove the iterative implementation of further features in the software prototype, as required by the users, prior to a final functional evaluation of the pedigree display for exploring the various error types, data scales and structures.
The VIPER display was shown to effectively expose the range of errors found in experimental genotyped pedigrees, allowing users to explore the underlying causes of reported inheritance inconsistencies. This interface will provide the basis for a full data cleaning tool that will allow the user to remove isolated bad data points, and reversibly test the effect of removing suspect genotypes and pedigree relationships.
A complete family history is critical to the assessment of genetic risk of hereditary diseases. Kinsys© is a family history-tracking program used by genetic counselors and healthcare professionals for risk assessment. A pedigree, which is a graphic drawing of family history, is the most important component in such a program. In this project, we analyzed the current pedigree displays of Kinsys© and identified their problems and limitations; and we developed innovative pedigree displays by principles of user-centered visualization methodology for a new version of Kinsys© that overcome some limitations of the original displays.
A multigenerational medical family history graphically recorded as a pedigree or family tree is a cost-effective tool in preconception counseling to identify couples at risk to have offspring with inherited disorders and to identify if either partner has a personal risk for a disorder with a genetic etiology. Interpretation of a medical family history can provide risk assessment for reproductive planning and choices, inform a diagnosis to help identify a patient’s medical screening needs and clinical management, and build rapport with the patient or couple. The use of standardized pedigree nomenclature is paramount to healthcare delivery as electronic medical records become universal. The trend towards having patients prepare a medical family history in advance of the first clinic visit is a way to empower patients to take charge of their health, and also allow health professionals to spend more focused time in confirming and interpreting family history at the visit instead of constructing family history. This article reviews standardized pedigree symbols, clues to identifying “red flags” in family history (with a focus on preconception genetic counseling), the pedigree as a psychosocial tool, and resources for obtaining a medical family history.
Electronic medical record; Family history; Genetic counseling; Pedigree; Standardized pedigree nomenclature
To evaluate the potential effect of computer support on general practitioners' management of familial breast and ovarian cancer, and to compare the effectiveness of two different types of computer program.
Crossover experiment with balanced block design.
Of a random sample of 100 general practitioners from Buckinghamshire who were invited, 41 agreed to participate. From these, 36 were selected for a fully balanced study.
Doctors managed 18 simulated cases: 6 with computerised decision support system Risk Assessment in Genetics (RAGs), 6 with Cyrillic (an established pedigree drawing program designed for clinical geneticists), and 6 with pen and paper.
Main outcome measures
Number of appropriate management decisions made (maximum 6), mean time taken to reach a decision, number of pedigrees accurately drawn (maximum 6). Secondary measures were method of support preferred for particular aspects of managing family histories of cancer; importance of specific information on cancer genetics that might be provided by an “ideal computer program.”
RAGs resulted in significantly more appropriate management decisions (median 6) than either Cyrillic (median 3) or pen and paper (median 3); median difference between RAGs and Cyrillic 2.5 (95% confidence interval 2.0 to 3.0; P<0.0001). RAGs also resulted in significantly more accurate pedigrees (median 5) than both Cyrillic (median 3.5) and pen and paper (median 2); median difference between RAGs and Cyrillic 1.5 (1.0 to 2.0; P<0.0001). The time taken to use RAGs (median 178 seconds) was 51 seconds longer per case (95% confidence interval 36 to 65; P<0.0001) than pen and paper (median 124 seconds) but was less than Cyrillic (median 203 seconds; difference 23. (5 to 43; P=0.02)). 33 doctors (92% (78% to 98%)) preferred using RAGs overall. The most important elements of an “ideal computer program” for genetic advice in primary care were referral advice, the capacity to create pedigrees, and provision of evidence and explanations to support advice.
RAGs could enable general practitioners to be more effective gatekeepers to genetics services, empowering them to reassure the majority of patients with a family history of breast and ovarian cancer who are not at increased genetic risk.
Health care providers need simple tools to identify patients at genetic risk of breast and ovarian cancers. Genetic risk prediction models such as BRCAPRO could fill this gap if incorporated into Electronic Medical Records or other Health Information Technology solutions. However, BRCAPRO requires potentially extensive information on the counselee and her family history. Thus, it may be useful to provide simplified version(s) of BRCAPRO for use in settings that do not require exhaustive genetic counseling.
We explore four simplified versions of BRCAPRO, each using less complete information than the original model. BRCAPROLYTE uses information on affected relatives only up to second degree. It is in clinical use but has not been evaluated. BRCAPROLYTE-Plus extends BRCAPROLYTE by imputing the ages of unaffected relatives. BRCAPROLYTE-Simple reduces the data collection burden associated with BRCAPROLYTE and BRCAPROLYTE-Plus by not collecting the family structure. BRCAPRO-1Degree only uses first-degree affected relatives. We use data on 2713 individuals from seven sites of the Cancer Genetics Network and MD Anderson Cancer Center to compare these simplified tools with the Family History Assessment Tool (FHAT) and BRCAPRO, with the latter serving as the benchmark.
BRCAPROLYTE retains high discrimination, however, because it ignores information on unaffected relatives, it overestimates carrier probabilities. BRCAPROLYTE-Plus and BRCAPROLYTE-Simple provide better calibration than BRCAPROLYTE, so they have higher specificity for similar values of sensitivity. BRCAPROLYTE-Plus performs slightly better than BRCAPROLYTE-Simple. The Areas Under the ROC curve are 0.783 (BRCAPRO), 0.763 (BRCAPROLYTE), 0.772 (BRCAPROLYTE-Plus), 0.773 (BRCAPROLYTE-Simple), 0.728 (BRCAPRO-1Degree), and 0.745 (FHAT). The simpler versions, especially BRCAPROLYTE-Plus and BRCAPROLYTE-Simple, lead to only modest loss in overall discrimination compared to BRCAPRO in this dataset.
Simplified implementations of BRCAPRO can be used for genetic risk prediction in settings where collection of complete pedigree information is impractical.
A fundamental goal of single nucleotide polymorphism (SNP) genotyping is to determine the sharing of alleles between individuals across genomic loci. Such analyses have diverse applications in defining the relatedness of individuals (including unexpected relationships in nominally unrelated individuals, or consanguinity within pedigrees), analyzing meiotic crossovers, and identifying a broad range of chromosomal anomalies such as hemizygous deletions and uniparental disomy, and analyzing population structure.
We present SNPduo, a command-line and web accessible tool for analyzing and visualizing the relatedness of any two individuals using identity by state. Using identity by state does not require prior knowledge of allele frequencies or pedigree information, and is more computationally tractable and is less affected by population stratification than calculating identity by descent probabilities. The web implementation visualizes shared genomic regions, and generates UCSC viewable tracks. The command-line version requires pedigree information for compatibility with existing software and determining specified relationships even though pedigrees are not required for IBS calculation, generates no visual output, is written in portable C++, and is well-suited to analyzing large datasets. We demonstrate how the SNPduo web tool identifies meiotic crossover positions in siblings, and confirm our findings by visualizing meiotic recombination in synthetic three-generation pedigrees. We applied SNPduo to 210 nominally unrelated Phase I / II HapMap samples and, consistent with previous findings, identified six undeclared pairs of related individuals. We further analyzed identity by state in 2,883 individuals from multiplex families with autism and identified a series of anomalies including related parents, an individual with mosaic loss of chromosome 18, an individual with maternal heterodisomy of chromosome 16, and unexplained replicate samples.
SNPduo provides the ability to explore and visualize SNP data to characterize the relatedness between individuals. It is compatible with, but distinct from, other established analysis software such as PLINK, and performs favorably in benchmarking studies for the analyses of genetic relatedness.
Glaucoma is the leading cause of irreversible blindness worldwide. Most of the cases are primary open angle glaucoma (POAG). POAG is a genetically heterogenous disease; autosomal dominance is the most frequent type of monogenic inheritance. In this study, we identified the genotype of a MYOC mutation and investigated the phenotype of a Chinese juvenile-onset open angle glaucoma (JOAG) pedigree (GZ.1 pedigree).
Blood samples were obtained from 24 participants. We performed sequence and gene linkage analysis in the GZ.1 pedigree retrospectively. Comprehensive ophthalmologic examinations were performed for each family member. Pharmacological treatment or filtering surgery was performed as needed according to the intraocular pressure (IOP) of each individual.
A Pro370Leu myocilin mutation located in exon 3 of MYOC was identified in 24 members of the GZ.1 pedigree. Sixteen patients had juvenile-onset primary open-angle glaucoma (JOAG), and the others participating in the project had no such genotype. Analysis of polymorphic microsatellite markers indicated that the disease in GZ.1 is autosomal dominant inheritance. The patients in GZ.1 are characterized by early age of onset (before 35 years of age), severe clinical presentations, and high intraocular pressure unresponsive to pharmacological treatment; requiring 89.5% of the patients to undergo filtering surgery. Fortunately, the success rate of surgery was high. None of the patients required further medical treatment and only one demonstrated low IOP fundus changes.
This is the first evidence of a founder effect for a Pro370Leu myocilin mutation in a Chinese POAG pedigree. The family with the Pro370Leu myocilin mutation presents with juvenile-onset glaucoma. After 10 years of follow-up, it is evident that the mutation is closely associated with the phenotype of the patients. Analysis of MYOC in JOAG patients may enable the identification of at-risk individuals and help prevent disease progression toward the degeneration of the optic nerve, and may also contribute to genetic counseling.
This is a guide for fieldwork in Population Medical Genetics research projects. Data collection, handling, and analysis from large pedigrees require the use of specific tools and methods not widely familiar to human geneticists, unfortunately leading to ineffective graphic pedigrees. Initially, the objective of the pedigree must be decided, and the available information sources need to be identified and validated. Data collection and recording by the tabulated method is advocated, and the involved techniques are presented. Genealogical and personal information are the two main components of pedigree data. While the latter is unique to each investigation project, the former is solely represented by gametic links between persons. The triad of a given pedigree member and its two parents constitutes the building unit of a genealogy. Likewise, three ID numbers representing those three elements of the triad is the record field required for any pedigree analysis. Pedigree construction, as well as pedigree and population data analysis, varies according to the pre-established objectives, the existing information, and the available resources.
medical genetics; population medical genetics; geographic clusters; isolates; rare diseases
The standard method of studying inherited disease is to observe its pattern of distribution in families, that is, its pattern in a pedigree. For clinical studies focused on inherited disease, a pedigree diagram is a valuable visual tool for the display of inheritance patterns. We describe the creation of a web-based pedigree display module for Trial/DB, a Web accessible database developed at the Yale Center for Medical Informatics (YCMI) to support clinical research studies. The pedigree diagram is generated dynamically from the database. The icons representing each subject in the pedigree are selectable hyperlinks that will display detailed clinical data collected on the subject. Microsoft Active Server Page and Scalable Vector Graphics (SVG) are used to create the interactive pedigree diagrams.
More involvement of sub-Saharan African countries in biomedical studies, specifically in genetic research, is needed to advance individualized medicine that will benefit non-European populations. Missing infrastructure, cultural and religious beliefs as well as lack of understanding of research benefits can pose a challenge to recruitment. Here we describe recruitment efforts for a large genetic study requiring three-generation pedigrees within the Yoruba homelands of Nigeria. The aim of the study was to identify genes responsible for keloids, a wound healing disorder. We also discuss ethical and logistical considerations that we encountered in preparation for this research endeavor.
Protocols for this bi-national intercultural study were approved by the Institutional Review Board (IRB) in the US and the ethics committees of the Nigerian institutions for consideration of cultural differences. Principles of community based participatory research were employed throughout the recruitment process. Keloid patients (patient advisors), community leaders, kings/chiefs and medical directors were engaged to assist the research teams with recruitment strategies. Community meetings, church forums, and media outlets (study flyers, radio and TV announcements) were utilized to promote the study in Nigeria. Recruitment of research participants was conducted by trained staff from the local communities. Pedigree structures were re-analyzed on a regular basis as new family members were recruited and recruitment challenges were documented.
Total recruitment surpassed 4200 study participants over a 7-year period including 79 families with complete three-generation pedigrees. In 9 families more than 20 family members participated, however, in 5 of these families, we encountered issues with pedigree structure as members from different branches presented inconsistent family histories. These issues were due to the traditional open family structure amongst the Yoruba and by beliefs in voodoo or in juju. In addition, family members living in other parts of the country or abroad complicated timely and complete family recruitment.
Organizational, logistics and ethics challenges can be overcome by additional administrative efforts, good communication, community involvement and education of staff members. However, recruitment challenges due to infrastructural shortcomings or cultural and religious beliefs can lead to significant delays, which may negatively affect study time lines and expectations of funding agencies.
Keloid; Recruitment; Genetics; Families; Yoruba; Nigeria; Low resource settings
Murine models with modified gene function as a result of N-ethyl-N-nitrosourea (ENU) mutagenesis have been used to study phenotypes resulting from genetic change. This study investigated genetic factors associated with red blood cell (RBC) physiology and structural integrity that may impact on blood component storage and transfusion outcome. Forward and reverse genetic approaches were employed with pedigrees of ENU-treated mice using a homozygous recessive breeding strategy. In a “forward genetic” approach, pedigree selection was based upon identification of an altered phenotype followed by exome sequencing to identify a causative mutation. In a second strategy, a “reverse genetic” approach based on selection of pedigrees with mutations in genes of interest was utilised and, following breeding to homozygosity, phenotype assessed. Thirty-three pedigrees were screened by the forward genetic approach. One pedigree demonstrated reticulocytosis, microcytic anaemia and thrombocytosis. Exome sequencing revealed a novel single nucleotide variation (SNV) in Ank1 encoding the RBC structural protein ankyrin-1 and the pedigree was designated Ank1EX34. The reticulocytosis and microcytic anaemia observed in the Ank1EX34 pedigree were similar to clinical features of hereditary spherocytosis in humans. For the reverse genetic approach three pedigrees with different point mutations in Spnb1 encoding RBC protein spectrin-1β, and one pedigree with a mutation in Epb4.1, encoding band 4.1 were selected for study. When bred to homozygosity two of the spectrin-1β pedigrees (a, b) demonstrated increased RBC count, haemoglobin (Hb) and haematocrit (HCT). The third Spnb1 mutation (spectrin-1β c) and mutation in Epb4.1 (band 4.1) did not significantly affect the haematological phenotype, despite these two mutations having a PolyPhen score predicting the mutation may be damaging. Exome sequencing allows rapid identification of causative mutations and development of databases of mutations predicted to be disruptive. These tools require further refinement but provide new approaches to the study of genetically defined changes that may impact on blood component storage and transfusion outcome.
Red Blood Cell; N-ethyl-N-nitrosourea (ENU); Ankyrin; Spectrin-1β; Band 4.1; Missense Library
The CDC's Family History Public Health Initiative encourages adoption and increase awareness of family health history. To meet these goals and develop a personalized medicine implementation science research agenda, the Genomedical Connection is using an implementation research (T3 research) framework to develop and integrate a self-administered computerized family history system with built-in decision support into 2 primary care clinics in North Carolina.
The family health history system collects a three generation family history on 48 conditions and provides decision support (pedigree and tabular family history, provider recommendation report and patient summary report) for 4 pilot conditions: breast cancer, ovarian cancer, colon cancer, and thrombosis. All adult English-speaking, non-adopted, patients scheduled for well-visits are invited to complete the family health system prior to their appointment. Decision support documents are entered into the medical record and available to provider's prior to the appointment. In order to optimize integration, components were piloted by stakeholders prior to and during implementation. Primary outcomes are change in appropriate testing for hereditary thrombophilia and screening for breast cancer, colon cancer, and ovarian cancer one year after study enrollment. Secondary outcomes include implementation measures related to the benefits and burdens of the family health system and its impact on clinic workflow, patients' risk perception, and intention to change health related behaviors. Outcomes are assessed through chart review, patient surveys at baseline and follow-up, and provider surveys. Clinical validity of the decision support is calculated by comparing its recommendations to those made by a genetic counselor reviewing the same pedigree; and clinical utility is demonstrated through reclassification rates and changes in appropriate screening (the primary outcome).
This study integrates a computerized family health history system within the context of a routine well-visit appointment to overcome many of the existing barriers to collection and use of family history information by primary care providers. Results of the implementation process, its acceptability to patients and providers, modifications necessary to optimize the system, and impact on clinical care can serve to guide future implementation projects for both family history and other tools of personalized medicine, such as health risk assessments.
Genomewide association studies have resulted in a great many genomic regions that are likely to harbor disease genes. Thorough interrogation of these specific regions is the logical next step, including regional haplotype studies to identify risk haplotypes upon which the underlying critical variants lie. Pedigrees ascertained for disease can be powerful for genetic analysis due to the cases being enriched for genetic disease. Here we present a Monte Carlo based method to perform haplotype association analysis. Our method, hapMC, allows for the analysis of full-length and sub-haplotypes, including imputation of missing data, in resources of nuclear families, general pedigrees, case-control data or mixtures thereof. Both traditional association statistics and transmission/disequilibrium statistics can be performed. The method includes a phasing algorithm that can be used in large pedigrees and optional use of pseudocontrols.
Our new phasing algorithm substantially outperformed the standard expectation-maximization algorithm that is ignorant of pedigree structure, and hence is preferable for resources that include pedigree structure. Through simulation we show that our Monte Carlo procedure maintains the correct type 1 error rates for all resource types. Power comparisons suggest that transmission-disequilibrium statistics are superior for performing association in resources of only nuclear families. For mixed structure resources, however, the newly implemented pseudocontrol approach appears to be the best choice. Results also indicated the value of large high-risk pedigrees for association analysis, which, in the simulations considered, were comparable in power to case-control resources of the same sample size.
We propose hapMC as a valuable new tool to perform haplotype association analyses, particularly for resources of mixed structure. The availability of meta-association and haplotype-mining modules in our suite of Monte Carlo haplotype procedures adds further value to the approach.
For omics experiments, detailed characterisation of experimental material with respect to its genetic features, its cultivation history and its treatment history is a requirement for analyses by bioinformatics tools and for publication needs. Furthermore, meta-analysis of several experiments in systems biology based approaches make it necessary to store this information in a standardised manner, preferentially in relational databases. In the Golm Plant Database System, we devised a data management system based on a classical Laboratory Information Management System combined with web-based user interfaces for data entry and retrieval to collect this information in an academic environment.
The database system contains modules representing the genetic features of the germplasm, the experimental conditions and the sampling details. In the germplasm module, genetically identical lines of biological material are generated by defined workflows, starting with the import workflow, followed by further workflows like genetic modification (transformation), vegetative or sexual reproduction. The latter workflows link lines and thus create pedigrees. For experiments, plant objects are generated from plant lines and united in so-called cultures, to which the cultivation conditions are linked. Materials and methods for each cultivation step are stored in a separate ACCESS database of the plant cultivation unit. For all cultures and thus every plant object, each cultivation site and the culture's arrival time at a site are logged by a barcode-scanner based system. Thus, for each plant object, all site-related parameters, e.g. automatically logged climate data, are available. These life history data and genetic information for the plant objects are linked to analytical results by the sampling module, which links sample components to plant object identifiers. This workflow uses controlled vocabulary for organs and treatments. Unique names generated by the system and barcode labels facilitate identification and management of the material. Web pages are provided as user interfaces to facilitate maintaining the system in an environment with many desktop computers and a rapidly changing user community. Web based search tools are the basis for joint use of the material by all researchers of the institute.
The Golm Plant Database system, which is based on a relational database, collects the genetic and environmental information on plant material during its production or experimental use at the Max-Planck-Institute of Molecular Plant Physiology. It thus provides information according to the MIAME standard for the component 'Sample' in a highly standardised format. The Plant Database system thus facilitates collaborative work and allows efficient queries in data analysis for systems biology research.
It is commonly assumed that prediction of genome-wide breeding values in genomic selection is achieved by capitalizing on linkage disequilibrium between markers and QTL but also on genetic relationships. Here, we investigated the reliability of predicting genome-wide breeding values based on population-wide linkage disequilibrium information, based on identity-by-descent relationships within the known pedigree, and to what extent linkage disequilibrium information improves predictions based on identity-by-descent genomic relationship information.
The study was performed on milk, fat, and protein yield, using genotype data on 35 706 SNP and deregressed proofs of 1086 Italian Brown Swiss bulls. Genome-wide breeding values were predicted using a genomic identity-by-state relationship matrix and a genomic identity-by-descent relationship matrix (averaged over all marker loci). The identity-by-descent matrix was calculated by linkage analysis using one to five generations of pedigree data.
We showed that genome-wide breeding values prediction based only on identity-by-descent genomic relationships within the known pedigree was as or more reliable than that based on identity-by-state, which implicitly also accounts for genomic relationships that occurred before the known pedigree. Furthermore, combining the two matrices did not improve the prediction compared to using identity-by-descent alone. Including different numbers of generations in the pedigree showed that most of the information in genome-wide breeding values prediction comes from animals with known common ancestors less than four generations back in the pedigree.
Our results show that, in pedigreed breeding populations, the accuracy of genome-wide breeding values obtained by identity-by-descent relationships was not improved by identity-by-state information. Although, in principle, genomic selection based on identity-by-state does not require pedigree data, it does use the available pedigree structure. Our findings may explain why the prediction equations derived for one breed may not predict accurate genome-wide breeding values when applied to other breeds, since family structures differ among breeds.
To investigate the completeness of pedigree and of number of pedigree analysis to know the acceptable familial history in Korean women with ovarian cancer.
Interview was conducted in 50 ovarian cancer patients for obtaining familial history three times over the 6 weeks. The completeness of pedigree is estimated in terms of familial history of disease (cancer), health status (health living, disease and death), and onset age of disease and death.
The completion of pedigree was 79.3, 85.1, and 85.6% at the 1st, 2nd, and 3rd time of interview and the time for pedigree analysis was 34.3, 10.8, and 3.1 minutes, respectively. The factors limiting pedigree analysis were as follows: out of contact with their relatives (38%), no living ancestors who know the family history (34%), dispersed family member because of the Korean War (16%), unknown cause of death (12%), reluctance to ask medical history of relatives (10%), and concealing their ovarian cancer (10%). The percentage of cancers revealed in 1st (2%) and 2nd degree (8%) relatives were increasing through surveys, especially colorectal cancer related with Lynch syndrome (4%).
Analysis of pedigree at least two times is acceptable in Korean woman with ovarian cancer from the first study. The completion of pedigree is increasing, while time to take family history is decreasing during three time survey.
Completeness; Family history; Lynch syndrome; Ovarian cancer; Pedigree
Pedigree reconstruction using genetic analysis provides a useful means to estimate fundamental population biology parameters relating to population demography, trait heritability and individual fitness when combined with other sources of data. However, there remain limitations to pedigree reconstruction in wild populations, particularly in systems where parent-offspring relationships cannot be directly observed, there is incomplete sampling of individuals, or molecular parentage inference relies on low quality DNA from archived material. While much can still be inferred from incomplete or sparse pedigrees, it is crucial to evaluate the quality and power of available genetic information a priori to testing specific biological hypotheses. Here, we used microsatellite markers to reconstruct a multi-generation pedigree of wild Atlantic salmon (Salmo salar L.) using archived scale samples collected with a total trapping system within a river over a 10 year period. Using a simulation-based approach, we determined the optimal microsatellite marker number for accurate parentage assignment, and evaluated the power of the resulting partial pedigree to investigate important evolutionary and quantitative genetic characteristics of salmon in the system.
We show that at least 20 microsatellites (ave. 12 alleles/locus) are required to maximise parentage assignment and to improve the power to estimate reproductive success and heritability in this study system. We also show that 1.5 fold differences can be detected between groups simulated to have differing reproductive success, and that it is possible to detect moderate heritability values for continuous traits (h2 ~ 0.40) with more than 80% power when using 28 moderately to highly polymorphic markers.
The methodologies and work flow described provide a robust approach for evaluating archived samples for pedigree-based research, even where only a proportion of the total population is sampled. The results demonstrate the feasibility of pedigree-based studies to address challenging ecological and evolutionary questions in free-living populations, where genealogies can be traced only using molecular tools, and that significant increases in pedigree assignment power can be achieved by using higher numbers of markers.
Atlantic salmon; Heritability; Incomplete sampling; MasterBayes; Parentage assignment; Power analysis; Reproductive success
Dilated cardiomyopathy (DCM) is a heart muscle disease characterized by ventricular dilatation and impaired systolic function. Patients with DCM suffer from heart failure, arrhythmia, and are at risk of premature death. DCM has a prevalence of one case out of 2500 individuals with an incidence of 7/100,000/year (but may be under diagnosed). In many cases the disease is inherited and is termed familial DCM (FDC). FDC may account for 20–48% of DCM. FDC is principally caused by genetic mutations in FDC genes that encode for cytoskeletal and sarcomeric proteins in the cardiac myocyte. Family history analysis is an important tool for identifying families affected by FDC. Standard criteria for evaluating FDC families have been published and the use of such criteria is increasing. Clinical genetic testing has been developed for some FDC genes and will be increasingly utilized for evaluating FDC families. Through the use of family screening by pedigree analysis and/or genetic testing, it is possible to identify patients at earlier, or even presymptomatic stages of their disease. This presents an opportunity to invoke lifestyle changes and to provide pharmacological therapy earlier in the course of disease. Genetic counseling is used to identify additional asymptomatic family members who are at risk of developing symptoms, allowing for regular screening of these individuals. The management of FDC focuses on limiting the progression of heart failure and controlling arrhythmia, and is based on currently accepted treatment guidelines for DCM. It includes general measures (salt and fluid restriction, treatment of hypertension, limitation of alcohol intake, control of body weight, moderate exercise) and pharmacotherapy. Cardiac resynchronization, implantable cardioverter defibrillators and left ventricular assist devices have progressively expanding usage. Patients with severe heart failure, severe reduction of the functional capacity and depressed left ventricular ejection fraction have a low survival rate and may require heart transplant.
To determine the genetic cause of Duane’s retraction syndrome (DRS) in two families segregating DRS as an autosomal dominant trait.
Members of two unrelated pedigrees were enrolled in an ongoing genetic study. Linkage analysis was performed using fluorescent microsatellite markers flanking the CHN1 locus. Probands and family members were screened for CHN1 mutations.
Of the six clinically affected individuals in the two pedigrees, three have bilateral and three have unilateral DRS. Both pedigrees are consistent with linkage to the DURS2 locus, one with complete and one with incomplete penetrance. Sequence analysis revealed the pedigrees segregate novel heterozygous missense CHN1 mutations, c.422C>T and c.754C>T, predicted to result in α2-chimaerin amino acid substitutions P141L and P252S, respectively.
Genetic analysis of two pedigrees segregating nonsyndromic DRS reveals two novel mutations in CHN1, bringing the number of DRS pedigrees know to harbor CHN1 mutations, and the number of unique CHN1 mutations, from seven to nine. Both mutations identified in this study alter residues that participate in intramolecular interactions that stabilize the inactive, closed conformation of α2-chimerin, and thus are predicted to result in its hyper-activation. Moreover, amino acid residue P252 was altered to a different residue in a previously reported DRS pedigree; thus, this is the first report of two CHN1 mutations altering the same residue, further supporting a gain-of-function etiology.
Members of families segregating DRS as an autosomal dominant trait should be screened for mutations in the CHN1 gene, enhancing genetic counseling and permitting earlier diagnosis.
A novel web-based tool PedWiz that pipelines the informatics process for pedigree data is introduced. PedWiz is designed to assist researchers in the analysis of pedigree data. It provides a convenient tool for pedigree informatics: descriptive statistics, relative pairs, genetic similarity coefficients, the variance-covariance matrix for three estimated coefficients of allele identical-by-descent sharing as well as mean allele sharing, a plot of the pedigree structures, and a visualization of the identity coefficients. With a renewed interest in linkage and other family based methods, PedWiz will be a valuable tool for the analysis of family data.
pedigree; informatics; genetic similarity; identity-by-descent; relative pairs; family data
Cancer risk prediction tools provide valuable information to clinicians but remain computationally challenging. Many clinics find that Ca Gene or Hughes Risk Apps fit their needs for easy- and ready-to-use software to obtain cancer risks; however, these resources may not fit all clinics’ needs. The Hughes Risk Apps Group and Bayes Mendel Lab therefore developed a web service, called “Risk Service", which may be integrated into any client software to quickly obtain standardized and up-to-date risk predictions for Bayes Mendel tools (BRCAPRO, MMRpro, PancPRO, and MelaPRO), the Tyrer-Cuzick IBIS Breast Cancer Risk Evaluation Tool, and the Colorectal Cancer Risk Assessment Tool.
Software clients that can convert their local structured data into the HL7 XML-formatted family and clinical patient history (Pedigree model) may integrate with the Risk Service. The Risk Service uses Apache Tomcat and Apache Axis2 technologies to provide an all Java web service. The software client sends HL7 XML information containing anonymized family and clinical history to a Dana-Farber Cancer Institute (DFCI) server where it is parsed, interpreted, and processed by multiple risk tools. The Risk Service then formats the results into an HL7 style message and returns the risk predictions to the originating software client. Upon consent, users may allow DFCI to maintain the data for future research. The Risk Service implementation is exemplified through Hughes Risk Apps.
The Risk Service broadens the availability of valuable, up-to-date cancer risk tools and allows clinics and researchers to integrate risk prediction tools into their own software interface designed for their needs. Each software package can collect risk data using its own interface, and display the results using its own interface, while using a central, up-to-date risk calculator. This allows users to choose from multiple interfaces while always getting the latest risk calculations. Consenting users contribute their data for future research, thus building a rich multi-center resource.
Risk Prediction; BayesMendel; BRCAPRO; IBIS Breast Cancer Risk Evaluation Tool; Colorectal Cancer Risk Assessment Tool; Web Service; Risk Service