Search tips
Search criteria


Important Notice

PubMed Central Canada to be taken offline in February 2018

On February 23, 2018, PubMed Central Canada (PMC Canada) will be taken offline permanently. No author manuscripts will be deleted, and the approximately 2,900 manuscripts authored by Canadian Institutes of Health Research (CIHR)-funded researchers currently in the archive will be copied to the National Research Council’s (NRC) Digital Repository over the coming months. These manuscripts along with all other content will also remain publicly searchable on PubMed Central (US) and Europe PubMed Central, meaning such manuscripts will continue to be compliant with the Tri-Agency Open Access Policy on Publications.

Read more

Results 1-23 (23)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Major depressive disorder subtypes to predict long-term course 
Depression and anxiety  2014;31(9):765-777.
Variation in course of major depressive disorder (MDD) is not strongly predicted by existing subtype distinctions. A new subtyping approach is considered here.
Two data mining techniques, ensemble recursive partitioning and Lasso generalized linear models (GLMs) followed by k-means cluster analysis, are used to search for subtypes based on index episode symptoms predicting subsequent MDD course in the World Mental Health (WMH) Surveys. The WMH surveys are community surveys in 16 countries. Lifetime DSM-IV MDD was reported by 8,261 respondents. Retrospectively reported outcomes included measures of persistence (number of years with an episode; number of with an episode lasting most of the year) and severity (hospitalization for MDD; disability due to MDD).
Recursive partitioning found significant clusters defined by the conjunctions of early onset, suicidality, and anxiety (irritability, panic, nervousness-worry-anxiety) during the index episode. GLMs found additional associations involving a number of individual symptoms. Predicted values of the four outcomes were strongly correlated. Cluster analysis of these predicted values found three clusters having consistently high, intermediate, or low predicted scores across all outcomes. The high-risk cluster (30.0% of respondents) accounted for 52.9-69.7% of high persistence and severity and was most strongly predicted by index episode severe dysphoria, suicidality, anxiety, and early onset. A total symptom count, in comparison, was not a significant predictor.
Despite being based on retrospective reports, results suggest that useful MDD subtyping distinctions can be made using data mining methods. Further studies are needed to test and expand these results with prospective data.
PMCID: PMC5125445  PMID: 24425049
Epidemiology; Depression; Anxiety/Anxiety Disorders; Suicide/Self Harm; Panic Attacks
2.  Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports 
Molecular psychiatry  2016;21(10):1366-1371.
Heterogeneity of major depressive disorder (MDD) illness course complicates clinical decision-making. While efforts to use symptom profiles or biomarkers to develop clinically useful prognostic subtypes have had limited success, a recent report showed that machine learning (ML) models developed from self-reports about incident episode characteristics and comorbidities among respondents with lifetime MDD in the World Health Organization World Mental Health (WMH) Surveys predicted MDD persistence, chronicity, and severity with good accuracy. We report results of model validation in an independent prospective national household sample of 1,056 respondents with lifetime MDD at baseline. The WMH ML models were applied to these baseline data to generate predicted outcome scores that were compared to observed scores assessed 10–12 years after baseline. ML model prediction accuracy was also compared to that of conventional logistic regression models. Area under the receiver operating characteristic curve (AUC) based on ML (.63 for high chronicity and .71–.76 for the other prospective outcomes) was consistently higher than for the logistic models (.62–.70) despite the latter models including more predictors. 34.6–38.1% of respondents with subsequent high persistence-chronicity and 40.8–55.8% with the severity indicators were in the top 20% of the baseline ML predicted risk distribution, while only 0.9% of respondents with subsequent hospitalizations and 1.5% with suicide attempts were in the lowest 20% of the ML predicted risk distribution. These results confirm that clinically useful MDD risk stratification models can be generated from baseline patient self-reports and that ML methods improve on conventional methods in developing such models.
PMCID: PMC4935654  PMID: 26728563
3.  Sparse factors for the positive and negative syndrome scale: Which symptoms and stage of illness? 
Psychiatry research  2015;225(3):283-290.
The Positive and Negative Syndrome Scale (PANSS) is frequently described with five latent factors, yet published factor models consistently fail to replicate across samples and related disorders. We hypothesize that (1) a subset of the PANSS, instead of the entire PANSS scale, would produce the most replicable five-factor models across samples, and that (2) the PANSS factor structure may be different depending on the treatment phase, influenced by the responsiveness of the positive symptoms to treatment. Using exploratory factor analysis, confirmatory factor analysis and cross validation on baseline and post-treatment observations from 3647 schizophrenia patients, we show that five-factor models fit best across samples when substantial subsets of the PANSS items are removed. The optimal model at baseline (five factors) omits 12 items: Motor Retardation, Grandiosity, Somatic Concern, Lack of Judgment and Insight, Difficulty in Abstract Thinking, Mannerisms and Posturing, Disturbance of Volition, Preoccupation, Disorientation, Excitement, Guilt Feelings and Depression. The PANSS factor models fit differently before and after patients have been treated. Patients with larger treatment response in positive symptoms have larger variations in factor structure across treatment stage than the less responsive patients. Negative symptom scores better predict the positive symptoms scores after treatment than before treatment. We conclude that sparse factor models replicate better on new samples, and the underlying disease structure of Schizophrenia changes upon treatment.
PMCID: PMC4346367  PMID: 25613662
PANSS; Confirmatory factor analysis; Exploratory factor analysis; Schizophrenia; RDoC; Dimensional Measures
4.  Causal inference methods to assess safety upper bounds in randomized trials with noncompliance 
Clinical Trials (London, England)  2015;12(3):265-275.
Premature discontinuation and other forms of noncompliance with treatment assignment can complicate causal inference of treatment effects in randomized trials. The intent-to-treat analysis gives unbiased estimates for causal effects of treatment assignment on outcome, but may understate potential benefit or harm of actual treatment. The corresponding upper confidence limit can also be underestimated.
To compare estimates of the hazard ratio and upper bound of the two-sided 95% confidence interval from causal inference methods that account for noncompliance with those from the intent-to-treat analysis.
We used simulations with parameters chosen to reflect cardiovascular safety trials of diabetes drugs, with a focus on upper bound estimates relative to 1.3, based on regulatory guidelines. A total of 1000 simulations were run under each parameter combination for a hypothetical trial of 10,000 total subjects randomly assigned to active treatment or control at 1:1 ratio. Noncompliance was considered in the form of treatment discontinuation and cross-over at specified proportions, with an assumed true hazard ratio of 0.9, 1, and 1.3, respectively. Various levels of risk associated with being a non-complier (independent of treatment status) were evaluated. Hazard ratio and upper bound estimates from causal survival analysis and intent-to-treat were obtained from each simulation and summarized under each parameter setting.
Causal analysis estimated the true hazard ratio with little bias in almost all settings examined. Intent-to-treat was unbiased only when the true hazard ratio = 1; otherwise it underestimated both benefit and harm. When upper bound estimates from intent-to-treat were ≥1.3, corresponding estimates from causal analysis were also ≥1.3 in almost 100% of the simulations, regardless of the true hazard ratio. When upper bound estimates from intent-to-treat were <1.3 and the true hazard ratio = 1, corresponding upper bound estimates from causal analysis were ≥1.3 in up to 66% of the simulations under some settings.
Simulations cannot cover all scenarios for noncompliance in real randomized trials.
Causal survival analysis was superior to intent-to-treat in estimating the true hazard ratio with respect to bias in the presence of noncompliance. However, its large variance should be considered for safety upper bound exclusion especially when the true hazard ratio = 1. Our simulations provided a broad reference for practical considerations of bias–variance trade-off in dealing with noncompliance in cardiovascular safety trials of diabetes drugs. Further research is warranted for the development and application of causal inference methods in the evaluation of safety upper bounds.
PMCID: PMC4420771  PMID: 25733675
Noncompliance; major adverse cardiovascular events; safety upper bound; causal survival analysis
5.  Statistical Epistasis and Progressive Brain Change in Schizophrenia: An Approach for Examining the Relationships Between Multiple Genes 
Molecular psychiatry  2011;17(11):1093-1102.
Although schizophrenia is generally considered to occur as a consequence of multiple genes that interact with one another, very few methods have been developed to model epistasis. Phenotype definition has also been a major challenge for research on the genetics of schizophrenia. In this report we use novel statistical techniques to address the high dimensionality of genomic data, and we apply a refinement in phenotype definition by basing it on the occurrence of brain changes during the early course of the illness, as measured by repeated MR scans (i.e., an “intermediate phenotype.” The method combines a machine learning algorithm, the ensemble method using stochastic gradient boosting, with traditional general linear model statistics. We began with fourteen genes that are relevant to schizophrenia based on association studies or their role in neurodevelopment and then used statistical techniques to reduce them to five genes and 17 SNPs that had a significant statistical interaction: 5 for PDE4B, 4 for RELN, 4 for ERBB4, 3 for DISC1, and one for NRG1. Five of the SNPs involved in these interactions replicate previous research, in that these five SNPs have previously been identified as schizophrenia vulnerability markers or implicate cognitive processes relevant to schizophrenia. This ability to replicate previous work suggests that our method has potential for detecting a meaningful epistatic relationships among the genes that influence brain abnormalities in schizophrenia.
PMCID: PMC3235542  PMID: 21876540
6.  Detecting Rare Variant Associations: Methods for Testing Haplotypes and Multiallelic Genotypes 
Genetic Epidemiology  2011;35(Suppl 1):S85-S91.
We summarize the work done by the contributors to Group 13 at Genetic Analysis Workshop 17 (GAW17) and provide a synthesis of their data analyses. The Group 13 contributors used a variety of approaches to test associations of both rare variants and common single-nucleotide polymorphisms (SNPs) with the GAW17 simulated traits, implementing analytic methods that incorporate multiallelic genotypes and haplotypes. In addition to using a wide variety of statistical methods and approaches, the contributors exhibited a remarkable amount of flexibility and creativity in coding the variants and their genes and in evaluating their proposed approaches and methods. We describe and contrast their methods along three dimensions: (1) selection and coding of genetic entities for analysis, (2) method of analysis, and (3) evaluation of the results. The contributors consistently presented a strong rationale for using multiallelic analytic approaches. They indicated that power was likely to be increased by capturing the signals of multiple markers within genetic entities defined by sliding windows, haplotypes, genes, functional pathways, and the entire set of SNPs and rare variants taken in aggregate. Despite this variability, the methods were fairly consistent in their ability to identify two associated genes for each simulated trait. The first gene was selected for the largest number of causal alleles and the second for a high-frequency causal SNP. The presumed model of inheritance and choice of genetic entities are likely to have a strong effect on the outcomes of the analyses.
PMCID: PMC3274416  PMID: 22128065
rare variants; sequence data; multiallelic data; Bayesian regression; penalized regression; tree-based clustering; pathway analysis; haplotypes
7.  Rare variant collapsing in conjunction with mean log p-value and gradient boosting approaches applied to Genetic Analysis Workshop 17 data 
BMC Proceedings  2011;5(Suppl 9):S94.
In addition to methods that can identify common variants associated with susceptibility to common diseases, there has been increasing interest in approaches that can identify rare genetic variants. We use the simulated data provided to the participants of Genetic Analysis Workshop 17 (GAW17) to identify both rare and common single-nucleotide polymorphisms and pathways associated with disease status. We apply a rare variant collapsing approach and the usual association tests for common variants to identify candidates for further analysis using pathway-based and tree-based ensemble approaches. We use the mean log p-value approach to identify a top set of pathways and compare it to those used in simulation of GAW17 dataset. We conclude that the mean log p-value approach is able to identify those pathways in the top list and also related pathways. We also use the stochastic gradient boosting approach for the selected subset of single-nucleotide polymorphisms. When compared the result of this tree-based method with the list of single-nucleotide polymorphisms used in dataset simulation, in addition to correct SNPs we observe number of false positives.
PMCID: PMC3287936  PMID: 22373203
8.  Phenotype Definition and Development – Contributions from Group 7 
Genetic epidemiology  2009;33(Suppl 1):S40-S44.
The papers in Genetic Analysis Workshop 16 Group 7 covered a wide range of topics. The effects of confounder misclassification and selection bias on association results were examined by one group. Another focused on bias introduced by various methods of accounting for treatment effects. Two groups used related methods to derive phenotypic traits. They used different analytic strategies for genetic associations with non-overlapping results (but because they used different sets of single-nucleotide polymorphisms and significance criteria, this is not surprising). Another group relied on the well characterized definition of type 2 diabetes to show benefits of a novel predictive test. Transmission-ratio distortion was the focus of another paper. The results were extended to show a potential secondary benefit of the test to identify potentially mis-called single-nucleotide polymorphisms.
PMCID: PMC3033653  PMID: 19924715
Genetic Analysis Workshop 16; association analysis; confounder misclassification; selection bias; optimal robust ROC; structural equation modelling; treatment adjustment; empirically derived phenotypes; transmission disequilibrium; transmission distortion; candidate genes; genome-wide association
9.  Hypomethylation of MB-COMT promoter is a major risk factor for schizophrenia and bipolar disorder 
Human molecular genetics  2006;15(21):3132-3145.
The variability in phenotypic presentations and the lack of consistency of genetic associations in mental illnesses remain a major challenge in molecular psychiatry. Recently, it has become increasingly clear that altered promoter DNA methylation could play a critical role in mediating differential regulation of genes and in facilitating short-term adaptation in response to the environment. Here, we report the investigation of the differential activity of membrane-bound catechol-O-methyltransferase (MB-COMT) due to altered promoter methylation and the nature of the contribution of COMT Val158Met polymorphism as risk factors for schizophrenia and bipolar disorder by analyzing 115 post-mortem brain samples from the frontal lobe. These studies are the first to reveal that the MB-COMT promoter DNA is frequently hypomethylated in schizophrenia and bipolar disorder patients, compared with the controls (methylation rate: 26 and 29 versus 60%; P = 0.004 and 0.008, respectively), particularly in the left frontal lobes (methylation rate: 29 and 30 versus 81%; P = 0.003 and 0.002, respectively). Quantitative gene-expression analyses showed a corresponding increase in transcript levels of MB-COMT in schizophrenia and bipolar disorder patients compared with the controls (P = 0.02) with an accompanying inverse correlation between MB-COMT and DRD1 expression. Furthermore, there was a tendency for the enrichment of the Val allele of the COMT Val158Met polymorphism with MB-COMT hypomethylation in the patients. These findings suggest that MB-COMT over-expression due to promoter hypomethylation and/or hyperactive allele of COMT may increase dopamine degradation in the frontal lobe providing a molecular basis for the shared symptoms of schizophrenia and bipolar disorder.
PMCID: PMC2799943  PMID: 16984965
10.  Genome-wide association study for empirically derived metabolic phenotypes in the Framingham Heart Study offspring cohort 
BMC Proceedings  2009;3(Suppl 7):S53.
We used data reduction and clustering methods to identify five phenotypically homogeneous groups of study participants with similar profiles for cardiovascular disease risk factors. We constructed both qualitative (binary subgroup membership) and quantitative traits (probability of subgroup membership) for each individual. The Cluster 1 comprised individuals who were generally healthy and had some history of smoking. Cluster 2 was dropped from the analyses due to the preponderance of missing data. Cluster 3 was used as the control group, healthy non-smokers. Members of Cluster 4 had features of the metabolic syndrome and were generally not as obese as Cluster 5. Obesity was the hallmark of Cluster 5, the members of which also had some features of the metabolic syndrome.
We then examined the genetic associations with both qualitative and quantitative representations of these empirically derived traits. Genetic analyses of the qualitative traits were conducted, comparing each of the affected groups with the unaffected cluster alone and, to increase statistical power, the unaffected group and healthy smokers combined. One single-nucleotide polymorphism on chromosome 4 met a conservative genome-wide significance level, but the effect was muted when we accounted for population stratification. The results for the quantitative traits were similar, with a small number of genome-wide significant findings muted by control for admixture. The directional findings will provide the basis for hypothesis generation for syndromes such as the metabolic syndrome and obesity.
PMCID: PMC2795953  PMID: 20018046
11.  Comparison of methods for correcting population stratification in a genome-wide association study of rheumatoid arthritis: principal-component analysis versus multidimensional scaling 
BMC Proceedings  2009;3(Suppl 7):S109.
Population stratification (PS) represents a major challenge in genome-wide association studies. Using the Genetic Analysis Workshop 16 Problem 1 data, which include samples of rheumatoid arthritis patients and healthy controls, we compared two methods that can be used to evaluate population structure and correct PS in genome-wide association studies: the principal-component analysis method and the multidimensional-scaling method. While both methods identified similar population structures in this dataset, principal-component analysis performed slightly better than the multidimensional-scaling method in correcting for PS in genome-wide association analysis of this dataset.
PMCID: PMC2795880  PMID: 20017973
13.  The Validity of Cocaine Dependence Subtypes 
Addictive behaviors  2007;33(1):41-53.
Cocaine dependence (CD) is a multifactorial disorder, variable in its manifestations, and heritable. We examined the concurrent validity of homogeneous subgroups of CD as phenotypes for genetic analysis. We applied data reduction methods and an empirical cluster-analytic approach to measures of cocaine use, cocaine-related effects, and cocaine treatment history in 1393 subjects, from 660 small nuclear families. Four of the six clusters that were derived yielded heritability estimates in excess of 0.3. Linkage analysis showed genomewide significant results for two of the clusters. Here we examine the concurrent validity of the six clusters using a variety of demographic and substance-related measures. In addition to being differentiated by a variety of cocaine-related measures, the clusters differed significantly on measures that were independent of those used to generate the clusters, i.e., demographic features and prevalence rates of co-morbid substance use and psychiatric disorders. These findings support the validity of the methods used to derive homogeneous subgroups of CD subjects and the resulting CD subtypes. Independent replication of these findings would provide further validation of this approach.
PMCID: PMC2111173  PMID: 17582692
Cocaine Dependence; Subtyping; Cluster Analysis; Heritability; Phenotype
14.  Serum Heat Shock Protein 70 Level as a Biomarker of Exceptional Longevity 
Mechanisms of ageing and development  2006;127(11):862-868.
Heat shock proteins are highly conserved proteins that, when produced intracellularly, protect stress exposed cells. In contrast, extracellular Hsp70 has been shown to have both protective and deleterious effects. In this study, we assessed heat shock protein 70 (Hsp70) for its potential role in human longevity. Because of the importance of HSP to disease processes, cellular protection, and inflammation, we hypothesized that: (1) Hsp70 levels in centenarians and centenarian offspring are different from controls and (2) alleles in genes associated with Hsp70 explain these differences. In this cross-sectional study, we assessed serum Hsp70 levels from participants enrolled in either the New England Centenarian Study (NECS) or the Longevity Genes Project (LGP): 87 centenarians (from LGP), 93 centenarian offspring (from NECS), and 126 controls (43 from NECS, 83 from LGP). We also examined genotypic and allelic frequencies of polymorphisms in HSP70-A1A and HSP70-A1B in 347 centenarians (266 from the NECS, 81 from the LGP), 260 NECS centenarian offspring, and 238 controls (NECS: 53 spousal controls and 106 septuagenarian offspring controls; LGP: 79 spousal controls). The adjusted mean serum Hsp70 levels (ng/mL) for the NECS centenarian offspring, LGP centenarians, LGP spousal controls, and NECS controls were 1.05, 1.13, 3.05, 6.93, respectively, suggesting that a low serum Hsp70 level is associated with longevity; however, no genetic associations were found with two SNPs within two hsp70 genes.
PMCID: PMC1781061  PMID: 17027907
ageing; centenarian; chaperokine; heat shock proteins; longevity
16.  Empirically derived subgroups in rheumatoid arthritis: association with single-nucleotide polymorphisms on chromosome 6 
BMC Proceedings  2007;1(Suppl 1):S20.
Rheumatoid arthritis (RA) is a disorder with important public health implications. It is possible that there are clinically distinctive subtypes of the disorder with different genetic etiologies. We used the data provided to the participants in the Genetic Analysis Workshop 15 to evaluate and describe clinically based subgroups and their genetic associations with single-nucleotide polymorphisms (SNPs) on chromosome 6, which harbors the HLA region. Detailed two- and three-SNP haplotype analyses were conducted in the HLA region. We used demographic, clinical self-report, and biomarker data from the entire sample (n = 8477) to identify and characterize the subgroups. We did not use the RA diagnosis itself in the identification of the subgroups. Nuclear families (715 families, 1998 individuals) were used to examine the genetic association with the HLA region. We found five distinct subgroups in the data. The first comprised unaffected family members. Cluster 2 was a mix of affected and unaffected in which patients endorsed symptoms not corroborated by physicians. Clusters 3 through 5 represented a severity continuum in RA. Cluster 5 was characterized by early onset severe disease. Cluster 2 showed no association on chromosome 6. Clusters 3 through 5 showed association with 17 SNPs on chromosome 6. In the HLA region, Cluster 3 showed single-, two-, and three-SNP association with the centromeric side of the region in an area of linkage disequilibrium. Cluster 5 showed both single- and two-SNP association with the telomeric side of the region in a second area of linkage disequilibrium. It will be important to replicate the subgroup structure and the association findings in an independent sample.
PMCID: PMC2367493  PMID: 18466517
17.  Multifactor-dimensionality reduction versus family-based association tests in detecting susceptibility loci in discordant sib-pair studies 
BMC Genetics  2005;6(Suppl 1):S146.
Complex diseases are generally thought to be under the influence of multiple, and possibly interacting, genes. Many association methods have been developed to identify susceptibility genes assuming a single-gene disease model, referred to as single-locus methods. Multilocus methods consider joint effects of multiple genes and environmental factors. One commonly used method for family-based association analysis is implemented in FBAT. The multifactor-dimensionality reduction method (MDR) is a multilocus method, which identifies multiple genetic loci associated with the occurrence of complex disease. Many studies of late onset complex diseases employ a discordant sib pairs design. We compared the FBAT and MDR in their ability to detect susceptibility loci using a discordant sib-pair dataset generated from the simulated data made available to participants in the Genetic Analysis Workshop 14. Using FBAT, we were able to identify the effect of one susceptibility locus. However, the finding was not statistically significant. We were not able to detect any of the interactions using this method. This is probably because the FBAT test is designed to find loci with major effects, not interactions. Using MDR, the best result we obtained identified two interactions. However, neither of these reached a level of statistical significance. This is mainly due to the heterogeneity of the disease trait and noise in the data.
PMCID: PMC1866789  PMID: 16451606
18.  Whole-genome variance components linkage analysis using single-nucleotide polymorphisms versus microsatellites on quantitative traits of derived phenotypes from factor analysis of electroencephalogram waves 
BMC Genetics  2005;6(Suppl 1):S15.
Alcohol dependence is a serious public health problem. We studied data from families participating in the Collaborative Study on the Genetics of Alcoholism (COGA) and made available to participants in the Genetic Analysis Workshop 14 (GAW14) in order to search for genes predisposing to alcohol dependence. Using factor analysis, we identified four factors (F1, F2, F3, F4) related to the electroencephalogram traits. We conducted variance components linkage analysis with each of the factors. Our results using the Affymetrix single-nucleotide polymorphism dataset showed significant evidence for a novel linkage of F3 (factor comprised of the three midline channel EEG measures from the target case of the Visual Oddball experiment ttdt2, 3, 4) to chromosome 18 (LOD = 3.45). This finding was confirmed by analyses of the microsatellite data (LOD = 2.73) and Illumina SNP data (LOD = 3.30). We also demonstrated that, in a sample like the COGA data, a dense single-nucleotide polymorphism map provides better linkage signals than low-resolution microsatellite map with quantitative traits.
PMCID: PMC1866785  PMID: 16451610
19.  Genome-wide linkage analysis for alcohol dependence: a comparison between single-nucleotide polymorphism and microsatellite marker assays 
BMC Genetics  2005;6(Suppl 1):S8.
Both theoretical and applied studies have proven that the utility of single nucleotide polymorphism (SNP) markers in linkage analysis is more powerful and cost-effective than current microsatellite marker assays. Here we performed a whole-genome scan on 115 White, non-Hispanic families segregating for alcohol dependence, using one 10.3-cM microsatellite marker set and two SNP data sets (0.33-cM, 0.78-cM spacing). Two definitions of alcohol dependence (ALDX1 and ALDX2) were used. Our multipoint nonparametric linkage analysis found alcoholism was nominal linked to 12 genomic regions. The linkage peaks obtained by using the microsatellite marker set and the two SNP sets had a high degree of correspondence in general, but the microsatellite marker set was insufficient to detect some nominal linkage peaks. The presence of linkage disequilibrium between markers did not significantly affect the results. Across the entire genome, SNP datasets had a much higher average linkage information content (0.33 cM: 0.93, 0.78 cM: 0.91) than did microsatellite marker set (0.57). The linkage peaks obtained through two SNP datasets were very similar with some minor differences. We conclude that genome-wide linkage analysis by using approximately 5,000 SNP markers evenly distributed across the human genome is sufficient and might be more powerful than current 10-cM microsatellite marker assays.
PMCID: PMC1866701  PMID: 16451694
21.  Search for genetic factors predisposing to atherogenic dyslipidemia 
BMC Genetics  2003;4(Suppl 1):S100.
Atherogenic dyslipidemia (AD) is a common feature in persons with premature coronary heart disease. While several linkage studies have been carried out to dissect the genetic etiology of lipid levels, few have investigated the AD lipid triad comprising elevated serum triglyceride, small low density lipoprotein (LDL) particles, and reduced high density lipoprotein (HDL) cholesterol levels. Here we report the results of a whole-genome screen for AD using the Framingham Heart Study population.
Our analyses provide some evidence for linkage to AD on chromosomes 1q31, 3q29, 10q26, 14p12, 14q13, 16q24, 18p11, and 19q13.
AD susceptibility is modulated by multiple genes in different chromosomes. Our study confirms results from other populations and suggests new areas of potential importance.
PMCID: PMC1866438  PMID: 14975168
22.  Empirically derived phenotypic subgroups – qualitative and quantitative trait analyses 
BMC Genetics  2003;4(Suppl 1):S15.
The Framingham Heart Study has contributed a great deal to advances in medicine. Most of the phenotypes investigated have been univariate traits (quantitative or qualitative). The aims of this study are to derive multivariate traits by identifying homogeneous groups of people and assigning both qualitative and quantitative trait scores; to assess the heritability of the derived traits; and to conduct both qualitative and quantitative linkage analysis on one of the heritable traits.
Multiple correspondence analysis, a nonparametric analogue of principal components analysis, was used for data reduction. Two-stage clustering, using both k-means and agglomerative hierarchical clustering, was used to cluster individuals based upon axes (factor) scores obtained from the data reduction. Probability of cluster membership was calculated using binary logistic regression. Heritability was calculated using SOLAR, which was also used for the quantitative trait analysis. GENEHUNTER-PLUS was used for the qualitative trait analysis.
We found four phenotypically distinct groups. Membership in the smallest group was heritable (38%, p < 1 × 10-6) and had characteristics consistent with atherogenic dyslipidemia. We found both qualitative and quantitative LOD scores above 3 on chromosomes 11 and 14 (11q13, 14q23, 14q31). There were two Kong & Cox LOD scores above 1.0 on chromosome 6 (6p21) and chromosome 11 (11q23).
This approach may be useful for the identification of genetic heterogeneity in complex phenotypes by clarifying the phenotype definition prior to linkage analysis. Some of our findings are in regions linked to elements of atherogenic dyslipidemia and related diagnoses, some may be novel, or may be false positives.
PMCID: PMC1866449  PMID: 14975083
23.  Genome-wide screen for heavy alcohol consumption 
BMC Genetics  2003;4(Suppl 1):S106.
To find specific genes predisposing to heavy alcohol consumption (self-reported consumption of 24 grams or more of alcohol per day among men and 12 grams or more among women), we studied 330 families collected by the Framingham Heart Study made available to participants in the Genetic Analysis Workshop 13 (GAW13).
Parametric and nonparametric methods of linkage analysis were used. No significant evidence of linkage was found; however, weak signals were identified in several chromosomal regions, including 1p22, 4q12, 4q25, and 11q24, which are in the vicinity of those reported in other similar studies.
Our study did not reveal significant evidence of linkage to heavy alcohol use; however, we found weak confirmation of studies carried out in other populations.
PMCID: PMC1866444  PMID: 14975174

Results 1-23 (23)