Purpose. The radioligand [11C]KR31173 has been introduced for PET imaging of the angiotensin II subtype 1 receptor (AT1R). The purpose of the present project was to employ and validate a compartmental model for quantification of the kinetics of this radioligand in a porcine model of renal ischemia followed by reperfusion (IR). Procedures. Ten domestic pigs were included in the study: five controls and five experimental animals with IR of the left kidney. To achieve IR, acute ischemia was created with a balloon inserted into the left renal artery and inflated for 60 minutes. Reperfusion was achieved by deflation and removal of the balloon. Blood chemistries, urine specific gravity and PH values, and circulating hormones of the renin angiotensin system were measured and PET imaging was performed one week after IR. Cortical time-activity curves obtained from a 90 min [11C]KR31173 dynamic PET study were processed with a compartmental model that included two tissue compartments connected in parallel. Radioligand binding quantified by radioligand retention (80 min value to maximum value ratio) was compared to the binding parameters derived from the compartmental model. A binding ratio was calculated as DVR = DVS/DVNS, where DVS and DVNS represented the distribution volumes of specific binding and nonspecific binding. Receptor binding was also determined by autoradiography in vitro. Results. Correlations between rate constants and binding parameters derived by the convolution and deconvolution curve fittings were significant (r > 0.9). Also significant was the correlation between the retention parameter derived from the tissue activity curve (Yret) and the retention parameter derived from the impulse response function (fret). Furthermore, significant correlations were found between these two retention parameters and DVR. Measurements with PET showed no significant changes in the radioligand binding parameters caused by IR, and these in vivo findings were confirmed by autoradiography performed in vitro. Conclusions. Correlations between various binding parameters support the concept of the parallel connectivity compartmental model. If an arterial input function cannot be obtained, simple radioligand retention may be adequate for estimation of in vivo radioligand binding.
Identifying prognostic factors for osteosarcoma (OS) aids in the selection of patients who require more aggressive management. CD133 has been found to be a prognostic factor of certain tumor types. However, the association between CD133 expression and the prognosis of OS remains unknown. In this study, we analyzed the association of CD133 expression in OS with clinical factors and overall survival, and further investigated its potential role in metastasis in vitro. We found CD133 expression in 65.7% (46/70) of OS samples using immunohistochemistry, and it was positively correlated with lung metastasis analyzed by Chi-square test (P=0.002) and shorter overall survival time using the Kaplan-Meier method compared by log-rank test (P=0.000). Multivariate analysis showed that CD133 expression was an independent prognostic factor of patients with OS. To test for direct participation of CD133, we separated CD133+ and CD133− cells in the MG63 cell line using magnetic-activated cell sorting and found that CD133+ cells were more active in migration by scratch wound-healing assay and invasion by Matrigel invasion assay compared with CD133− cells. Elevated mRNA expression of the stemness gene octamer-binding transcription factor 4 (Oct-4) and NANOG, and the metastasis-related receptor C-X-C chemokine receptor type 4 (CXCR4) were also found in CD133+ cells by reverse transcription-polymerase chain reaction. Thus, expression of CD133 was correlated with lung metastasis and poor prognosis in OS patients. CD133+ cells may be a type of cancer stem cell with high expression of self-renewal capacity and metastasis-related genes.
osteosarcoma; CD133; prognosis; lung metastasis
We study the influence of five-order nonlinear on the dynamic of dark soliton. Starting from the cubic-quintic nonlinear Schrodinger equation with the quadratic phase chirp term, by using a similarity transformation technique, we give the exact solution of dark soliton and calculate the precise expressions of dark soliton's width, amplitude, wave central position, and wave velocity which can describe the dynamic behavior of soliton's evolution. From two different kinds of quadratic phase chirps, we mainly analyze the effect on dark soliton's dynamics which different fiver-order nonlinear term generates. The results show the following two points with quintic nonlinearities coefficient increasing: (1) if the coefficients of the quadratic phase chirp term relate to the propagation distance, the solitary wave displays a periodic change and the soliton's width increases, while its amplitude and wave velocity reduce. (2) If the coefficients of the quadratic phase chirp term do not depend on propagation distance, the wave function only emerges in a fixed area. The soliton's width increases, while its amplitude and the wave velocity reduce.
Quantum dots (QDs) have attracted increasing interest in bioimaging and sensing. Here, we report a biosensor of complex I using ubiquinone-terminated disulphides with different alkyl spacers (QnNS, n = 2, 5 and 10) as surface-capping ligands to functionalise CdSe/ZnS QDs. The enhancement or quenching of the QD bioconjugates fluorescence changes as a function of the redox state of QnNS, since QDs are highly sensitive to the electron-transfer processes. The bioconjugated QnNS-QDs emission could be modulated by complex I in the presence of NADH, which simulates an electron-transfer system part of the mitochondrial respiratory chain, providing an in vitro and intracellular complex I sensor. Epidemiological studies suggest that Parkinson's patients have the impaired activity of complex I in the electron-transfer chain of mitochondria. We have demonstrated that the QnNS-QDs system could aid in early stage Parkinson's disease diagnosis and progression monitoring by following different complex I levels in SH-SY5Y cells.
Microarray experiments typically analyze thousands to tens of thousands of genes from small numbers of biological replicates. The fact that genes are normally expressed in functionally relevant patterns suggests that gene-expression data can be stratified and clustered into relatively homogenous groups. Cluster-wise dimensionality reduction should make it feasible to improve screening power while minimizing information loss.
We propose a powerful and computationally simple method for finding differentially expressed genes in small microarray experiments. The method incorporates a novel stratification-based tight clustering algorithm, principal component analysis and information pooling. Comprehensive simulations show that our method is substantially more powerful than the popular SAM and eBayes approaches. We applied the method to three real microarray datasets: one from a Populus nitrogen stress experiment with 3 biological replicates; and two from public microarray datasets of human cancers with 10 to 40 biological replicates. In all three analyses, our method proved more robust than the popular alternatives for identification of differentially expressed genes.
The C++ code to implement the proposed method is available upon request for academic use.
It has been postulated that multiple-marker methods may have added ability, over single-marker methods, to detect genetic variants associated with disease. The Wellcome Trust Case Control Consortium (WTCCC) provided the first successful large genome-wide association studies (GWAS) which included single-marker association analyses for seven common complex diseases. Of those signals detected, only one was associated with coronary artery disease (CAD), and none were identified for hypertension (HTN). Our objective was to find additional genetic associations and pathways for cardiovascular disease by examining the WTCCC data for variants associated with CAD and HTN using two-marker testing methods. We applied two-marker association testing to the WTCCC dataset, which includes ~2,000 affected individuals with each disorder, and a shared pool of ~3,000 controls, all genotyped using Affymetrix GeneChip 500 K arrays. For CAD, we detected single nucleotide polymorphisms (SNP) pairs in three genes showing genome-wide significance: HFE2, STK32B, and DIPC2. The most notable SNP pairs in a non-protein-coding region were at 9p21, a known major CAD-associated region. For HTN, we detected SNP pairs in five genes: GPR39, XRCC4, MYO6, ZFAT, and MACROD2. Four further associated SNP pair regions were at least 70 kb from any known gene. We have shown that novel, multiple-marker, statistical methods can be of use in finding variants in GWAS. We describe many new, associated variants for both CAD and HTN and describe their known genetic mechanisms.
Interactions among genomic loci (also known as epistasis) have been suggested as one of the potential sources of missing heritability in single locus analysis of genome-wide association studies (GWAS). The computational burden of searching for interactions is compounded by the extremely low threshold for identifying significant p-values due to multiple hypothesis testing corrections. Utilizing prior biological knowledge to restrict the set of candidate SNP pairs to be tested can alleviate this problem, but systematic studies that investigate the relative merits of integrating different biological frameworks and GWAS data have not been conducted.
We developed four biologically based frameworks to identify pairwise interactions among candidate SNP pairs as follows: (1) for each human protein-coding gene, a set of SNPs associated with that gene was constructed providing a gene-based interaction model, (2) for each known biological pathway, a set of SNPs associated with the genes in the pathway was constructed providing a pathway-based interaction model, (3) a set of SNPs associated with genes in a disease-related subnetwork provides a network-based interaction model, and (4) a framework is based on the function of SNPs. The last approach uses expression SNPs (eSNPs or eQTLs), which are SNPs or loci that have defined effects on the abundance of transcripts of other genes. We constructed pairs of eSNPs and SNPs located in the target genes whose expression is regulated by eSNPs. For all four frameworks the SNP sets were exhaustively tested for pairwise interactions within the sets using a traditional logistic regression model after excluding genes that were previously identified to associate with the trait. Using previously published GWAS data for type 2 diabetes (T2D) and the biologically based pair-wise interaction modeling, we identify twelve genes not seen in the previous single locus analysis.
We present four approaches to detect interactions associated with complex diseases. The results show our approaches outperform the traditional single locus approaches in detecting genes that previously did not reach significance; the results also provide novel drug targets and biomarkers relevant to the underlying mechanisms of disease.
Stroke is the second most common cause of mortality and the leading cause of neurological disability, cognitive impairment and dementia worldwide. Nimodipine is a dihydropyridinic calcium antagonist with a role in neuroprotection, making it a promising therapy for vascular cognitive impairment and dementia.
The NICE study is a multicenter, randomized, double-blind, placebo-controlled study being carried out in 23 centers in China. The study population includes patients aged 30–80 who have suffered an ischemic stroke (≤7 days). Participants are randomly allocated to nimodipine (90 mg/d) or placebo (90 mg/d). The primary efficacy is to evaluate the level of mild cognitive impairment following treatment of an ischemic stroke with nimodipine or placebo for 6 months. Safety is being assessed by observing side effects of nimodipine. Assuming a relative risk reduction of 22%, at least 656 patients are required in this study to obtain statistical power of 90%. The first patient was recruited in November 2010.
Previous studies suggested that nimodipine could improve cognitive function in vascular dementia and Alzheimer’s disease dementia. It is unclear that at which time-point intervention with nimodipine should occur. Therefore, the NICE study is designed to evaluate the benefits and safety of nimodipine, which was adminstered within seven days, in preventing/treating mild cognitive impairment following ischemic stroke.
It is generally known that risk variants segregate together with a disease within families but this information has not been used in the existing statistical methods for detecting rare variants. Here we introduce two weighted sum statistics that can apply to either genome-wide association data or resequencing data for identifying rare disease variants: weights calculated based on sibpairs and odd ratios, respectively. We evaluated the two methods via extensive simulations under different disease models. We compared the proposed methods with the weighted sum statistic (WSS) proposed by Madsen and Browning, keeping the same genotyping or resequencing cost. Our methods clearly demonstrate more statistical power than the WSS. In addition, we found using sibpair information can increase power over using only unrelated samples by more than 40%. We applied our methods to the Framingham Heart Study (FHS) and Wellcome Trust Case Control Consortium (WTCCC) hypertension datasets. Although we did not identify any genes as reaching a genome-wide significance level, we found variants in the candidate gene angiotensinogen (AGT) significantly associated with hypertension at P=6.9×10-4, whereas the most significant single SNP association evidence is P=0.063. We further applied the odds ratio weighted method to the IFIH1 gene for type 1 diabetes in the WTCCC data. Our method yielded a P value of 4.82×10-4, much more significant than that obtained by haplotype-based methods. We demonstrated that family data are extremely informative in searching for rare variants underlying complex traits, and the odds ratio weighted sum statistic is more efficient than currently existing methods.
To detect rare variants associated with a phenotype, we develop a novel statistical method that can use both family and unrelated case-control data. Unlike the currently existing methods, we first use family data to calculate weights to be given to rare variants, differentiating between concordantly affected and discordant sib pairs. These weights are then used in an association test applied to the unrelated case-control data. We applied the proposed method to the simulated sequencing data in Genetic Analysis Workshop 17 and identified two genes associated with the disease.
The field of pancreatic stem and progenitor cell biology has been hampered by a lack of in vitro functional and quantitative assays that allow for the analysis of the single cell. Analyses of single progenitors are of critical importance because they provide definitive ways to unequivocally demonstrate the lineage potential of individual progenitors. Although methods have been devised to generate "pancreatospheres" in suspension culture from single cells, several limitations exist. First, it is time-consuming to perform single cell deposition for a large number of cells, which in turn commands large volumes of culture media and space. Second, numeration of the resulting pancreatospheres is labor-intensive, especially when the frequency of the pancreatosphere-initiating progenitors is low. Third, the pancreatosphere assay is not an efficient method to allow both the proliferation and differentiation of pancreatic progenitors in the same culture well, restricting the usefulness of the assay.
To overcome these limitations, a semi-solid media based colony assay for pancreatic progenitors has been developed and is presented in this report. This method takes advantage of an existing concept from the hematopoietic colony assay, in which methylcellulose is used to provide viscosity to the media, allowing the progenitor cells to stay in three-dimensional space as they undergo proliferation as well as differentiation. To enrich insulin-expressing colony-forming progenitors from a heterogeneous population, we utilized cells that express neurogenin (Ngn) 3, a pancreatic endocrine progenitor cell marker. Murine embryonic stem (ES) cell-derived Ngn3 expressing cells tagged with the enhanced green fluorescent protein reporter were sorted and as many as 25,000 cells per well were plated into low-attachment 24-well culture dishes. Each well contained 500 μL of semi-solid media with the following major components: methylcellulose, Matrigel, nicotinamide, exendin-4, activin βB, and conditioned media collected from murine ES cell-derived pancreatic-like cells. After 8 to 12 days of culture, insulin-expressing colonies with distinctive morphology were formed and could be further analyzed for pancreatic gene expression using quantitative RT-PCR and immunoflourescent staining to determine the lineage composition of each colony.
In summary, our colony assay allows easy detection and quantification of functional progenitors within a heterogeneous population of cells. In addition, the semi-solid media format allows uniform presentation of extracellular matrix components and growth factors to cells, enabling progenitors to proliferate and differentiate in vitro. This colony assay provides unique opportunities for mechanistic studies of pancreatic progenitor cells at the single cell level.
Although they have demonstrated success in searching for common variants for complex diseases, Genome-Wide Association (GWA) studies are less successful in detecting rare genetic variants because of the poor statistical power of most of current methods. We developed a two-stage method that can apply to GWA studies for detecting rare variants. Here we report the results of applying this two-stage method to the Wellcome Trust Case Control Consortium (WTCCC) dataset that include 7 complex diseases: Bipolar disorder, Cardiovascular disease, Hypertension, Rheumatoid Arthritis, Crohn’s disease, Type 1 Diabetes and Type 2 Diabetes. We identified 24 genes or regions that reach genome wide significance. 8 of them are novel and were not reported in the WTCCC study. The cumulative risk (or protective) haplotype frequency for each of the 8 genes or regions is small, being at most 11%. For each of the novel genes, the risk (or protective) haplotype set cannot be tagged by the common SNPs available in chips (r2<0.32). The gene identified in hypertension was further replicated in the Framingham Heart Study (FHS), and is also significantly associated with Type 2 Diabetes. Our analysis suggests that searching for rare genetic variants is feasible in current genome-wide association studies and candidate gene studies, and the results can severe as guides to future resequencing studies to identify the underlying rare functional variants.
The aim of this study was to investigate the clinical heterogeneity of Parkinson’s disease (PD) among a cohort of Chinese patients in early stages. Clinical data on demographics, motor variables, motor phenotypes, disease progression, global cognitive function, depression, apathy, sleep quality, constipation, fatigue, and L-dopa complications were collected from 138 Chinese PD subjects in early stages (Hoehn and Yahr stages 1–3). The PD subject subtypes were classified using k-means cluster analysis according to the clinical data from five to three-cluster consecutively. Kappa statistical analysis was performed to evaluate the consistency among different subtype solutions. The cluster analysis indicated four main subtypes: the non-tremor dominant subtype (NTD, n=28, 20.3%), rapid disease progression subtype (RDP, n=7, 5.1%), young-onset subtype (YO, n=50, 36.2%), and tremor dominant subtype (TD, n=53, 38.4%). Overall, 78.3% (108/138) of subjects were always classified between the same three groups (52 always in TD, 7 in RDP, and 49 in NTD), and 98.6% (136/138) between five- and four-cluster solutions. However, subjects classified as NTD in the four-cluster analysis were dispersed into different subtypes in the three-cluster analysis, with low concordance between four- and three-cluster solutions (kappa value=−0.139, P=0.001). This study defines clinical heterogeneity of PD patients in early stages using a data-driven approach. The subtypes generated by the four-cluster solution appear to exhibit ideal internal cohesion and external isolation.
Parkinson’s disease; Heterogeneity; Subtype; Cluster analysis
We have demonstrated that growth differentiation factor 9 (GDF9) enhances activin A-induced inhibin βB-subunit mRNA levels in human granulosa-lutein (hGL) cells by regulating receptors and key intracellular components of the activin signaling pathway. However, we could not exclude its effects on follistatin (FST) and follistatin-like 3 (FSTL3), well recognized extracellular inhibitors of activin A.
hGL cells from women undergoing in vitro fertilization (IVF) treatment were cultured with and without siRNA transfection of FST, FSTL3 or GDF9 and then treated with GDF9, activin A, FST, FSTL3 or combinations. FST, FSTL3 and inhibin βB-subunit mRNA, and FST, FSTL3 and inhibin B protein levels were assessed with real-time RT-PCR and ELISA, respectively. Data were log transformed before ANOVA followed by Tukey's test.
GDF9 suppressed basal FST and FSTL3 mRNA and protein levels in a time- and dose-dependent manner and inhibited activin A-induced FST and FSTL3 mRNA and protein expression, effects attenuated by BMPR2 extracellular domain (BMPR2 ECD), a GDF9 antagonist. After GDF9 siRNA transfection, basal and activin A-induced FST and FSTL3 mRNA and protein levels increased, but changes were reversed by adding GDF9. Reduced endogenous FST or FSTL3 expression with corresponding siRNA transfection augmented activin A-induced inhibin βB-subunit mRNA levels as well as inhibin B levels (P values all <0.05). Furthermore, the enhancing effects of GDF9 in activin A-induced inhibin βB-subunit mRNA and inhibin B production were attenuated by adding FST.
GDF9 decreases basal and activin A-induced FST and FSTL3 expression, and this explains, in part, its enhancing effects on activin A-induced inhibin βB-subunit mRNA expression and inhibin B production in hGL cells.
Embryonic stem (ES) cell technology may serve as a platform for the discovery of drugs to treat diseases such as diabetes. However, because of difficulties in establishing reliable ES cell differentiation methods and in creating cost-effective plating conditions for the high-throughput format, screening for molecules that regulate pancreatic beta cells and their immediate progenitors has been limited. A relatively simple and inexpensive differentiation protocol that allows efficient generation of insulin-expressing cells from murine ES cells was previously established in our laboratories. In this report, this system is characterized in greater detail to map developmental cell stages for future screening experiments. Our results show that sequential activation of multiple gene markers for undifferentiated ES cells, epiblast, definitive endoderm, foregut, and pancreatic lineages was found to follow the sequence of events that mimics pancreatic ontogeny. Cells that expressed enhanced green fluorescent protein, driven by pancreatic and duodenal homeobox 1 or insulin 1 promoter, correctly expressed known beta cell lineage markers. Overexpression of Sox17, an endoderm fate-determining transcription factor, at a very early stage of differentiation (days 2–3) enhanced pancreatic gene expression. Overexpression of neurogenin3, an endocrine progenitor cell marker, induced glucagon expression at stages when pancreatic and duodenal homeobox 1 message was present (days 10–16). Forced expression (between days 16 and 25) of MafA, a pancreatic maturation factor, resulted in enhanced expression of insulin genes, glucose transporter 2 and glucokinase, and glucose-responsive insulin secretion. Day 20 cells implanted in vivo resulted in pancreatic-like cells. Together, our differentiation assay recapitulates the proceedings and behaviors of pancreatic development and will be valuable for future screening of beta cell effectors.
Genome-wide association (GWA) studies have identified common variants that are associated with a variety of traits and diseases, but most studies have been performed in European-derived populations. Here, we describe the first genome-wide analyses of imputed genotype and copy number variants (CNVs) for anthropometric measures in African-derived populations: 1188 Nigerians from Igbo-Ora and Ibadan, Nigeria, and 743 African-Americans from Maywood, IL. To improve the reach of our study, we used imputation to estimate genotypes at ∼2.1 million single-nucleotide polymorphisms (SNPs) and also tested CNVs for association. No SNPs or common CNVs reached a genome-wide significance level for association with height or body mass index (BMI), and the best signals from a meta-analysis of the two cohorts did not replicate in ∼3700 African-Americans and Jamaicans. However, several loci previously confirmed in European populations showed evidence of replication in our GWA panel of African-derived populations, including variants near IHH and DLEU7 for height and MC4R for BMI. Analysis of global burden of rare CNVs suggested that lean individuals possess greater total burden of CNVs, but this finding was not supported in an independent European population. Our results suggest that there are not multiple loci with strong effects on anthropometric traits in African-derived populations and that sample sizes comparable to those needed in European GWA studies will be required to identify replicable associations. Meta-analysis of this data set with additional studies in African-ancestry populations will be helpful to improve power to detect novel associations.
In order to determine the genetic variation of the MHC class IIB exon2 allele in the offspring, 700 fry from seven families of Japanese flounder challenged with V. anguillarum were studied, and different mortality rates were found in those families. Five to ten surviving and dead fry from each of the seven families were selected to study the MHC class II B exon2 gene with PCR and a direct sequencing method. One hundred and sixteen different exon2 sequences were found and 116 different alleles were identified, while a minimum of four loci were revealed in the MHC class II B exon2 gene. The ratio (dN/dS) of nonsynonymous substitution (dN) to synonymous substitutions (dS) in the peptide-binding region (PBR) of the MHC class IIB gene was 6.234, which indicated that balancing selection is acting on the MHC class IIB genes. The MHC IIB alleles were thus being passed on to their progeny. Some alleles were significantly more frequent in surviving than dead individuals. All together our data suggested that the alleles Paol-DAB*4301, Paol-DAB*4601, Paol-DAB*4302, Paol-DAB*3803, and Paol-DAB*4101 were associated with resistance to V. anguillarum in flounder.
Recently, Steen et al proposed a novel two-stage approach for family-based genome-wide association studies. In the first stage, a test based on between-family information is used to rank SNPs according to their P-values or conditional power of the test. In the second stage, the R most promising SNPs are tested using a family-based association test. We call this two-stage approach top R method. Ionita-Laza et al proposed an exponential weighting method within a two-stage framework. In the second stage of this approach, instead of testing top R SNPs, it tests all SNPs and weights the P-values of association test according to the information of the first stage. However, both of the top R and exponential weighting methods only use the information from the first stage to rank SNPs. It seems that the two methods do not use information from the first stage efficiently. Furthermore, it may be unreasonable for the exponential weighting method to use the same weight for all SNPs within a group when only one or a few SNPs are related with a disease. In this article, we propose a data-driven weighting scheme within a two-stage framework. In this method, we use the information from the first stage to determine a SNP-specific weight for each SNP. We use simulation studies to evaluate the performance of our method. The simulation results showed that our proposed method is consistently more powerful than the top R method and the exponential weighting method, regardless of the LD structure, population structure, and family structure.
two stage; data-driven weighting; linkage disequilibrium; population stratification
Large genome-wide association studies have been performed to detect common genetic variants involved in common diseases, but most of the variants found this way account for only a small portion of the trait variance. Furthermore, candidate gene based resequencing suggests that many rare genetic variants contribute to the trait variance of common diseases. Here we propose two designs, sibpair and unrelated-case designs, to detect rare genetic variants in either a candidate gene based or genome-wide association analysis. First we show that we can detect and classify together rare risk haplotypes using a relatively small sample with either of these designs, and then have increased power to test association in a larger case-control sample. This method can also be applied to resequencing data. Next we apply the method to the Wellcome Trust Case Control Consortium (WTCCC) coronary artery disease and hypertension data, the latter being the only trait for which no genome-wide association evidence was reported in the original WTCCC study, and identify one interesting gene associated with hypertension and four associated with coronary artery disease at a genome-wide significance level of 5%. These results suggest that searching for rare genetic variants is feasible and can be fruitful in current genome-wide association studies, candidate gene studies or resequencing studies.
Recently with the rapid improvements in high-throughout genotyping techniques, researchers are facing the very challenging task of analyzing large-scale genetic associations, especially at the whole-genome level, without an optimal solution. In this study, we propose a new approach for genetic association analysis that is based on a variable-sized sliding-window framework and employs principal component analysis to find the optimum window size. With the help of the bisection algorithm in window-size searching, our method is more computationally efficient than available approaches. We evaluate the performance of the proposed method by comparing it with two other methods—a single-marker method and a variable-length Markov chain method. We demonstrate that, in most cases, the proposed method outperforms the other two methods. Furthermore, since the proposed method is based on genotype data, it does not require any computationally intensive phasing program to account for uncertain haplotype phase.
Recently, Steen et al.1 proposed a novel two-stage approach for family-based genome-wide association studies. In the first stage, a test based on between-family information is used to rank SNPs according to their p-values or conditional power of the test. In the second stage, the R most promising SNPs are tested using a family-based association test. We call this two-stage approach top R method. Ionita-Laza et al.2 proposed an exponential weighting method within a two-stage framework. In the second stage of this approach, instead of testing top R SNPs it tests all SNPs and weights the p-values of association test according to the information of the first stage. However, both of the top R and exponential weighting methods only use the information from the first stage to rank SNPs. It seems that the two methods do not use information from the first stage efficiently. Furthermore, it may be unreasonable for the exponential weighting method to use the same weight for all SNPs within a group when only one or a few SNPs are related to disease.
In this article, we propose a data-driven weighting scheme within a two-stage framework. In this method, we use the information from the first stage to determine a SNP specific weight for each SNP. We use simulation studies to evaluate the performance of our method. The simulation results showed that our proposed method is consistently more powerful than the top R method and the exponential weighting method regardless of LD structure, population structure and family structure.
two-stage; data-driven weighting; linkage disequilibrium; population stratification
The CuII atom in the title compound, [CuI2(C20H14N4)], has a distorted square-pyramidal coordination formed by the N atoms of the tridentate 4′-(4-pyridyl)-2,2′:6′2′′-terpyridine (pyterpy) ligand and two I atoms; one of the I atoms is in the apical position. In contrast to other known square-pyramidal diiodido- and dibromidocopper complexes of the pyterpy ligand in which metal–halogen distances are significantly different, in the title compound the apical and equatorial Cu—I bonds are almost identical [2.6141 (8) and 2.6025 (8) Å, respectively].
Population stratification is one of the major causes of spurious associations in association studies. A unified association approach based on principal-component analysis can overcome the effect of population stratification, as well as make use of both family and unrelated samples combined to increase power (family-case-control, or FamCC). In this study, we compared FamCC and the transmission-disequilibrium test (TDT) using data on hypertension, systolic blood pressure, and diastolic blood pressure in the Framingham Heart Study. Our study indicated FamCC has reasonable type I error for both the unrelated sample and the family sample for all three traits. For these three traits, we found results from FamCC were inconsistent with those from the TDT. We discuss the reasons for this inconsistency. After correcting for multiple tests, we did not detect any significant single-nucleotide polymorphisms by either FamCC or the TDT.
To account for population stratification in association studies, principal-components analysis is often performed on single-nucleotide polymorphisms (SNPs) across the genome. Here, we use Framingham Heart Study (FHS) Genetic Analysis Workshop 16 data to compare the performance of local ancestry adjustment for population stratification based on principal components (PCs) estimated from SNPs in a local chromosomal region with global ancestry adjustment based on PCs estimated from genome-wide SNPs.
Standardized height residuals from unrelated adults from the FHS Offspring Cohort were averaged from longitudinal data. PCs of SNP genotype data were calculated to represent individual's ancestry either 1) globally using all SNPs across the genome or 2) locally using SNPs in adjacent 20-Mbp regions within each chromosome. We assessed the extent to which there were differences in association studies of height depending on whether PCs for global, local, or both global and local ancestry were included as covariates.
The correlations between local and global PCs were low (r < 0.12), suggesting variability between local and global ancestry estimates. Genome-wide association tests without any ancestry adjustment demonstrated an inflated type I error rate that decreased with adjustment for local ancestry, global ancestry, or both. A known spurious association was replicated for SNPs within the lactase gene, and this false-positive association was abolished by adjustment with local or global ancestry PCs.
Population stratification is a potential source of bias in this seemingly homogenous FHS population. However, local and global PCs derived from SNPs appear to provide adequate information about ancestry.