We previously reported a top-ranked risk gene [i.e., serine incorporator 2 gene (SERINC2)] for alcohol dependence in the subjects of European descent by analyzing the common variants in a genome-wide association study. In the present study, we comprehensively examined the rare variants [minor allele frequency (MAF) < 0.05] in the NKAIN1-SERINC2 region, in order to confirm our previous finding.
A discovery sample (1,409 European-American cases with alcohol dependence and 1,518 European-American controls) and a replication sample (6,438 European-Australian family subjects with 1,645 alcohol dependent probands) underwent association analysis. A total of 39,903 subjects from 19 other cohorts with 11 different neuropsychiatric and neurological disorders served as contrast groups. The entire NKAIN1-SERINC2 region was imputed in all cohorts using the same reference panels of genotypes that included rare variants from the whole-genome sequencing data. We stringently cleaned the phenotype and genotype data, and obtained a total of about 220 SNPs in the subjects with European descent and about 450 SNPs in the subjects with African descent with 0
Using a weighted regression analysis implemented in the program SCORE-Seq, we found a rare variant constellation across the entire NKAIN1-SERINC2 region that was associated with alcohol dependence in European-Americans (Fp: overall p=1.8×10−4; VT: overall p=1.4×10−4; Collapsing p=6.5×10−5) and European-Australians (Fp: overall p=0.028; Collapsing p=0.025), but not African-Americans, and not associated with any other disorder examined. Association signals in this region came mainly from SERINC2, a gene that codes for an activity-regulated protein expressed in brain that incorporates serine into lipids. Additionally, 26 individual rare variants were nominally associated with alcohol dependence in European-Americans (p<0.05). The associations of 5 of these rare variants that lay within SERINC2 exhibited region-wide significance (p<α=0.0006); and 25 associations survived correction for false discovery rate (q<0.05). The associations of 2 rare variants at SERINC2 were replicated in European-Australians (p<0.05).
We concluded that SERINC2 was a replicable and significant risk gene specific for alcohol dependence in the subjects of European descent.
SERINC2; alcohol dependence; rare variant constellations; European descent; association
To summarize baseline characteristics from a large multi-center infertility clinical trial.
Cross-sectional baseline data from a double-blind randomized trial of 2 treatment regimens (letrozole vs. clomiphene).
Academic Health Centers throughout the U.S.
Main Outcome Measure(s)
Historical, biometric, biochemical and questionnaire parameters.
750 women with PCOS and their male partners took part in the study.
Females averaged ~30 years old and were obese (BMI 35) with ~20% from a racial/ethnic minority. Most (87%) were hirsute and nulligravid (63%). . Most of the females had an elevated antral follicle count and enlarged ovarian volume on ultrasound. Women had elevated mean circulating androgens, LH:FSH ratio (~2), and AMH levels (8.0 ng/mL). Additionally, women had evidence for metabolic dysfunction with elevated mean fasting insulin and dyslipidemia. Increasing obesity was associated with decreased LH:FSH levels, AMH levels and antral follicle counts but increasing cardiovascular risk factors, including prevalence of the metabolic syndrome. Males were obese (BMI 30) and had normal mean semen parameters.
The treatment groups were well-matched at baseline. Obesity exacerbates select female reproductive and most metabolic parameters. We have also established a database and sample repository that will eventually be accessible to investigators.
insulin resistance; hirsutism; infertility; ovulation induction; metabolic syndrome
Alcohol dependence is more common among men than among women. Potential explanations for this male excess include a role of genes on sex chromosomes (X and Y). In the present study, we scanned the entire Y chromosome and its homologues on X chromosome in males, in order to identify male-specific risk genes for alcohol dependence. Two thousand nine hundred twenty-seven subjects in two independent cohorts were analyzed. The European-American male cohort [883 cases with alcohol dependence and 445 controls] served as the discovery cohort and the European-American female cohort [526 cases and 1,073 controls] served as a contrast group. All subjects were genotyped on the Illumina Human 1M beadchip. Two thousand two hundred twenty-four SNPs on Y chromosome or in the homologues on X chromosome were analyzed. The allele frequencies were compared between cases and controls within each cohort using logistic regression analysis. We found that, after experiment-wide correction, 2 SNPs on the X chromosome were significantly associated with alcohol dependence in European-American males (p=1.0×10-4 for rs5916144 and p=5.5×10−5 for rs5961794 at 3'UTR of NLGN4X), but not in females. A total of twenty-six SNPs at 3'UTR of or within NLGN4X were nominally associated with alcohol dependence in males (5.5×10−5≤p≤p0.05), all of which were not statistically significant in females. We conclude that NLGN4X was a significant male-specific risk gene for alcohol dependence in European-Americans. NLGN4X might harbor a causal variant(s) for alcohol dependence. A defect of synaptogenesis in neuronal circuitry caused by NLGN4X mutations is believed to play a role in alcohol dependence.
Alcohol dependence; NLGN4X; Y chromosome; Homologue; Male-specificity; Synaptogenesis
Intraventricular hemorrhage is a disorder of complex etiology. We analyzed genotypes for 7 genes from 224 inborn preterm neonates treated with antenatal steroids and Grade 3-4 intraventricular hemorrhage and 389 matched controls. Only methylenetetrahydrofolate reductase was more prevalent in cases of intraventricular hemorrhage, emphasizing the need for more comprehensive genetic strategies.
In genetic studies of complex diseases, particularly mental illnesses, and behavior disorders, two distinct characteristics have emerged in some data sets. First, genetic data sets are collected with a large number of phenotypes that are potentially related to the complex disease under study. Second, each phenotype is collected from the same subject repeatedly over time. In this study, we present a nonparametric regression approach to study multivariate and time-repeated phenotypes together by using the technique of the multivariate adaptive regression splines for analysis of longitudinal data (MASAL), which makes it possible to identify genes, gene-gene and gene-environment, including time, interactions associated with the phenotypes of interest. Furthermore, we propose a permutation test to assess the associations between the phenotypes and selected markers. Through simulation, we demonstrate that our proposed approach has advantages over the existing methods that examine each longitudinal phenotype separately or analyze the summarized values of phenotypes by compressing them into one-time-point phenotypes. Application of the proposed method to the Framingham Heart Study illustrates that the use of multivariate longitudinal phenotypes enhanced the significance of the association test.
Multivariate phenotypes; longitudinal data analysis; genetic association test; multivariate adaptive regression splines
Background and Objectives
We previously reported a risk genomic region (i.e.,
PTP4A1-PHF3-EYS) for alcohol dependence in a genome-wide association
study (GWAS). We also reported a rare variant constellation across this region that was
significantly associated with alcohol dependence. In the present study, we significantly
increased the marker density within this region and examined the specificity of the
associations of common variants for alcohol dependence.
One African-American discovery sample (681 cases with alcohol dependence and
508 controls), one European-American replication sample (1,409 alcohol dependent cases
and 1,518 controls), and one European-Australian replication sample (a total of 6,438
family subjects with 1,645 alcohol dependent probands) underwent association analysis. A
total of 38,714 subjects from 18 other cohorts with 10 different neuropsychiatric
disorders served as contrast groups.
We found 289 SNPs that were nominally associated with alcohol dependence in the
discovery sample (p<0.05). Fifty-six associations of them were significant after
No markers were significantly associated with other neuropsychiatric disorders after
Conclusions and Scientific Significance
We confirmed with our previous findings that PTP4A1-PHF3-EYS
variants were significantly associated with alcohol dependence, which were replicable
across multiple independent populations and were specific for alcohol dependence. These
findings suggested that this region might harbor a causal variant(s) for alcohol
Common variants; Alcohol dependence; PTP4A1; PHF3; EYS
Gut microbiota mediated low-grade inflammation is involved in the onset of type 2 diabetes (T2DM). In this study, we used a high fat sucrose (HFS) diet-induced pre-insulin resistance and a low dose-STZ HFS rat models to study the effect and mechanism of Lactobacillus casei Zhang in protecting against T2DM onset. Hyperglycemia was favorably suppressed by L. casei Zhang treatment. Moreover, the hyperglycemia was connected with type 1 immune response, high plasma bile acids and urine chloride ion loss. This chloride ion loss was significantly prevented by L. casei via upregulating of chloride ion-dependent genes (ClC1-7, GlyRα1, SLC26A3, SLC26A6, GABAAα1, Bestrophin-3 and CFTR). A shift in the caecal microflora, particularly the reduction of bile acid 7α-dehydroxylating bacteria, and fecal bile acid profiles also occurred. These change coincided with organ chloride influx. Thus, we postulate that the prevention of T2DM onset by L. casei Zhang may be via a microbiota-based bile acid-chloride exchange mechanism.
Humans express at least seven alcohol dehydrogenase (ADH) isoforms that are encoded by ADH gene cluster (ADH7–ADH1C–ADH1B–ADH1A–ADH6–ADH4–ADH5) at chromosome 4. ADHs are key catabolic enzymes for retinol and ethanol. The functional ADH variants (mostly rare) have been implicated in alcoholism risk. In addition to catalyzing the oxidation of retinol and ethanol, ADHs may be involved in the metabolic pathways of several neurotransmitters that are implicated in the neurobiology of neuropsychiatric disorders. In the present study, we comprehensively examined the associations between common ADH variants [minor allele frequency (MAF) >0.05] and 11 neuropsychiatric and neurological disorders. A total of 50,063 subjects in 25 independent cohorts were analyzed. The entire ADH gene cluster was imputed across these 25 cohorts using the same reference panels. Association analyses were conducted, adjusting for multiple comparisons. We found 28 and 15 single nucleotide polymorphisms (SNPs), respectively, that were significantly associated with schizophrenia in African-Americans and autism in European-Americans after correction by false discovery rate (FDR) (q <0.05); and 19 and 6 SNPs, respectively, that were significantly associated with these two disorders after region-wide correction by SNPSpD (8.9 × 10−5 ≤ p ≤ 0.0003 and 2.4 × 10−5 ≤ p ≤ 0.0003, respectively). No variants were significantly associated with the other nine neuropsychiatric disorders, including alcohol dependence. We concluded that common ADH variants conferred risk for both schizophrenia in African-Americans and autism in European-Americans.
Let Y1, …, Yn be a sequence whose underlying mean is a step function with an unknown number of the steps and unknown change points. The detection of the change points, namely the positions where the mean changes, is an important problem in such fields as engineering, economics, climatology and bioscience. This problem has attracted a lot of attention in statistics, and a variety of solutions have been proposed and implemented. However, there is scant literature on the theoretical properties of those algorithms. Here, we investigate a recently developed algorithm called the Screening and Ranking Algorithm (SaRa). We characterize the theoretical properties of SaRa and show its superiority over other commonly used algorithms. In particular, we develop a false discovery rate approach to the multiple change-point problem and show a strong sure coverage property for the SaRa.
Change-point detection; copy number variation; false discovery rate; high dimensional data; screening and ranking algorithm
Intraventricular hemorrhage (IVH) of the preterm neonate is a complex developmental disorder, with contributions from both the environment and the genome. IVH, or hemorrhage into the germinal matrix of the developing brain with secondary periventricular infarction, occurs in that critical period of time before the 32nd – 33rd week post-conception and has been attributed to changes in cerebral blood flow to the immature germinal matrix microvasculature. Emerging data suggest that genes subserving coagulation, inflammatory and vascular pathways, and their interactions with environmental triggers may influence both the incidence and severity of cerebral injury and are the subject of this review.
Polymorphisms in the Factor V Leiden gene are associated with the atypical timing of IVH suggesting an as yet unknown environmental trigger. The methylenetetra-hydrofolate reeducates (MTHFR) variants render neonates more vulnerable to cerebral injury in the presence of perinatal hypoxia. The present study demonstrates that the MTHFR 677C>T polymorphism and low 5 minute Apgar score additively increase the risk of IVH. Finally, review of published preclinical data suggests the stressors of delivery result in hemorrhage in the presence of mutations in collagen 4A1 (COL4A1), a major structural protein of the developing cerebral vasculature. Maternal genetics and fetal environment may also play a role.
Economically, Leuconostoc lactis is one of the most important species in the genus Leuconostoc. It plays an important role in the food industry including the production of dextrans and bacteriocins. Currently, traditional molecular typing approaches for characterisation of this species at the isolate level are either unavailable or are not sufficiently reliable for practical use. Multilocus sequence typing (MLST) is a robust and reliable method for characterising bacterial and fungal species at the molecular level. In this study, a novel MLST protocol was developed for 50 L. lactis isolates from Mongolia and China.
Sequences from eight targeted genes (groEL, carB, recA, pheS, murC, pyrG, rpoB and uvrC) were obtained. Sequence analysis indicated 20 different sequence types (STs), with 13 of them being represented by a single isolate. Phylogenetic analysis based on the sequences of eight MLST loci indicated that the isolates belonged to two major groups, A (34 isolates) and B (16 isolates). Linkage disequilibrium analyses indicated that recombination occurred at a low frequency in L. lactis, indicating a clonal population structure. Split-decomposition analysis indicated that intraspecies recombination played a role in generating genotypic diversity amongst isolates.
Our results indicated that MLST is a valuable tool for typing L. lactis isolates that can be used for further monitoring of evolutionary changes and population genetics.
Historically, the Mongol Empire ranks among the world's largest contiguous empires, and the Mongolians developed their unique lifestyle and diet over thousands of years. In this study, the intestinal microbiota of Mongolians residing in Ulan Bator, TUW province and the Khentii pasturing area were studied using 454 pyrosequencing and q-PCR technology. We explored the impacts of lifestyle and seasonal dietary changes on the Mongolians' gut microbes. At the phylum level, the Mongolians's gut populations were marked by a dominance of Bacteroidetes (55.56%) and a low Firmicutes to Bacteroidetes ratio (0.71). Analysis based on the operational taxonomic unit (OTU) level revealed that the Mongolian core intestinal microbiota comprised the genera Prevotella, Bacteroides, Faecalibacterium, Ruminococcus, Subdoligranulum and Coprococcus. Urbanisation and life-style may have modified the compositions of the gut microbiota of Mongolians from Ulan Bator, TUW and Khentii. Based on a food frequency questionnaire, we found that the dietary structure was diverse and stable throughout the year in Ulan Bator and TUW, but was simple and varied during the year in Khentii. Accordingly, seasonal effects on intestinal microbiota were more distinct in Khentii residents than in TUW or Ulan Bator residents.
To estimate whether progestin-induced endometrial shedding, prior to ovulation induction with clomiphene citrate, metformin, or a combination of both, affects ovulation, conception, and live birth rates in women with polycystic ovary syndrome (PCOS).
A secondary analysis of the data from 626 women with PCOS from the National Institutes of Child Health and Human Development Cooperative Reproductive Medicine Network trial was performed. Women had been randomized to up to six cycles of clomiphene citrate alone, metformin alone, or clomiphene citrate plus metformin. Women were assessed for occurrence of ovulation, conception, and live birth in relation to prior bleeding episodes (after either ovulation or exogenous progestin-induced withdrawal bleed).
While ovulation rates were higher in cycles preceded by spontaneous endometrial shedding than after anovulatory cycles (with or without prior progestin withdrawal), both conception and live birth rates were significantly higher after anovulatory cycles without progestin-induced withdrawal bleeding (live birth per cycle: spontaneous menses 2.2%; anovulatory with progestin withdrawal 1.6%; anovulatory without progestin withdrawal 5.3%; p<0.001). The difference was more marked when rate was calculated per ovulation (live birth per ovulation: spontaneous menses 3.0%; anovulatory with progestin withdrawal 5.4%; anovulatory without progestin withdrawal 19.7%; p < .001).
Conception and live birth rates are lower in women with PCOS after a spontaneous menses or progestin-induced withdrawal bleeding as compared to anovulatory cycles without progestin withdrawal. The common clinical practice of inducing endometrial shedding with progestin prior to ovarian stimulation may have an adverse effect on rates of conception and live birth in anovulatory women with PCOS.
Polycystic ovary syndrome (PCOS) patients are at increased risk of pregnancy complications, which may impair pregnancy outcome. Transfer of fresh embryos after superovulation may lead to abnormal implantation and placentation and further increase risk for pregnancy loss and complications. Some preliminary data suggest that elective embryo cryopreservation followed by frozen–thawed embryo transfer into a hormonally primed endometrium could result in a higher clinical pregnancy rate than that achieved by fresh embryo transfer.
This study is a multicenter, prospective, randomized controlled clinical trial (1:1 treatment ratio of fresh vs. elective frozen embryo transfers).. A total of 1,180 infertile PCOS patients undergoing the first cycle of in vitro fertilization (IVF) or intracytoplasmic sperm injection will be enrolled and randomized into two parallel groups. Participants in group A will undergo fresh embryo transfer on day 3 after oocyte retrieval, and participants in group B will undergo elective embryo cryopreservation after oocyte retrieval and frozen–thawed embryo transfer in programmed cycles. The primary outcome is the live birth rate. Our study is powered at 80 to detect an absolute difference of 10 at the significance level of 0.01 based on a two-sided test.
We hypothesize that elective embryo cryopreservation and frozen–thawed embryo transfer will reduce the incidence of pregnancy complications and increase the live birth rate in PCOS patients who need IVF to achieve pregnancy.
ClinicalTrials.gov Identifier: NCT01841528
Frozen–thawed embryo transfer; In vitro fertilization; Live birth; Polycystic ovarian syndrome
We aimed to identify novel, functional, replicable and genome-wide
significant risk regions specific for alcohol dependence using genome-wide
association studies (GWASs).
A discovery sample (1,409 European-American cases with alcohol
dependence and 1,518 European-American controls) and a replication sample
(6,438 European-Australian family subjects with 1,645 alcohol dependent
probands) underwent association analysis. Nineteen other cohorts with 11
different neuropsychiatric disorders served as contrast groups. Additional
eight samples underwent expression quantitative locus (eQTL) analysis.
A genome-wide significant risk gene region
(NKAIN1-SERINC2) was identified in a meta-analysis of
the discovery and replication samples. This region was enriched with 74 risk
SNPs (unimputed); half of them had significant cis-acting
regulatory effects. The distributions of -log(p) values for the SNP-disease
associations or SNP-expression associations in this region were consistent
throughout eight independent samples. Furthermore, imputing across the
NKAIN1-SERINC2 region, we found that among all 795 SNPs
in the discovery sample, 471 SNPs were nominally associated with alcohol
dependence (1.7×10−7≤p≤0.047); 53
survived region- and cohort-wide correction for multiple testing; 92 SNPs
were replicated in the replication sample (0.002≤p≤0.050).
This region was neither significantly associated with alcohol dependence in
African-Americans, nor with other non-alcoholism diseases. Finally,
transcript expression of genes in NKAIN1-SERINC2 was
significantly (p<3.4×10−7) associated
with expression of numerous genes in the neurotransmitter systems or
metabolic pathways previously associated with alcohol dependence.
NKAIN1-SERINC2 may harbor a causal variant(s) for
alcohol dependence. It may contribute to the disease risk by way of
neurotransmitter systems or metabolic pathways.
GWAS; genome-wide association studies; alcohol dependence; eQTL; risk region; replication
Alcohol and nicotine co-dependence can be considered as a more severe subtype of alcohol dependence. A portion of its risk may be attributable to genetic factors.
We searched for significant risk genomic regions specific for this disorder using a genome-wide association study (GWAS). A total of 8,847 subjects underwent gene-disease association analysis, including (i) a discovery cohort of 818 European-American cases with alcohol and nicotine co-dependence and 1,396 European-American controls, (ii) a replication cohort of 5,704 Australian family subjects with 907 affected offspring, and (iii) a replication cohort of 449 African-American cases and 480 African-American controls. Additionally, a total of 38,714 subjects of European or African descent in 18 independent cohorts with 10 other non-alcoholism neuropsychiatric disorders were analyzed as contrast. Furthermore, 90 unrelated HapMap CEU individuals, 93 European brain tissue samples and 80 European peripheral blood mononuclear cell (PBMC) samples underwent cis-acting expression quantitative locus (cis-eQTL) analysis.
We identified a significant risk region for alcohol and nicotine co-dependence between IPO11 and HTR1A on chromosome 5q that was reported to be suggestively associated with alcohol dependence previously. In the European-American discovery cohort, 381 SNPs in this region were nominally associated with alcohol and nicotine co-dependence (p<0.05); 57 associations of them survived region- and cohort-wide correction (α=3.6×10−6); and the top SNP (rs7445832) was significantly associated with alcohol and nicotine co-dependence at the genome-wide significance level (p=6.2×10−9). Furthermore, associations for 34 and 11 SNPs were replicated in the Australian and African-American replication cohorts, respectively. Among these replicable associations, 4 reached genome-wide significance level in the meta-analysis of European-Americans and European-Australians: rs7445832 (p=9.6×10−10), rs13361996 (p=8.2×10−9), rs62380518 (p=2.3×10−8) and rs7714850 (p=3.4×10−8). Cis-eQTL analysis showed that many risk SNPs in this region had nominally significant cis-acting regulatory effects on HTR1A or IPO11 mRNA expression. Finally, no markers were significantly associated with any other neuropsychiatric disorder examined.
We speculate that this IPO11-HTR1A region might harbor a causal variant for alcohol and nicotine co-dependence.
GWAS; alcohol and nicotine co-dependence; cis-eQTL; IPO11; HTR1A
The human gut microbiota consists of complex microbial communities, which possibly play crucial roles in physiological functioning and health maintenance. China has evolved into a multicultural society consisting of the major ethnic group, Han, and 55 official ethnic minority groups. Nowadays, these minority groups inhabit in different Chinese provinces and some of them still keep their unique culture and lifestyle. Currently, only limited data are available on the gut microbiota of these Chinese ethnic groups. In this study, 10 major fecal bacterial groups of 314 healthy individuals from 7 Chinese ethnic origins were enumerated by quantitative polymerase chain reaction. Our data confirmed that the selected bacterial groups were common to all 7 surveyed ethnicities, but the amount of the individual bacterial groups varied to different degree. By principal component and canonical variate analyses of the 314 individuals or the 91 Han subjects, no distinct group clustering pattern was observed. Nevertheless, weak differences were noted between the Han and Zhuang from other ethnic minority groups, and between the Heilongjiang Hans from those of the other provinces. Thus, our results suggest that the ethnic origin may contribute to shaping the human gut microbiota.
To investigate whether prenatal exposure to nicotine has an impact on several reading skill outcomes in school age children.
Using a longitudinal sample of 5,119 school age children in the Avon Longitudinal Study of Parents and Children (ALSPAC), this study investigated specific reading skill outcomes in the area of speed, fluency, accuracy, spelling and comprehension in relation to prenatal nicotine exposure, after adjusting for potential mediators and confounders. Prenatal nicotine exposure was divided into three categories: high (>17mg per day), low (≤17mg per day) and no exposure.
We found that prenatal nicotine exposure was associated with increased risk of underperformance in specific reading skill outcomes after adjusting for potential mediators and confounders (p = .006). The effect of poor performance in decoding single words was most pronounced among children with prenatal exposure to high levels of nicotine in conjunction with a phonological deficit. Overall the results showed that maternal smoking has moderate to large associations with delayed or decreased reading skills of children in the ALSPAC.
High prenatal nicotine exposure has a negative association with reading performance in school age children. In addition, modeling showed that environmental factors significantly moderated the interaction between prenatal nicotine exposure and reading skill outcomes.
Reading performance; reading skills; prenatal exposure to nicotine; ALSPAC
To determine whether self-reported menopausal symptoms are associated with measures of subclinical atherosclerosis.
Multi-center, randomized controlled trial.
Recently menopausal women (n=868) screened for the Kronos Early Estrogen Prevention Study (KEEPS).
Cross sectional analysis.
Main Outcome Measures
Baseline menopausal symptoms (hot flashes, dyspareunia, vaginal dryness, night sweats, palpitations, mood swings, depression, insomnia, irritability), serum estradiol (E2) levels and measures of atherosclerosis were assessed. Atherosclerosis was quantified using Coronary Artery Calcium (CAC) Agatston scores (n=771) and Carotid Intima-Media Thickness (CIMT). Logistic regression model of menopausal symptoms and E2 was used to predict CAC. Linear regression model of menopausal symptoms and E2 was used to predict CIMT. Correlation between length of time in menopause with menopausal symptoms, estradiol (E2), CAC, and CIMT were assessed.
In early menopausal women screened for KEEPS, neither E2 nor climacteric symptoms predicted the extent of subclinical atherosclerosis. Palpitations (p=0.09) and depression (p=0.07) approached significance as predictors of CAC. Other symptoms of insomnia, irritability, dyspareunia, hot flashes, mood swings, night sweats, and vaginal dryness were not associated with CAC. Women with significantly elevated CAC scores were excluded from further participation in KEEPS; in women meeting inclusion criteria, neither baseline menopausal symptoms nor E2 predicted CIMT. Years since menopause onset correlated with CIMT, dyspareunia, vaginal dryness and E2.
Self-reported symptoms in recently menopausal women are not strong predictors of subclinical atherosclerosis. Continued follow-up of this population will be performed to determine if baseline or persistent symptoms in the early menopause are associated with progression of cardiovascular disease.
KEEPS; estrogen; cardiovascular; menopause; CAC; CIMT; palpitations; depression
Robust variable selection procedures through penalized regression have been gaining increased attention in the literature. They can be used to perform variable selection and are expected to yield robust estimates. However, to the best of our knowledge, the robustness of those penalized regression procedures has not been well characterized. In this paper, we propose a class of penalized robust regression estimators based on exponential squared loss. The motivation for this new procedure is that it enables us to characterize its robustness that has not been done for the existing procedures, while its performance is near optimal and superior to some recently developed methods. Specifically, under defined regularity conditions, our estimators are n-consistent and possess the oracle property. Importantly, we show that our estimators can achieve the highest asymptotic breakdown point of 1/2 and that their influence functions are bounded with respect to the outliers in either the response or the covariate domain. We performed simulation studies to compare our proposed method with some recent methods, using the oracle method as the benchmark. We consider common sources of influential points. Our simulation studies reveal that our proposed method performs similarly to the oracle method in terms of the model error and the positive selection rate even in the presence of influential points. In contrast, other existing procedures have a much lower non-causal selection rate. Furthermore, we re-analyze the Boston Housing Price Dataset and the Plasma Beta-Carotene Level Dataset that are commonly used examples for regression diagnostics of influential points. Our analysis unravels the discrepancies of using our robust method versus the other penalized regression method, underscoring the importance of developing and applying robust penalized regression methods.
Robust regression; Variable selection; Breakdown point; Influence function
Twin and family studies establish the foundation for studying the genetic, environmental and cultural transmission effects for phenotypes. In this work, we make use of the well established statistical methods and theory for mixed models to assess cultural transmission in twin and family studies. Specifically, we address two critical yet poorly understood issues: the model identifiability in assessing cultural transmission for twin and family data and the biases in the estimates when sub-models are used. We apply our models and theory to two real data sets. A simulation is conducted to verify the bias in the estimates of genetic effects when the working model is a sub-model.
Twin and family study; Biometrical Genetic model; Cultural transmission; Biometrical Genetic model; Identifiability; Likelihood ratio test; Mixed-effects model
Increasing evidence suggests that rare and generally deleterious genetic variants might have strong impact on disease risks of not only Mendelian disease, but also many common diseases. However, identifying such rare variants remains to be challenging, and novel statistical methods and bioinformatic software must be developed. Hence, we have to extensively evaluate various methods under reasonable genetic models. While there are abundant genomic data, they are not most helpful for the evaluation of the methods because the disease mechanism is unknown. Thus, it is imperative that we simulate genomic data that mimic the real data containing rare variants and that enable us to impose a known disease penetrance model. Although resampling simulation methods have shown their advantages in computational efficiency and in preserving important properties such as linkage disequilibrium (LD) and allele frequency, they still have limitations as we demonstrated. We propose an algorithm that combines a regression-based imputation with resampling to simulate genetic data with both rare and common variants. Logistic regression model was employed to fit the relationship between a rare variant and its nearby common variants in the 1000 Genomes Project data and then applied to the real data to fill in one rare variant at a time using the fitted logistic model based on common variants. Individuals then were simulated using the real data with imputed rare variants. We compared our method with existing simulators and demonstrated that our method performed well in retaining the real sample properties, such as LD and minor allele frequency, qualitatively.
resampling; logistic regression; simulation; rare SNPs
Preterm (PT) subjects are at risk for developmental delay, and task-based studies suggest that developmental disorders may be due to alterations in neural connectivity. Since emerging data imply the importance of right cerebellar function for language acquisition in typical development, we hypothesized that PT subjects would have alternate areas of cerebellar connectivity, and that these areas would be responsible for differences in cognitive outcomes between PT subjects and term controls at age 20 years.
Nineteen PT and 19 term control young adults were prospectively studied using resting-state functional MRI (fMRI) to create voxel-based contrast maps reflecting the functional connectivity of each tissue element in the grey matter through analysis of the intrinsic connectivity contrast degree (ICC-d). Left cerebellar ICC-d differences between subjects identified a region of interest that was used for subsequent seed-based connectivity analyses. Subjects underwent standardized language testing, and correlations with cognitive outcomes were assessed.
There were no differences in gender, hand preference, maternal education, age at study, or Peabody Picture Vocabulary Test (PPVT) scores. Functional connectivity (FcMRI) demonstrated increased tissue connectivity in the biventer, simple and quadrangular lobules of the L cerebellum (p<0.05) in PTs compared to term controls; seed-based analyses from these regions demonstrated alterations in connectivity from L cerebellum to both R and L inferior frontal gyri (IFG) in PTs compared to term controls. For PTs but not term controls, there were significant positive correlations between these connections and PPVT scores (R IFG: r=0.555, p=0.01; L IFG: r=0.454, p=0.05), as well as Verbal Comprehension Index (VCI) scores (R IFG: r=0.472, p=0.04).
These data suggest the presence of a left cerebellar language circuit in PT subjects at young adulthood. These findings may represent either a delay in maturation or the engagement of alternative neural pathways for language in the developing PT brain.
Preterm; cerebellum; language systems; functional MRI; resting state intrinsic connectivity contrast degree