We constructed a very-high-density, whole-genome marker map (WGMM) for cotton by using 18,597 DNA markers corresponding to 48,958 loci that were aligned to both a consensus genetic map and a reference genome sequence. The WGMM has a density of one locus per 15.6 kb, or an average of 1.3 loci per gene. The WGMM was anchored by the use of colinear markers to a detailed genetic map, providing recombinational information. Mapped markers occurred at relatively greater physical densities in distal chromosomal regions and lower physical densities in the central regions, with all 1 Mb bins having at least nine markers. Hotspots for quantitative trait loci and resistance gene analog clusters were aligned to the map and DNA markers identified for targeting of these regions of high practical importance. Based on the cotton D genome reference sequence, the locations of chromosome structural rearrangements plotted on the map facilitate its translation to other Gossypium genome types. The WGMM is a versatile genetic map for marker assisted breeding, fine mapping and cloning of genes and quantitative trait loci, developing new genetic markers and maps, genome-wide association mapping, and genome evolution studies.
quantitative trait loci; resistance gene analog; simple sequence repeat; restriction fragment length polymorphism; inversions
To identify genetic predictors of diabetes-associated ED using genome
wide and candidate gene approaches in a cohort of men with type I
We examined 528 white men with T1D (125 with ED) from the DCCT and
its observational follow up EDIC Study. ED was defined from a single item of
the IIEF. An Illumina Human1M BeadChip was used for genotyping. 867,125
single nucleotide polymorphisms (SNPs) were subjected to analysis. Whole
genome and candidate gene approaches tested the hypothesis that genetic
polymorphisms may predispose men with T1D to ED. Univariate and multivariate
models were used controlling for age, HbA1c, diabetes duration, and prior
randomization to intensive or conventional insulin therapy during DCCT. A
stratified false discovery rate was used to perform the candidate gene
Two SNPs located on chromosome 3 in one genomic loci were associated
with ED with p < 1×10−6. rs9810233 had a
p-value of 7 × 10−7 and rs1920201 had a p-value
of 9×10−7 The nearest gene to these two SNPs is
ALCAM. The genetic association results at these loci were similar in
univariate and multivariate analysis. No candidate genes met criteria for
Two SNPs, rs9810233 and rs1920101, which are 25 kb apart, are both
associated with ED, albeit not meeting the standard GWAS significance
criteria of p < 5 × 10−8. Other studies with
larger sample sizes will be required to determine whether ALCAM represents a
novel gene in the pathogenesis of diabetes associated ED.
Erectile Dysfunction; Diabetes; Genetics
Refractive error is the most common eye disorder worldwide, and a prominent cause of blindness. Myopia affects over 30% of Western populations, and up to 80% of Asians. The CREAM consortium conducted genome-wide meta-analyses including 37,382 individuals from 27 studies of European ancestry, and 8,376 from 5 Asian cohorts. We identified 16 new loci for refractive error in subjects of European ancestry, of which 8 were shared with Asians. Combined analysis revealed 8 additional loci. The new loci include genes with functions in neurotransmission (GRIA4), ion channels (KCNQ5), retinoic acid metabolism (RDH5), extracellular matrix remodeling (LAMA2, BMP2), and eye development (SIX6, PRSS56). We also confirmed previously reported associations with GJD2 and RASGRF1. Risk score analysis using associated SNPs showed a tenfold increased risk of myopia for subjects with the highest genetic load. Our results, accumulated across independent multi-ethnic studies, considerably advance understanding of mechanisms involved in refractive error and myopia.
Haptoglobin (Hp) is an abundant serum protein which binds extracorpuscular hemoglobin (Hb). Two alleles exist in humans for the Hp gene, denoted 1 and 2. Diabetic individuals with the Hp 2-2 genotype are at increased risk of developing vascular complications including heart attack, stroke, and kidney disease. Recent evidence shows that treatment with vitamin E can reduce the risk of diabetic vascular complications by as much as 50% in Hp 2-2 individuals. We sought to develop a rapid and accurate test for Hp phenotype (which is 100% concordant with the three major Hp genotypes) to facilitate widespread diagnostic testing as well as prospective clinical trials.
A monoclonal antibody raised against human Hp was shown to distinguish between the three Hp phenotypes in an enzyme linked immunosorbent assay (ELISA). Hp phenotypes obtained in over 8000 patient samples using this ELISA method were compared with those obtained by polyacrylamide gel electrophoresis or the TaqMan PCR method.
Our analysis showed that the sensitivity and specificity of the ELISA test for Hp 2-2 phenotype is 99.0% and 98.1%, respectively. The positive predictive value and the negative predictive value for Hp 2-2 phenotype is 97.5% and 99.3%, respectively. Similar results were obtained for Hp 2-1 and Hp 1-1 phenotypes. In addition, the ELISA was determined to be more sensitive and specific than the TaqMan method.
The Hp ELISA represents a user-friendly, rapid and highly accurate diagnostic tool for determining Hp phenotypes. This test will greatly facilitate the typing of thousands of samples in ongoing clinical studies.
diabetes; ELISA; haptoglobin phenotype; pharmacogenomics; vitamin E
Percent mammographic density (PMD) is a strong and highly heritable risk factor for breast cancer. Studies of the role of PMD in familial breast cancer may require controls, such as the sisters of cases, selected from the same 'risk set' as the cases. The use of sister controls would allow control for factors that have been shown to influence risk of breast cancer such as race/ethnicity, socioeconomic status and a family history of breast cancer, but may introduce 'overmatching' and attenuate case-control differences in PMD.
To examine the potential effects of using sister controls rather than unrelated controls in a case-control study, we examined PMD in triplets, each comprised of a case with invasive breast cancer, an unaffected full sister control, and an unaffected unrelated control. Both controls were matched to cases on age at mammogram. Total breast area and dense area in the mammogram were measured in the unaffected breast of cases and a randomly selected breast in controls, and the non-dense area and PMD calculated from these measurements.
The mean difference in PMD between cases and controls, and the standard deviation (SD) of the difference, were slightly less for sister controls (4.2% (SD = 20.0)) than for unrelated controls (4.9% (SD = 25.7)). We found statistically significant correlations in PMD between cases (n = 228) and sister controls (n = 228) (r = 0.39 (95% CI: 0.28, 0.50; P <0.0001)), but not between cases and unrelated controls (n = 228) (r = 0.04 (95% CI: -0.09, 0.17; P = 0.51)). After adjusting for other risk factors, square root transformed PMD was associated with an increased risk of breast cancer when comparing cases to sister controls (adjusted odds ratio (inter-quintile odds ratio (IQOR) = 2.19, 95% CI = 1.20, 4.00) or to unrelated controls (adjusted IQOR = 2.62, 95% CI = 1.62, 4.25).
The use of sister controls in case-control studies of PMD resulted in a modest attenuation of case-control differences and risk estimates, but showed a statistically significant association with risk and allowed control for race/ethnicity, socioeconomic status and family history.
Mammographic density; case-control study; overmatching; case control
To investigate the underlying phenotypic constructs in autism spectrum disorders (ASD) and to identify genetic loci that are linked to these empirically derived factors.
Exploratory factor analysis was applied to two datasets with 28 selected Autism Diagnostic Interview-Revised (ADI-R) algorithm items. The first dataset was from the Autism Genome Project (AGP) phase I (1,236 ASD subjects from 618 families); the second was from the AGP phase II (804 unrelated ASD subjects). Variables derived from the factor analysis were then used as quantitative traits in genome-wide variance components linkage analyses.
Six factors, joint attention, social interaction and communication, non-verbal communication, repetitive sensory-motor behaviour, peer interaction, and compulsion/restricted interests, were retained for both datasets. There was good agreement between the factor loading patterns from the two datasets. All factors showed familial aggregation. Suggestive evidence for linkage was obtained for the joint attention factor on 11q23. Genome-wide significant evidence for linkage was obtained for the repetitive sensory-motor behaviour factor on 19q13.3.
This study demonstrates that the underlying phenotypic constructs based on the ADI-R algorithm items are replicable in independent datasets; and the empirically derived factors are suitable and informative in genetic studies of ASD.
autism; ADI-R; factor analysis; linkage analysis; quantitative trait
Urogenital prolapse can have a significant impact on quality of life. The life time risk of requiring surgery for urogenital prolapse is 11%. Prolift mesh has recently been introduced to reduce repeat operation rate and for long-term benefit.
To evaluate the outcome of the treatment of urogenital prolapse with synthetic mesh.
A retrospective review of case notes of all women who underwent prolift mesh insertion for prolapse between July 2004 and June 2005, at Royal Alexandra Hospital Paisley UK. We looked at the presenting complaints, previous operation, intraoperative complications and complications at six weeks and six months follow-up.
Twenty-two procedures were carried out in the twelve months period. Age of the patients ranged from 55 to 82 years (median 64 yrs). Eleven had anterior Prolift (50%), Seven had posterior Prolift 31.8% and four total Prolift 18%. There were no intraoperative complications. All the patients had previous surgery for prolapse. Eight patients had anterior repair, six patients had posterior repair, and three patients had abdominal hysterectomy. Vaginal hysterectomy was carried out with mesh insertion as a concomitant procedure in seven cases (31.25%). All patients were seen at six weeks and six months after the surgery. Complications rate included mesh erosion one patient and suture material protruding in the vagina one patient, one patient had failed prolift operation. All the twenty-one patients were cured giving 95.4% success rate.
The use of prolene mesh in pelvic reconstructive surgery was associated with good outcome and minimal complications in this study.
Prolift; Mesh; Urogenital prolapse
We describe a recombinant inbred line (RIL) population of 161 F5 genotypes for the widest euploid cross that can be made to cultivated sorghum (Sorghum bicolor) using conventional techniques, S. bicolor × Sorghum propinquum, that segregates for many traits related to plant architecture, growth and development, reproduction, and life history. The genetic map of the S. bicolor × S. propinquum RILs contains 141 loci on 10 linkage groups collectively spanning 773.1 cM. Although the genetic map has DNA marker density well-suited to quantitative trait loci mapping and samples most of the genome, our previous observations that sorghum pericentromeric heterochromatin is recalcitrant to recombination is highlighted by the finding that the vast majority of recombination in sorghum is concentrated in small regions of euchromatin that are distal to most chromosomes. The advancement of the RIL population in an environment to which the S. bicolor parent was well adapted (indeed bred for) but the S. propinquum parent was not largely eliminated an allele for short-day flowering that confounded many other traits, for example, permitting us to map new quantitative trait loci for flowering that previously eluded detection. Additional recombination that has accrued in the development of this RIL population also may have improved resolution of apices of heterozygote excess, accounting for their greater abundance in the F5 than the F2 generation. The S. bicolor × S. propinquum RIL population offers advantages over early-generation populations that will shed new light on genetic, environmental, and physiological/biochemical factors that regulate plant growth and development.
quantitative trait locus; simple-sequence repeat; DNA marker; recombination; segregation distortion
Genome duplication (GD) has permanently shaped the architecture and function of many higher eukaryotic genomes. The angiosperms (flowering plants) are outstanding models in which to elucidate consequences of GD for higher eukaryotes, owing to their propensity for chromosomal duplication or even triplication in a few cases. Duplicated genome structures often require both intra- and inter-genome alignments to unravel their evolutionary history, also providing the means to deduce both obvious and otherwise-cryptic orthology, paralogy and other relationships among genes. The burgeoning sets of angiosperm genome sequences provide the foundation for a host of investigations into the functional and evolutionary consequences of gene and GD. To provide genome alignments from a single resource based on uniform standards that have been validated by empirical studies, we built the Plant Genome Duplication Database (PGDD; freely available at http://chibba.agtec.uga.edu/duplication/), a web service providing synteny information in terms of colinearity between chromosomes. At present, PGDD contains data for 26 plants including bryophytes and chlorophyta, as well as angiosperms with draft genome sequences. In addition to the inclusion of new genomes as they become available, we are preparing new functions to enhance PGDD.
Percent mammographic breast density (PMD) is a strong heritable risk factor for breast cancer. However, the pathways through which this risk is mediated are still unclear. To explore whether PMD and breast cancer have a shared genetic basis, we identified genetic variants most strongly associated with PMD in a published meta-analysis of five genome-wide association studies (GWAS) and used these to construct risk scores for 3628 breast cancer cases and 5190 controls from the UK2 GWAS of breast cancer. The signed per-allele effect estimates of SNPs were multiplied with the respective allele counts in the individual and summed over all SNPs to derive the risk score for an individual. These scores were included as the exposure variable in a logistic regression model with breast cancer case-control status as the outcome. This analysis was repeated using ten different cut-offs for the most significant density SNPs (1-10% representing 5,222-50,899 SNPs). Permutation analysis was also performed across all 10 cut-offs. The association between risk score and breast cancer was significant for all cut-offs from 3-10% of top density SNPs, being most significant for the 6% (2-sided P=0.002) to 10% (P=0.001) cut-offs (overall permutation P=0.003). Women in the top 10% of the risk score distribution had a 31% increased risk of breast cancer [OR= 1.31 (95%CI 1.08-1.59)] compared to women in the bottom 10%. Together, our results demonstrate that PMD and breast cancer have a shared genetic basis that is mediated through a large number of common variants.
breast cancer; mammographic density; SNPs; polygenic; Mendelian Randomisation
Background & Aims
RAC1 is a GTPase that has an evolutionarily conserved role in coordinating immune defenses, from plants to mammals. Chronic inflammatory bowel diseases (IBD) are associated with dysregulation of immune defenses. We studied the role of RAC1 in IBD using human genetic and functional studies and animal models of colitis.
We used a candidate gene approach to HapMap-Tag single nucleotide polymorphisms (SNPs) in a discovery cohort; findings were confirmed in 2 additional cohorts. RAC1 mRNA expression was examined from peripheral blood cells of patients. Colitis was induced in mice with conditional disruption of Rac1 in phagocytes by administration of dextran sulphate sodium (DSS).
We observed a genetic association between RAC1 with ulcerative colitis (UC) in a discovery cohort, 2 independent replication cohorts, and in combined analysis for the SNPs rs10951982 (Pcombined UC = 3.3 × 10–8, odds ratio [OR]=1.43 [1.26–1.63]) and rs4720672 (Pcombined UC=4.7 × 10–6, OR=1.36 [1.19–1.58]). Patients with IBD who had the rs10951982 risk allele had increased expression of RAC1, compared to those without this allele. Conditional disruption of Rac1 in macrophage and neutrophils of mice protected them against DSS-induced colitis.
Studies of human tissue samples and knockout mice demonstrated a role for the GTPase RAC1 in the development of UC; increased expression of RAC1 was associated with susceptibility to colitis.
innate immunity; Crohn's disease; CD; Rac-1 knockout
Papaya is a major fruit crop in tropical and subtropical regions worldwide. It is trioecious with three sex forms: male, female, and hermaphrodite. Sex determination is controlled by a pair of nascent sex chromosomes with two slightly different Y chromosomes, Y for male and Yh for hermaphrodite. The sex chromosome genotypes are XY (male), XYh (hermaphrodite), and XX (female). The papaya hermaphrodite-specific Yh chromosome region (HSY) is pericentromeric and heterochromatic. Physical mapping of HSY and its X counterpart is essential for sequencing these regions and uncovering the early events of sex chromosome evolution and to identify the sex determination genes for crop improvement.
A reiterate chromosome walking strategy was applied to construct the two physical maps with three bacterial artificial chromosome (BAC) libraries. The HSY physical map consists of 68 overlapped BACs on the minimum tiling path, and covers all four HSY-specific Knobs. One gap remained in the region of Knob 1, the only knob structure shared between HSY and X, due to the lack of HSY-specific sequences. This gap was filled on the physical map of the HSY corresponding region in the X chromosome. The X physical map consists of 44 BACs on the minimum tiling path with one gap remaining in the middle, due to the nature of highly repetitive sequences. This gap was filled on the HSY physical map. The borders of the non-recombining HSY were defined genetically by fine mapping using 1460 F2 individuals. The genetically defined HSY spanned approximately 8.5 Mb, whereas its X counterpart extended about 5.4 Mb including a 900 Kb region containing the Knob 1 shared by the HSY and X. The 8.5 Mb HSY corresponds to 4.5 Mb of its X counterpart, showing 4 Mb (89%) DNA sequence expansion.
The 89% increase of DNA sequence in HSY indicates rapid expansion of the Yh chromosome after genetic recombination was suppressed 2–3 million years ago. The genetically defined borders coincide with the common BACs on the minimum tiling paths of HSY and X. The minimum tiling paths of HSY and its X counterpart are being used for sequencing these X and Yh-specific regions.
Bacterial artificial chromosome (BAC); Carica papaya; Sex chromosomes; Sex determination; Suppression of recombination
Photoperiod-sensitive flowering is a key adaptive trait for sorghum (Sorghum bicolor) in West and Central Africa. In this study we performed an association analysis to investigate the effect of polymorphisms within the genes putatively related to variation in flowering time on photoperiod-sensitive flowering in sorghum. For this purpose a genetically characterized panel of 219 sorghum accessions from West and Central Africa was evaluated for their photoperiod response index (PRI) based on two sowing dates under field conditions.
Sorghum accessions used in our study were genotyped for single nucleotide polymorphisms (SNPs) in six genes putatively involved in the photoperiodic control of flowering time. Applying a mixed model approach and previously-determined population structure parameters to these candidate genes, we found significant associations between several SNPs with PRI for the genes CRYPTOCHROME 1 (CRY1-b1) and GIGANTEA (GI).
The negative values of Tajima's D, found for the genes of our study, suggested that purifying selection has acted on genes involved in photoperiodic control of flowering time in sorghum. The SNP markers of our study that showed significant associations with PRI can be used to create functional markers to serve as important tools for marker-assisted selection of photoperiod-sensitive cultivars in sorghum.
MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/.
We summarize the contributions of Group 9 of Genetic Analysis Workshop 17. This group addressed the problems of linkage disequilibrium and other longer range forms of allelic association when evaluating the effects of genotypes on phenotypes. Issues raised by long-range associations, whether a result of selection, stratification, possible technical errors, or chance, were less expected but proved to be important. Most contributors focused on regression methods of various types to illustrate problematic issues or to develop adaptations for dealing with high-density genotype assays. Study design was also considered, as was graphical modeling. Although no method emerged as uniformly successful, most succeeded in reducing false-positive results either by considering clusters of loci within genes or by applying smoothing metrics that required results from adjacent loci to be similar. Two unexpected results that questioned our assumptions of what is required to model linkage disequilibrium were observed. The first was that correlations between loci separated by large genetic distances can greatly inflate single-locus test statistics, and, whether the result of selection, stratification, possible technical errors, or chance, these correlations seem overabundant. The second unexpected result was that applying principal components analysis to genome-wide genotype data can apparently control not only for population structure but also for linkage disequilibrium.
score tests; two-stage study designs; robust regression; higher criticism; principal components analysis; graphical modeling
Fossil records indicate that life appeared in marine environments ∼3.5 billion years ago (Gyr) and transitioned to terrestrial ecosystems nearly 2.5 Gyr. Sequence analysis suggests that “hydrobacteria” and “terrabacteria” might have diverged as early as 3 Gyr. Bacteria of the genus Azospirillum are associated with roots of terrestrial plants; however, virtually all their close relatives are aquatic. We obtained genome sequences of two Azospirillum species and analyzed their gene origins. While most Azospirillum house-keeping genes have orthologs in its close aquatic relatives, this lineage has obtained nearly half of its genome from terrestrial organisms. The majority of genes encoding functions critical for association with plants are among horizontally transferred genes. Our results show that transition of some aquatic bacteria to terrestrial habitats occurred much later than the suggested initial divergence of hydro- and terrabacterial clades. The birth of the genus Azospirillum approximately coincided with the emergence of vascular plants on land.
Genome sequencing and analysis of plant-associated beneficial soil bacteria Azospirillum spp. reveals that these organisms transitioned from aquatic to terrestrial environments significantly later than the suggested major Precambrian divergence of aquatic and terrestrial bacteria. Separation of Azospirillum from their close aquatic relatives coincided with the emergence of vascular plants on land. Nearly half of the Azospirillum genome has been acquired horizontally, from distantly related terrestrial bacteria. The majority of horizontally acquired genes encode functions that are critical for adaptation to the rhizosphere and interaction with host plants.
Mammographic breast density is a highly heritable (h2 > 0.6) and strong risk factor for breast cancer. We conducted a genome-wide linkage study to identify loci influencing mammographic breast density (MD).
Epidemiological data were assembled on 1,415 families from the Australia, Northern California and Ontario sites of the Breast Cancer Family Registry, and additional families recruited in Australia and Ontario. Families consisted of sister pairs with age-matched mammograms and data on factors known to influence MD. Single nucleotide polymorphism (SNP) genotyping was performed on 3,952 individuals using the Illumina Infinium 6K linkage panel.
Using a variance components method, genome-wide linkage analysis was performed using quantitative traits obtained by adjusting MD measurements for known covariates. Our primary trait was formed by fitting a linear model to the square root of the percentage of the breast area that was dense (PMD), adjusting for age at mammogram, number of live births, menopausal status, weight, height, weight squared, and menopausal hormone therapy. The maximum logarithm of odds (LOD) score from the genome-wide scan was on chromosome 7p14.1-p13 (LOD = 2.69; 63.5 cM) for covariate-adjusted PMD, with a 1-LOD interval spanning 8.6 cM. A similar signal was seen for the covariate adjusted area of the breast that was dense (DA) phenotype. Simulations showed that the complete sample had adequate power to detect LOD scores of 3 or 3.5 for a locus accounting for 20% of phenotypic variance. A modest peak initially seen on chromosome 7q32.3-q34 increased in strength when only the 513 families with at least two sisters below 50 years of age were included in the analysis (LOD 3.2; 140.7 cM, 1-LOD interval spanning 9.6 cM). In a subgroup analysis, we also found a LOD score of 3.3 for DA phenotype on chromosome 12.11.22-q13.11 (60.8 cM, 1-LOD interval spanning 9.3 cM), overlapping a region identified in a previous study.
The suggestive peaks and the larger linkage signal seen in the subset of pedigrees with younger participants highlight regions of interest for further study to identify genes that determine MD, with the goal of understanding mammographic density and its involvement in susceptibility to breast cancer.
Both single gene and whole genome duplications (WGD) have recurred in angiosperm evolution. However, the evolutionary effects of different modes of gene duplication, especially regarding their contributions to genetic novelty or redundancy, have been inadequately explored.
In Arabidopsis thaliana and Oryza sativa (rice), species that deeply sample botanical diversity and for which expression data are available from a wide range of tissues and physiological conditions, we have compared expression divergence between genes duplicated by six different mechanisms (WGD, tandem, proximal, DNA based transposed, retrotransposed and dispersed), and between positional orthologs. Both neo-functionalization and genetic redundancy appear to contribute to retention of duplicate genes. Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty. WGD duplicates may more frequently be retained due to dosage amplification, while inferred transposon mediated gene duplications tend to reduce gene expression levels. The extent of expression divergence between duplicates is discernibly related to duplication modes, different WGD events, amino acid divergence, and putatively neutral divergence (time), but the contribution of each factor is heterogeneous among duplication modes. Gene loss may retard inter-species expression divergence. Members of different gene families may have non-random patterns of origin that are similar in Arabidopsis and rice, suggesting the action of pan-taxon principles of molecular evolution.
Gene duplication modes differ in contribution to genetic novelty and redundancy, but show some parallels in taxa separated by hundreds of millions of years of evolution.
We describe the construction of a BAC contig and identification of a minimal tiling path that encompass the dominant and monogenically inherited downy mildew resistance locus Pp523 of Brassica oleracea L. The selection of BAC clones for construction of the physical map was carried out by screening gridded BAC libraries with DNA overgo probes derived from both genetically mapped DNA markers flanking the locus of interest and BAC-end sequences that align to Arabidopsis thaliana sequences within the previously identified syntenic region. The selected BAC clones consistently mapped to three different genomic regions of B. oleracea. Although 83 BAC clones were accurately mapped within a ∼4.6 cM region surrounding the downy mildew resistance locus Pp523, a subset of 33 BAC clones mapped to another region on chromosome C8 that was ∼60 cM away from the resistance gene, and a subset of 63 BAC clones mapped to chromosome C5. These results reflect the triplication of the Brassica genomes since their divergence from a common ancestor shared with A. thaliana, and they are consonant with recent analyses of the C genome of Brassica napus. The assembly of a minimal tiling path constituted by 13 (BoT01) BAC clones that span the Pp523 locus sets the stage for map-based cloning of this resistance gene.
genetic resistance; plant disease resistance; map-based cloning; BAC contig; genome triplication
Genetic Analysis Workshop 17 (GAW17) provided a platform for evaluating existing statistical genetic methods and for developing novel methods to analyze rare variants that modulate complex traits. In this article, we present an overview of the 1000 Genomes Project exome data and simulated phenotype data that were distributed to GAW17 participants for analyses, the different issues addressed by the participants, and the process of preparation of manuscripts resulting from the discussions during the workshop.
Pathway-based analysis has been recently used in joint tests of association between disease and a group of common genetic variants. Here we explore this idea for the joint effects analysis of rare genetic variants and their association with quantitative traits and disease. We accumulate multiple rare minor alleles in a genetic risk score for each individual in a given pathway; this score is then used to assess association with quantitative phenotypes and disease. We demonstrate that this approach may be better than studying single rare variants or a gene risk score for identifying individuals with significantly greater risk.
Evolution of the Brassica species has been recursively affected by polyploidy events, and comparison to their relative, Arabidopsis thaliana, provides means to explore their genomic complexity.
A genome-wide physical map of a rapid-cycling strain of B. oleracea was constructed by integrating high-information-content fingerprinting (HICF) of Bacterial Artificial Chromosome (BAC) clones with hybridization to sequence-tagged probes. Using 2907 contigs of two or more BACs, we performed several lines of comparative genomic analysis. Interspecific DNA synteny is much better preserved in euchromatin than heterochromatin, showing the qualitative difference in evolution of these respective genomic domains. About 67% of contigs can be aligned to the Arabidopsis genome, with 96.5% corresponding to euchromatic regions, and 3.5% (shown to contain repetitive sequences) to pericentromeric regions. Overgo probe hybridization data showed that contigs aligned to Arabidopsis euchromatin contain ~80% of low-copy-number genes, while genes with high copy number are much more frequently associated with pericentromeric regions. We identified 39 interchromosomal breakpoints during the diversification of B. oleracea and Arabidopsis thaliana, a relatively high level of genomic change since their divergence. Comparison of the B. oleracea physical map with Arabidopsis and other available eudicot genomes showed appreciable 'shadowing' produced by more ancient polyploidies, resulting in a web of relatedness among contigs which increased genomic complexity.
A high-resolution genetically-anchored physical map sheds light on Brassica genome organization and advances positional cloning of specific genes, and may help to validate genome sequence assembly and alignment to chromosomes.
All the physical mapping data is freely shared at a WebFPC site (http://lulu.pgml.uga.edu/fpc/WebAGCoL/brassica/WebFPC/; Temporarily password-protected: account: pgml; password: 123qwe123.
Comparative genomics; polyploidy; Arabidopsis thaliana
Alternative splicing (AS) of pre-mRNA is a fundamental molecular process that generates diversity in the transcriptome and proteome of eukaryotic organisms. SR proteins, a family of splicing regulators with one or two RNA recognition motifs (RRMs) at the N-terminus and an arg/ser-rich domain at the C-terminus, function in both constitutive and alternative splicing. We identified SR proteins in 27 eukaryotic species, which include plants, animals, fungi and “basal” eukaryotes that lie outside of these lineages. Using RNA recognition motifs (RRMs) as a phylogenetic marker, we classified 272 SR genes into robust sub-families. The SR gene family can be split into five major groupings, which can be further separated into 11 distinct sub-families. Most flowering plants have double or nearly double the number of SR genes found in vertebrates. The majority of plant SR genes are under purifying selection. Moreover, in all paralogous SR genes in Arabidopsis, rice, soybean and maize, one of the two paralogs is preferentially expressed throughout plant development. We also assessed the extent of AS in SR genes based on a splice graph approach (http://combi.cs.colostate.edu/as/gmap_SRgenes). AS of SR genes is a widespread phenomenon throughout multiple lineages, with alternative 3′ or 5′ splicing events being the most prominent type of event. However, plant-enriched sub-families have 57%–88% of their SR genes experiencing some type of AS compared to the 40%–54% seen in other sub-families. The SR gene family is pervasive throughout multiple eukaryotic lineages, conserved in sequence and domain organization, but differs in gene number across lineages with an abundance of SR genes in flowering plants. The higher number of alternatively spliced SR genes in plants emphasizes the importance of AS in generating splice variants in these organisms.
High percent mammographic density adjusted for age and body mass index (BMI) is one of the strongest risk factors for breast cancer. We conducted a meta-analysis of five genome-wide association studies of percent mammographic density and report an association with rs10995190 in ZNF365 (combined P=9×6·10−10). This finding might partly explain the underlying biology of the recently discovered association between common variants in ZNF365 and breast cancer risk.
Recombination in the family Coronaviridae has been well documented and is thought to be a contributing factor in the emergence and evolution of different coronaviral genotypes as well as different species of coronavirus. However, there are limited data available on the frequency and extent of recombination in coronaviruses in nature and particularly for the avian gamma-coronaviruses where only recently the emergence of a turkey coronavirus has been attributed solely to recombination. In this study, the full-length genomes of eight avian gamma-coronavirus infectious bronchitis virus (IBV) isolates were sequenced and along with other full-length IBV genomes available from GenBank were analyzed for recombination. Evidence of recombination was found in every sequence analyzed and was distributed throughout the entire genome. Areas that have the highest occurrence of recombination are located in regions of the genome that code for nonstructural proteins 2, 3 and 16, and the structural spike glycoprotein. The extent of the recombination observed, suggests that this may be one of the principal mechanisms for generating genetic and antigenic diversity within IBV. These data indicate that reticulate evolutionary change due to recombination in IBV, likely plays a major role in the origin and adaptation of the virus leading to new genetic types and strains of the virus.
gamma coronavirus; avian coronavirus; infectious bronchitis virus; genome; recombination