Whole genome sequencing has allowed rapid progress in the application of forward genetics in model species. In this study, we demonstrated an application of next-generation sequencing for forward genetics in a complex crop genome. We sequenced an ethyl methanesulfonate-induced mutant of Sorghum bicolor defective in hydrogen cyanide release and identified the causal mutation. A workflow identified the causal polymorphism relative to the reference BTx623 genome by integrating data from single nucleotide polymorphism identification, prior information about candidate gene(s) implicated in cyanogenesis, mutation spectra, and polymorphisms likely to affect phenotypic changes. A point mutation resulting in a premature stop codon in the coding sequence of dhurrinase2, which encodes a protein involved in the dhurrin catabolic pathway, was responsible for the acyanogenic phenotype. Cyanogenic glucosides are not cyanogenic compounds but their cyanohydrins derivatives do release cyanide. The mutant accumulated the glucoside, dhurrin, but failed to efficiently release cyanide upon tissue disruption. Thus, we tested the effects of cyanide release on insect herbivory in a genetic background in which accumulation of cyanogenic glucoside is unchanged. Insect preference choice experiments and herbivory measurements demonstrate a deterrent effect of cyanide release capacity, even in the presence of wild-type levels of cyanogenic glucoside accumulation. Our gene cloning method substantiates the value of (1) a sequenced genome, (2) a strongly penetrant and easily measurable phenotype, and (3) a workflow to pinpoint a causal mutation in crop genomes and accelerate in the discovery of gene function in the postgenomic era.
Bioinformatics; Illumina; mutation discovery; forward genetics
The recent dramatic cost reduction of next-generation sequencing technology enables investigators to assess most variants in the human genome to identify risk variants for complex diseases. However, sequencing large samples remains very expensive. For a study sample with existing genotype data, such as array data from genome-wide association studies, a cost-effective approach is to sequence a subset of the study sample and then to impute the rest of the study sample, using the sequenced subset as a reference panel. The use of such an internal reference panel identifies population-specific variants and avoids the problem of a substantial mismatch in ancestry background between the study population and the reference population. To efficiently select an internal panel, we introduce an idea of phylogenetic diversity from mathematical phylogenetics and comparative genomics. We propose the “most diverse reference panel”, defined as the subset with the maximal “phylogenetic diversity”, thereby incorporating individuals that span a diverse range of genotypes within the sample. Using data both from simulations and from the 1000 Genomes Project, we show that the most diverse reference panel can substantially improve the imputation accuracy compared to randomly selected reference panels, especially for the imputation of rare variants. The improvement in imputation accuracy holds across different marker densities, reference panel sizes, and lengths for the imputed segments. We thus propose a novel strategy for planning sequencing studies on samples with existing genotype data.
coalescent; imputation; phylogenetic diversity; sequencing; study design
Oxidative damage to DNA constitutes a major threat to the faithful replication of DNA in all organisms and it is therefore important to understand the various mechanisms that are responsible for repair of such damage and the consequences of unrepaired damage. In these experiments, we make use of a reporter system in Saccharomyces cerevisiae that can measure the specific increase of each type of base pair mutation by measuring reversion to a Trp+ phenotype. We demonstrate that increased oxidative damage due to the absence of the superoxide dismutase gene, SOD1, increases all types of base pair mutations and that mismatch repair (MMR) reduces some, but not all, types of mutations. By analyzing various strains that can revert only via a specific CG → AT transversion in backgrounds deficient in Ogg1 (encoding an 8-oxoG glycosylase), we can study mutagenesis due to a known 8-oxoG base. We show as expected that MMR helps prevent mutagenesis due to this damaged base and that Pol η is important for its accurate replication. In addition we find that its accurate replication is facilitated by template switching, as loss of either RAD5 or MMS2 leads to a significant decrease in accurate replication. We observe that these ogg1 strains accumulate revertants during prolonged incubation on plates, in a process most likely due to retromutagenesis.
oxidative damage; mismatch repair; translesion synthesis; template switching; retromutagenesis
From population genetics theory, elevating the mutation rate of a large population should progressively reduce average fitness. If the fitness decline is large enough, the population will go extinct in a process known as lethal mutagenesis. Lethal mutagenesis has been endorsed in the virology literature as a promising approach to viral treatment, and several in vitro studies have forced viral extinction with high doses of mutagenic drugs. Yet only one empirical study has tested the genetic models underlying lethal mutagenesis, and the theory failed on even a qualitative level. Here we provide a new level of analysis of lethal mutagenesis by developing and evaluating models specifically tailored to empirical systems that may be used to test the theory. We first quantify a bias in the estimation of a critical parameter and consider whether that bias underlies the previously observed lack of concordance between theory and experiment. We then consider a seemingly ideal protocol that avoids this bias—mutagenesis of virions—but find that it is hampered by other problems. Finally, results that reveal difficulties in the mere interpretation of mutations assayed from double-strand genomes are derived. Our analyses expose unanticipated complexities in testing the theory. Nevertheless, the previous failure of the theory to predict experimental outcomes appears to reside in evolutionary mechanisms neglected by the theory (e.g., beneficial mutations) rather than from a mismatch between the empirical setup and model assumptions. This interpretation raises the specter that naive attempts at lethal mutagenesis may augment adaptation rather than retard it.
evolution; extinction; fitness; virus; theory; mutation
Dynamic regulation of chromosome structure and organization is critical for fundamental cellular processes such as gene expression and chromosome segregation. Condensins are conserved chromosome-associated proteins that regulate a variety of chromosome dynamics, including axial shortening, lateral compaction, and homolog pairing. However, how the in vivo activities of condensins are regulated and how functional interactors target condensins to chromatin are not well understood. To better understand how Drosophila melanogaster condensin is regulated, we performed a yeast two-hybrid screen and identified the chromo-barrel domain protein Mrg15 to interact with the Cap-H2 condensin subunit. Genetic interactions demonstrate that Mrg15 function is required for Cap-H2-mediated unpairing of polytene chromosomes in ovarian nurse cells and salivary gland cells. In diploid tissues, transvection assays demonstrate that Mrg15 inhibits transvection at Ubx and cooperates with Cap-H2 to antagonize transvection at yellow. In cultured cells, we show that levels of chromatin-bound Cap-H2 protein are partially dependent on Mrg15 and that Cap-H2-mediated homolog unpairing is suppressed by RNA interference depletion of Mrg15. Thus, maintenance of interphase chromosome compaction and homolog pairing status requires both Mrg15 and Cap-H2. We propose a model where the Mrg15 and Cap-H2 protein–protein interaction may serve to recruit Cap-H2 to chromatin and facilitates compaction of interphase chromatin.
condensin; homolog pairing; Mrg15; chromosome structure; transvection
Whole-genome sequencing, particularly in fungi, has progressed at a tremendous rate. More difficult, however, is experimental testing of the inferences about gene function that can be drawn from comparative sequence analysis alone. We present a genome-wide functional characterization of a sequenced but experimentally understudied budding yeast, Saccharomyces bayanus var. uvarum (henceforth referred to as S. bayanus), allowing us to map changes over the 20 million years that separate this organism from S. cerevisiae. We first created a suite of genetic tools to facilitate work in S. bayanus. Next, we measured the gene-expression response of S. bayanus to a diverse set of perturbations optimized using a computational approach to cover a diverse array of functionally relevant biological responses. The resulting data set reveals that gene-expression patterns are largely conserved, but significant changes may exist in regulatory networks such as carbohydrate utilization and meiosis. In addition to regulatory changes, our approach identified gene functions that have diverged. The functions of genes in core pathways are highly conserved, but we observed many changes in which genes are involved in osmotic stress, peroxisome biogenesis, and autophagy. A surprising number of genes specific to S. bayanus respond to oxidative stress, suggesting the organism may have evolved under different selection pressures than S. cerevisiae. This work expands the scope of genome-scale evolutionary studies from sequence-based analysis to rapid experimental characterization and could be adopted for functional mapping in any lineage of interest. Furthermore, our detailed characterization of S. bayanus provides a valuable resource for comparative functional genomics studies in yeast.
comparative genomics; yeast; gene expression
Kinesin-based transport is important for synaptogenesis, neuroplasticity, and maintaining synaptic function. In an anatomical screen of neurodevelopmental mutants, we identified the exchange of a conserved residue (R561H) in the forkhead-associated domain of the kinesin-3 family member Unc-104/KIF1A as the genetic cause for defects in synaptic terminal- and dendrite morphogenesis. Previous structure-based analysis suggested that the corresponding residue in KIF1A might be involved in stabilizing the activated state of kinesin-3 dimers. Herein we provide the first in vivo evidence for the functional importance of R561. The R561H allele (unc-104bris) is not embryonic lethal, which allowed us to investigate consequences of disturbed Unc-104 function on postembryonic synapse development and larval behavior. We demonstrate that Unc-104 regulates the reliable apposition of active zones and postsynaptic densities, possibly by controlling site-specific delivery of its cargo. Next, we identified a role for Unc-104 in restraining neuromuscular junction growth and coordinating dendrite branch morphogenesis, suggesting that Unc-104 is also involved in dendritic transport. Mutations in KIF1A/unc-104 have been associated with hereditary spastic paraplegia and hereditary sensory and autonomic neuropathy type 2. However, we did not observe synapse retraction or dystonic posterior paralysis. Overall, our study demonstrates the specificity of defects caused by selective impairments of distinct molecular motors and highlights the critical importance of Unc-104 for the maturation of neuronal structures during embryonic development, larval synaptic terminal outgrowth, and dendrite morphogenesis.
synapse; kinesin-3; FHA domain; hereditary spastic paraplegia
In this commentary, Rob Kulathinal describes two articles from the Perrimon lab, each describing a new online resource that can assist geneticists with the design of their RNA interference (RNAi) experiments. Hu et al.’s “UP-TORR: online tool for accurate and up-to-date annotation of RNAi reagents” and “FlyPrimerBank: An online database for Drosophila melanogaster gene expression analysis and knockdown evaluation of RNAi reagents” are published, respectively, in this month’s issues of GENETICS and G3.
Updated Targets of RNAi Reagents (UP-TORR); FlyPrimerBank; model organism database; RNAi; online resource; genetic map; genomic roadmap
The term “transcriptional network” refers to the mechanism(s) that underlies coordinated expression of genes, typically involving transcription factors (TFs) binding to the promoters of multiple genes, and individual genes controlled by multiple TFs. A multitude of studies in the last two decades have aimed to map and characterize transcriptional networks in the yeast Saccharomyces cerevisiae. We review the methodologies and accomplishments of these studies, as well as challenges we now face. For most yeast TFs, data have been collected on their sequence preferences, in vivo promoter occupancy, and gene expression profiles in deletion mutants. These systematic studies have led to the identification of new regulators of numerous cellular functions and shed light on the overall organization of yeast gene regulation. However, many yeast TFs appear to be inactive under standard laboratory growth conditions, and many of the available data were collected using techniques that have since been improved. Perhaps as a consequence, comprehensive and accurate mapping among TF sequence preferences, promoter binding, and gene expression remains an open challenge. We propose that the time is ripe for renewed systematic efforts toward a complete mapping of yeast transcriptional regulatory mechanisms.
yeast; transcription factors; regulatory networks; gene expression; chromatin
An article by Polley and Fay in this issue of GENETICS provides an excellent opportunity to introduce or reinforce concepts of reverse genetics and RNA interference, suppressor screens, synthetic phenotypes, and phenocopy. Necessary background, explanations of these concepts, and a sample approach to classroom use of the original article, including discussion questions, are provided.
primer; lin-35; C. elegans
Some genetic phenomena originate as mutations that are initially advantageous but decline in fitness until they become distinctly deleterious. Here I give the condition for a mutation–selection balance to form and describe some of the properties of the resulting equilibrium population. A characterization is also given of the fixation probabilities for such mutations.
mutation; selection; mutational load; probability of fixation
DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease.
BN; DA; F344; Rattus norvegicus; whole-genome sequencing; next-generation whole-genome sequencing (NGS)
We have adapted a bacterial CRISPR RNA/Cas9 system to precisely engineer the Drosophila genome and report that Cas9-mediated genomic modifications are efficiently transmitted through the germline. This RNA-guided Cas9 system can be rapidly programmed to generate targeted alleles for probing gene function in Drosophila.
CRISPR RNA; Cas9; homologous recombination; genome engineering; Drosophila
The absence of telomerase in many eukaryotes leads to the gradual shortening of telomeres, causing replicative senescence. In humans, this proliferation barrier constitutes a tumor suppressor mechanism and may be involved in cellular aging. Yet the heterogeneity of the senescence phenotype has hindered the understanding of its onset. Here we investigated the regulation of telomere length and its control of senescence heterogeneity. Because the length of the shortest telomeres can potentially regulate cell fate, we focus on their dynamics in Saccharomyces cerevisiae. We developed a stochastic model of telomere dynamics built on the protein-counting model, where an increasing number of protein-bound telomeric repeats shift telomeres into a nonextendable state by telomerase. Using numerical simulations, we found that the length of the shortest telomere is well separated from the length of the others, suggesting a prominent role in triggering senescence. We evaluated this possibility using classical genetic analyses of tetrads, combined with a quantitative and sensitive assay for senescence. In contrast to mitosis of telomerase-negative cells, which produces two cells with identical senescence onset, meiosis is able to segregate a determinant of senescence onset among the telomerase-negative spores. The frequency of such segregation is in accordance with this determinant being the length of the shortest telomere. Taken together, our results substantiate the length of the shortest telomere as being the key genetic marker determining senescence onset in S. cerevisiae.
telomere distribution; replicative senescence; Saccharomyces cerevisiae; genetic determinism; stochastic modeling
In Schizosaccharomyces pombe, over 90% of transcription factor genes are nonessential. Moreover, the majority do not exhibit significant growth defects under optimal conditions when deleted, complicating their functional characterization and target gene identification. Here, we systematically overexpressed 99 transcription factor genes with the nmt1 promoter and found that 64 transcription factor genes exhibited reduced fitness when ectopically expressed. Cell cycle defects were also often observed. We further investigated three uncharacterized transcription factor genes (toe1+–toe3+) that displayed cell elongation when overexpressed. Ectopic expression of toe1+ resulted in a G1 delay while toe2+ and toe3+ overexpression produced an accumulation of septated cells with abnormalities in septum formation and nuclear segregation, respectively. Transcriptome profiling and ChIP-chip analysis of the transcription factor overexpression strains indicated that Toe1 activates target genes of the pyrimidine-salvage pathway, while Toe3 regulates target genes involved in polyamine synthesis. We also found that ectopic expression of the putative target genes SPBC3H7.05c, and dad5+ and SPAC11D3.06 could recapitulate the cell cycle phenotypes of toe2+ and toe3+ overexpression, respectively. Furthermore, single deletions of the putative target genes urg2+ and SPAC1399.04c, and SPBC3H7.05c, SPACUNK4.15, and rds1+, could suppress the phenotypes of toe1+ and toe2+ overexpression, respectively. This study implicates new transcription factors and metabolism genes in cell cycle regulation and demonstrates the potential of systematic overexpression analysis to elucidate the function and target genes of transcription factors in S. pombe.
Schizosaccharomyces pombe; fission yeast; transcription factor; overexpression; microarray; cell cycle
During embryogenesis, an essential process known as dosage compensation is initiated to equalize gene expression from sex chromosomes. Although much is known about how dosage compensation is established, the consequences of modulating the stability of dosage compensation postembryonically are not known. Here we define a role for the Caenorhabditis elegans dosage compensation complex (DCC) in the regulation of DAF-2 insulin-like signaling. In a screen for dauer regulatory genes that control the activity of the FoxO transcription factor DAF-16, we isolated three mutant alleles of dpy-21, which encodes a conserved DCC component. Knockdown of multiple DCC components in hermaphrodite and male animals indicates that the dauer suppression phenotype of dpy-21 mutants is due to a defect in dosage compensation per se. In dpy-21 mutants, expression of several X-linked genes that promote dauer bypass is elevated, including four genes encoding components of the DAF-2 insulin-like pathway that antagonize DAF-16/FoxO activity. Accordingly, dpy-21 mutation reduced the expression of DAF-16/FoxO target genes by promoting the exclusion of DAF-16/FoxO from nuclei. Thus, dosage compensation enhances dauer arrest by repressing X-linked genes that promote reproductive development through the inhibition of DAF-16/FoxO nuclear translocation. This work is the first to establish a specific postembryonic function for dosage compensation in any organism. The influence of dosage compensation on dauer arrest, a larval developmental fate governed by the integration of multiple environmental inputs and signaling outputs, suggests that the dosage compensation machinery may respond to external cues by modulating signaling pathways through chromosome-wide regulation of gene expression.
Caenorhabditis elegans; dosage compensation; dauer; insulin signaling; DAF-16/FoxO
Males of the guppy (Poecilia reticulata) vary tremendously in their ornamental patterns, which are thought to have evolved in response to a complex interplay between natural and sexual selection. Although the selection pressures acting on the color patterns of the guppy have been extensively studied, little is known about the genes that control their ontogeny. Over 50 years ago, two autosomal color loci, blue and golden, were described, both of which play a decisive role in the formation of the guppy color pattern. Orange pigmentation is absent in the skin of guppies with a lesion in blue, suggesting a defect in xanthophore development. In golden mutants, the development of the melanophore pattern during embryogenesis and after birth is affected. Here, we show that blue and golden correspond to guppy orthologs of colony-stimulating factor 1 receptor a (csf1ra; previously called fms) and kita. Most excitingly, we found that both genes are required for the development of the black ornaments of guppy males, which in the case of csf1ra might be mediated by xanthophore–melanophore interactions. Furthermore, we provide evidence that two temporally and genetically distinct melanophore populations contribute to the adult camouflage pattern expressed in both sexes: one early appearing and kita-dependent and the other late-developing and kita-independent. The identification of csf1ra and kita mutants provides the first molecular insights into pigment pattern formation in this important model species for ecological and evolutionary genetics.
guppy; color pattern; pigment pattern development; Kita; Csf1ra
Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism’s entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse – affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set.
G matrix; factor model; sparsity; Bayesian inference; animal model
Segments of indentity-by-descent (IBD) detected from high-density genetic data are useful for many applications, including long-range phase determination, phasing family data, imputation, IBD mapping, and heritability analysis in founder populations. We present Refined IBD, a new method for IBD segment detection. Refined IBD achieves both computational efficiency and highly accurate IBD segment reporting by searching for IBD in two steps. The first step (identification) uses the GERMLINE algorithm to find shared haplotypes exceeding a length threshold. The second step (refinement) evaluates candidate segments with a probabilistic approach to assess the evidence for IBD. Like GERMLINE, Refined IBD allows for IBD reporting on a haplotype level, which facilitates determination of multi-individual IBD and allows for haplotype-based downstream analyses. To investigate the properties of Refined IBD, we simulate SNP data from a model with recent superexponential population growth that is designed to match United Kingdom data. The simulation results show that Refined IBD achieves a better power/accuracy profile than fastIBD or GERMLINE. We find that a single run of Refined IBD achieves greater power than 10 runs of fastIBD. We also apply Refined IBD to SNP data for samples from the United Kingdom and from Northern Finland and describe the IBD sharing in these data sets. Refined IBD is powerful, highly accurate, and easy to use and is implemented in Beagle version 4.
identity-by-descent (IBD) segments; Beagle; shared haplotypes
Charles Darwin’s long-term illness has been the subject of much speculation. His numerous symptoms have led to conclusions that his illness was essentially psychogenic in nature. These diagnoses have never been fully convincing, however, particularly in regard to the proposed underlying psychological background causes of the illness. Similarly, two proposed somatic causes of illness, Chagas disease and arsenic poisoning, lack credibility and appear inconsistent with the lifetime history of the illness. Other physical explanations are simply too incomplete to explain the range of symptoms. Here, a very different sort of explanation will be offered. We now know that mitochondrial mutations producing impaired mitochondrial function may result in a wide range of differing symptoms, including symptoms thought to be primarily psychological. Examination of Darwin’s maternal family history supports the contention that his illness was mitochondrial in nature; his mother and one maternal uncle had strange illnesses and the youngest maternal sibling died of an infirmity with symptoms characteristic of mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes (MELAS syndrome), a condition rooted in mitochondrial dysfunction. Darwin’s own symptoms are described here and are in accord with the hypothesis that he had the mtDNA mutation commonly associated with the MELAS syndrome.
Dicentric chromosomes undergo breakage in mitosis, resulting in chromosome deletions, duplications, and translocations. In this study, we map chromosome break sites of dicentrics in Saccharomyces cerevisiae by a mitotic recombination assay. The assay uses a diploid strain in which one homolog has a conditional centromere in addition to a wild-type centromere, and the other homolog has only the wild-type centromere; the conditional centromere is inactive when cells are grown in galactose and is activated when the cells are switched to glucose. In addition, the two homologs are distinguishable by multiple single-nucleotide polymorphisms (SNPs). Under conditions in which the conditional centromere is activated, the functionally dicentric chromosome undergoes double-stranded DNA breaks (DSBs) that can be repaired by mitotic recombination with the homolog. Such recombination events often lead to loss of heterozygosity (LOH) of SNPs that are centromere distal to the crossover. Using a PCR-based assay, we determined the position of LOH in multiple independent recombination events to a resolution of ∼4 kb. This analysis shows that dicentric chromosomes have recombination breakpoints that are broadly distributed between the two centromeres, although there is a clustering of breakpoints within 10 kb of the conditional centromere.
Saccharomyces cerevisiae; dicentric chromosomes; mitotic crossovers; loss of heterozygosity; break-induced replication
Genotyping-by-sequencing (GBS) approaches provide low-cost, high-density genotype information. However, GBS has unique technical considerations, including a substantial amount of missing data and a nonuniform distribution of sequence reads. The goal of this study was to characterize technical variation using this method and to develop methods to optimize read depth to obtain desired marker coverage. To empirically assess the distribution of fragments produced using GBS, ∼8.69 Gb of GBS data were generated on the Zea mays reference inbred B73, utilizing ApeKI for genome reduction and single-end reads between 75 and 81 bp in length. We observed wide variation in sequence coverage across sites. Approximately 76% of potentially observable cut site-adjacent sequence fragments had no sequencing reads whereas a portion had substantially greater read depth than expected, up to 2369 times the expected mean. The methods described in this article facilitate determination of sequencing depth in the context of empirically defined read depth to achieve desired marker density for genetic mapping studies.
genotyping-by-sequencing (GBS); coverage distribution; read depth; marker number
Mathematical models of meiosis that relate offspring to parental genotypes through parameters such as meiotic recombination frequency have been difficult to develop for polyploids. Existing models have limitations with respect to their analytic potential, their compatibility with insights into mechanistic aspects of meiosis, and their treatment of model parameters in terms of parameter dependencies. In this article I put forward a computational approach to the probabilistic modeling of meiosis. A computer program enumerates all possible paths through the phases of replication, pairing, recombination, and segregation, while keeping track of the probabilities of the paths according to the various parameters involved. Probabilities for classes of genotypes or phenotypes are added, and the resulting formulas are simplified by the symbolic-computation system Mathematica. An example application to autotetraploids results in a model that remedies the limitations of previous models mentioned above. In addition to the immediate implications, the computational approach presented here can be expected to be useful through opening avenues for modeling a host of processes, including meiosis in higher-order ploidies.
meiotic recombination frequency (MRF); tetraploidy; quadrivalent; pairing-partner switch; coefficient of double reduction
Understanding genetic causes and effects of speciation in sympatric populations of sexually reproducing eukaryotes is challenging, controversial, and of practical importance for controlling rapidly evolving pests and pathogens. The major African malaria vector mosquito Anopheles gambiae sensu stricto (s.s.) is considered to contain two incipient species with strong reproductive isolation, hybrids between the M and S molecular forms being very rare. Following recent observations of higher proportions of hybrid forms at a few sites in West Africa, we conducted new surveys of 12 sites in four contiguous countries (The Gambia, Senegal, Guinea-Bissau, and Republic of Guinea). Identification and genotyping of 3499 A. gambiae s.s. revealed high frequencies of M/S hybrid forms at each site, ranging from 5 to 42%, and a large spectrum of inbreeding coefficient values from 0.11 to 0.76, spanning most of the range expected between the alternative extremes of panmixia and assortative mating. Year-round sampling over 2 years at one of the sites in The Gambia showed that M/S hybrid forms had similar relative frequencies throughout periods of marked seasonal variation in mosquito breeding and abundance. Genome-wide scans with an Affymetrix high-density single-nucleotide polymorphism (SNP) microarray enabled replicate comparisons of pools of different molecular forms, in three separate populations. These showed strong differentiation between M and S forms only in the pericentromeric region of the X chromosome that contains the molecular form-specific marker locus, with only a few other loci showing minor differences. In the X chromosome, the M/S hybrid forms were more differentiated from M than from S forms, supporting a hypothesis of asymmetric introgression and backcrossing.
Genomics; speciation; hybridization; introgression; mosquito