A multiplexed analysis of the transcriptional regulation of yeast pseudohyphal growth recorded the binding of 28 different transcription factors with barcoded transposons. A core set of target genes is identified, and a process of DNA looping at the FLO11 locus that provides transcriptional memory for expression of the gene is described.
Pseudohyphal growth is a developmental pathway seen in some strains of yeast in which cells form multicellular filaments in response to environmental stresses. We used multiplexed transposon “Calling Cards” to record the genome-wide binding patterns of 28 transcription factors (TFs) in nitrogen-starved yeast. We identified TF targets relevant for pseudohyphal growth, producing a detailed map of its regulatory network. Using tools from graph theory, we identified 14 TFs that lie at the center of this network, including Flo8, Mss11, and Mfg1, which bind as a complex. Surprisingly, the DNA-binding preferences for these key TFs were unknown. Using Calling Card data, we predicted the in vivo DNA-binding motif for the Flo8-Mss11-Mfg1 complex and validated it using a reporter assay. We found that this complex binds several important targets, including FLO11, at both their promoter and termination sequences. We demonstrated that this binding pattern is the result of DNA looping, which regulates the transcription of these targets and is stabilized by an interaction with the nuclear pore complex. This looping provides yeast cells with a transcriptional memory, enabling them more rapidly to execute the filamentous growth program when nitrogen starved if they had been previously exposed to this condition.
Previous studies investigating a genetic basis for idiopathic pulmonary fibrosis (IPF) have focused on resequencing single genes in IPF kindreds or cohorts to determine the genetic contributions to IPF. None has investigated interactions among the candidate genes.
To compare the frequencies and interactions of mutations in six IPF-associated genes in a cohort of 132 individuals with IPF with those of a disease-control cohort of 192 individuals with chronic obstructive pulmonary disease (COPD) and the population represented in the Exome Variant Server.
We resequenced the genes encoding surfactant proteins A2 (SFTPA2), and C (SFTPC), the ATP binding cassette member A3 (ABCA3), telomerase (TERT), thyroid transcription factor (NKX2-1) and mucin 5B (MUC5B) and compared the collapsed frequencies of rare (minor allele frequency <1%), computationally predicted deleterious variants in each cohort. We also genotyped a common MUC5B promoter variant that is over-represented in individuals with IPF.
We found 15 mutations in 14 individuals (11%) in the IPF cohort: (SFTPA2 (n=1), SFTPC (n=5), ABCA3 (n=4) and TERT (n=5)). No individual with IPF had two different mutations, but one individual with IPF was homozygous for p.E292V, the most common ABCA3 disease-causing variant. We did not detect an interaction between any of the mutations and the MUC5B promoter variant.
Rare mutations in SFTPA2, SFTPC and TERT are collectively over-represented in individuals with IPF. Genetic analysis and counselling should be considered as part of the IPF evaluation.
Interstitial Fibrosis; Paediatric Lung Disaese; Rare lung diseases; COPD epidemiology
Cell-cell interactions between tumor cells and constituents of their microenvironment are critical determinants of tumor tissue biology and therapeutic responses. Interactions between glioblastoma (GBM) cells and endothelial cells (ECs) establish a purported cancer stem cell niche. We hypothesized that genes regulated by these interactions would be important, particularly as therapeutic targets. Using a computational approach, we deconvoluted expression data from a mixed physical co-culture of GBM cells and ECs and identified a previously undescribed upregulation of the cAMP specific phosphodiesterase PDE7B in GBM cells in response to direct contact with ECs. We further found that elevated PDE7B expression occurs in most GBM cases and has a negative effect on survival. PDE7B overexpression resulted in the expansion of a stem-like cell subpopulation in vitro and increased tumor growth and aggressiveness in an in vivo intracranial GBM model. Collectively these studies illustrate a novel approach for studying cell-cell interactions and identifying new therapeutic targets like PDE7B in GBM.
Current consensus identifies four molecular subtypes of medulloblastoma (MB): WNT, sonic hedgehog (SHH), and groups “3/C” and “4/D”. Group 4 is not well characterized, but harbors the most frequently observed chromosomal abnormality in MB, i17q, whose presence may confer a worse outcome. Recent publications have identified mutations in chromatin remodeling genes that may be overrepresented in this group, suggesting a biological role for these genes in i17q. This work seeks to explore the pathology that underlies i17q in MB. Specifically, we examine the prognostic significance of the previously-identified gene mutations in an independent set of MBs as well as to examine biological relevance of these genes and related pathways by gene expression profiling. The previously-implicated p53 signaling pathway is also examined as a putative driver of i17q tumor oncogenesis. The data show gene mutations associated with i17q tumors in previous studies (KMD6A, ZMYM3, MLL3 and GPS2) were correlated with significantly worse outcomes despite not being specific to i17q in this set. Expression of these genes did not appear to underlie the biology of the molecular variants. TP53 expression was significantly reduced in i17q/group 4 tumors; this could not be accounted for by dosage effects alone. Expression of regulators and mediators of p53 signaling were significantly altered in i17q tumors. Our findings support that chromatin remodeling gene mutations are associated with significantly worse outcomes in MB but cannot explain outcomes or pathogenesis of i17q tumors. However, expression analyses of the p53 signaling pathway shows alterations in i17q tumors that cannot be explained by dosage effects and is strongly suggestive of an oncogenic role.
Electronic supplementary material
The online version of this article (doi:10.1186/s40478-014-0074-1) contains supplementary material, which is available to authorized users.
Medulloblastoma; Group 4; i17q; Expression; Outcomes; Fluidigm; Next-generation sequencing
Clickable nanogel solutions were synthesized by using the copper catalyzed azide/alkyne cycloaddition (CuAAC) to partially polymerize solutions of azide and alkyne functionalized poly(ethylene glycol) (PEG) monomers. Coatings were fabricated using a second click reaction: a UV thiol-yne attachment of the nanogel solutions to mercaptosilanated glass. Because the CuAAC reaction was effectively halted by the addition of a copper-chelator, we were able to prevent bulk gelation and limit the coating thickness to a single monolayer of nanogels in the absence of the solution reaction. This enabled the inclusion of kosmotropic salts, which caused the PEG to phase-separate and nearly double the nanogel packing density, as confirmed by Quartz Crystal Microbalance with Dissipation (QCM-D). Protein adsorption was analyzed by single molecule counting with total internal reflection fluorescence (TIRF) microscopy and cell adhesion assays. Coatings formed from the phase-separated clickable nanogel solutions attached with salt adsorbed significantly less fibrinogen than other 100% PEG coatings tested, as well as poly-L-lysine-g-PEG (PLL-g-PEG) coatings. However, PEG/albumin nanogel coatings still outperformed the best 100% PEG clickable nanogel coatings. Additional surface crosslinking of the clickable nanogel coating in the presence of copper further reduced levels of fibrinogen adsorption closer to those of PEG/albumin nanogel coatings. However, this step negatively impacted long-term resistance to cell adhesion and dramatically altered the morphology of the coating by atomic force microscopy (AFM). The main benefit of the click strategy is that the partially polymerized solutions are stable almost indefinitely, allowing attachment in the phase-separated state without danger of bulk gelation, and thus, producing the best performing 100% PEG coating that we have studied to date.
BACKGROUND AND OBJECTIVE:
Neonatal respiratory distress syndrome (RDS) due to pulmonary surfactant deficiency is heritable, but common variants do not fully explain disease heritability.
Using next-generation, pooled sequencing of race-stratified DNA samples from infants ≥34 weeks’ gestation with and without RDS (n = 513) and from a Missouri population-based cohort (n = 1066), we scanned all exons of 5 surfactant-associated genes and used in silico algorithms to identify functional mutations. We validated each mutation with an independent genotyping platform and compared race-stratified, collapsed frequencies of rare mutations by gene to investigate disease associations and estimate attributable risk.
Single ABCA3 mutations were overrepresented among European-descent RDS infants (14.3% of RDS vs 3.7% of non-RDS; P = .002) but were not statistically overrepresented among African-descent RDS infants (4.5% of RDS vs 1.5% of non-RDS; P = .23). In the Missouri population-based cohort, 3.6% of European-descent and 1.5% of African-descent infants carried a single ABCA3 mutation. We found no mutations among the RDS infants and no evidence of contribution to population-based disease burden for SFTPC, CHPT1, LPCAT1, or PCYT1B.
In contrast to lethal neonatal RDS resulting from homozygous or compound heterozygous ABCA3 mutations, single ABCA3 mutations are overrepresented among European-descent infants ≥34 weeks’ gestation with RDS and account for ∼10.9% of the attributable risk among term and late preterm infants. Although ABCA3 mutations are individually rare, they are collectively common among European- and African-descent individuals in the general population.
genetic association studies; neonatal respiratory distress syndrome; newborn; respiratory distress syndrome
In order to facilitate understanding of pigment cell biology, we developed a method to concomitantly purify melanocytes, iridophores, and retinal pigmented epithelium from zebrafish, and analyzed their transcriptomes. Comparing expression data from these cell types and whole embryos allowed us to reveal gene expression co-enrichment in melanocytes and retinal pigmented epithelium, as well as in melanocytes and iridophores. We found 214 genes co-enriched in melanocytes and retinal pigmented epithelium, indicating the shared functions of melanin-producing cells. We found 62 genes significantly co-enriched in melanocytes and iridophores, illustrative of their shared developmental origins from the neural crest. This is also the first analysis of the iridophore transcriptome. Gene expression analysis for iridophores revealed extensive enrichment of specific enzymes to coordinate production of their guanine-based reflective pigment. We speculate the coordinated upregulation of specific enzymes from several metabolic pathways recycles the rate-limiting substrate for purine synthesis, phosphoribosyl pyrophosphate, thus constituting a guanine cycle. The purification procedure and expression analysis described here, along with the accompanying transcriptome-wide expression data, provide the first mRNA sequencing data for multiple purified zebrafish pigment cell types, and will be a useful resource for further studies of pigment cell biology.
Human leukocyte antigen (HLA) typing at the allelic level can in theory be achieved using whole exome sequencing (exome-seq) data with no added cost but has been hindered by its computational challenge. We developed ATHLATES, a program that applies assembly, allele identification and allelic pair inference to short read sequences, and applied it to data from Illumina platforms. In 15 data sets with adequate coverage for HLA-A, -B, -C, -DRB1 and -DQB1 genes, ATHLATES correctly reported 74 out of 75 allelic pairs with an overall concordance rate of 99% compared with conventional typing. This novel approach should be broadly applicable to research and clinical laboratories.
DNA methylation is a mechanism for long-term transcriptional regulation and is required for normal cellular differentiation. Failure to properly establish or maintain DNA methylation patterns leads to cell dysfunction and diseases such as cancer. Identifying DNA methylation signatures in complex tissues can be challenging owing to inaccurate cell enrichment methods and low DNA yields. We have developed a technique called laser capture microdissection-reduced representation bisulfite sequencing (LCM-RRBS) for the multiplexed interrogation of the DNA methylation status of cytosine–guanine dinucleotide islands and promoters. LCM-RRBS accurately and reproducibly profiles genome-wide methylation of DNA extracted from microdissected fresh frozen or formalin-fixed paraffin-embedded tissue samples. To demonstrate the utility of LCM-RRBS, we characterized changes in DNA methylation associated with gonadectomy-induced adrenocortical neoplasia in the mouse. Compared with adjacent normal tissue, the adrenocortical tumors showed reproducible gains and losses of DNA methylation at genes involved in cell differentiation and organ development. LCM-RRBS is a rapid, cost-effective, and sensitive technique for analyzing DNA methylation in heterogeneous tissues and will facilitate the investigation of DNA methylation in cancer and organ development.
The use of NextGen Sequencing clinically necessitates the need for informatics tools that support the complete workflow from sample accessioning to data analysis and reporting. To address this need we have developed Clinical Genomicist Workstation (CGW). CGW is a secure, n-tiered application where web browser submits requests to application servers that persist the data in a relational database. CGW is used by Washington University Genomic and Pathology Services for clinical genomic testing of many cancers. CGW has been used to accession, analyze and sign out over 409 cases since November, 2011. There are 22 ordering oncologists and 7 clinical genomicists that use the CGW. In summary, CGW a ‘soup-to-nuts’ solution to track, analyze, interpret, and report clinical genomic diagnostic tests.
Genome-wide association studies have identified common variation in the CHRNA5–CHRNA3–CHRNB4 and CHRNA6–CHRNB3 gene clusters that contribute to nicotine dependence. However, the role of rare variation in risk for nicotine dependence in these nicotinic receptor genes has not been studied. We undertook pooled sequencing of the coding regions and flanking sequence of the CHRNA5, CHRNA3, CHRNB4, CHRNA6 and CHRNB3 genes in African American and European American nicotine-dependent smokers and smokers without symptoms of dependence. Carrier status of individuals harboring rare missense variants at conserved sites in each of these genes was then compared in cases and controls to test for an association with nicotine dependence. Missense variants at conserved residues in CHRNB4 are associated with lower risk for nicotine dependence in African Americans and European Americans (AA P = 0.0025, odds-ratio (OR) = 0.31, 95% confidence-interval (CI) = 0.31–0.72; EA P = 0.023, OR = 0.69, 95% CI = 0.50–0.95). Furthermore, these individuals were found to smoke fewer cigarettes per day than non-carriers (AA P = 6.6 × 10−5, EA P = 0.021). Given the possibility of stochastic differences in rare allele frequencies between groups replication of this association is necessary to confirm these findings. The functional effects of the two CHRNB4 variants contributing most to this association (T375I and T91I) and a missense variant in CHRNA3 (R37H) in strong linkage disequilibrium with T91I were examined in vitro. The minor allele of each polymorphism increased cellular response to nicotine (T375I P = 0.01, T91I P = 0.02, R37H P = 0.003), but the largest effect on in vitro receptor activity was seen in the presence of both CHRNB4 T91I and CHRNA3 R37H (P = 2 × 10−6).
As DNA sequencing technology has markedly advanced in recent years2, it has become increasingly evident that the amount of genetic variation between any two individuals is greater than previously thought3. In contrast, array-based genotyping has failed to identify a significant contribution of common sequence variants to the phenotypic variability of common disease4,5. Taken together, these observations have led to the evolution of the Common Disease / Rare Variant hypothesis suggesting that the majority of the "missing heritability" in common and complex phenotypes is instead due to an individual's personal profile of rare or private DNA variants6-8. However, characterizing how rare variation impacts complex phenotypes requires the analysis of many affected individuals at many genomic loci, and is ideally compared to a similar survey in an unaffected cohort. Despite the sequencing power offered by today's platforms, a population-based survey of many genomic loci and the subsequent computational analysis required remains prohibitive for many investigators.
To address this need, we have developed a pooled sequencing approach1,9 and a novel software package1 for highly accurate rare variant detection from the resulting data. The ability to pool genomes from entire populations of affected individuals and survey the degree of genetic variation at multiple targeted regions in a single sequencing library provides excellent cost and time savings to traditional single-sample sequencing methodology. With a mean sequencing coverage per allele of 25-fold, our custom algorithm, SPLINTER, uses an internal variant calling control strategy to call insertions, deletions and substitutions up to four base pairs in length with high sensitivity and specificity from pools of up to 1 mutant allele in 500 individuals. Here we describe the method for preparing the pooled sequencing library followed by step-by-step instructions on how to use the SPLINTER package for pooled sequencing analysis (http://www.ibridgenetwork.org/wustl/splinter). We show a comparison between pooled sequencing of 947 individuals, all of whom also underwent genome-wide array, at over 20kb of sequencing per person. Concordance between genotyping of tagged and novel variants called in the pooled sample were excellent. This method can be easily scaled up to any number of genomic loci and any number of individuals. By incorporating the internal positive and negative amplicon controls at ratios that mimic the population under study, the algorithm can be calibrated for optimal performance. This strategy can also be modified for use with hybridization capture or individual-specific barcodes and can be applied to the sequencing of naturally heterogeneous samples, such as tumor DNA.
Genetics; Issue 64; Genomics; Cancer Biology; Bioinformatics; Pooled DNA sequencing; SPLINTER; rare genetic variants; genetic screening; phenotype; high throughput; computational analysis; DNA; PCR; primers
The insertion element IS6110 is one of the main sources of genomic variability in Mycobacterium tuberculosis, the etiological agent of human tuberculosis. Although IS 6110 has been used extensively as an epidemiological marker, the identification of the precise chromosomal insertion sites has been limited by technical challenges. Here, we present IS-seq, a novel method that combines high-throughput sequencing using Illumina technology with efficient combinatorial sample multiplexing to simultaneously probe 519 clinical isolates, identifying almost all the flanking regions of the element in a single experiment.
We identified a total of 6,976 IS6110 flanking regions on the different isolates. When validated using reference strains, the method had 100% specificity and 98% positive predictive value. The insertions mapped to both coding and non-coding regions, and in some cases interrupted genes thought to be essential for virulence or in vitro growth. Strains were classified into families using insertion sites, and high agreement with previous studies was observed.
This high-throughput IS-seq method, which can also be used to map insertions in other organisms, extends previous surveys of in vivo interrupted loci and provides a baseline for probing the consequences of disruptions in M. tuberculosis strains.
Pathogenic mutations in APP, PSEN1, PSEN2, MAPT and GRN have previously been linked to familial early onset forms of dementia. Mutation screening in these genes has been performed in either very small series or in single families with late onset AD (LOAD). Similarly, studies in single families have reported mutations in MAPT and GRN associated with clinical AD but no systematic screen of a large dataset has been performed to determine how frequently this occurs. We report sequence data for 439 probands from late-onset AD families with a history of four or more affected individuals. Sixty sequenced individuals (13.7%) carried a novel or pathogenic mutation. Eight pathogenic variants, (one each in APP and MAPT, two in PSEN1 and four in GRN) three of which are novel, were found in 14 samples. Thirteen additional variants, present in 23 families, did not segregate with disease, but the frequency of these variants is higher in AD cases than controls, indicating that these variants may also modify risk for disease. The frequency of rare variants in these genes in this series is significantly higher than in the 1,000 genome project (p = 5.09×10−5; OR = 2.21; 95%CI = 1.49–3.28) or an unselected population of 12,481 samples (p = 6.82×10−5; OR = 2.19; 95%CI = 1.347–3.26). Rare coding variants in APP, PSEN1 and PSEN2, increase risk for or cause late onset AD. The presence of variants in these genes in LOAD and early-onset AD demonstrates that factors other than the mutation can impact the age at onset and penetrance of at least some variants associated with AD. MAPT and GRN mutations can be found in clinical series of AD most likely due to misdiagnosis. This study clearly demonstrates that rare variants in these genes could explain an important proportion of genetic heritability of AD, which is not detected by GWAS.
Surfaces that resist protein adsorption are important for many bioanalytical applications. Bovine serum albumin (BSA) coatings and multi-arm poly(ethylene glycol) (PEG) coatings display low levels of non-specific protein adsorption and have enabled highly quantitative single-molecule (SM) protein studies. Recently, a method was developed for coating a glass with PEG–BSA nanogels, a promising hybrid of these two low-background coatings. We characterized the nanogel coating to determine its suitability for SM protein experiments. SM adsorption counting revealed that nanogel-coated surfaces exhibit lower protein adsorption than covalently coupled BSA surfaces and monolayers of multi-arm PEG, so this surface displays one of the lowest degrees of protein adsorption yet observed. Additionally, the nanogel coating was resistant to DNA adsorption, underscoring the utility of the coating across a variety of SM experiments. The nanogel coating was found to be compatible with surfactants, whereas the BSA coating was not. Finally, applying the coating to a real-world study, we found that single ligand molecules could be tethered to this surface and detected with high sensitivity and specificity by a digital immunoassay. These results suggest that PEG–BSA nanogel coatings will be highly useful for the SM analysis of proteins.
adsorption; total internal reflection fluorescence; antibody binding; protein detection; digital immunoassay; surfactant
The human gut microbiota is a metabolic organ whose cellular composition is determined by a dynamic process of selection and competition. To identify microbial genes required for establishment of human symbionts in the gut, we developed an approach (insertion-sequencing, or INSeq) based on a mutagenic transposon that allows capture of adjacent chromosomal DNA to define its genomic location. We used massively parallel sequencing to monitor the relative abundance of tens of thousands of transposon mutants of a saccharolytic human gut bacterium, Bacteroides thetaiotaomicron, as they established themselves in wild-type and immunodeficient gnotobiotic mice, in the presence or absence of other human gut commensals. In vivo selection transforms this population, revealing functions necessary for survival in the gut: we show how this selection is influenced by community composition and competition for nutrients (vitamin B12). INSeq provides a broadly applicable platform to explore microbial adaptation to the gut and other ecosystems.
Sporadic heart failure is thought to have a genetic component, but the contributing genetic events are poorly defined. Here, we used ultra-high-throughput resequencing of pooled DNAs to identify SNPs in 4 biologically relevant cardiac signaling genes, and then examined the association between allelic variants and incidence of sporadic heart failure in 2 large Caucasian populations. Resequencing of DNA pools, each containing DNA from approximately 100 individuals, was rapid, accurate, and highly sensitive for identifying common and rare SNPs; it also had striking advantages in time and cost efficiencies over individual resequencing using conventional Sanger methods. In 2,606 individuals examined, we identified a total of 129 separate SNPs in the 4 cardiac signaling genes, including 23 nonsynonymous SNPs that we believe to be novel. Comparison of allele frequencies between 625 Caucasian nonaffected controls and 1,117 Caucasian individuals with systolic heart failure revealed 12 SNPs in the cardiovascular heat shock protein gene HSPB7 with greater proportional representation in the systolic heart failure group; all 12 SNPs were confirmed in an independent replication study. These SNPs were found to be in tight linkage disequilibrium, likely reflecting a single genetic event, but none altered amino acid sequence. These results establish the power and applicability of pooled resequencing for comparative SNP association analysis of target subgenomes in large populations and identify an association between multiple HSPB7 polymorphisms and heart failure.
Rare germline variants are difficult to identify using traditional sequencing due to relatively high cost and low throughput. Using second-generation sequencing, we report a targeted, cost-effective method to quantify rare SNPs from pooled genomic DNA. We pooled DNA from 1,111 individuals and targeted four genes. Our novel base-calling algorithm, SNPSeeker, derived from Large Deviation theory, can detect SNPs present at frequencies below the raw error rate of the sequencing platform
A single tumor may contain cells with different somatic mutations. By characterizing this genetic heterogeneity within tumors, advances have been made in the prognosis, treatment and understanding of tumorigenesis. In contrast, the extent of epigenetic intra-tumor heterogeneity and how it influences tumor biology is under-explored. We have characterized epigenetic heterogeneity within individual tumors using next-generation sequencing. We used deep single molecule bisulfite sequencing and sample-specific DNA barcodes to determine the spectrum of MLH1 promoter methylation across an average of 1000 molecules in each of 33 individual samples in parallel, including endometrial cancer, matched blood and normal endometrium. This first glimpse, deep into each tumor, revealed unexpectedly heterogeneous patterns of methylation at the MLH1 promoter within a subset of endometrial tumors. This high-resolution analysis allowed us to measure the clonality of methylation in individual tumors and gain insight into the accumulation of aberrant promoter methylation on both alleles during tumorigenesis.
We describe a strategy to analyze the impact of single nucleotide mutations on protein function. Our method utilizes a combination of yeast functional complementation, growth competition of mutant pools and polyacrylamide gel immobilized PCR. A system was constructed in which the yeast PGK1 gene was expressed from a plasmid-borne copy of the gene in a PGK1 deletion strain of Saccharomyces cerevisiae. Using this system, we demonstrated that the enrichment or depletion of PGK1 point mutants from a mixed culture was consistent with the expected results based on the isolated growth rates of the mutants. Enrichment or depletion of individual point mutants was shown to result from increases or decreases, respectively, in the specific activities of the encoded proteins. Further, we demonstrate the ability to analyze the functional effect of many individual point mutations in parallel. By functional complementation of yeast deletions with human homologs, our technique could be readily applied to the functional analysis of single nucleotide polymorphisms in human genes of medical interest.
Ohno [Ohno, S. (1970) in Evolution by Gene Duplication, Springer, New York] proposed that gene duplication with subsequent divergence of paralogs could be a major force in the evolution of new gene functions. In practice the functional differences between closely related homologues produced by duplications can be subtle and difficult to separate experimentally. Here we show that DNA microarrays can distinguish the functions of two closely related homologues from the yeast Saccharomyces cerevisiae, Yap1p and Yap2p. Although Yap1p and Yap2p are both bZIP transcription factors involved in multiple stress responses and are 88% identical in their DNA binding domains, our work shows that these proteins activate nonoverlapping sets of genes. Yap1p controls a set of genes involved in detoxifying the effects of reactive oxygen species, whereas Yap2p controls a set of genes over represented for the function of stabilizing proteins. In addition we show that the binding sites in the promoters of the Yap1p-dependent genes differ from the sites in the promoters of Yap2p-dependent genes and we validate experimentally that these differences are important for regulation by Yap1p. We conclude that while Yap1p and Yap2p may have some overlapping functions they are clearly not redundant and, more generally, that DNA microarray analysis will be an important tool for distinguishing the functions of the large numbers of highly conserved genes found in all eukaryotic genomes.