Recurrent chromosomal aberrations in solid tumors can reveal the genetic pathways involved in the evolution of a malignancy and in some cases predict biological behavior. However, the role of individual genetic backgrounds in shaping karyotypes of sporadic tumors is unknown. The genetic structure of purebred dog breeds coupled with their susceptibility to spontaneous cancers provides a robust model with which to address this question. We tested the hypothesis that there is an association between breed and the distribution of genomic copy number imbalances in naturally-occurring canine tumors through assessment of a cohort of Golden Retrievers and Rottweilers diagnosed with spontaneous appendicular osteosarcoma. Our findings reveal significant correlations between breed and tumor karyotypes that are independent from gender, age at diagnosis and histological classification. These data indicate for the first time that individual genetic backgrounds, as defined by breed in dogs, influence tumor karyotypes in a cancer with extensive genomic instability.
microarray; comparative genomic hybridization (CGH); canine; osteosarcoma; chromosome
To identify susceptibility loci for visceral leishmaniasis we undertook genome-wide association studies in two populations; 989 cases and 1089 controls from India, and 357 cases in 308 Brazilian families (1970 individuals). The HLA-DRB1-HLA-DQA1 locus was the only region to show strong evidence of association in both populations. Replication at this region was undertaken in a second Indian population comprising 941 cases and 990 controls, resulting in Pcombined=2.76×10−17 and OR(95%CI)=1.41(1.30-1.52) across the three cohorts at rs9271858. A conditional analysis provided evidence for multiple associations within the HLA-DRB1-HLA-DQA1 region, and a model in which risk differed between three groups of haplotypes better explained the signal and was significant in the Indian discovery and replication cohorts. In conclusion the HLA-DRB1-HLA-DQA1 HLA class II region contributes to visceral leishmaniasis susceptibility in India and Brazil, suggesting shared genetic risk factors for visceral leishmaniasis that cross the epidemiological divides of geography and parasite species.
To gain further insight into the genetic architecture of psoriasis, we conducted a meta-analysis of three genome-wide association studies (GWAS) and two independent datasets genotyped on the Immunochip, involving 10,588 cases and 22,806 controls in total. We identified 15 new disease susceptibility regions, increasing the number of psoriasis-associated loci to 36 for Caucasians. Conditional analyses identified five independent signals within previously known loci. The newly identified shared disease regions encompassed a number of genes whose products regulate T-cell function (e.g. RUNX3, TAGAP and STAT3). The new psoriasis-specific regions were notable for candidate genes whose products are involved in innate host defense, encoding proteins with roles in interferon-mediated antiviral responses (DDX58), macrophage activation (ZC3H12C), and NF-κB signaling (CARD14 and CARM1). These results portend a better understanding of shared and distinctive genetic determinants of immune-mediated inflammatory disorders and emphasize the importance of the skin in innate and acquired host defense.
Using the Immunochip custom single nucleotide polymorphism (SNP) array, designed for dense genotyping of 186 genome wide association study (GWAS) confirmed loci we analysed 11,475 rheumatoid arthritis cases of European ancestry and 15,870 controls for 129,464 markers. The data were combined in meta-analysis with GWAS data from additional independent cases (n=2,363) and controls (n=17,872). We identified fourteen novel loci; nine were associated with rheumatoid arthritis overall and 5 specifically in anti-citrillunated peptide antibody positive disease, bringing the number of confirmed European ancestry rheumatoid arthritis loci to 46. We refined the peak of association to a single gene for 19 loci, identified secondary independent effects at six loci and association to low frequency variants (minor allele frequency <0.05) at 4 loci. Bioinformatic analysis of the data generated strong hypotheses for the causal SNP at seven loci. This study illustrates the advantages of dense SNP mapping analysis to inform subsequent functional investigations.
Breast cancer is the most common cancer in women in developed countries. To identify common breast cancer susceptibility alleles, we conducted a genome-wide association study in which 582,886 SNPs were genotyped in 3,659 cases with a family history of the disease and 4,897 controls. Promising associations were evaluated in a second stage, comprising 12,576 cases and 12,223 controls. We identified five new susceptibility loci, on chromosomes 9, 10 and 11 (P = 4.6 × 10−7 to P = 3.2 × 10−15). We also identified SNPs in the 6q25.1 (rs3757318, P = 2.9 × 10−6), 8q24 (rs1562430, P = 5.8 × 10−7) and LSP1 (rs909116, P = 7.3 × 10−7) regions that showed more significant association with risk than those reported previously. Previously identified breast cancer susceptibility loci were also found to show larger effect sizes in this study of familial breast cancer cases than in previous population-based studies, consistent with polygenic susceptibility to the disease.
Trans-acting genetic variants play a substantial, albeit poorly characterized, role in the heritable determination of gene expression. Using paired purified primary monocytes and B-cells we identify novel, predominantly cell-specific, cis- and trans-eQTL (expression quantitative trait loci). These include multi-locus trans-associations to LYZ in monocytes and to KLF4 in B-cells. Additionally, we observe B-cell specific trans-association of rs11171739 at 12q13.2, a known autoimmune disease locus, to IP6K2 (pB-cell=5.8×10−15), PRIC285 (pB-cell=3.0×10−10) and an upstream region of CDKN1A (pB-cell=2×10−52; pmonocyte=1.8×10−4), suggesting roles for cell cycle regulation and PPARγ signaling in disease pathogenesis. We also find specific HLA alleles forming trans-association with the expression of AOAH and ARHGAP24 in monocytes but not in B-cells. In summary, we demonstrate that mapping gene expression in defined primary cell populations identifies new cell-specific trans-regulated networks and provides insights into the genetic basis of disease susceptibility.
Esophageal squamous cell carcinoma (ESCC) is highly prevalent in China and other Asian countries, as a major cause of cancer-related mortality. ESCC displays complex chromosomal abnormalities, including multiple structural and numerical aberrations. Chromosomal abnormalities, such as recurrent amplifications and homozygous deletions, directly contribute to tumorigenesis through altering the expression of key oncogenes and tumor suppressor genes.
To understand the role of genetic alterations in ESCC pathogenesis and identify critical amplification/deletion targets, we performed genome-wide 1-Mb array comparative genomic hybridization (aCGH) analysis for 10 commonly used ESCC cell lines. Recurrent chromosomal gains were frequently detected on 3q26-27, 5p15-14, 8p12, 8p22-24, 11q13, 13q21-31, 18p11 and 20q11-13, with frequent losses also found on 8p23-22, 11q22, 14q32 and 18q11-23. Gain of 11q13.3-13.4 was the most frequent alteration in ESCC. Within this region, CCND1 oncogene was identified with high level of amplification and overexpression in ESCC, while FGF19 and SHANK2 was also remarkably over-expressed. Moreover, a high concordance (91.5%) of gene amplification and protein overexpression of CCND1 was observed in primary ESCC tumors. CCND1 amplification/overexpression was also significantly correlated with the lymph node metastasis of ESCC.
These findings suggest that genomic gain of 11q13 is the major mechanism contributing to the amplification. Novel oncogenes identified within the 11q13 amplicon including FGF19 and SHANK2 may play important roles in ESCC tumorigenesis.
We densely genotyped, using 1000 Genomes Project pilot CEU and additional re-sequencing study variants, 183 reported immune-mediated disease non-HLA risk loci in 12,041 celiac disease cases and 12,228 controls. We identified 13 new celiac disease risk loci at genome wide significance, bringing the total number of known loci (including HLA) to 40. Multiple independent association signals are found at over a third of these loci, attributable to a combination of common, low frequency, and rare genetic variants. In comparison with previously available data such as HapMap3, our dense genotyping in a large sample size provided increased resolution of the pattern of linkage disequilibrium, and suggested localization of many signals to finer scale regions. In particular, 29 of 54 fine-mapped signals appeared localized to specific single genes - and in some instances to gene regulatory elements. We define a complex genetic architecture of risk regions, and refine risk signals, providing a next step towards elucidating causal disease mechanisms.
Numerous attributes render the domestic dog a highly pertinent model for cancer-associated gene discovery. We performed microarray-based comparative genomic hybridization analysis of 60 spontaneous canine intracranial tumors to examine the degree to which dog and human patients exhibit aberrations of ancestrally related chromosome regions, consistent with a shared pathogenesis. Canine gliomas and meningiomas both demonstrated chromosome copy number aberrations (CNAs) that share evolutionarily conserved synteny with those previously reported in their human counterpart. Interestingly, however, genomic imbalances orthologous to some of the hallmark aberrations of human intracranial tumors, including chromosome 22/NF2 deletions in meningiomas and chromosome 1p/19q deletions in oligodendrogliomas, were not major events in the dog. Furthermore, and perhaps most significantly, we identified highly recurrent CNAs in canine intracranial tumors for which the human orthologue has been reported previously at low frequency but which have not, thus far, been associated intimately with the pathogenesis of the tumor. The presence of orthologous CNAs in canine and human intracranial cancers is strongly suggestive of their biological significance in tumor development and/or progression. Moreover, the limited genetic heterogenity within purebred dog populations, coupled with the contrasting organization of the dog and human karyotypes, offers tremendous opportunities for refining evolutionarily conserved regions of tumor-associated genomic imbalance that may harbor novel candidate genes involved in their pathogenesis. A comparative approach to the study of canine and human intracranial tumors may therefore provide new insights into their genetic etiology, towards development of more sophisticated molecular subclassification and tailored therapies in both species.
Comparative genomic hybridization; Canine; Brain tumor; Chromosome; Microarray
It is widely accepted that atherosclerosis and inflammation are intimately linked. Monocytes play a key role in both of these processes and we hypothesized that activation of inflammatory pathways in monocytes would lead to, among others, proatherogenic changes in the monocyte transcriptome. Such differentially expressed genes in circulating monocytes would be strong candidates for further investigation in disease association studies.
Endotoxin, lipopolysaccharide (LPS), or saline control was infused in healthy volunteers. Monocyte RNA was isolated, processed and hybridized to Hver 2.1.1 spotted cDNA microarrays. Differential expression of key genes was confirmed by RT-PCR and results were compared to in vitro data obtained by our group to identify candidate genes.
All subjects who received LPS experienced the anticipated clinical response indicating successful stimulation. One hour after LPS infusion, 11 genes were identified as being differentially expressed; 1 down regulated and 10 up regulated. Four hours after LPS infusion, 28 genes were identified as being differentially expressed; 3 being down regulated and 25 up regulated. No genes were significantly differentially expressed following saline infusion. Comparison with results obtained in in vitro experiments lead to the identification of 6 strong candidate genes (BATF, BID, C3aR1, IL1RN, SEC61B and SLC43A3)
In vivo endotoxin exposure of healthy individuals resulted in the identification of several candidate genes through which systemic inflammation links to atherosclerosis.
Human; Monocytes; LPS infusion; Transcriptome; In Vivo
Metformin is the most commonly used pharmacological therapy for type 2 diabetes. We carried out a GWA study on glycaemic response to metformin in 1024 Scottish patients with type 2 diabetes. Replication was in two cohorts consisting of 1783 Scottish patients and 1113 patients from the UK Prospective Diabetes Study. In a meta-analysis (n=3920) we observed an association (P=2.9 *10−9) for a SNP rs11212617 at a locus containing the ataxia telangiectasia mutated (ATM) gene with an odds ratio of 1.35 (95% CI 1.22 to 1.49) for treatment success. In a rat hepatoma cell line, inhibition of ATM with KU-55933 attenuated the phosphorylation and activation of AMPK in response to metformin. We conclude that ATM, a gene known to be involved in DNA repair and cell cycle control, plays a role in the effect of metformin upstream of AMPK, and variation in this gene alters glycaemic response to metformin.
We studied the status of chromosomes 1 and 19 in 363 astrocytic and oligodendroglial tumors. Whereas the predominant pattern of copy number abnormality was a concurrent loss of the entire 1p and 19q regions (total 1p/19q loss) among oligodendroglial tumors and partial deletions of 1p and/or 19q in astrocytic tumors, a subset of apparently astrocytic tumors also had total 1p/19q loss. The presence of total 1p/19q loss was associated with longer survival of patients with all types of adult gliomas independent of age and diagnosis (P = .041). The most commonly deleted region on 19q in astrocytic tumors spans 885 kb in 19q13.33–q13.41, which is telomeric to the previously proposed region. Novel regions of homozygous deletion, including a part of DPYD (1p21.3) or the KLK cluster (19q13.33), were observed in anaplastic oligodendrogliomas. Amplifications encompassing AKT2 (19q13.2) or CCNE1 (19q12) were identified in some glioblastomas. Deletion mapping of the centromeric regions of 1p and 19q in the tumors that had total 1p/19q loss, indicating that the breakpoints lie centromeric to NOTCH2 within the pericentromeric regions of 1p and 19q. Thus, we show that the copy number abnormalities of 1p and 19q in human gliomas are complex and have distinct patterns that are prognostically predictive independent of age and pathological diagnosis. An accurate identification of total 1p/19q loss and discriminating this from other 1p/19q changes is, however, critical when the 1p/19q copy number status is used to stratify patients in clinical trials.
array-CGH; astrocytoma; centromere; deletion; microarray; oligodendroglioma; translocation
We performed a genome-wide association study (GWAS) in 1705 Parkinson's disease (PD) UK patients and 5175 UK controls, the largest sample size so far for a PD GWAS. Replication was attempted in an additional cohort of 1039 French PD cases and 1984 controls for the 27 regions showing the strongest evidence of association (P< 10−4). We replicated published associations in the 4q22/SNCA and 17q21/MAPT chromosome regions (P< 10−10) and found evidence for an additional independent association in 4q22/SNCA. A detailed analysis of the haplotype structure at 17q21 showed that there are three separate risk groups within this region. We found weak but consistent evidence of association for common variants located in three previously published associated regions (4p15/BST1, 4p16/GAK and 1q32/PARK16). We found no support for the previously reported SNP association in 12q12/LRRK2. We also found an association of the two SNPs in 4q22/SNCA with the age of onset of the disease.
It has recently been shown that nucleosome distribution, histone modifications and RNA polymerase II (Pol II) occupancy show preferential association with exons (“exon-intron marking”), linking chromatin structure and function to co-transcriptional splicing in a variety of eukaryotes. Previous ChIP-sequencing studies suggested that these marking patterns reflect the nucleosomal landscape. By analyzing ChIP-chip datasets across the human genome in three cell types, we have found that this marking system is far more complex than previously observed. We show here that a range of histone modifications and Pol II are preferentially associated with exons. However, there is noticeable cell-type specificity in the degree of exon marking by histone modifications and, surprisingly, this is also reflected in some histone modifications patterns showing biases towards introns. Exon-intron marking is laid down in the absence of transcription on silent genes, with some marking biases changing or becoming reversed for genes expressed at different levels. Furthermore, the relationship of this marking system with splicing is not simple, with only some histone modifications reflecting exon usage/inclusion, while others mirror patterns of exon exclusion. By examining nucleosomal distributions in all three cell types, we demonstrate that these histone modification patterns cannot solely be accounted for by differences in nucleosome levels between exons and introns. In addition, because of inherent differences between ChIP-chip array and ChIP-sequencing approaches, these platforms report different nucleosome distribution patterns across the human genome. Our findings confound existing views and point to active cellular mechanisms which dynamically regulate histone modification levels and account for exon-intron marking. We believe that these histone modification patterns provide links between chromatin accessibility, Pol II movement and co-transcriptional splicing.
T-pro are tumor-infiltrating TCRαβ+CD8+ cells of reduced cytotoxic potential that promote experimental two-stage chemical cutaneous carcinogenesis. Toward understanding their mechanism of action, this study uses whole-genome expression analysis to compare T-pro with systemic CD8+ T cells from multiple groups of tumor-bearing mice. T-pro show an overt T helper 17–like profile (high retinoic acid–related orphan receptor-(ROR)γt, IL-17A, IL-17F; low T-bet and eomesodermin), regulatory potential (high FoxP3, IL-10, Tim-3), and transcripts encoding epithelial growth factors (amphiregulin, Gro-1, Gro-2). Tricolor flow cytometry subsequently confirmed the presence of TCRβ+ CD8+ IL-17+ T cells among tumor-infiltrating lymphocytes (TILs). Moreover, a time-course analysis of independent TIL isolates from papillomas versus carcinomas exposed a clear association of the “T-pro phenotype” with malignant progression. This molecular characterization of T-pro builds a foundation for elucidating the contributions of inflammation to cutaneous carcinogenesis, and may provide useful biomarkers for cancer immunotherapy in which the widely advocated use of tumor-specific CD8+ cytolytic T cells should perhaps accommodate the cells’ potential corruption toward the T-pro phenotype. The data are also likely germane to psoriasis, in which the epidermis may be infiltrated by CD8+ IL-17-producing T cells.
We report an alternative approach to transcriptome sequencing for the Illumina Genome Analyzer, in which the reverse transcription reaction takes place on the flowcell. No amplification is performed during the library preparation, so PCR biases and duplicates are avoided. Since the template is poly A+ RNA rather than cDNA, the resulting sequences are necessarily strand-specific. The method is compatible with paired- or single-ended sequencing.
Tumor Suppressor genes (TSGs) often locate at chromosomal regions with frequent deletions in tumors. Loss of 16q23 occurs frequently in multiple tumors, indicating the presence of critical TSGs at this locus, such as the well-studied WWOX. Herein we found that ADAMTS18, located next to WWOX, was significantly downregulated in multiple carcinoma cell lines. No deletion of ADAMTS18 was detected with multiplex differential DNA-PCR or high resolution 1-Mb array-based CGH analysis. Instead, methylation of the ADAMTS18 promoter CpG Island was frequently detected with methylation-specific PCR and bisulfite genome sequencing in multiple carcinoma cell lines and primary carcinomas, but not in any non-tumor cell line and normal epithelial tissue. Both pharmacological and genetic demethylation dramatically induced ADAMTS18 expression, indicating that CpG methylation directly contributes to the tumor-specific silencing of ADAMTS18. Ectopic ADAMTS18 expression leads to significant inhibition of both anchorage-dependent and -independent growth of carcinoma cells lacking the expression. Thus, through functional epigenetics, we identified ADAMTS18 as a novel functional tumor suppressor, being frequently inactivated epigenetically in multiple carcinomas.
ADAMTS18; methylation; tumor suppressor gene; carcinoma; promoter
By comparative DNA fingerprinting, we identified a 357-bp DNA fragment frequently amplified in esophageal squamous cell carcinomas (ESCC). This fragment overlaps with an expressed sequence tag mapped to 7q22. Further 5′ and 3′-rapid amplification of cDNA ends revealed that it is part of a novel, single-exon gene with full-length mRNA of 2052-bp and encodes a nuclear protein of 109 amino acids (~15-kDa). This gene, designated as GAEC1 (Gene Amplified in Esophageal Cancer 1), was located within a 1-2 Mb amplicon at 7q22.1 identified by high-resolution 1-Mb array-CGH in 6/10 ESCC cell lines. GAEC1 was ubiquitously expressed in normal tissues including esophageal and gastrointestinal organs; with amplification and overexpression in 6/10 (60%) ESCC cell lines and 34/99 (34%) primary tumors. Overexpression of GAEC1 in 3T3 mouse fibroblasts caused foci formation and colony formation in soft agar, comparable to H-ras, and injection of GAEC1-transfected 3T3 cells into athymic nude mice formed undifferentiated sarcoma in vivo, indicating that GAEC1 is a transforming oncogene. Although no significant correlation was observed between GAEC1 amplification and clinicopathological parameters and prognosis, our study demonstrated that overexpressed GAEC1 has tumorigenic potential and suggest that overexpressed GAEC1 may play an important role in ESCC pathogenesis.
Gene amplification; overexpression; transforming gene; esophageal cancer; 7q22
The SCL (TAL1) transcription factor is a critical regulator of haematopoiesis and its expression is tightly controlled by multiple cis-acting regulatory elements. To elaborate further the DNA elements which control its regulation, we used genomic tiling microarrays covering 256 kb of the human SCL locus to perform a concerted analysis of chromatin structure and binding of regulatory proteins in human haematopoietic cell lines. This approach allowed us to characterise further or redefine known human SCL regulatory elements and led to the identification of six novel elements with putative regulatory function both up and downstream of the SCL gene. They bind a number of haematopoietic transcription factors (GATA1, E2A LMO2, SCL, LDB1), CTCF or components of the transcriptional machinery and are associated with relevant histone modifications, accessible chromatin and low nucleosomal density. Functional characterisation shows that these novel elements are able to enhance or repress SCL promoter activity, have endogenous promoter function or enhancer-blocking insulator function. Our analysis opens up several areas for further investigation and adds new layers of complexity to our understanding of the regulation of SCL expression.
Progressive hearing loss is common in the human population, but little is known about the molecular basis. We report a new ENU-induced mouse mutant, diminuendo, with a single base change in the seed region of Mirn96. Heterozygotes show progressive loss of hearing and hair cell anomalies, while homozygotes have no cochlear responses. Most microRNAs are believed to downregulate target genes by binding to specific sites on their mRNAs, so mutation of the seed should lead to target gene upregulation. Microarray analysis revealed 96 transcripts with significantly altered expression in homozygotes; notably, Slc26a5, oncomodulin, Gfi1, Ptprq and Pitpnm1 were downregulated. Hypergeometric p-value analysis showed hundreds of genes were upregulated in mutants. Different genes, with target sites complementary to the mutant seed, were downregulated. This is the first microRNA found associated with deafness, and diminuendo represents a model for understanding and potentially moderating progressive hair cell degeneration in hearing loss more generally.
Autism comprises a spectrum of behavioral and cognitive disturbances of childhood development and is known to be highly heritable. Although numerous approaches have been used to identify genes implicated in the development of autism, less than 10% of autism cases have been attributed to single gene disorders.
We describe the use of high-resolution genome-wide tilepath microarrays and comparative genomic hybridization to identify copy number variants within 119 probands from multiplex autism families. We next carried out DNA methylation analysis by bisulfite sequencing in a proband and his family, expanding this analysis to methylation analysis of peripheral blood and temporal cortex DNA of autism cases and matched controls from independent datasets. We also assessed oxytocin receptor (OXTR) gene expression within the temporal cortex tissue by quantitative real-time polymerase chain reaction (PCR).
Our analysis revealed a genomic deletion containing the oxytocin receptor gene, OXTR (MIM accession no.: 167055), previously implicated in autism, was present in an autism proband and his mother who exhibits symptoms of obsessive-compulsive disorder. The proband's affected sibling did not harbor this deletion but instead may exhibit epigenetic misregulation of this gene through aberrant gene silencing by DNA methylation. Further DNA methylation analysis of the CpG island known to regulate OXTR expression identified several CpG dinucleotides that show independent statistically significant increases in the DNA methylation status in the peripheral blood cells and temporal cortex in independent datasets of individuals with autism as compared to control samples. Associated with the increase in methylation of these CpG dinucleotides is our finding that OXTR mRNA showed decreased expression in the temporal cortex tissue of autism cases matched for age and sex compared to controls.
Together, these data provide further evidence for the role of OXTR and the oxytocin signaling pathway in the etiology of autism and, for the first time, implicate the epigenetic regulation of OXTR in the development of the disorder.
See the related commentary by Gurrieri and Neri:
DNA methylation is a major epigenetic modification important for regulating gene expression and suppressing spurious transcription. Most methods to scan the genome in different tissues for differentially methylated sites have focused on the methylation of CpGs in CpG islands, which are concentrations of CpGs often associated with gene promoters.
Here, we use a methylation profiling strategy that is predominantly responsive to methylation differences outside of CpG islands. The method compares the yield from two samples of size-selected fragments generated by a methylation-sensitive restriction enzyme. We then profile nine different normal tissues from two human donors relative to spleen using a custom array of genomic clones covering the euchromatic portion of human chromosome 1 and representing 8% of the human genome. We observe gross regional differences in methylation states across chromosome 1 between tissues from the same individual, with the most striking differences detected in the comparison of cerebellum and spleen. Profiles of the same tissue from different donors are strikingly similar, as are the profiles of different lobes of the brain. Comparing our results with published gene expression levels, we find that clones exhibiting extreme ratios reflecting low relative methylation are statistically enriched for genes with high expression ratios, and vice versa, in most pairs of tissues examined.
The varied patterns of methylation differences detected between tissues by our methylation profiling method reinforce the potential functional significance of regional differences in methylation levels outside of CpG islands.
Hematopoiesis is a carefully controlled process that is regulated by complex networks of transcription factors that are, in part, controlled by signals resulting from ligand binding to cell-surface receptors. To further understand hematopoiesis, we have compared gene expression profiles of human erythroblasts, megakaryocytes, B cells, cytotoxic and helper T cells, natural killer cells, granulocytes, and monocytes using whole genome microarrays. A bioinformatics analysis of these data was performed focusing on transcription factors, immunoglobulin superfamily members, and lineage-specific transcripts. We observed that the numbers of lineage-specific genes varies by 2 orders of magnitude, ranging from 5 for cytotoxic T cells to 878 for granulocytes. In addition, we have identified novel coexpression patterns for key transcription factors involved in hematopoiesis (eg, GATA3-GFI1 and GATA2-KLF1). This study represents the most comprehensive analysis of gene expression in hematopoietic cells to date and has identified genes that play key roles in lineage commitment and cell function. The data, which are freely accessible, will be invaluable for future studies on hematopoiesis and the role of specific genes and will also aid the understanding of the recent genome-wide association studies.