PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-13 (13)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  The Vertebrate Genome Annotation browser 10 years on 
Nucleic Acids Research  2013;42(Database issue):D771-D779.
The Vertebrate Genome Annotation (VEGA) database (http://vega.sanger.ac.uk), initially designed as a community resource for browsing manual annotation of the human genome project, now contains five reference genomes (human, mouse, zebrafish, pig and rat). Its introduction pages have been redesigned to enable the user to easily navigate between whole genomes and smaller multi-species haplotypic regions of interest such as the major histocompatibility complex. The VEGA browser is unique in that annotation is updated via the Human And Vertebrate Analysis aNd Annotation (HAVANA) update track every 2 weeks, allowing single gene updates to be made publicly available to the research community quickly. The user can now access different haplotypic subregions more easily, such as those from the non-obese diabetic mouse, and display them in a more intuitive way using the comparative tools. We also highlight how the user can browse manually annotated updated patches from the Genome Reference Consortium (GRC).
doi:10.1093/nar/gkt1241
PMCID: PMC3964964  PMID: 24316575
2.  Current status and new features of the Consensus Coding Sequence database  
Nucleic Acids Research  2013;42(Database issue):D865-D872.
The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.
doi:10.1093/nar/gkt1059
PMCID: PMC3965069  PMID: 24217909
3.  Fine mapping of type 1 diabetes regions Idd9.1 and Idd9.2 reveals genetic complexity 
Mammalian Genome  2013;24(9-10):358-375.
Nonobese diabetic (NOD) mice congenic for C57BL/10 (B10)-derived genes in the Idd9 region of chromosome 4 are highly protected from type 1 diabetes (T1D). Idd9 has been divided into three protective subregions (Idd9.1, 9.2, and 9.3), each of which partially prevents disease. In this study we have fine-mapped the Idd9.1 and Idd9.2 regions, revealing further genetic complexity with at least two additional subregions contributing to protection from T1D. Using the NOD sequence from bacterial artificial chromosome clones of the Idd9.1 and Idd9.2 regions as well as whole-genome sequence data recently made available, sequence polymorphisms within the regions highlight a high degree of polymorphism between the NOD and B10 strains in the Idd9 regions. Among numerous candidate genes are several with immunological importance. The Idd9.1 region has been separated into Idd9.1 and Idd9.4, with Lck remaining a candidate gene within Idd9.1. One of the Idd9.2 regions contains the candidate genes Masp2 (encoding mannan-binding lectin serine peptidase 2) and Mtor (encoding mammalian target of rapamycin). From mRNA expression analyses, we have also identified several other differentially expressed candidate genes within the Idd9.1 and Idd9.2 regions. These findings highlight that multiple, relatively small genetic effects combine and interact to produce significant changes in immune tolerance and diabetes onset.
Electronic supplementary material
The online version of this article (doi:10.1007/s00335-013-9466-y) contains supplementary material, which is available to authorized users.
doi:10.1007/s00335-013-9466-y
PMCID: PMC3824839  PMID: 23934554
4.  The non-obese diabetic mouse sequence, annotation and variation resource: an aid for investigating type 1 diabetes 
Model organisms are becoming increasingly important for the study of complex diseases such as type 1 diabetes (T1D). The non-obese diabetic (NOD) mouse is an experimental model for T1D having been bred to develop the disease spontaneously in a process that is similar to humans. Genetic analysis of the NOD mouse has identified around 50 disease loci, which have the nomenclature Idd for insulin-dependent diabetes, distributed across at least 11 different chromosomes. In total, 21 Idd regions across 6 chromosomes, that are major contributors to T1D susceptibility or resistance, were selected for finished sequencing and annotation at the Wellcome Trust Sanger Institute. Here we describe the generation of 40.4 mega base-pairs of finished sequence from 289 bacterial artificial chromosomes for the NOD mouse. Manual annotation has identified 738 genes in the diabetes sensitive NOD mouse and 765 genes in homologous regions of the diabetes resistant C57BL/6J reference mouse across 19 candidate Idd regions. This has allowed us to call variation consequences between homologous exonic sequences for all annotated regions in the two mouse strains. We demonstrate the importance of this resource further by illustrating the technical difficulties that regions of inter-strain structural variation between the NOD mouse and the C57BL/6J reference mouse can cause for current next generation sequencing and assembly techniques. Furthermore, we have established that the variation rate in the Idd regions is 2.3 times higher than the mean found for the whole genome assembly for the NOD/ShiLtJ genome, which we suggest reflects the fact that positive selection for functional variation in immune genes is beneficial in regard to host defence. In summary, we provide an important resource, which aids the analysis of potential causative genes involved in T1D susceptibility.
Database URLs: http://www.sanger.ac.uk/resources/mouse/nod/; http://vega-previous.sanger.ac.uk/info/data/mouse_regions.html
doi:10.1093/database/bat032
PMCID: PMC3668384  PMID: 23729657
5.  The B10 Idd9.3 locus mediates accumulation of functionally superior CD137pos T regulatory cells in the NOD Type 1 diabetes model 
CD137 is a T cell costimulatory molecule encoded by the prime candidate gene (designated Tnfrsf9) in NOD.B10 Idd9.3 congenic mice protected from type one diabetes (T1D). NOD T cells show decreased CD137-mediated T cell signaling compared to NOD.B10 Idd9.3 T cells, but it has been unclear how this decreased CD137 T cell signaling could mediate susceptibility to T1D. We and others have shown that a subset of T regulatory cells (Tregs) constitutively expresses CD137 (whereas T effectors do not, and only express CD137 briefly after activation). Here we show that the B10 Idd9.3 region intrinsically contributes to accumulation of CD137pos Tregs with age. NOD.B10 Idd9.3 mice showed significantly increased percentages and numbers of CD137pos peripheral Tregs compared to NOD mice. Moreover, Tregs expressing the B10 Idd9.3 region preferentially accumulated in mixed bone marrow chimeric mice reconstituted with allotypically marked NOD and NOD.B10 Idd9.3 bone marrow. We demonstrate a possible significance of increased numbers of CD137pos Tregs by showing functional superiority of FACS purified CD137pos Tregs in vitro compared to CD137neg Tregs in T cell suppression assays. Increased functional suppression was also associated with increased production of the alternatively spliced CD137 isoform, soluble CD137, which has been shown to suppress T cell proliferation. We show for the first time that CD137pos Tregs are the primary cellular source of soluble CD137. NOD.B10 Idd9.3 mice showed significantly increased serum soluble CD137 compared to NOD mice with age, consistent with their increased numbers of CD137pos Tregs with age. These studies demonstrate the importance of CD137pos Tregs in T1D and offer a new hypothesis for how the NOD Idd9.3 region could act to increase T1D susceptibility.
doi:10.4049/jimmunol.1101013
PMCID: PMC3505683  PMID: 23066155
6.  Structural and functional annotation of the porcine immunome 
BMC Genomics  2013;14:332.
Background
The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems.
Results
The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome.
Conclusions
This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig’s adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response.
doi:10.1186/1471-2164-14-332
PMCID: PMC3658956  PMID: 23676093
Immune response; Porcine; Genome annotation; Co-expression network; Phylogenetic analysis; Accelerated evolution
7.  Mouse genomic variation and its effect on phenotypes and gene regulation 
Nature  2011;477(7364):289-294.
We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten times more variants than previously known. We use these genomes to explore the phylogenetic history of the laboratory mouse and to examine the functional consequences of allele-specific variation on transcript abundance, revealing that at least 12% of transcripts show a significant tissue-specific expression bias. By identifying candidate functional variants at 718 quantitative trait loci we show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism.
doi:10.1038/nature10413
PMCID: PMC3276836  PMID: 21921910
8.  Evidence that Cd101 is an autoimmune diabetes gene in NOD mice 
We have previously proposed that sequence variation of the CD101 gene between NOD and C57BL/6 (B6) mice accounts for the protection from type 1 diabetes (T1D) provided by the Idd10 region, a <1 Mb region on mouse chromosome 3. Here, we provide further support for the hypothesis that Cd101 is Idd10 using haplotype and expression analyses of novel Idd10 congenic strains coupled to the development of a CD101 knockout mouse. Susceptibility to T1D was correlated with genotype-dependent CD101 expression on multiple cell subsets, including FoxP3+ regulatory CD4+ T cells, CD11c+ dendritic cells and Gr1+ myeloid cells. The correlation of CD101 expression on immune cells from four independent Idd10 haplotypes with the development of T1D supports the identity of Cd101 as Idd10. Since CD101 has been associated with T regulatory and antigen presentation cell functions, our results provide a further link between immune regulation and susceptibility to T1D.
doi:10.4049/jimmunol.1003523
PMCID: PMC3128927  PMID: 21613616
Rodent; Diabetes; Autoimmunity
9.  NOD congenic strain analysis of autoimmune diabetes reveals genetic complexity of the Idd18 locus and identifies Vav3 as a candidate gene 
We have used the public sequencing and annotation of the mouse genome to delimit the previously resolved type 1 diabetes (T1D) Idd18 interval to a region on chromosome 3 that includes the immunologically relevant candidate gene, Vav3. To test the candidacy of Vav3, we developed a novel congenic strain which enabled the resolution of Idd18 to a 604 kb interval, designated Idd18.1, which contains only two annotated genes: the complete sequence of Vav3, and the last exon of the gene encoding NETRIN G1, Ntng1. Targeted sequencing of Idd18.1 in the NOD mouse strain revealed that allelic variation between NOD and C57BL/6J (B6) occurs in non-coding regions with 138 single nucleotide polymorphisms (SNPs) concentrated in the introns between exons 20 and 27, and immediately after the 3′ UTR. We observed differential expression of VAV3 RNA transcripts in thymocytes when comparing congenic mouse strains with B6 or NOD alleles at Idd18.1. The T1D protection associated with B6 alleles of Idd18.1/Vav3 requires the presence of B6 protective alleles at Idd3, which are correlated with increased IL-2 production and regulatory T cell function. In the absence of B6 protective alleles at Idd3, we detected a second T1D protective B6 locus, Idd18.3, which is closely linked to, but distinct from, Idd18.1. Therefore, genetic mapping, sequencing, and gene expression evidence indicate that alteration of VAV3 expression is an etiological factor in the development of autoimmune beta-cell destruction in NOD mice. This study also demonstrates that a congenic strain mapping approach can isolate closely linked susceptibility genes.
doi:10.4049/jimmunol.0903734
PMCID: PMC2886967  PMID: 20363978
Rodent; Diabetes; Autoimmunity
10.  Genome-wide end-sequenced BAC resources for the NOD/MrkTac☆ and NOD/ShiLtJ☆☆ mouse genomes 
Genomics  2010;95(2):105-110.
Non-obese diabetic (NOD) mice spontaneously develop type 1 diabetes (T1D) due to the progressive loss of insulin-secreting β-cells by an autoimmune driven process. NOD mice represent a valuable tool for studying the genetics of T1D and for evaluating therapeutic interventions. Here we describe the development and characterization by end-sequencing of bacterial artificial chromosome (BAC) libraries derived from NOD/MrkTac (DIL NOD) and NOD/ShiLtJ (CHORI-29), two commonly used NOD substrains. The DIL NOD library is composed of 196,032 BACs and the CHORI-29 library is composed of 110,976 BACs. The average depth of genome coverage of the DIL NOD library, estimated from mapping the BAC end-sequences to the reference mouse genome sequence, was 7.1-fold across the autosomes and 6.6-fold across the X chromosome. Clones from this library have an average insert size of 150 kb and map to over 95.6% of the reference mouse genome assembly (NCBIm37), covering 98.8% of Ensembl mouse genes. By the same metric, the CHORI-29 library has an average depth over the autosomes of 5.0-fold and 2.8-fold coverage of the X chromosome, the reduced X chromosome coverage being due to the use of a male donor for this library. Clones from this library have an average insert size of 205 kb and map to 93.9% of the reference mouse genome assembly, covering 95.7% of Ensembl genes. We have identified and validated 191,841 single nucleotide polymorphisms (SNPs) for DIL NOD and 114,380 SNPs for CHORI-29. In total we generated 229,736,133 bp of sequence for the DIL NOD and 121,963,211 bp for the CHORI-29. These BAC libraries represent a powerful resource for functional studies, such as gene targeting in NOD embryonic stem (ES) cell lines, and for sequencing and mapping experiments.
doi:10.1016/j.ygeno.2009.10.004
PMCID: PMC2824108  PMID: 19909804
Bacterial artificial chromosome; NOD/MrkTac; NOD/ShiLtJ; Mouse genome; Non-obese diabetic (NOD); Type 1 diabetes; T1D; Insulin-dependent diabetes; IDD
11.  The DNA sequence of the human X chromosome 
Ross, Mark T. | Grafham, Darren V. | Coffey, Alison J. | Scherer, Steven | McLay, Kirsten | Muzny, Donna | Platzer, Matthias | Howell, Gareth R. | Burrows, Christine | Bird, Christine P. | Frankish, Adam | Lovell, Frances L. | Howe, Kevin L. | Ashurst, Jennifer L. | Fulton, Robert S. | Sudbrak, Ralf | Wen, Gaiping | Jones, Matthew C. | Hurles, Matthew E. | Andrews, T. Daniel | Scott, Carol E. | Searle, Stephen | Ramser, Juliane | Whittaker, Adam | Deadman, Rebecca | Carter, Nigel P. | Hunt, Sarah E. | Chen, Rui | Cree, Andrew | Gunaratne, Preethi | Havlak, Paul | Hodgson, Anne | Metzker, Michael L. | Richards, Stephen | Scott, Graham | Steffen, David | Sodergren, Erica | Wheeler, David A. | Worley, Kim C. | Ainscough, Rachael | Ambrose, Kerrie D. | Ansari-Lari, M. Ali | Aradhya, Swaroop | Ashwell, Robert I. S. | Babbage, Anne K. | Bagguley, Claire L. | Ballabio, Andrea | Banerjee, Ruby | Barker, Gary E. | Barlow, Karen F. | Barrett, Ian P. | Bates, Karen N. | Beare, David M. | Beasley, Helen | Beasley, Oliver | Beck, Alfred | Bethel, Graeme | Blechschmidt, Karin | Brady, Nicola | Bray-Allen, Sarah | Bridgeman, Anne M. | Brown, Andrew J. | Brown, Mary J. | Bonnin, David | Bruford, Elspeth A. | Buhay, Christian | Burch, Paula | Burford, Deborah | Burgess, Joanne | Burrill, Wayne | Burton, John | Bye, Jackie M. | Carder, Carol | Carrel, Laura | Chako, Joseph | Chapman, Joanne C. | Chavez, Dean | Chen, Ellson | Chen, Guan | Chen, Yuan | Chen, Zhijian | Chinault, Craig | Ciccodicola, Alfredo | Clark, Sue Y. | Clarke, Graham | Clee, Chris M. | Clegg, Sheila | Clerc-Blankenburg, Kerstin | Clifford, Karen | Cobley, Vicky | Cole, Charlotte G. | Conquer, Jen S. | Corby, Nicole | Connor, Richard E. | David, Robert | Davies, Joy | Davis, Clay | Davis, John | Delgado, Oliver | DeShazo, Denise | Dhami, Pawandeep | Ding, Yan | Dinh, Huyen | Dodsworth, Steve | Draper, Heather | Dugan-Rocha, Shannon | Dunham, Andrew | Dunn, Matthew | Durbin, K. James | Dutta, Ireena | Eades, Tamsin | Ellwood, Matthew | Emery-Cohen, Alexandra | Errington, Helen | Evans, Kathryn L. | Faulkner, Louisa | Francis, Fiona | Frankland, John | Fraser, Audrey E. | Galgoczy, Petra | Gilbert, James | Gill, Rachel | Glöckner, Gernot | Gregory, Simon G. | Gribble, Susan | Griffiths, Coline | Grocock, Russell | Gu, Yanghong | Gwilliam, Rhian | Hamilton, Cerissa | Hart, Elizabeth A. | Hawes, Alicia | Heath, Paul D. | Heitmann, Katja | Hennig, Steffen | Hernandez, Judith | Hinzmann, Bernd | Ho, Sarah | Hoffs, Michael | Howden, Phillip J. | Huckle, Elizabeth J. | Hume, Jennifer | Hunt, Paul J. | Hunt, Adrienne R. | Isherwood, Judith | Jacob, Leni | Johnson, David | Jones, Sally | de Jong, Pieter J. | Joseph, Shirin S. | Keenan, Stephen | Kelly, Susan | Kershaw, Joanne K. | Khan, Ziad | Kioschis, Petra | Klages, Sven | Knights, Andrew J. | Kosiura, Anna | Kovar-Smith, Christie | Laird, Gavin K. | Langford, Cordelia | Lawlor, Stephanie | Leversha, Margaret | Lewis, Lora | Liu, Wen | Lloyd, Christine | Lloyd, David M. | Loulseged, Hermela | Loveland, Jane E. | Lovell, Jamieson D. | Lozado, Ryan | Lu, Jing | Lyne, Rachael | Ma, Jie | Maheshwari, Manjula | Matthews, Lucy H. | McDowall, Jennifer | McLaren, Stuart | McMurray, Amanda | Meidl, Patrick | Meitinger, Thomas | Milne, Sarah | Miner, George | Mistry, Shailesh L. | Morgan, Margaret | Morris, Sidney | Müller, Ines | Mullikin, James C. | Nguyen, Ngoc | Nordsiek, Gabriele | Nyakatura, Gerald | O’Dell, Christopher N. | Okwuonu, Geoffery | Palmer, Sophie | Pandian, Richard | Parker, David | Parrish, Julia | Pasternak, Shiran | Patel, Dina | Pearce, Alex V. | Pearson, Danita M. | Pelan, Sarah E. | Perez, Lesette | Porter, Keith M. | Ramsey, Yvonne | Reichwald, Kathrin | Rhodes, Susan | Ridler, Kerry A. | Schlessinger, David | Schueler, Mary G. | Sehra, Harminder K. | Shaw-Smith, Charles | Shen, Hua | Sheridan, Elizabeth M. | Shownkeen, Ratna | Skuce, Carl D. | Smith, Michelle L. | Sotheran, Elizabeth C. | Steingruber, Helen E. | Steward, Charles A. | Storey, Roy | Swann, R. Mark | Swarbreck, David | Tabor, Paul E. | Taudien, Stefan | Taylor, Tineace | Teague, Brian | Thomas, Karen | Thorpe, Andrea | Timms, Kirsten | Tracey, Alan | Trevanion, Steve | Tromans, Anthony C. | d’Urso, Michele | Verduzco, Daniel | Villasana, Donna | Waldron, Lenee | Wall, Melanie | Wang, Qiaoyan | Warren, James | Warry, Georgina L. | Wei, Xuehong | West, Anthony | Whitehead, Siobhan L. | Whiteley, Mathew N. | Wilkinson, Jane E. | Willey, David L. | Williams, Gabrielle | Williams, Leanne | Williamson, Angela | Williamson, Helen | Wilming, Laurens | Woodmansey, Rebecca L. | Wray, Paul W. | Yen, Jennifer | Zhang, Jingkun | Zhou, Jianling | Zoghbi, Huda | Zorilla, Sara | Buck, David | Reinhardt, Richard | Poustka, Annemarie | Rosenthal, André | Lehrach, Hans | Meindl, Alfons | Minx, Patrick J. | Hillier, LaDeana W. | Willard, Huntington F. | Wilson, Richard K. | Waterston, Robert H. | Rice, Catherine M. | Vaudin, Mark | Coulson, Alan | Nelson, David L. | Weinstock, George | Sulston, John E. | Durbin, Richard | Hubbard, Tim | Gibbs, Richard A. | Beck, Stephan | Rogers, Jane | Bentley, David R.
Nature  2005;434(7031):325-337.
The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.
doi:10.1038/nature03440
PMCID: PMC2665286  PMID: 15772651
12.  DNA sequence of human chromosome 17 and analysis of rearrangement in the human lineage 
Zody, Michael C. | Garber, Manuel | Adams, David J. | Sharpe, Ted | Harrow, Jennifer | Lupski, James R. | Nicholson, Christine | Searle, Steven M. | Wilming, Laurens | Young, Sarah K. | Abouelleil, Amr | Allen, Nicole R. | Bi, Weimin | Bloom, Toby | Borowsky, Mark L. | Bugalter, Boris E. | Butler, Jonathan | Chang, Jean L. | Chen, Chao-Kung | Cook, April | Corum, Benjamin | Cuomo, Christina A. | de Jong, Pieter J. | DeCaprio, David | Dewar, Ken | FitzGerald, Michael | Gilbert, James | Gibson, Richard | Gnerre, Sante | Goldstein, Steven | Grafham, Darren V. | Grocock, Russell | Hafez, Nabil | Hagopian, Daniel S. | Hart, Elizabeth | Norman, Catherine Hosage | Humphray, Sean | Jaffe, David B. | Jones, Matt | Kamal, Michael | Khodiyar, Varsha K. | LaButti, Kurt | Laird, Gavin | Lehoczky, Jessica | Liu, Xiaohong | Lokyitsang, Tashi | Loveland, Jane | Lui, Annie | Macdonald, Pendexter | Major, John E. | Matthews, Lucy | Mauceli, Evan | McCarroll, Steven A. | Mihalev, Atanas H. | Mudge, Jonathan | Nguyen, Cindy | Nicol, Robert | O'Leary, Sinéad B. | Osoegawa, Kazutoyo | Schwartz, David C. | Shaw-Smith, Charles | Stankiewicz, Pawel | Steward, Charles | Swarbreck, David | Venkataraman, Vijay | Whittaker, Charles A. | Yang, Xiaoping | Zimmer, Andrew R. | Bradley, Allan | Hubbard, Tim | Birren, Bruce W. | Rogers, Jane | Lander, Eric S. | Nusbaum, Chad
Nature  2006;440(7087):1045-1049.
Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome1, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome2,3. It is also enriched in segmental duplications, ranking third in density among the autosomes4. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution5,6, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.
doi:10.1038/nature04689
PMCID: PMC2610434  PMID: 16625196
13.  Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones 
Imanishi, Tadashi | Itoh, Takeshi | Suzuki, Yutaka | O'Donovan, Claire | Fukuchi, Satoshi | Koyanagi, Kanako O | Barrero, Roberto A | Tamura, Takuro | Yamaguchi-Kabata, Yumi | Tanino, Motohiko | Yura, Kei | Miyazaki, Satoru | Ikeo, Kazuho | Homma, Keiichi | Kasprzyk, Arek | Nishikawa, Tetsuo | Hirakawa, Mika | Thierry-Mieg, Jean | Thierry-Mieg, Danielle | Ashurst, Jennifer | Jia, Libin | Nakao, Mitsuteru | Thomas, Michael A | Mulder, Nicola | Karavidopoulou, Youla | Jin, Lihua | Kim, Sangsoo | Yasuda, Tomohiro | Lenhard, Boris | Eveno, Eric | Suzuki, Yoshiyuki | Yamasaki, Chisato | Takeda, Jun-ichi | Gough, Craig | Hilton, Phillip | Fujii, Yasuyuki | Sakai, Hiroaki | Tanaka, Susumu | Amid, Clara | Bellgard, Matthew | de Fatima Bonaldo, Maria | Bono, Hidemasa | Bromberg, Susan K | Brookes, Anthony J | Bruford, Elspeth | Carninci, Piero | Chelala, Claude | Couillault, Christine | de Souza, Sandro J. | Debily, Marie-Anne | Devignes, Marie-Dominique | Dubchak, Inna | Endo, Toshinori | Estreicher, Anne | Eyras, Eduardo | Fukami-Kobayashi, Kaoru | R. Gopinath, Gopal | Graudens, Esther | Hahn, Yoonsoo | Han, Michael | Han, Ze-Guang | Hanada, Kousuke | Hanaoka, Hideki | Harada, Erimi | Hashimoto, Katsuyuki | Hinz, Ursula | Hirai, Momoki | Hishiki, Teruyoshi | Hopkinson, Ian | Imbeaud, Sandrine | Inoko, Hidetoshi | Kanapin, Alexander | Kaneko, Yayoi | Kasukawa, Takeya | Kelso, Janet | Kersey, Paul | Kikuno, Reiko | Kimura, Kouichi | Korn, Bernhard | Kuryshev, Vladimir | Makalowska, Izabela | Makino, Takashi | Mano, Shuhei | Mariage-Samson, Regine | Mashima, Jun | Matsuda, Hideo | Mewes, Hans-Werner | Minoshima, Shinsei | Nagai, Keiichi | Nagasaki, Hideki | Nagata, Naoki | Nigam, Rajni | Ogasawara, Osamu | Ohara, Osamu | Ohtsubo, Masafumi | Okada, Norihiro | Okido, Toshihisa | Oota, Satoshi | Ota, Motonori | Ota, Toshio | Otsuki, Tetsuji | Piatier-Tonneau, Dominique | Poustka, Annemarie | Ren, Shuang-Xi | Saitou, Naruya | Sakai, Katsunaga | Sakamoto, Shigetaka | Sakate, Ryuichi | Schupp, Ingo | Servant, Florence | Sherry, Stephen | Shiba, Rie | Shimizu, Nobuyoshi | Shimoyama, Mary | Simpson, Andrew J | Soares, Bento | Steward, Charles | Suwa, Makiko | Suzuki, Mami | Takahashi, Aiko | Tamiya, Gen | Tanaka, Hiroshi | Taylor, Todd | Terwilliger, Joseph D | Unneberg, Per | Veeramachaneni, Vamsi | Watanabe, Shinya | Wilming, Laurens | Yasuda, Norikazu | Yoo, Hyang-Sook | Stodolsky, Marvin | Makalowski, Wojciech | Go, Mitiko | Nakai, Kenta | Takagi, Toshihisa | Kanehisa, Minoru | Sakaki, Yoshiyuki | Quackenbush, John | Okazaki, Yasushi | Hayashizaki, Yoshihide | Hide, Winston | Chakraborty, Ranajit | Nishikawa, Ken | Sugawara, Hideaki | Tateno, Yoshio | Chen, Zhu | Oishi, Michio | Tonellato, Peter | Apweiler, Rolf | Okubo, Kousaku | Wagner, Lukas | Wiemann, Stefan | Strausberg, Robert L | Isogai, Takao | Auffray, Charles | Nomura, Nobuo | Gojobori, Takashi | Sugano, Sumio
PLoS Biology  2004;2(6):e162.
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.
An international team has systematically validated and annotated just over 21,000 human genes using full-length cDNA, thereby providing a valuable new resource for the human genetics community
doi:10.1371/journal.pbio.0020162
PMCID: PMC393292  PMID: 15103394

Results 1-13 (13)