PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-18 (18)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
more »
Document Types
1.  WormBase 2014: new views of curated biology 
Nucleic Acids Research  2013;42(Database issue):D789-D793.
WormBase (http://www.wormbase.org/) is a highly curated resource dedicated to supporting research using the model organism Caenorhabditis elegans. With an electronic history predating the World Wide Web, WormBase contains information ranging from the sequence and phenotype of individual alleles to genome-wide studies generated using next-generation sequencing technologies. In recent years, we have expanded the contents to include data on additional nematodes of agricultural and medical significance, bringing the knowledge of C. elegans to bear on these systems and providing support for underserved research communities. Manual curation of the primary literature remains a central focus of the WormBase project, providing users with reliable, up-to-date and highly cross-linked information. In this update, we describe efforts to organize the original atomized and highly contextualized curated data into integrated syntheses of discrete biological topics. Next, we discuss our experiences coping with the vast increase in available genome sequences made possible through next-generation sequencing platforms. Finally, we describe some of the features and tools of the new WormBase Web site that help users better find and explore data of interest.
doi:10.1093/nar/gkt1063
PMCID: PMC3965043  PMID: 24194605
2.  Ensembl Genomes 2013: scaling up access to genome-wide data 
Nucleic Acids Research  2013;42(Database issue):D546-D552.
Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.
doi:10.1093/nar/gkt979
PMCID: PMC3965094  PMID: 24163254
3.  Structural analysis of the genome of breast cancer cell line ZR-75-30 identifies twelve expressed fusion genes 
BMC Genomics  2012;13:719.
Background
It has recently emerged that common epithelial cancers such as breast cancers have fusion genes like those in leukaemias. In a representative breast cancer cell line, ZR-75-30, we searched for fusion genes, by analysing genome rearrangements.
Results
We first analysed rearrangements of the ZR-75-30 genome, to around 10kb resolution, by molecular cytogenetic approaches, combining array painting and array CGH. We then compared this map with genomic junctions determined by paired-end sequencing. Most of the breakpoints found by array painting and array CGH were identified in the paired end sequencing—55% of the unamplified breakpoints and 97% of the amplified breakpoints (as these are represented by more sequence reads). From this analysis we identified 9 expressed fusion genes: APPBP2-PHF20L1, BCAS3-HOXB9, COL14A1-SKAP1, TAOK1-PCGF2, TIAM1-NRIP1, TIMM23-ARHGAP32, TRPS1-LASP1, USP32-CCDC49 and ZMYM4-OPRD1. We also determined the genomic junctions of a further three expressed fusion genes that had been described by others, BCAS3-ERBB2, DDX5-DEPDC6/DEPTOR and PLEC1-ENPP2. Of this total of 12 expressed fusion genes, 9 were in the coamplification. Due to the sensitivity of the technologies used, we estimate these 12 fusion genes to be around two-thirds of the true total. Many of the fusions seem likely to be driver mutations. For example, PHF20L1, BCAS3, TAOK1, PCGF2, and TRPS1 are fused in other breast cancers. HOXB9 and PHF20L1 are members of gene families that are fused in other neoplasms. Several of the other genes are relevant to cancer—in addition to ERBB2, SKAP1 is an adaptor for Src, DEPTOR regulates the mTOR pathway and NRIP1 is an estrogen-receptor coregulator.
Conclusions
This is the first structural analysis of a breast cancer genome that combines classical molecular cytogenetic approaches with sequencing. Paired-end sequencing was able to detect almost all breakpoints, where there was adequate read depth. It supports the view that gene breakage and gene fusion are important classes of mutation in breast cancer, with a typical breast cancer expressing many fusion genes.
doi:10.1186/1471-2164-13-719
PMCID: PMC3548764  PMID: 23260012
Breast cancer; Chromosome aberrations; Genomics; Fusion genes
4.  Discovery and Targeted LC-MS/MS of Purified Polerovirus Reveals Differences in the Virus-Host Interactome Associated with Altered Aphid Transmission 
PLoS ONE  2012;7(10):e48177.
Circulative transmission of viruses in the Luteoviridae, such as cereal yellow dwarf virus (CYDV), requires a series of precisely orchestrated interactions between virus, plant, and aphid proteins. Natural selection has favored these viruses to be retained in the phloem to facilitate acquisition and transmission by aphids. We show that treatment of infected oat tissue homogenate with sodium sulfite reduces transmission of the purified virus by aphids. Transmission electron microscopy data indicated no gross change in virion morphology due to treatments. However, treated virions were not acquired by aphids through the hindgut epithelial cells and were not transmitted when injected directly into the hemocoel. Analysis of virus preparations using nanoflow liquid chromatography coupled to tandem mass spectrometry revealed a number of host plant proteins co-purifying with viruses, some of which were lost following sodium sulfite treatment. Using targeted mass spectrometry, we show data suggesting that several of the virus-associated host plant proteins accumulated to higher levels in aphids that were fed on CYDV-infected plants compared to healthy plants. We propose two hypotheses to explain these observations, and these are not mutually exclusive: (a) that sodium sulfite treatment disrupts critical virion-host protein interactions required for aphid transmission, or (b) that host infection with CYDV modulates phloem protein expression in a way that is favorable for virus uptake by aphids. Importantly, the genes coding for the plant proteins associated with virus may be examined as targets in breeding cereal crops for new modes of virus resistance that disrupt phloem-virus or aphid-virus interactions.
doi:10.1371/journal.pone.0048177
PMCID: PMC3484124  PMID: 23118947
5.  Piwi and piRNAs Act Upstream of an Endogenous siRNA Pathway to suppress Tc3 Transposon Mobility in the Caenorhabditis elegans germline 
Molecular cell  2008;31(1):79-90.
The Piwi proteins of the Argonaute superfamily are required for normal germline development in Drosophila, zebrafish and mice, and associate with 24-30 nucleotide RNAs termed piRNAs. We identify a class of 21 nucleotide RNAs, previously named 21U-RNAs, as the piRNAs of C. elegans. Piwi and piRNA expression is restricted to the male and female germline and independent of many proteins in other small RNA pathways, including DCR-1. We show that Piwi is specifically required to silence Tc3, but not other Tc/mariner DNA transposons. Tc3 excision rates in the germline are increased at least 100 fold in piwi mutants as compared to wild type. We find no evidence for a Ping-Pong model for piRNA amplification in C. elegans. Instead, we demonstrate that Piwi acts upstream of an endogenous siRNA pathway in Tc3 silencing. These data might suggest a link between piRNA and siRNA function.
doi:10.1016/j.molcel.2008.06.003
PMCID: PMC3353317  PMID: 18571451
6.  WormBase 
Worm  2012;1(1):15-21.
WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase’s role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.
doi:10.4161/worm.19574
PMCID: PMC3670165  PMID: 24058818
Caenorhabditis elegans; annotation; community resource; genome; model organism database; nematode; parasitic nematode; sequence curation
7.  WormBase 2012: more genomes, more data, new website 
Nucleic Acids Research  2011;40(Database issue):D735-D741.
Since its release in 2000, WormBase (http://www.wormbase.org) has grown from a small resource focusing on a single species and serving a dedicated research community, to one now spanning 15 species essential to the broader biomedical and agricultural research fields. To enhance the rate of curation, we have automated the identification of key data in the scientific literature and use similar methodology for data extraction. To ease access to the data, we are collaborating with journals to link entities in research publications to their report pages at WormBase. To facilitate discovery, we have added new views of the data, integrated large-scale datasets and expanded descriptions of models for human disease. Finally, we have introduced a dramatic overhaul of the WormBase website for public beta testing. Designed to balance complexity and usability, the new site is species-agnostic, highly customizable, and interactive. Casual users and developers alike will be able to leverage the public RESTful application programming interface (API) to generate custom data mining solutions and extensions to the site. We report on the growth of our database and on our work in keeping pace with the growing demand for data, efforts to anticipate the requirements of users and new collaborations with the larger science community.
doi:10.1093/nar/gkr954
PMCID: PMC3245152  PMID: 22067452
8.  DNA methylation profiling of human chromosomes 6, 20 and 22 
Nature genetics  2006;38(12):1378-1385.
DNA methylation constitutes the most stable type of epigenetic modifications modulating the transcriptional plasticity of mammalian genomes. Using bisulfite DNA sequencing, we report high-resolution methylation reference profiles of human chromosomes 6, 20 and 22, providing a resource of about 1.9 million CpG methylation values derived from 12 different tissues. Analysis of 6 annotation categories, revealed evolutionary conserved regions to be the predominant sites for differential DNA methylation and a core region surrounding the transcriptional start site as informative surrogate for promoter methylation. We find 17% of the 873 analyzed genes differentially methylated in their 5′-untranslated regions (5′-UTR) and about one third of the differentially methylated 5′-UTRs to be inversely correlated with transcription. While our study was controlled for factors reported to affect DNA methylation such as sex and age, we did not find any significant attributable effects. Our data suggest DNA methylation to be ontogenetically more stable than previously thought.
doi:10.1038/ng1909
PMCID: PMC3082778  PMID: 17072317
9.  Development of stable reporter system cloning luxCDABE genes into chromosome of Salmonella enterica serotypes using Tn7 transposon 
BMC Microbiology  2010;10:197.
Background
Salmonellosis may be a food safety problem when raw food products are mishandled and not fully cooked. In previous work, we developed bioluminescent Salmonella enterica serotypes using a plasmid-based reporting system that can be used for real-time monitoring of the pathogen's growth on food products in short term studies. In this study, we report the use of a Tn7-based transposon system for subcloning of luxCDABE genes into the chromosome of eleven Salmonella enterica serotypes isolated from the broiler production continuum.
Results
We found that the lux operon is constitutively expressed from the chromosome post-transposition and the lux cassette is stable without external pressure, i.e. antibiotic selection, for all Salmonella enterica serotypes used. Bioluminescence expression is based on an active electron transport chain and is directly related with metabolic activity. This relationship was quantified by measuring bioluminescence against a temperature gradient in aqueous solution using a luminometer. In addition, bioluminescent monitoring of two serotypes confirmed that our chicken skin model has the potential to be used to evaluate pathogen mitigation strategies.
Conclusions
This study demonstrated that our new stable reporting system eliminates bioluminescence variation due to plasmid instability and provides a reliable real-time experimental system to study application of preventive measures for Salmonella on food products in real-time for both short and long term studies.
doi:10.1186/1471-2180-10-197
PMCID: PMC2918591  PMID: 20653968
10.  Mining the surface proteome of tomato (Solanum lycopersicum) fruit for proteins associated with cuticle biogenesis 
Journal of Experimental Botany  2010;61(13):3759-3771.
The aerial organs of plants are covered by the cuticle, a polyester matrix of cutin and organic solvent-soluble waxes that is contiguous with the polysaccharide cell wall of the epidermis. The cuticle is an important surface barrier between a plant and its environment, providing protection against desiccation, disease, and pests. However, many aspects of the mechanisms of cuticle biosynthesis, assembly, and restructuring are entirely unknown. To identify candidate proteins with a role in cuticle biogenesis, a surface protein extract was obtained from tomato (Solanum lycopersicum) fruits by dipping in an organic solvent and the constituent proteins were identified by several complementary fractionation strategies and two mass spectrometry techniques. Of the ∼200 proteins that were identified, a subset is potentially involved in the transport, deposition, or modification of the cuticle, such as those with predicted lipid-associated protein domains. These include several lipid-transfer proteins, GDSL-motif lipase/hydrolase family proteins, and an MD-2-related lipid recognition domain-containing protein. The epidermal-specific transcript accumulation of several of these candidates was confirmed by laser-capture microdissection and quantitative reverse transcription-PCR (qRT-PCR), together with their expression during various stages of fruit development. This indicated a complex pattern of cuticle deposition, and models for cuticle biogenesis and restructuring are discussed.
doi:10.1093/jxb/erq194
PMCID: PMC2921210  PMID: 20571035
Cuticle; cutin; lipid; proteome; tomato fruit; wax
11.  The DNA sequence of the human X chromosome 
Ross, Mark T. | Grafham, Darren V. | Coffey, Alison J. | Scherer, Steven | McLay, Kirsten | Muzny, Donna | Platzer, Matthias | Howell, Gareth R. | Burrows, Christine | Bird, Christine P. | Frankish, Adam | Lovell, Frances L. | Howe, Kevin L. | Ashurst, Jennifer L. | Fulton, Robert S. | Sudbrak, Ralf | Wen, Gaiping | Jones, Matthew C. | Hurles, Matthew E. | Andrews, T. Daniel | Scott, Carol E. | Searle, Stephen | Ramser, Juliane | Whittaker, Adam | Deadman, Rebecca | Carter, Nigel P. | Hunt, Sarah E. | Chen, Rui | Cree, Andrew | Gunaratne, Preethi | Havlak, Paul | Hodgson, Anne | Metzker, Michael L. | Richards, Stephen | Scott, Graham | Steffen, David | Sodergren, Erica | Wheeler, David A. | Worley, Kim C. | Ainscough, Rachael | Ambrose, Kerrie D. | Ansari-Lari, M. Ali | Aradhya, Swaroop | Ashwell, Robert I. S. | Babbage, Anne K. | Bagguley, Claire L. | Ballabio, Andrea | Banerjee, Ruby | Barker, Gary E. | Barlow, Karen F. | Barrett, Ian P. | Bates, Karen N. | Beare, David M. | Beasley, Helen | Beasley, Oliver | Beck, Alfred | Bethel, Graeme | Blechschmidt, Karin | Brady, Nicola | Bray-Allen, Sarah | Bridgeman, Anne M. | Brown, Andrew J. | Brown, Mary J. | Bonnin, David | Bruford, Elspeth A. | Buhay, Christian | Burch, Paula | Burford, Deborah | Burgess, Joanne | Burrill, Wayne | Burton, John | Bye, Jackie M. | Carder, Carol | Carrel, Laura | Chako, Joseph | Chapman, Joanne C. | Chavez, Dean | Chen, Ellson | Chen, Guan | Chen, Yuan | Chen, Zhijian | Chinault, Craig | Ciccodicola, Alfredo | Clark, Sue Y. | Clarke, Graham | Clee, Chris M. | Clegg, Sheila | Clerc-Blankenburg, Kerstin | Clifford, Karen | Cobley, Vicky | Cole, Charlotte G. | Conquer, Jen S. | Corby, Nicole | Connor, Richard E. | David, Robert | Davies, Joy | Davis, Clay | Davis, John | Delgado, Oliver | DeShazo, Denise | Dhami, Pawandeep | Ding, Yan | Dinh, Huyen | Dodsworth, Steve | Draper, Heather | Dugan-Rocha, Shannon | Dunham, Andrew | Dunn, Matthew | Durbin, K. James | Dutta, Ireena | Eades, Tamsin | Ellwood, Matthew | Emery-Cohen, Alexandra | Errington, Helen | Evans, Kathryn L. | Faulkner, Louisa | Francis, Fiona | Frankland, John | Fraser, Audrey E. | Galgoczy, Petra | Gilbert, James | Gill, Rachel | Glöckner, Gernot | Gregory, Simon G. | Gribble, Susan | Griffiths, Coline | Grocock, Russell | Gu, Yanghong | Gwilliam, Rhian | Hamilton, Cerissa | Hart, Elizabeth A. | Hawes, Alicia | Heath, Paul D. | Heitmann, Katja | Hennig, Steffen | Hernandez, Judith | Hinzmann, Bernd | Ho, Sarah | Hoffs, Michael | Howden, Phillip J. | Huckle, Elizabeth J. | Hume, Jennifer | Hunt, Paul J. | Hunt, Adrienne R. | Isherwood, Judith | Jacob, Leni | Johnson, David | Jones, Sally | de Jong, Pieter J. | Joseph, Shirin S. | Keenan, Stephen | Kelly, Susan | Kershaw, Joanne K. | Khan, Ziad | Kioschis, Petra | Klages, Sven | Knights, Andrew J. | Kosiura, Anna | Kovar-Smith, Christie | Laird, Gavin K. | Langford, Cordelia | Lawlor, Stephanie | Leversha, Margaret | Lewis, Lora | Liu, Wen | Lloyd, Christine | Lloyd, David M. | Loulseged, Hermela | Loveland, Jane E. | Lovell, Jamieson D. | Lozado, Ryan | Lu, Jing | Lyne, Rachael | Ma, Jie | Maheshwari, Manjula | Matthews, Lucy H. | McDowall, Jennifer | McLaren, Stuart | McMurray, Amanda | Meidl, Patrick | Meitinger, Thomas | Milne, Sarah | Miner, George | Mistry, Shailesh L. | Morgan, Margaret | Morris, Sidney | Müller, Ines | Mullikin, James C. | Nguyen, Ngoc | Nordsiek, Gabriele | Nyakatura, Gerald | O’Dell, Christopher N. | Okwuonu, Geoffery | Palmer, Sophie | Pandian, Richard | Parker, David | Parrish, Julia | Pasternak, Shiran | Patel, Dina | Pearce, Alex V. | Pearson, Danita M. | Pelan, Sarah E. | Perez, Lesette | Porter, Keith M. | Ramsey, Yvonne | Reichwald, Kathrin | Rhodes, Susan | Ridler, Kerry A. | Schlessinger, David | Schueler, Mary G. | Sehra, Harminder K. | Shaw-Smith, Charles | Shen, Hua | Sheridan, Elizabeth M. | Shownkeen, Ratna | Skuce, Carl D. | Smith, Michelle L. | Sotheran, Elizabeth C. | Steingruber, Helen E. | Steward, Charles A. | Storey, Roy | Swann, R. Mark | Swarbreck, David | Tabor, Paul E. | Taudien, Stefan | Taylor, Tineace | Teague, Brian | Thomas, Karen | Thorpe, Andrea | Timms, Kirsten | Tracey, Alan | Trevanion, Steve | Tromans, Anthony C. | d’Urso, Michele | Verduzco, Daniel | Villasana, Donna | Waldron, Lenee | Wall, Melanie | Wang, Qiaoyan | Warren, James | Warry, Georgina L. | Wei, Xuehong | West, Anthony | Whitehead, Siobhan L. | Whiteley, Mathew N. | Wilkinson, Jane E. | Willey, David L. | Williams, Gabrielle | Williams, Leanne | Williamson, Angela | Williamson, Helen | Wilming, Laurens | Woodmansey, Rebecca L. | Wray, Paul W. | Yen, Jennifer | Zhang, Jingkun | Zhou, Jianling | Zoghbi, Huda | Zorilla, Sara | Buck, David | Reinhardt, Richard | Poustka, Annemarie | Rosenthal, André | Lehrach, Hans | Meindl, Alfons | Minx, Patrick J. | Hillier, LaDeana W. | Willard, Huntington F. | Wilson, Richard K. | Waterston, Robert H. | Rice, Catherine M. | Vaudin, Mark | Coulson, Alan | Nelson, David L. | Weinstock, George | Sulston, John E. | Durbin, Richard | Hubbard, Tim | Gibbs, Richard A. | Beck, Stephan | Rogers, Jane | Bentley, David R.
Nature  2005;434(7031):325-337.
The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.
doi:10.1038/nature03440
PMCID: PMC2665286  PMID: 15772651
12.  A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis 
Nature biotechnology  2008;26(7):779-785.
DNA methylation is an indispensible epigenetic modification of mammalian genomes. Consequently there is great interest in strategies for genome-wide/whole-genome DNA methylation analysis, and immunoprecipitation-based methods have proven to be a powerful option. Such methods are rapidly shifting the bottleneck from data generation to data analysis, necessitating the development of better analytical tools. Until now, a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling has been the inability to estimate absolute methylation levels. Here we report the development of a novel cross-platform algorithm – Bayesian Tool for Methylation Analysis (Batman) – for analyzing Methylated DNA Immunoprecipitation (MeDIP) profiles generated using arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). The latter is an approach we have developed to elucidate the first high-resolution whole-genome DNA methylation profile (DNA methylome) of any mammalian genome. MeDIP-seq/MeDIP-chip combined with Batman represent robust, quantitative, and cost-effective functional genomic strategies for elucidating the function of DNA methylation.
doi:10.1038/nbt1414
PMCID: PMC2644410  PMID: 18612301
13.  Development of bioluminescent Salmonella strains for use in food safety 
BMC Microbiology  2008;8:10.
Background
Salmonella can reside in healthy animals without the manifestation of any adverse effects on the carrier. If raw products of animal origin are not handled properly during processing or cooked to a proper temperature during preparation, salmonellosis can occur. In this research, we developed bioluminescent Salmonella strains that can be used for real-time monitoring of the pathogen's growth on food products. To accomplish this, twelve Salmonella strains from the broiler production continuum were transformed with the broad host range plasmid pAKlux1, and a chicken skin attachment model was developed.
Results
Salmonella strains carrying pAKlux1 constitutively expressed the luxCDABE operon and were therefore detectable using bioluminescence. Strains were characterized in terms of bioluminescence properties and plasmid stability. To assess the usefulness of bioluminescent Salmonella strains in food safety studies, we developed an attachment model using chicken skin. The effect of washing on attachment of Salmonella strains to chicken skin was tested using bioluminescent strains, which revealed the attachment properties of each strain.
Conclusion
This study demonstrated that bioluminescence is a sensitive and effective tool to detect Salmonella on food products in real-time. Bioluminescence imaging is a promising technology that can be utilized to evaluate new food safety measures for reducing Salmonella contamination on food products.
doi:10.1186/1471-2180-8-10
PMCID: PMC2257966  PMID: 18211715
14.  Variation analysis and gene annotation of eight MHC haplotypes: The MHC Haplotype Project 
Immunogenetics  2008;60(1):1-18.
The human major histocompatibility complex (MHC) is contained within about 4 Mb on the short arm of chromosome 6 and is recognised as the most variable region in the human genome. The primary aim of the MHC Haplotype Project was to provide a comprehensively annotated reference sequence of a single, human leukocyte antigen-homozygous MHC haplotype and to use it as a basis against which variations could be assessed from seven other similarly homozygous cell lines, representative of the most common MHC haplotypes in the European population. Comparison of the haplotype sequences, including four haplotypes not previously analysed, resulted in the identification of >44,000 variations, both substitutions and indels (insertions and deletions), which have been submitted to the dbSNP database. The gene annotation uncovered haplotype-specific differences and confirmed the presence of more than 300 loci, including over 160 protein-coding genes. Combined analysis of the variation and annotation datasets revealed 122 gene loci with coding substitutions of which 97 were non-synonymous. The haplotype (A3-B7-DR15; PGF cell line) designated as the new MHC reference sequence, has been incorporated into the human genome assembly (NCBI35 and subsequent builds), and constitutes the largest single-haplotype sequence of the human genome to date. The extensive variation and annotation data derived from the analysis of seven further haplotypes have been made publicly available and provide a framework and resource for future association studies of all MHC-associated diseases and transplant medicine.
doi:10.1007/s00251-007-0262-2
PMCID: PMC2206249  PMID: 18193213
Major histocompatibility complex; Haplotype; Polymorphism; Retroelement; Genetic predisposition to disease; Population genetics
15.  A Comparison of nLC-ESI-MS/MS and nLC-MALDI-MS/MS for GeLC-Based Protein Identification and iTRAQ-Based Shotgun Quantitative Proteomics 
The use of nLC-ESI-MS/MS in shotgun proteomics experiments and GeLC-MS/MS analysis is well accepted and routinely available in most proteomics laboratories. However, the same cannot be said for nLC-MALDI MS/MS, which has yet to experience such widespread acceptance, despite the fact that the MALDI technology offers several critical advantages over ESI. As an illustration, in an analysis of moderately complex sample of E. coli proteins, the use MALDI in addition to ESI in GeLC-MS/MS resulted in a 16% average increase in protein identifications, while with more complex samples the number of additional protein identifications increased by an average of 45%. The size of the unique peptides identified by MALDI was, on average, 25% larger than the unique peptides identified by ESI, and they were found to be slightly more hydrophilic. The insensitivity of MALDI to the presence of ionization suppression agents was shown to be a significant advantage, suggesting it be used as a complement to ESI when ion suppression is a possibility. Furthermore, the higher resolution of the TOF/TOF instrument improved the sensitivity, accuracy, and precision of the data over that obtained using only ESI-based iTRAQ experiments using a linear ion trap. Nevertheless, accurate data can be generated with either instrument. These results demonstrate that coupling nanoLC with both ESI and MALDI ionization interfaces improves proteome coverage, reduces the deleterious effects of ionization suppression agents, and improves quantitation, particularly in complex samples.
PMCID: PMC2062563  PMID: 17916795
nLC-ESI-MS/MS; nLC-MALDI-MS/MS; protein identification; quantitation; quadrupole linear ion trap; tandem time-of-flight; mass spectrometry
16.  The Pfam Protein Families Database 
Nucleic Acids Research  2002;30(1):276-280.
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at http://www.sanger.ac.uk/Software/Pfam/, in Sweden at http://www.cgb.ki.se/Pfam/, in France at http://pfam.jouy.inra.fr/ and in the US at http://pfam.wustl.edu/. The latest version (6.6) of Pfam contains 3071 families, which match 69% of proteins in SWISS-PROT 39 and TrEMBL 14. Structural data, where available, have been utilised to ensure that Pfam families correspond with structural domains, and to improve domain-based annotation. Predictions of non-domain regions are now also included. In addition to secondary structure, Pfam multiple sequence alignments now contain active site residue mark-up. New search tools, including taxonomy search and domain query, greatly add to the functionality and usability of the Pfam resource.
PMCID: PMC99071  PMID: 11752314
17.  The Pfam Protein Families Database 
Nucleic Acids Research  2000;28(1):263-266.
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the WWW in the UK at http://www.sanger.ac.uk/Software/Pfam/ , in Sweden at http://www.cgr.ki.se/Pfam/ and in the US at http://pfam.wustl.edu/ . The latest version (4.3) of Pfam contains 1815 families. These Pfam families match 63% of proteins in SWISS-PROT 37 and TrEMBL 9. For complete genomes Pfam currently matches up to half of the proteins. Genomic DNA can be directly searched against the Pfam library using the Wise2 package.
PMCID: PMC102420  PMID: 10592242
18.  DNA Methylation Profiling of the Human Major Histocompatibility Complex: A Pilot Study for the Human Epigenome Project 
PLoS Biology  2004;2(12):e405.
The Human Epigenome Project aims to identify, catalogue, and interpret genome-wide DNA methylation phenomena. Occurring naturally on cytosine bases at cytosine–guanine dinucleotides, DNA methylation is intimately involved in diverse biological processes and the aetiology of many diseases. Differentially methylated cytosines give rise to distinct profiles, thought to be specific for gene activity, tissue type, and disease state. The identification of such methylation variable positions will significantly improve our understanding of genome biology and our ability to diagnose disease. Here, we report the results of the pilot study for the Human Epigenome Project entailing the methylation analysis of the human major histocompatibility complex. This study involved the development of an integrated pipeline for high-throughput methylation analysis using bisulphite DNA sequencing, discovery of methylation variable positions, epigenotyping by matrix-assisted laser desorption/ionisation mass spectrometry, and development of an integrated public database available at http://www.epigenome.org. Our analysis of DNA methylation levels within the major histocompatibility complex, including regulatory exonic and intronic regions associated with 90 genes in multiple tissues and individuals, reveals a bimodal distribution of methylation profiles (i.e., the vast majority of the analysed regions were either hypo- or hypermethylated), tissue specificity, inter-individual variation, and correlation with independent gene expression data.
DNA is frequently modified by methylation, which can affect its function. The Human Epigenome Project aims to identify, catalog, and interpret DNA methylation throughout the genome
doi:10.1371/journal.pbio.0020405
PMCID: PMC529316  PMID: 15550986

Results 1-18 (18)