PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-14 (14)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  The 1000 Genomes Project: Data Management and Community Access 
Nature methods  2012;9(5):459-462.
The 1000 Genomes Project was launched as one of the largest distributed data collection and analysis projects ever undertaken in biology. In addition to the primary scientific goals of creating both a deep catalogue of human genetic variation and extensive methods to accurately discover and characterize variation using new sequencing technologies, the project makes all of its data publicly available for community use. The project data coordination center has developed and deployed several tools to enable widespread data access.
doi:10.1038/nmeth.1974
PMCID: PMC3340611  PMID: 22543379
2.  A Mathematical Approach to the Analysis of Multiplex DNA Profiles 
Bulletin of mathematical biology  2010;73(8):1909-1931.
Multiplex DNA profiles are used extensively for biomedical and forensic purposes. However, while DNA profile data generation is automated, human analysis of those data is not, and the need for speed combined with accuracy demands a computer-automated approach to sample interpretation and quality assessment. In this paper, we describe an integrated mathematical approach to modeling the data and extracting the relevant information, while rejecting noise and sample artifacts. We conclude with examples showing the effectiveness of our algorithms.
doi:10.1007/s11538-010-9598-0
PMCID: PMC3150588  PMID: 21103945
DNA; STR; multiplex; electrophoresis; cubic spline; inner product; norm; Gaussian; OSIRIS; forensics; profile; biomedical
3.  Assessing and Managing Risk when Sharing Aggregate Genetic Variant Data 
Nature Reviews. Genetics  2011;12(10):730-736.
Preface
Access to genetic data across studies is an important aspect of identifying new genetic associations through genome-wide association studies (GWAS). Meta-analysis across multiple GWAS with combined cohort sizes of tens of thousands of individuals often uncovers many more genome-wide associated loci than the original individual studies, which emphasizes the importance of tools and mechanisms for data sharing. However, even sharing summary-level data, such as allele frequencies, inherently carries some degree of privacy risk to study participants. Here we discuss mechanisms and resources for sharing data from GWAS, particularly focusing on approaches for assessing and quantifying privacy risks to participants from sharing of summary-level data.
doi:10.1038/nrg3067
PMCID: PMC3349221  PMID: 21921928
4.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2011;40(D1):D13-D25.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkr1184
PMCID: PMC3245031  PMID: 22140104
5.  The variant call format and VCFtools 
Bioinformatics  2011;27(15):2156-2158.
Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API.
Availability: http://vcftools.sourceforge.net
Contact: rd@sanger.ac.uk
doi:10.1093/bioinformatics/btr330
PMCID: PMC3137218  PMID: 21653522
6.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2010;39(Database issue):D38-D51.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkq1172
PMCID: PMC3013733  PMID: 21097890
7.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2009;38(Database issue):D5-D16.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, Reference Sequence, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Peptidome, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkp967
PMCID: PMC2808881  PMID: 19910364
10.  Supplementing High-Density SNP Microarrays for Additional Coverage of Disease-Related Genes: Addiction as a Paradigm 
PLoS ONE  2009;4(4):e5225.
Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well represented by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions.
doi:10.1371/journal.pone.0005225
PMCID: PMC2668711  PMID: 19381300
11.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2008;37(Database issue):D5-D15.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the web applications is custom implementation of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkn741
PMCID: PMC2686545  PMID: 18940862
13.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2005;34(Database issue):D173-D180.
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: .
doi:10.1093/nar/gkj158
PMCID: PMC1347520  PMID: 16381840
14.  dbSNP: a database of single nucleotide polymorphisms 
Nucleic Acids Research  2000;28(1):352-355.
In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Cancer for Biotechnology Information (NCBI) has established the dbSNP database. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP . Submitted SNPs can also be downloaded via anonymous FTP at ftp://ncbi. nlm.nih.gov/snp/
PMCID: PMC102496  PMID: 10592272

Results 1-14 (14)