PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-4 (4)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies 
Genome Biology  2014;15(3):R59.
Background
The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination.
Results
We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome.
Conclusions
In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.
doi:10.1186/gb-2014-15-3-r59
PMCID: PMC4053751  PMID: 24647006
2.  The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color 
Genome Biology  2013;14(6):r53.
Background
Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders.
Results
We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina
1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation.
Conclusions
We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits.
doi:10.1186/gb-2013-14-6-r53
PMCID: PMC4053823  PMID: 23731509
Theobroma cacao L.; genome; Matina 1-6; haplotype phasing; genetic mapping; pod color; MYB113
3.  Comments on sequence normalization of tiling array expression 
Bioinformatics  2009;25(17):2171-2173.
Motivation: Methods to improve tiling array expression signals are needed to accurately detect genome features. Royce et al. provide statistical normalizations of tile signal based on probe sequence content that promises improved accuracy, and should be independently verified.
Results: Assessment of the sequence content normalization methods identified a problem: confounding of probe sequence content with gene structure (intron/exon) sequence content. Normalization obscured tile signal changes at gene structure boundaries. This and other evidence suggests that simple sequence normalization does not improve detection of genes from tile expression data.
Availability: http://wfleabase.org/genome-summaries/tile-expression/tileseqnorms/
Contact: gilbertd@indiana.edu
doi:10.1093/bioinformatics/btp389
PMCID: PMC2800354  PMID: 19578171
4.  wFleaBase: the Daphnia genome database 
BMC Bioinformatics  2005;6:45.
Background
wFleaBase is a database with the necessary infrastructure to curate, archive and share genetic, molecular and functional genomic data and protocols for an emerging model organism, the microcrustacean Daphnia. Commonly known as the water-flea, Daphnia's ecological merit is unequaled among metazoans, largely because of its sentinel role within freshwater ecosystems and over 200 years of biological investigations. By consequence, the Daphnia Genomics Consortium (DGC) has launched an interdisciplinary research program to create the resources needed to study genes that affect ecological and evolutionary success in natural environments.
Discussion
These tools include the genome database wFleaBase, which currently contains functions to search and extract information from expressed sequenced tags, genome survey sequences and full genome sequencing projects. This new database is built primarily from core components of the Generic Model Organism Database project, and related bioinformatics tools.
Summary
Over the coming year, preliminary genetic maps and the nearly complete genomic sequence of Daphnia pulex will be integrated into wFleaBase, including gene predictions and ortholog assignments based on sequence similarities with eukaryote genes of known function. wFleaBase aims to serve a large ecological and evolutionary research community. Our challenge is to rapidly expand its content and to ultimately integrate genetic and functional genomic information with population-level responses to environmental challenges. URL: .
doi:10.1186/1471-2105-6-45
PMCID: PMC555599  PMID: 15752432

Results 1-4 (4)