A major concern in conservation genetics is to maintain the genetic diversity of populations. Genetic variation in livestock species is threatened by the progressive marginalisation of local breeds in benefit of high-output pigs worldwide. We used high-density SNP and re-sequencing data to assess genetic diversity of local pig breeds from Europe. In addition, we re-sequenced pigs from commercial breeds to identify potential candidate mutations responsible for phenotypic divergence among these groups of breeds.
Our results point out some local breeds with low genetic diversity, whose genome shows a high proportion of regions of homozygosis (>50%) and that harbour a large number of potentially damaging mutations. We also observed a high correlation between genetic diversity estimates using high-density SNP data and Next Generation Sequencing data (r = 0.96 at individual level). The study of non-synonymous SNPs that were fixed in commercial breeds and also in any local breed, but with different allele, revealed 99 non-synonymous SNPs affecting 65 genes. Candidate mutations that may underlie differences in the adaptation to the environment were exemplified by the genes AZGP1 and TAS2R40. We also observed that highly productive breeds may have lost advantageous genotypes within genes involve in immune response – e.g. IL12RB2 and STAB1–, probably as a result of strong artificial in the intensive production systems in pig.
The high correlation between genetic diversity computed with the 60K SNP and whole genome re-sequence data indicates that the Porcine 60K SNP Beadchip provides reliable estimates of genomic diversity in European pig populations despite the expected bias. Moreover, this analysis gave insights for strategies to the genetic characterization of local breeds. The comparison between re-sequenced local pigs and re-sequenced commercial pigs made it possible to report candidate mutations to be responsible for phenotypic divergence among those groups of breeds. This study highlights the importance of low input breeds as a valuable genetic reservoir for the pig production industry. However, the high levels of ROHs, inbreeding and potentially damaging mutations emphasize the importance of the genetic characterization of local breeds to preserve their genomic variability.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-601) contains supplementary material, which is available to authorized users.
Copy number variable regions (CNVRs) can result in drastic phenotypic differences and may therefore be subject to selection during domestication. Studying copy number variation in relation to domestication is highly relevant in pigs because of their very rich natural and domestication history that resulted in many different phenotypes. To investigate the evolutionary dynamic of CNVRs, we applied read depth method on next generation sequence data from 16 individuals, comprising wild boars and domestic pigs from Europe and Asia.
We identified 3,118 CNVRs with an average size of 13 kilobases comprising a total of 39.2 megabases of the pig genome and 545 overlapping genes. Functional analyses revealed that CNVRs are enriched with genes related to sensory perception, neurological process and response to stimulus, suggesting their contribution to adaptation in the wild and behavioral changes during domestication. Variations of copy number (CN) of antimicrobial related genes suggest an ongoing process of evolution of these genes to combat food-borne pathogens. Likewise, some genes related to the omnivorous lifestyle of pigs, like genes involved in detoxification, were observed to be CN variable. A small portion of CNVRs was unique to domestic pigs and may have been selected during domestication. The majority of CNVRs, however, is shared between wild and domesticated individuals, indicating that domestication had minor effect on the overall diversity of CNVRs. Also, the excess of CNVRs in non-genic regions implies that a major part of these variations is likely to be (nearly) neutral. Comparison between different populations showed that larger populations have more CNVRs, highlighting that CNVRs are, like other genetic variation such as SNPs and microsatellites, reflecting demographic history rather than phenotypic diversity.
CNVRs in pigs are enriched for genes related to sensory perception, neurological process, and response to stimulus. The majority of CNVRs ascertained in domestic pigs are also variable in wild boars, suggesting that the domestication of the pig did not result in a change in CNVRs in domesticated pigs. The majority of variable regions were found to reflect demographic patterns rather than phenotypic.
Structural variation; Copy number variation; Next generation sequencing data; Read depth method
Detecting genetic variation is a critical step in elucidating the molecular mechanisms underlying phenotypic diversity. Until recently, such detection has mostly focused on single nucleotide polymorphisms (SNPs) because of the ease in screening complete genomes. Another type of variant, copy number variation (CNV), is emerging as a significant contributor to phenotypic variation in many species. Here we describe a genome-wide CNV study using array comparative genomic hybridization (aCGH) in a wide variety of chicken breeds.
We identified 3,154 CNVs, grouped into 1,556 CNV regions (CNVRs). Thirty percent of the CNVs were detected in at least 2 individuals. The average size of the CNVs detected was 46.3 kb with the largest CNV, located on GGAZ, being 4.3 Mb. Approximately 75% of the CNVs are copy number losses relatively to the Red Jungle Fowl reference genome. The genome coverage of CNVRs in this study is 60 Mb, which represents almost 5.4% of the chicken genome. In particular large gene families such as the keratin gene family and the MHC show extensive CNV.
A relative large group of the CNVs are line-specific, several of which were previously shown to be related to the causative mutation for a number of phenotypic variants. The chance that inter-specific CNVs fall into CNVRs detected in chicken is related to the evolutionary distance between the species. Our results provide a valuable resource for the study of genetic and phenotypic variation in this phenotypically diverse species.
Copy number variation; Chicken; aCGH; Line-specific CNVs; Inter-specific CNVs; Genes
The application of DNA markers for the identification of biological samples from both human and non-human species is widespread and includes use in food authentication. In the food industry the financial incentive to substituting the true name of a food product with a higher value alternative is driving food fraud. This applies to British pork products where products derived from traditional pig breeds are of premium value. The objective of this study was to develop a genetic assay for regulatory authentication of traditional pig breed-labelled products in the porcine food industry in the United Kingdom.
The dataset comprised of a comprehensive coverage of breed types present in Britain: 460 individuals from 7 traditional breeds, 5 commercial purebreds, 1 imported European breed and 1 imported Asian breed were genotyped using the PorcineSNP60 beadchip. Following breed-informative SNP selection, assignment power was calculated for increasing SNP panel size. A 96-plex assay created using the most informative SNPs revealed remarkably high genetic differentiation between the British pig breeds, with an average FST of 0.54 and Bayesian clustering analysis also indicated that they were distinct homogenous populations. The posterior probability of assignment of any individual of a presumed origin actually originating from that breed given an alternative breed origin was > 99.5% in 174 out of 182 contrasts, at a test value of log(LR) > 0. Validation of the 96-plex assay using independent test samples of known origin was successful; a subsequent survey of market samples revealed a high level of breed label conformity.
The newly created 96-plex assay using selected markers from the PorcineSNP60 beadchip enables powerful assignment of samples to traditional breed origin and can effectively identify mislabelling, providing a highly effective tool for DNA analysis in food forensics.
The availability of a high-density SNP genotyping chip and a reference genome sequence of the pig (Sus scrofa) enabled the construction of a high-density linkage map. A high-density linkage map is an essential tool for further fine-mapping of quantitative trait loci (QTL) for a variety of traits in the pig and for a better understanding of mechanisms underlying genome evolution.
Four different pig pedigrees were genotyped using the Illumina PorcineSNP60 BeadChip. Recombination maps for the autosomes were computed for each individual pedigree using a common set of markers. The resulting genetic maps comprised 38,599 SNPs, including 928 SNPs not positioned on a chromosome in the current assembly of the pig genome (build 10.2). The total genetic length varied according to the pedigree, from 1797 to 2149 cM. Female maps were longer than male maps, with a notable exception for SSC1 where male maps are characterized by a higher recombination rate than females in the region between 91–250 Mb. The recombination rates varied among chromosomes and along individual chromosomes, regions with high recombination rates tending to cluster close to the chromosome ends, irrespective of the position of the centromere. Correlations between main sequence features and recombination rates were investigated and significant correlations were obtained for all the studied motifs. Regions characterized by high recombination rates were enriched for specific GC-rich sequence motifs as compared to low recombinant regions. These correlations were higher in females than in males, and females were found to be more recombinant than males at regions where the GC content was greater than 0.4.
The analysis of the recombination rate along the pig genome highlighted that the regions exhibiting higher levels of recombination tend to cluster around the ends of the chromosomes irrespective of the location of the centromere. Major sex-differences in recombination were observed: females had a higher recombination rate within GC-rich regions and exhibited a stronger correlation between recombination rates and specific sequence features.
Pig; Recombination; Genome; SNP; Linkage; Meiosis; Telomere; Centromere; Isochore
The turkey (Meleagris gallopavo) is an important agricultural species and the second largest contributor to the world’s poultry meat production. Genetic improvement is attributed largely to selective breeding programs that rely on highly heritable phenotypic traits, such as body size and breast muscle development. Commercial breeding with small effective population sizes and epistasis can result in loss of genetic diversity, which in turn can lead to reduced individual fitness and reduced response to selection. The presence of genomic diversity in domestic livestock species therefore, is of great importance and a prerequisite for rapid and accurate genetic improvement of selected breeds in various environments, as well as to facilitate rapid adaptation to potential changes in breeding goals. Genomic selection requires a large number of genetic markers such as e.g. single nucleotide polymorphisms (SNPs) the most abundant source of genetic variation within the genome.
Alignment of next generation sequencing data of 32 individual turkeys from different populations was used for the discovery of 5.49 million SNPs, which subsequently were used for the analysis of genetic diversity among the different populations. All of the commercial lines branched from a single node relative to the heritage varieties and the South Mexican turkey population. Heterozygosity of all individuals from the different turkey populations ranged from 0.17-2.73 SNPs/Kb, while heterozygosity of populations ranged from 0.73-1.64 SNPs/Kb. The average frequency of heterozygous SNPs in individual turkeys was 1.07 SNPs/Kb. Five genomic regions with very low nucleotide variation were identified in domestic turkeys that showed state of fixation towards alleles different than wild alleles.
The turkey genome is much less diverse with a relatively low frequency of heterozygous SNPs as compared to other livestock species like chicken and pig. The whole genome SNP discovery study in turkey resulted in the detection of 5.49 million putative SNPs compared to the reference genome. All commercial lines appear to share a common origin. Presence of different alleles/haplotypes in the SM population highlights that specific haplotypes have been selected in the modern domesticated turkey.
In livestock species like the chicken, high throughput single nucleotide polymorphism (SNP) genotyping assays are increasingly being used for whole genome association studies and as a tool in breeding (referred to as genomic selection). To be of value in a wide variety of breeds and populations, the success rate of the SNP genotyping assay, the distribution of the SNP across the genome and the minor allele frequencies (MAF) of the SNPs used are extremely important.
We describe the design of a moderate density (60k) Illumina SNP BeadChip in chicken consisting of SNPs known to be segregating at high to medium minor allele frequencies (MAF) in the two major types of commercial chicken (broilers and layers). This was achieved by the identification of 352,303 SNPs with moderate to high MAF in 2 broilers and 2 layer lines using Illumina sequencing on reduced representation libraries. To further increase the utility of the chip, we also identified SNPs on sequences currently not covered by the chicken genome assembly (Gallus_gallus-2.1). This was achieved by 454 sequencing of the chicken genome at a depth of 12x and the identification of SNPs on 454-derived contigs not covered by the current chicken genome assembly. In total we added 790 SNPs that mapped to 454-derived contigs as well as 421 SNPs with a position on Chr_random of the current assembly. The SNP chip contains 57,636 SNPs of which 54,293 could be genotyped and were shown to be segregating in chicken populations. Our SNP identification procedure appeared to be highly reliable and the overall validation rate of the SNPs on the chip was 94%. We were able to map 328 SNPs derived from the 454 sequence contigs on the chicken genome. The majority of these SNPs map to chromosomes that are already represented in genome build Gallus_gallus-2.1.0. Twenty-eight SNPs were used to construct two new linkage groups most likely representing two micro-chromosomes not covered by the current genome assembly.
The high success rate of the SNPs on the Illumina chicken 60K Beadchip emphasizes the power of Next generation sequence (NGS) technology for the SNP identification and selection step. The identification of SNPs from sequence contigs derived from NGS sequencing resulted in improved coverage of the chicken genome and the construction of two new linkage groups most likely representing two chicken micro-chromosomes.
Next generation sequencing technologies allow to obtain at low cost the genomic sequence information that currently lacks for most economically and ecologically important organisms. For the mallard duck genomic data is limited. The mallard is, besides a species of large agricultural and societal importance, also the focal species when it comes to long distance dispersal of Avian Influenza. For large scale identification of SNPs we performed Illumina sequencing of wild mallard DNA and compared our data with ongoing genome and EST sequencing of domesticated conspecifics. This is the first study of its kind for waterfowl.
More than one billion base pairs of sequence information were generated resulting in a 16× coverage of a reduced representation library of the mallard genome. Sequence reads were aligned to a draft domesticated duck reference genome and allowed for the detection of over 122,000 SNPs within our mallard sequence dataset. In addition, almost 62,000 nucleotide positions on the domesticated duck reference showed a different nucleotide compared to wild mallard. Approximately 20,000 SNPs identified within our data were shared with SNPs identified in the sequenced domestic duck or in EST sequencing projects. The shared SNPs were considered to be highly reliable and were used to benchmark non-shared SNPs for quality. Genotyping of a representative sample of 364 SNPs resulted in a SNP conversion rate of 99.7%. The correlation of the minor allele count and observed minor allele frequency in the SNP discovery pool was 0.72.
We identified almost 150,000 SNPs in wild mallards that will likely yield good results in genotyping. Of these, ~101,000 SNPs were detected within our wild mallard sequences and ~49,000 were detected between wild and domesticated duck data. In the ~101,000 SNPs we found a subset of ~20,000 SNPs shared between wild mallards and the sequenced domesticated duck suggesting a low genetic divergence. Comparison of quality metrics between the total SNP set (122,000 + 62,000 = 184,000 SNPs) and the validated subset shows similar characteristics for both sets. This indicates that we have detected a large amount (~150,000) of accurately inferred mallard SNPs, which will benefit bird evolutionary studies, ecological studies (e.g. disentangling migratory connectivity) and industrial breeding programs.
Variation within individual genomes ranges from single nucleotide polymorphisms (SNPs) to kilobase, and even megabase, sized structural variants (SVs), such as deletions, insertions, inversions, and more complex rearrangements. Although much is known about the extent of SVs in humans and mice, species in which they exert significant effects on phenotypes, very little is known about the extent of SVs in the 2.5-times smaller and less repetitive genome of the chicken.
We identified hundreds of shared and divergent SVs in four commercial chicken lines relative to the reference chicken genome. The majority of SVs were found in intronic and intergenic regions, and we also found SVs in the coding regions. To identify the SVs, we combined high-throughput short read paired-end sequencing of genomic reduced representation libraries (RRLs) of pooled samples from 25 individuals and computational mapping of DNA sequences from a reference genome.
We provide a first glimpse of the high abundance of small structural genomic variations in the chicken. Extrapolating our results, we estimate that there are thousands of rearrangements in the chicken genome, the majority of which are located in non-coding regions. We observed that structural variation contributes to genetic differentiation among current domesticated chicken breeds and the Red Jungle Fowl. We expect that, because of their high abundance, SVs might explain phenotypic differences and play a role in the evolution of the chicken genome. Finally, our study exemplifies an efficient and cost-effective approach for identifying structural variation in sequenced genomes.
The turkey (Meleagris gallopavo) is an important agricultural species that is the second largest contributor to the world's poultry meat production. The genomic resources of turkey provide turkey breeders with tools needed for the genetic improvement of commercial breeds of turkey for economically important traits. A linkage map of turkey is essential not only for the mapping of quantitative trait loci, but also as a framework to enable the assignment of sequence contigs to specific chromosomes. Comparative genomics with chicken provides insight into mechanisms of genome evolution and helps in identifying rare genomic events such as genomic rearrangements and duplications/deletions.
Eighteen full sib families, comprising 1008 (35 F1 and 973 F2) birds, were genotyped for 775 single nucleotide polymorphisms (SNPs). Of the 775 SNPs, 570 were informative and used to construct a linkage map in turkey. The final map contains 531 markers in 28 linkage groups. The total genetic distance covered by these linkage groups is 2,324 centimorgans (cM) with the largest linkage group (81 loci) measuring 326 cM. Average marker interval for all markers across the 28 linkage groups is 4.6 cM. Comparative mapping of turkey and chicken revealed two inter-, and 57 intrachromosomal rearrangements between these two species.
Our turkey genetic map of 531 markers reveals a genome length of 2,324 cM. Our linkage map provides an improvement of previously published maps because of the more even distribution of the markers and because the map is completely based on SNP markers enabling easier and faster genotyping assays than the microsatellitemarkers used in previous linkage maps. Turkey and chicken are shown to have a highly conserved genomic structure with a relatively low number of inter-, and intrachromosomal rearrangements.
The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing.
Assemblies of the BAC clone derived genome sequence have been annotated using the Pre-Ensembl and Ensembl automated pipelines and made accessible through the Pre-Ensembl/Ensembl browsers. The current annotated genome assembly (Sscrofa9) was released with Ensembl 56 in September 2009. A revised assembly (Sscrofa10) is under construction and will incorporate whole genome shotgun sequence (WGS) data providing > 30× genome coverage. The WGS sequence, most of which comprise short Illumina/Solexa reads, were generated from DNA from the same single Duroc sow as the source of the BAC library from which clones were preferentially selected for sequencing. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement the data have been released into public sequence repositories (Genbank/EMBL, NCBI/Ensembl trace repositories) in a timely manner and in advance of publication.
In this marker paper, the Swine Genome Sequencing Consortium (SGSC) sets outs its plans for analysis of the pig genome sequence, for the application and publication of the results.
Over the past years, the relationship between gene transcription and chromosomal location has been studied in a number of different vertebrate genomes. Regional differences in gene expression have been found in several different species. The chicken genome, as the closest sequenced genome relative to mammals, is an important resource for investigating regional effects on transcription in birds and studying the regional dynamics of chromosome evolution by comparative analysis.
We used gene expression data to survey eight chicken tissues and create transcriptome maps for all chicken chromosomes. The results reveal the presence of two distinct types of chromosomal regions characterized by clusters of highly or lowly expressed genes. Furthermore, these regions correlate highly with a number of genome characteristics. Regions with clusters of highly expressed genes have higher gene densities, shorter genes, shorter average intron and higher GC content compared to regions with clusters of lowly expressed genes. A comparative analysis between the chicken and human transcriptome maps constructed using similar panels of tissues suggests that the regions with clusters of highly expressed genes are relatively conserved between the two genomes.
Our results revealed the presence of a higher order organization of the chicken genome that affects gene expression, confirming similar observations in other species. These results will aid in the further understanding of the regional dynamics of chromosome evolution.
The microarray data used in this analysis have been submitted to NCBI GEO database under accession number GSE17108. The reviewer access link is: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=tjwjpscyceqawjk&acc=GSE17108
The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs) in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a short read de novo sequence assembler and a program designed to identify variation within short reads. To illustrate the potential of this technique, we present the results obtained with a randomly sheared, enzymatically generated, 2-3 kbp genome fraction of six pooled Meleagris gallopavo (turkey) individuals.
A total of 100 million 36 bp reads were generated, representing approximately 5-6% (~62 Mbp) of the turkey genome, with an estimated sequence depth of 58. Reads consisting of bases called with less than 1% error probability were selected and assembled into contigs. Subsequently, high throughput discovery of nucleotide variation was performed using sequences with more than 90% reliability by using the assembled contigs that were 50 bp or longer as the reference sequence. We identified more than 7,500 SNPs with a high probability of representing true nucleotide variation in turkeys. Increasing the reference genome by adding publicly available turkey BAC-end sequences increased the number of SNPs to over 11,000. A comparison with the sequenced chicken genome indicated that the assembled turkey contigs were distributed uniformly across the turkey genome. Genotyping of a representative sample of 340 SNPs resulted in a SNP conversion rate of 95%. The correlation of the minor allele count (MAC) and observed minor allele frequency (MAF) for the validated SNPs was 0.69.
We provide an efficient and cost-effective approach for the identification of thousands of high quality SNPs in species currently lacking a sequenced genome and applied this to turkey. The methodology addresses a random fraction of the genome, resulting in an even distribution of SNPs across the targeted genome.
Although the Illumina 1 G Genome Analyzer generates billions of base pairs of sequence data, challenges arise in sequence selection due to the varying sequence quality. Therefore, in the framework of the International Porcine SNP Chip Consortium, this pilot study aimed to evaluate the impact of the quality level of the sequenced bases on mapping quality and identification of true SNPs on a large scale.
DNA pooled from five animals from a commercial boar line was digested with DraI; 150–250-bp fragments were isolated and end-sequenced using the Illumina 1 G Genome Analyzer, yielding 70,348,064 sequences 36-bp long. Rules were developed to select sequences, which were then aligned to unique positions in a reference genome. Sequences were selected based on quality, and three thresholds of sequence quality (SQ) were compared. The highest threshold of SQ allowed identification of a larger number of SNPs (17,489), distributed widely across the pig genome. In total, 3,142 SNPs were validated with a success rate of 96%. The correlation between estimated minor allele frequency (MAF) and genotyped MAF was moderate, and SNPs were highly polymorphic in other pig breeds. Lowering the SQ threshold and maintaining the same criteria for SNP identification resulted in the discovery of fewer SNPs (16,768), of which 259 were not identified using higher SQ levels. Validation of SNPs found exclusively in the lower SQ threshold had a success rate of 94% and a low correlation between estimated MAF and genotyped MAF. Base change analysis suggested that the rate of transitions in the pig genome is likely to be similar to that observed in humans. Chromosome X showed reduced nucleotide diversity relative to autosomes, as observed for other species.
Large numbers of SNPs can be identified reliably by creating strict rules for sequence selection, which simultaneously decreases sequence ambiguity. Selection of sequences using a higher SQ threshold leads to more reliable identification of SNPs. Lower SQ thresholds can be used to guarantee sufficient sequence coverage, resulting in high success rate but less reliable MAF estimation. Nucleotide diversity varies between porcine chromosomes, with the X chromosome showing less variation as observed in other species.
Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited.
A total of 4.8 million whole genome shotgun sequences obtained from the NCBI trace-repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project" were analysed for the presence of SNPs. Available BAC and BAC-end sequences and their naming and mapping information, all obtained from SangerInstitute FTP site, served as a rough assembly of a reference genome. In 1.2 Gb of pig genome sequence, we identified 98,151 SNPs in which one of the sequences in the alignment represented the polymorphism and 6,374 SNPs in which two sequences represent an identical polymorphism. To benchmark the SNP identification method, 163 SNPs, in which the polymorphism was represented twice in the sequence alignment, were selected and tested on a panel of three purebred boar lines and wild boar. Of these 163 in silico identified SNPs, 134 were shown to be polymorphic in our animal panel.
This SNP identification method, which mines for SNPs in publicly available porcine shotgun sequences repositories, provides thousands of high quality SNPs. Benchmarking in an animal panel showed that more than 80% of the predicted SNPs represented true genetic variation.
One of the loci responsible for feather development in chickens is K. The K allele is partially dominant to the k+ allele and causes a retard in the emergence of flight feathers at hatch. The K locus is sex linked and located on the Z chromosome. Therefore, the locus can be utilized to produce phenotypes that identify the sexes of chicks at hatch. Previous studies on the organization of the K allele concluded the integration of endogenous retrovirus 21 (ev21) into one of two large homologous segments located on the Z chromosome of late feathering chickens. In this study, a detailed molecular analysis of the K locus and a DNA test to distinguish between homozygous and heterozygous late feathering males are presented.
The K locus was investigated with quantitative PCR by examining copy number variations in a total of fourteen markers surrounding the ev21 integration site. The results showed a duplication at the K allele and sequence analysis of the breakpoint junction indicated a tandem duplication of 176,324 basepairs. The tandem duplication of this region results in the partial duplication of two genes; the prolactin receptor and the gene encoding sperm flagellar protein 2. Sequence analysis revealed that the duplication is similar in Broiler and White Leghorn. In addition, twelve late feathering animals, including Broiler, White Leghorn, and Brown Layer lines, contained a 78 bp breakpoint junction fragment, indicating that the duplication is similar in all breeds. The breakpoint junction was used to develop a TaqMan-based quantitative PCR test to allow distinction between homozygous and heterozygous late feathering males. In total, 85.3% of the animals tested were correctly assigned, 14.7% were unassigned and no animals were incorrectly assigned.
The detailed molecular analysis presented in this study revealed the presence of a tandem duplication in the K allele. The duplication resulted in the partial duplication of two genes; the prolactin receptor and the gene encoding sperm flagellar protein 2. Furthermore, a DNA test was developed to distinguish between homozygous and heterozygous late feathering males.
Comparative genomics is a powerful means of establishing inter-specific relationships between gene function/location and allows insight into genomic rearrangements, conservation and evolutionary phylogeny. The availability of the complete sequence of the chicken genome has initiated the development of detailed genomic information in other birds including turkey, an agriculturally important species where mapping has hitherto focused on linkage with limited physical information. No molecular study has yet examined conservation of avian microchromosomes, nor differences in copy number variants (CNVs) between birds.
We present a detailed comparative cytogenetic map between chicken and turkey based on reciprocal chromosome painting and mapping of 338 chicken BACs to turkey metaphases. Two inter-chromosomal changes (both involving centromeres) and three pericentric inversions have been identified between chicken and turkey; and array CGH identified 16 inter-specific CNVs.
This is the first study to combine the modalities of zoo-FISH and array CGH between different avian species. The first insight into the conservation of microchromosomes, the first comparative cytogenetic map of any bird and the first appraisal of CNVs between birds is provided. Results suggest that avian genomes have remained relatively stable during evolution compared to mammalian equivalents.
The resolution of radiation hybrid (RH) maps is intermediate between that of the genetic and BAC (Bacterial Artificial Chromosome) contig maps. Moreover, once framework RH maps of a genome have been constructed, a quick location of markers by simple PCR on the RH panel is possible. The chicken ChickRH6 panel recently produced was used here to construct a high resolution RH map of chicken GGA5. To confirm the validity of the map and to provide valuable comparative mapping information, both markers from the genetic map and a high number of ESTs (Expressed Sequence Tags) were used. Finally, this RH map was used for testing the accuracy of the chicken genome assembly for chromosome 5.
A total of 169 markers (21 microsatellites and 148 ESTs) were typed on the ChickRH6 RH panel, of which 134 were assigned to GGA5. The final map is composed of 73 framework markers extending over a 1315.6 cR distance. The remaining 61 markers were placed alongside the framework markers within confidence intervals.
The high resolution framework map obtained in this study has markers covering the entire chicken chromosome 5 and reveals the existence of a high number of rearrangements when compared to the human genome. Only two discrepancies were observed in relation to the sequence assembly recently reported for this chromosome.
Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide modules that need advanced informatics skills to allow implementation in pipelines.
Here we present POSA, a pair of new perl objects that describe DNA sequence traces and Phrap contig assemblies in detail. Methods included in POSA include basecalling with quality scores (by Phred), contig assembly (by Phrap), generation of primer3 input and automated SNP annotation (by PolyPhred). Although easily implemented by users with only limited programming experience, these objects considerabily reduce hands-on analysis time compared to using the Staden package for extracting sequence information from raw sequencing files and for SNP discovery.
The POSA objects allow a flexible and easy design, implementation and usage of perl-based pipelines to handle and analyze DNA sequencing data, while requiring only minor programming skills.