Sequencing the Kpn
I library generated more than 26 million single-end reads of 120 bp length (Table ). Sequence data have been deposited in the NCBI Sequence Read Archive under submission number SRA052686. More than 99% of the reads contained the barcode, and more than 95% contained the restriction overhang, where one mismatch was allowed (Table ). After clustering reads to RAD tags and removing RAD tags with more than 200 reads, the number of RAD tags per inbred ranged from 100,420 to 127,681. Altogether 636,179 RAD tags were assigned to 113,221 RAD clusters. RAD clusters with one to four RAD tags occurred most frequently, and there was a gradual decrease in the number of RAD clusters with larger numbers of RAD tags (Figure ). After the filtering steps to remove intergenomic polymorphisms, nearly 33,000 polymorphisms were detected (Table ). SNPs and insertions/deletions (InDels) identified at positions 7 to 88, and 7 to 80 at a read, respectively, occurred with equal frequency at each position (Figure ), and only those were used for all further analyses. Polymorphisms outside the aforementioned read positions were more numerous due to a higher number of sequencing errors and were therefore discarded. Afterwards, altogether 20,835 SNPs were detected from 9,552 RAD clusters, and 125 InDels were detected from 59 RAD clusters (Table , Additional file 1
). A decrease in the number of reads per RAD cluster also resulted in a decreasing number of polymorphisms per RAD cluster (Figure ). Of all SNPs, 40 were triallelic. The remaining 20,795 SNPs were biallelic and consisted of 58.2% transitions and 41.8% transversions (Figure ), providing a ratio of 1.39. The number of different transitions was balanced (6,018 A/G and 6,084 C/T), and the number of transversions ranged from 1,480 (C/G) to 2,583 (A/T).
Restriction-site associated DNA (RAD) sequencing statistics summary of a KpnI library from eight B. napus inbred lines
Figure 1 Restriction-site associated DNA (RAD) clusters with different numbers of RAD tags across eightB. napusinbreds.
Distribution of restriction-site associated DNA polymorphisms (single nucleotide polymorphisms (SNPs) and insertions/deletions (InDels)) along the read length.
Figure 3 Number of polymorphisms and number of reads per restriction-site associated DNA (RAD) clusters across eightB. napusinbreds. Violin plot denotes 9,552 RAD clusters with single nucleotide polymorphisms (SNPs), and triangles denote 59 RAD (more ...)
Figure 4 Transitions and transversions within 20,795 biallelic single nucleotide polymorphisms (SNPs) detected among eightB. napusinbreds..
The correlation between pairwise modified Roger’s distance (MRD) estimates among all inbreds based on simple sequence repeat (SSR) data from a previous study [3
] and based on DNA polymorphism data from RAD sequencing was ρ
= 0.92 (Figure ). The 16 RAD clusters selected for validation comprised 31 SNPs, of which 26 (83.9%) were verified to be polymorphic. Out of the five non-polymorphic SNPs, four belonged to one specific RAD cluster. RAD and Sanger sequencing information did not agree for 13.1% of the inbred-allele combinations observed for the 31 SNPs.
Figure 5 Correlation of Modified Roger’s distances (MRD) between pairs of eightB. napusinbreds. MRD estimates were determined with simple sequence repeat (SSR) markers and polymorphisms from restriction-site associated DNA (RAD) sequencing. (more ...)
Of all RAD clusters and polymorphisms, 35,960 RAD clusters (31.8%), 6,042 SNPs (29.0%), and 50 InDels (40%) were found in the B. rapa sequence data (Figure (a)), 33,749 RAD clusters (29.8%), 5,687 SNPs (27.3%), and 44 InDels (35.2%) were found in the B. rapa chromosome data, and 8,873 RAD clusters (7.8%), 1,482 SNPs (7.1%), and 4 InDels (3.2%) were found in the B. rapa coding sequence (CDS) data after BLAST searches. The transition/transversion ratio was 1.45 calculated for the B. rapa sequence data and 1.60 for the B. rapa CDS data. RAD clusters and polymorphisms were distributed evenly across the ten B. rapa chromosomes (Figure ).
Figure 6 Representation ofB. napusrestriction-site associated DNA (RAD) information and unigenes (UG). (a) RAD clusters, single-nucleotide polymorphisms (SNPs), and insertions/deletions (InDels) in the B. rapa sequence, B. rapa chromosomes, and (more ...)
Figure 7 Distribution of restriction-site associated DNA (RAD) information inB. rapa. Dispersal of RAD clusters, RAD clusters with single nucleotide polymorphisms (SNPs), and RAD clusters with insertions/deletions (InDels) across the ten chromosomes (more ...)
Altogether 9,469 RAD clusters (8.4%), 1,245 SNPs (6.0%), and 10 InDels (8.0%) were found at least once in the Brassica UG dataset after BLAST searches (Table ). In the search against all 94,558 Brassica UG, we found for 6,140 UG RAD clusters, for 678 UG SNPs, and for 6 UG InDels (Figure (b)). A total of 3,231 UG with RAD clusters (52.6%), 335 UG with SNPs (49.4%), and 3 UG with InDels (50.0%) could be assigned a function after BLAST searches against the UniProtKB/Swiss-Prot dataset. The GO term representation was balanced between all UG, UG with RAD clusters, and UG with SNPs, whereas GO terms for UG with InDels were generally either over- or underrepresented (Figure ).
Number of occurrence of restriction-site associated DNA (RAD) clusters, single nucleotide polymorphisms (SNPs), and insertions/deletions (InDels) in the Brassica unigene dataset
Figure 8 Gene Ontology (GO) term representation. GO term representation (%) of all B. napus unigenes (UG), UG with restriction-site associated DNA (RAD) clusters, UG with single nucleotide polymorphisms (SNPs), and UG with insertions/deletions (InDels) according (more ...)