Here we have generated the first comprehensive genome sequence analysis of a BRCA2-deficient cell line. It also represents the first such sequence of a widely used cell line derived from a pancreatic tumour. This dataset will therefore prove useful in deciphering the genetic contribution to both BRCA2-deficient and pancreatic cancers.
Many of the most frequently studied cell lines, such as Capan-1, were derived from patient tumour samples decades ago, and as such lack matched normal controls that can be used to determine germline variation. This study demonstrates that massively parallel sequencing can be utilized effectively even in the study of such cell lines. Our novel approach to filtration of the large variant dataset, using a combination of the publicly available databases in combination with HapMap samples prepared, sequenced, and analysed in an analogous manner to the query, will provide a strategy for the future evaluation of many more cell lines. Detailed genomic characterization of these lines will be beneficial, not only as a comparative resource for the large-scale tumour sequencing studies, but also to assist in deciphering many unresolved molecular and cellular observations.
Most rearrangements detected in Capan-1 are intrachromosomal, as in many breast tumours 
. As would be expected, the rearrangements are spread throughout the genome and therefore reflect the patterns seen in other solid tumours 
, rather than as hotspots of rearrangement observed recently in leukaemia 
. Interestingly, Capan-1 exhibits a greater overall number of rearrangements than has been previously described in other tumour types, including breast, which often show high levels of chromosome rearrangement 
. It seems possible that this is reflective of the genomic instability that results from the loss of BRCA2 function in HR. However, our analysis shows that, to date, there are too few complete sequences of HR-deficient genomes to be able to prove this with statistical significance ().
Capan-1 is a particularly intriguing cell line as it is both BRCA2
deficient, like many familial breast cancers 
, yet derived from a pancreatic adenocarcinoma. Although two BRCA2
-deficient tumours have previously been assessed by massively parallel DNA sequencing, this was only at low depth, sufficient for studies of chromosomal rearrangements, but not SNPs and indels 
. Previous genome-wide SNP analyses of pancreatic cancers using microarrays suggested that a small number of signalling pathways and cellular processes are altered in most pancreatic tumours 
. Many of these core processes are also affected by the novel variations detected here in Capan-1, including apoptosis (DCC
), the DNA damage response (ATM
), GTPase-dependent signalling (SMAP2
), and Wnt signalling (FZD10
), in addition to those already established, namely KRAS and TGFb signalling (the KRAS
genes respectively). Interestingly, the recent study of genomic variation in metastases from pancreatic cancers 
demonstrated that most homozygous mutations in the metastasis were already present in 
parental tumour, hence were most likely to represent tumour suppressors. This is an important finding with respect to Capan-1 and other cell lines that were derived from a metastasis rather than the primary tumour.
The most common origin of SNPs in primates is through deamination of methyl-cytosine causing transition of cytosine to thymine 
. Here we also observed that such C>T transitions constitute the most common type of base substitution in the Capan-1 genome. Base substitution frequencies have previously been analyzed in 24 advanced pancreatic adenocarcinomas 
and 11 breast tumours 
, using large-scale PCR-based resequencing studies of protein-coding exons. Whilst C>T transitions also predominated in both tumour types, the pattern of substitutions differed between pancreas and breast. In pancreatic adenocarcinomas, the vast majority of substitutions were either C>T (53.8%) or C>A (16.6%), with all other classes each accounting for only 5–10% of the total 
. In contrast, the spectra of breast tumour mutations comprised C>T (36.5%), C>G (28.1%), and C>A (15.1%), with far fewer substitutions at A or T bases 
. We observed Capan-1 to be more akin to pancreatic adenocarcinomas in terms of the pattern of exome base substitutions, although A>G transitions were the second most common class of mutation ().
The observation that the incidence of small indels in the context of short regions of repetitive sequence occurs more frequently in Capan-1, and to some extent in the BRCA2 deficient tumours PD3689a and b (), is intriguing. Such a signature may well indicate the use of alternative pathways of DNA double strand break repair, such as non-homologous end joining or single-strand annealing 
, to compensate for the lack of HR. With the future sequencing of further BRCA deficient genomes, it will be possible to decipher whether this is in fact a bone fide
DNA signature representative of a cellular defect in HR, which might be used as a biomarker to identify patient populations that might benefit from targeted therapies such as PARP inhibitors 
This comprehensive sequence analysis of a BRCA2-deficient pancreatic cancer cell line provides a valuable resource that will, in combination with large-scale genome resequencing of patient tumour samples, facilitate the identification of new biomarkers and targets for therapy. The compilation of such genomic datasets will undoubtedly underlie a greater understanding of this complex disease, and how loss of BRCA2 contributes to tumour progression.