1.  Computational approaches to identify functional genetic variants in cancer genomes 
Nature methods  2013;10(8):723-729.
The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor, but only a minority drive tumor progression. We present the result of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype.
PMCID: PMC3919555  PMID: 23900255
2.  Returning individual research results for genome sequences of pancreatic cancer 
Genome Medicine  2014;6(5):42.
Disclosure of individual results to participants in genomic research is a complex and contentious issue. There are many existing commentaries and opinion pieces on the topic, but little empirical data concerning actual cases describing how individual results have been returned. Thus, the real life risks and benefits of disclosing individual research results to participants are rarely if ever presented as part of this debate.
The Australian Pancreatic Cancer Genome Initiative (APGI) is an Australian contribution to the International Cancer Genome Consortium (ICGC), that involves prospective sequencing of tumor and normal genomes of study participants with pancreatic cancer in Australia. We present three examples that illustrate different facets of how research results may arise, and how they may be returned to individuals within an ethically defensible and clinically practical framework. This framework includes the necessary elements identified by others including consent, determination of the significance of results and which to return, delineation of the responsibility for communication and the clinical pathway for managing the consequences of returning results.
Of 285 recruited patients, we returned results to a total of 25 with no adverse events to date. These included four that were classified as medically actionable, nine as clinically significant and eight that were returned at the request of the treating clinician. Case studies presented depict instances where research results impacted on cancer susceptibility, current treatment and diagnosis, and illustrate key practical challenges of developing an effective framework.
We suggest that return of individual results is both feasible and ethically defensible but only within the context of a robust framework that involves a close relationship between researchers and clinicians.
PMCID: PMC4067993  PMID: 24963353
3.  Somatic Point Mutation Calling in Low Cellularity Tumors 
PLoS ONE  2013;8(11):e74380.
Somatic mutation calling from next-generation sequencing data remains a challenge due to the difficulties of distinguishing true somatic events from artifacts arising from PCR, sequencing errors or mis-mapping. Tumor cellularity or purity, sub-clonality and copy number changes also confound the identification of true somatic events against a background of germline variants. We have developed a heuristic strategy and software ( for somatic mutation calling in samples with low tumor content and we show the superior sensitivity and precision of our approach using a previously sequenced cell line, a series of tumor/normal admixtures, and 3,253 putative somatic SNVs verified on an orthogonal platform.
PMCID: PMC3826759  PMID: 24250782
4.  Clinical and molecular characterization of HER2 amplified-pancreatic cancer 
Genome Medicine  2013;5(8):78.
Pancreatic cancer is one of the most lethal and molecularly diverse malignancies. Repurposing of therapeutics that target specific molecular mechanisms in different disease types offers potential for rapid improvements in outcome. Although HER2 amplification occurs in pancreatic cancer, it is inadequately characterized to exploit the potential of anti-HER2 therapies.
HER2 amplification was detected and further analyzed using multiple genomic sequencing approaches. Standardized reference laboratory assays defined HER2 amplification in a large cohort of patients (n = 469) with pancreatic ductal adenocarcinoma (PDAC).
An amplified inversion event (1 MB) was identified at the HER2 locus in a patient with PDAC. Using standardized laboratory assays, we established diagnostic criteria for HER2 amplification in PDAC, and observed a prevalence of 2%. Clinically, HER2- amplified PDAC was characterized by a lack of liver metastases, and a preponderance of lung and brain metastases. Excluding breast and gastric cancer, the incidence of HER2-amplified cancers in the USA is >22,000 per annum.
HER2 amplification occurs in 2% of PDAC, and has distinct features with implications for clinical practice. The molecular heterogeneity of PDAC implies that even an incidence of 2% represents an attractive target for anti-HER2 therapies, as options for PDAC are limited. Recruiting patients based on HER2 amplification, rather than organ of origin, could make trials of anti-HER2 therapies feasible in less common cancer types.
PMCID: PMC3978667  PMID: 24004612
5.  Cerebellar Output in Zebrafish: An Analysis of Spatial Patterns and Topography in Eurydendroid Cell Projections 
The cerebellum is a brain region responsible for motor coordination and for refining motor programs. While a great deal is known about the structure and connectivity of the mammalian cerebellum, fundamental questions regarding its function in behavior remain unanswered. Recently, the zebrafish has emerged as a useful model organism for cerebellar studies, owing in part to the similarity in cerebellar circuits between zebrafish and mammals. While the cell types composing their cerebellar cortical circuits are generally conserved with mammals, zebrafish lack deep cerebellar nuclei, and instead a majority of cerebellar output comes from a single type of neuron: the eurydendroid cell. To describe spatial patterns of cerebellar output in zebrafish, we have used genetic techniques to label and trace eurydendroid cells individually and en masse. We have found that cerebellar output targets the thalamus and optic tectum, and have confirmed the presence of pre-synaptic terminals from eurydendroid cells in these structures using a synaptically targeted GFP. By observing individual eurydendroid cells, we have shown that different medial-lateral regions of the cerebellum have eurydendroid cells projecting to different targets. Finally, we found topographic organization in the connectivity between the cerebellum and the optic tectum, where more medial eurydendroid cells project to the rostral tectum while lateral cells project to the caudal tectum. These findings indicate that there is spatial logic underpinning cerebellar output in zebrafish with likely implications for cerebellar function.
PMCID: PMC3612595  PMID: 23554587
zebrafish; cerebellum; eurydendroid; optic tectum; thalamus; topography; Gal4
6.  Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes 
Biankin, Andrew V. | Waddell, Nicola | Kassahn, Karin S. | Gingras, Marie-Claude | Muthuswamy, Lakshmi B. | Johns, Amber L. | Miller, David K. | Wilson, Peter J. | Patch, Ann-Marie | Wu, Jianmin | Chang, David K. | Cowley, Mark J. | Gardiner, Brooke B. | Song, Sarah | Harliwong, Ivon | Idrisoglu, Senel | Nourse, Craig | Nourbakhsh, Ehsan | Manning, Suzanne | Wani, Shivangi | Gongora, Milena | Pajic, Marina | Scarlett, Christopher J. | Gill, Anthony J. | Pinho, Andreia V. | Rooman, Ilse | Anderson, Matthew | Holmes, Oliver | Leonard, Conrad | Taylor, Darrin | Wood, Scott | Xu, Qinying | Nones, Katia | Fink, J. Lynn | Christ, Angelika | Bruxner, Tim | Cloonan, Nicole | Kolle, Gabriel | Newell, Felicity | Pinese, Mark | Mead, R. Scott | Humphris, Jeremy L. | Kaplan, Warren | Jones, Marc D. | Colvin, Emily K. | Nagrial, Adnan M. | Humphrey, Emily S. | Chou, Angela | Chin, Venessa T. | Chantrill, Lorraine A. | Mawson, Amanda | Samra, Jaswinder S. | Kench, James G. | Lovell, Jessica A. | Daly, Roger J. | Merrett, Neil D. | Toon, Christopher | Epari, Krishna | Nguyen, Nam Q. | Barbour, Andrew | Zeps, Nikolajs | Kakkar, Nipun | Zhao, Fengmei | Wu, Yuan Qing | Wang, Min | Muzny, Donna M. | Fisher, William E. | Brunicardi, F. Charles | Hodges, Sally E. | Reid, Jeffrey G. | Drummond, Jennifer | Chang, Kyle | Han, Yi | Lewis, Lora R. | Dinh, Huyen | Buhay, Christian J. | Beck, Timothy | Timms, Lee | Sam, Michelle | Begley, Kimberly | Brown, Andrew | Pai, Deepa | Panchal, Ami | Buchner, Nicholas | De Borja, Richard | Denroche, Robert E. | Yung, Christina K. | Serra, Stefano | Onetto, Nicole | Mukhopadhyay, Debabrata | Tsao, Ming-Sound | Shaw, Patricia A. | Petersen, Gloria M. | Gallinger, Steven | Hruban, Ralph H. | Maitra, Anirban | Iacobuzio-Donahue, Christine A. | Schulick, Richard D. | Wolfgang, Christopher L. | Morgan, Richard A. | Lawlor, Rita T. | Capelli, Paola | Corbo, Vincenzo | Scardoni, Maria | Tortora, Giampaolo | Tempero, Margaret A. | Mann, Karen M. | Jenkins, Nancy A. | Perez-Mancera, Pedro A. | Adams, David J. | Largaespada, David A. | Wessels, Lodewyk F. A. | Rust, Alistair G. | Stein, Lincoln D. | Tuveson, David A. | Copeland, Neal G. | Musgrove, Elizabeth A. | Scarpa, Aldo | Eshleman, James R. | Hudson, Thomas J. | Sutherland, Robert L. | Wheeler, David A. | Pearson, John V. | McPherson, John D. | Gibbs, Richard A. | Grimmond, Sean M.
Nature  2012;491(7424):399-405.
Pancreatic cancer is a highly lethal malignancy with few effective therapies. We performed exome sequencing and copy number analysis to define genomic aberrations in a prospectively accrued clinical cohort (n = 142) of early (stage I and II) sporadic pancreatic ductal adenocarcinoma. Detailed analysis of 99 informative tumours identified substantial heterogeneity with 2,016 non-silent mutations and 1,628 copy-number variations. We define 16 significantly mutated genes, reaffirming known mutations (KRAS, TP53, CDKN2A, SMAD4, MLL3, TGFBR2, ARID1A and SF3B1), and uncover novel mutated genes including additional genes involved in chromatin modification (EPC1 and ARID2), DNA damage repair (ATM) and other mechanisms (ZIM2, MAP2K4, NALCN, SLC16A4 and MAGEA6). Integrative analysis with in vitro functional data and animal models provided supportive evidence for potential roles for these genetic aberrations in carcinogenesis. Pathway-based analysis of recurrently mutated genes recapitulated clustering in core signalling pathways in pancreatic ductal adenocarcinoma, and identified new mutated genes in each pathway. We also identified frequent and diverse somatic aberrations in genes described traditionally as embryonic regulators of axon guidance, particularly SLIT/ROBO signalling, which was also evident in murine Sleeping Beauty transposon-mediated somatic mutagenesis models of pancreatic cancer, providing further supportive evidence for the potential involvement of axon guidance genes in pancreatic carcinogenesis.
PMCID: PMC3530898  PMID: 23103869
7.  qpure: A Tool to Estimate Tumor Cellularity from Genome-Wide Single-Nucleotide Polymorphism Profiles 
PLoS ONE  2012;7(9):e45835.
Tumour cellularity, the relative proportion of tumour and normal cells in a sample, affects the sensitivity of mutation detection, copy number analysis, cancer gene expression and methylation profiling. Tumour cellularity is traditionally estimated by pathological review of sectioned specimens; however this method is both subjective and prone to error due to heterogeneity within lesions and cellularity differences between the sample viewed during pathological review and tissue used for research purposes. In this paper we describe a statistical model to estimate tumour cellularity from SNP array profiles of paired tumour and normal samples using shifts in SNP allele frequency at regions of loss of heterozygosity (LOH) in the tumour. We also provide qpure, a software implementation of the method. Our experiments showed that there is a medium correlation 0.42 (-value = 0.0001) between tumor cellularity estimated by qpure and pathology review. Interestingly there is a high correlation 0.87 (-value 2.2e-16) between cellularity estimates by qpure and deep Ion Torrent sequencing of known somatic KRAS mutations; and a weaker correlation 0.32 (-value = 0.004) between IonTorrent sequencing and pathology review. This suggests that qpure may be a more accurate predictor of tumour cellularity than pathology review. qpure can be downloaded from
PMCID: PMC3457972  PMID: 23049875
8.  The mammalian PYHIN gene family: Phylogeny, evolution and expression 
Proteins of the mammalian PYHIN (IFI200/HIN-200) family are involved in defence against infection through recognition of foreign DNA. The family member absent in melanoma 2 (AIM2) binds cytosolic DNA via its HIN domain and initiates inflammasome formation via its pyrin domain. AIM2 lies within a cluster of related genes, many of which are uncharacterised in mouse. To better understand the evolution, orthology and function of these genes, we have documented the range of PYHIN genes present in representative mammalian species, and undertaken phylogenetic and expression analyses.
No PYHIN genes are evident in non-mammals or monotremes, with a single member found in each of three marsupial genomes. Placental mammals show variable family expansions, from one gene in cow to four in human and 14 in mouse. A single HIN domain appears to have evolved in the common ancestor of marsupials and placental mammals, and duplicated to give rise to three distinct forms (HIN-A, -B and -C) in the placental mammal ancestor. Phylogenetic analyses showed that AIM2 HIN-C and pyrin domains clearly diverge from the rest of the family, and it is the only PYHIN protein with orthology across many species. Interestingly, although AIM2 is important in defence against some bacteria and viruses in mice, AIM2 is a pseudogene in cow, sheep, llama, dolphin, dog and elephant. The other 13 mouse genes have arisen by duplication and rearrangement within the lineage, which has allowed some diversification in expression patterns.
The role of AIM2 in forming the inflammasome is relatively well understood, but molecular interactions of other PYHIN proteins involved in defence against foreign DNA remain to be defined. The non-AIM2 PYHIN protein sequences are very distinct from AIM2, suggesting they vary in effector mechanism in response to foreign DNA, and may bind different DNA structures. The PYHIN family has highly varied gene composition between mammalian species due to lineage-specific duplication and loss, which probably indicates different adaptations for fighting infectious disease. Non-genomic DNA can indicate infection, or a mutagenic threat. We hypothesise that defence of the genome against endogenous retroelements has been an additional evolutionary driver for PYHIN proteins.
PMCID: PMC3458909  PMID: 22871040
PYHIN; HIN-200; cytosolic DNA; ALR; IFI16; AIM2
9.  MicroRNAs and their isomiRs function cooperatively to target common biological pathways 
Genome Biology  2011;12(12):R126.
Variants of microRNAs (miRNAs), called isomiRs, are commonly reported in deep-sequencing studies; however, the functional significance of these variants remains controversial. Observational studies show that isomiR patterns are non-random, hinting that these molecules could be regulated and therefore functional, although no conclusive biological role has been demonstrated for these molecules.
To assess the biological relevance of isomiRs, we have performed ultra-deep miRNA-seq on ten adult human tissues, and created an analysis pipeline called miRNA-MATE to align, annotate, and analyze miRNAs and their isomiRs. We find that isomiRs share sequence and expression characteristics with canonical miRNAs, and are generally strongly correlated with canonical miRNA expression. A large proportion of isomiRs potentially derive from AGO2 cleavage independent of Dicer. We isolated polyribosome-associated mRNA, captured the mRNA-bound miRNAs, and found that isomiRs and canonical miRNAs are equally associated with translational machinery. Finally, we transfected cells with biotinylated RNA duplexes encoding isomiRs or their canonical counterparts and directly assayed their mRNA targets. These studies allow us to experimentally determine genome-wide mRNA targets, and these experiments showed substantial overlap in functional mRNA networks suppressed by both canonical miRNAs and their isomiRs.
Together, these results find isomiRs to be biologically relevant and functionally cooperative partners of canonical miRNAs that act coordinately to target pathways of functionally related genes. This work exposes the complexity of the miRNA-transcriptome, and helps explain a major miRNA paradox: how specific regulation of biological processes can occur when the specificity of miRNA targeting is mediated by only 6 to 11 nucleotides.
PMCID: PMC3334621  PMID: 22208850
10.  PINA v2.0: mining interactome modules 
Nucleic Acids Research  2011;40(Database issue):D862-D865.
The Protein Interaction Network Analysis (PINA) platform is a comprehensive web resource, which includes a database of unified protein–protein interaction data integrated from six manually curated public databases, and a set of built-in tools for network construction, filtering, analysis and visualization. The second version of PINA enhances its utility for studies of protein interactions at a network level, by including multiple collections of interaction modules identified by different clustering approaches from the whole network of protein interactions (‘interactome’) for six model organisms. All identified modules are fully annotated by enriched Gene Ontology terms, KEGG pathways, Pfam domains and the chemical and genetic perturbations collection from MSigDB. Moreover, a new tool is provided for module enrichment analysis in addition to simple query function. The interactome data are also available on the web site for further bioinformatics analysis. PINA is freely accessible at
PMCID: PMC3244997  PMID: 22067443
11.  From transcriptome to biological function: environmental stress in an ectothermic vertebrate, the coral reef fish Pomacentrus moluccensis 
BMC Genomics  2007;8:358.
Our understanding of the importance of transcriptional regulation for biological function is continuously improving. We still know, however, comparatively little about how environmentally induced stress affects gene expression in vertebrates, and the consistency of transcriptional stress responses to different types of environmental stress. In this study, we used a multi-stressor approach to identify components of a common stress response as well as components unique to different types of environmental stress. We exposed individuals of the coral reef fish Pomacentrus moluccensis to hypoxic, hyposmotic, cold and heat shock and measured the responses of approximately 16,000 genes in liver. We also compared winter and summer responses to heat shock to examine the capacity for such responses to vary with acclimation to different ambient temperatures.
We identified a series of gene functions that were involved in all stress responses examined here, suggesting some common effects of stress on biological function. These common responses were achieved by the regulation of largely independent sets of genes; the responses of individual genes varied greatly across different stress types. In response to heat exposure over five days, a total of 324 gene loci were differentially expressed. Many heat-responsive genes had functions associated with protein turnover, metabolism, and the response to oxidative stress. We were also able to identify groups of co-regulated genes, the genes within which shared similar functions.
This is the first environmental genomic study to measure gene regulation in response to different environmental stressors in a natural population of a warm-adapted ectothermic vertebrate. We have shown that different types of environmental stress induce expression changes in genes with similar gene functions, but that the responses of individual genes vary between stress types. The functions of heat-responsive genes suggest that prolonged heat exposure leads to oxidative stress and protein damage, a challenge of the immune system, and the re-allocation of energy sources. This study hence offers insight into the effects of environmental stress on biological function and sheds light on the expected sensitivity of coral reef fishes to elevated temperatures in the future.
PMCID: PMC2222645  PMID: 17916261

