Sunday, 9 October 2016

Recent papers that the Genomics Core has helped with

I like to highlight some of the really interesting work we've been involved with, or that has come out of the Institute from time to time, and I recently updated our lab home page with links to a couple of papers.  i thought I'd take the opportunity to write about them in a bit more detail here. Many of you will already know I run the Genomics Core facility at CRUKs Cambridge Institute. We do a lot of Illumina sequencing! The lab works on a huge number of projects for the research groups here in the Institute, and also across many groups in Cambridge via a long-running sequencing collaboration. We do do some R&D work in my lab, but >90% of our efforts are working with, or for, other research groups.
Highlights from the last years genomics research include work from the Caldas group who have completed three project over the lat year I've included here; 1) profiling of almost 2500 Breast Cancer patients for mutational analysis of 173 genes using a targeted pull-down (Pereira et al Nature Communications 2016); 2) cancer exomes from Murtaza et al,; 3) PDXs from Bruna et al.; and the Balasubramanian group who have shown that it is possible to capture and sequence double-strand DNA breaks (DSBs) in situ and directly map these at single-nucleotide resolution, enabling the study of DSB origin (Lensing et al. Nature Methods 2016). The rapid speed and unbiased nature of the genome-wide experiments being performed in the Institute, and often prepped and sequenced in the Genomics core continue to increase our understanding cancer biology.

1) Mutational analysis of 173 genes in 2433 tumors: Bernard Pereira and Suet-Feung Chin, in Carlos Caldas research group, published a massive Breast cancer gene resequencing project, which is helping to improve our understanding of patient classification into clinically relevant subtypes. They showed that this using both mutation and copy-number analysis provides the best currently possible stratification. The project analysed almost 2500 tumours from the METABRIC study (see my previous blog) sequencing the 173 most frequently mutated breast cancer genes. They found 40 mutated genes that are instrumental in breast cancer progression. There was high variation in teh mutational frequency of some of these genes, e.g. TP53 was mutated in 85% of IntClust10, and around half of IntClust4/5/6 and 9, but less than 15% of IntClust3/7/8, which are good prognosis tumours.

Analysis of the clonal distribution of mutations (accounting for CNVs) showed that most drivers were present in nearly all tumour cells and probably occurred early in the evolution of the tumour. There was a lower number of apparent tumour clones in samples from patients with better prognosis than in patients with poorer outcomes. Inactivating mutations in SMAD4 were associated with worse outcomes across the IntClusts, but TP53 mutations were more strongly associated with worse outcome in ER+ disease. And TP53 DNA-binding domain mutations were associated with the worst outcomes. PIK3CA mutations were prognostic in ER-, but not in ER+ patients.

The Prevalence of mutations of mutations across histological subtypes 

The paper reported the finding of 10 new breast cancer driver genes that were previously known drivers in other cancer types. Hopefully this wil allow the relatively quick migration of treatment from one setting to another. Cancer Research UK's chief clinician Professor Peter Johnson was quoted  as saying "This study gives us more vital information about how breast cancer develops and why some types are more difficult to treat than others, and this information is a great resource for researchers all over the world" - the release of this data via cBioPortal is likely to improve breast cancer research. Particularly as the METABRIC study is a large sample size project, and also has long-term clinical follow-up.

What did Genomics do: This project was a collaboration between the Caldas group, Sam Apraicio's group in Vancouver and Illumina. The sequencing for this project was done by Illumina but Michelle Pugh from my group helped prep the rapid capture libraries. Michelle is now working at Iniviata.

2) Breast cancer PDX encyclopaedia: Alejandra Bruna and Oscar Rueda in the Caldas group have done a massive amount of work in creating one of the first, and largest series of Breast cancer patient derived xenografts (PDX). These PDX models allow far more than a single molecular analysis to be performed from a patient sample. Tumour tissue becomes possibly limitless, and follow up studies and even "clinical trials" can be carried out in a level of detail, and at a rate, that is very tough to do in a Human setting.

Rather than rely on cell lines, which have very limited inter- and intra-tumor heterogeneity and are adapted to growth on plastic - as such it is not difficult to see their shortcomings in the development of new treatments. But generating PDX models is hard and people have usually focused on making only one or two, or a handful up till now. The Caldas group wanted to build something larger, and create a resource that reflected the full molecular pathology they had revealed in the METABRIC study. So far 83 PDX models have been created. All have been shown to be re-established after freezing and so are a long term resource. Both primary and metastatic models have been created; 60% are from ER+ patients. The PDXs were subjected to extensive molecular characterisation: we developed the use of shallow whole-genome sequencing of pre-capture exome libraries for CNV analysis with the Caldas group (see this post from 2014), Exome sequencing, reduced-representation bisulfite sequencing (‘RRBS’) for DNA methylation, and gene expression arrays (the project stuck with arrays for the best possible correlation to METABRIC).

Importantly the project was able to show that PDX's retained the same histological and molecular pathology through passaging, and that the intra-tumor heterogeneity and clonal architecture were maintained.

Perhaps most exciting was the demonstration that PDX's could be used for high-throughput drug screening, to test drug combinations, and could predict in vivo drug response. CRUK's Science Blog covered this paper and the discussed how this work is likely to be seen as a better way to discover and develop new cancer drugs.

The Bioinformatics core helped to create the data portal for this project. The Breast Cancer PDTX Encyclopaedia is an open resource that allows users to browse the data. The publication describes the project methods i detail, which will hopefully encourage others to create additional PDX models for breast and or other cancers.

What did Genomics do: The core helped with much of the molecular characterisation. We prepped and sequenced all the sWGS and exome libraries, we sequenced the RRBS libraries (and did some useful work tweaking the amount of PhiX needed for these libraries), Illumina HT12 arrays were processed at the Department of Pathology. The Cambridge Institute Bioinformatics, Histopathology, Flow Cytometry, Biological Resource, and Bio-repository core facilities all helped with this project.

3) Mapping double-strand breaks: Stefanie Lensing in Shankar Balasubramanian's group (co-founder of Solexa) developed an improved method for mapping double-strand breaks. DSBs are one of the major causes of mutations and rearrangements and several groups have previously characterised them. However the methods used have not been ideal; ChIP-seq captures DSB proteins rather than the breaks themselves, and BLESS involves a relatively inefficient blunt-end ligation, lots of PCR, and produce low-diversity libraries (see this paper and my post). Stefanie developed DSB-seq to capture DBSs in situ using a modified Illumina P5 adapter such that after ligation single-end sequencing could be used to map breaks with nucleotide resolution. She directly compared BLESS with DSB-seq and shows that the new method identified 4.5-fold more DSBs normal human epidermal keratinocytes.

The paper also shows that G-quadruplex DNA secondary structures, which have previously been implicated as fragile sites in the genome, were 3-fold enriched over random within DSBCapture peaks. There was very large enrichment of DSBCapture peaks in regulatory, nucleosome-depleted regions, and many DSB sites were also sites for RNApolII revealing a relationship between DSBs and elevated transcription within nucleosome-depleted chromatin.

You can get a detailed protocol on Natures Protocol Exchange.

What did Genomics do: The core performed the sequencing for this project.

It is looking like it will be possible to combine long-, or synthtic-read, phasing, methods with exome targeting to sequence DNA repair genes in patients. By using PacBio and/or 10X Genomics it would be possible to definitively test patients for mutations in cis- or in trans- adding clinically relevant information currently not available through short-read genomes and exomes. DSB-seq and phasing of DNA repair gene mutations are likely to be useful methods that could be translated in the next few years.

4) Understanding metastatic disease with ctDNA sequencing: Muhammed Murtaza (Tgen and Rosenfeld group) and Sarah-Jane Dawson (Peter Mac and Caldas group) published a detailed analysis of a single breast cancer patient using tumour- and liquid -biopsy analysis. They collected 8 tumour biopsies and 9 plasma samples collected over 1,193 days of clinical follow-up for an ER+/HER2+ breast cancer patient. They exquisitely characterised the tumour evolution during therapy using exome and targeted amplicon sequencing (TAm-seq). This work is the first to really demonstrate that liquid-biopsy truly recapitulates the tumour burden and metastatic heterogeneity - and is an important step along the road to using liquid-biopsy in the clinic.

The paper reports the finding of over 350 candidate non-synonymous SNVs from the exome sequencing data, and just over 300 were successfully sequenced by TAm-seq to an average coverage of up to 8000x! The amount of data generated from a single patient really allowed the group to delve deep into the evolution of the tumour and the relationships between the primary and metastatic sites (see Fig1 above). The authors do point out that this kind of study needs to be replicated to show how well liquid biopsy can be used to track other breast cancer patients, and how it performs in other cancers, with potentially different evolutionary tracks and varying metastatic sites.

Importantly the results show again that if actionable mutations are identified in circulating DNA then this may inform the choice of targeted therapies.

What did Genomics do: This project was not processed by the Genomics Core, however it leads on from work we were involved with. It was also such a great paper I wanted to highlight it here!

No comments:

Post a Comment