COSMIC: Case Study

Catalogue Of Somatic Mutations In Cancer - a database curated by experts covering a wide variety of somatic mutation mechanisms involved in human cancer

'Human colon cancer cells in culture. Colon cancer is the third most common cancer in Britain. People are more likely to develop this cancer if they eat a diet high in animal fats.' by Annie Cavanagh. Credit: Annie Cavanagh. CC BY-NC

'Human colon cancer cells in culture. Colon cancer is the third most common cancer in Britain. People are more likely to develop this cancer if they eat a diet high in animal fats.' by Annie Cavanagh. Credit: Annie Cavanagh. CC BY-NC

What is COSMIC?

COSMIC is an online catalogue of somatic mutations in cancer, a gold standard data resource of global impact in the field of cancer genomics. The database has utility for basic research into the mechanisms of cancer but is also relevant for pharmaceutical R&D activities in oncology as well as the development of products and services in cancer diagnosis and patient stratification.

The problem

How to curate and organise exponentially growing amounts of data?

In 2013, the advancing genomic revolution was increasing both the rate of cancer mutational data available and the need of the cancer research community for an accurate, centralised database to aggregate and interrogate relevant data.

For over a decade, COSMIC filled the role of organising and hosting quality data, but was running the risk of not keeping pace with the exponential growth of available data that needed curating. The danger was, that over time this important resource could lose its relevance and deprive the cancer research community of a key source of up-to-date information.

The solution

Secure industry support to double the size of the COSMIC team

'Skin cancer cell' by Annie Cavanagh. Credit: Annie Cavanagh. CC BY-NC

'Skin cancer cell' by Annie Cavanagh. Credit: Annie Cavanagh. CC BY-NC

To keep pace and relevance in the context of an evolving landscape of genomic data generation and use. The solution was to grow the COSMIC team quickly and substantially to help increase its curation capability and develop new tools for extracting information from the data.

An additional challenge was that traditional grant funding alone could not support the required growth of COSMIC over the long term, so we needed to find new ways of ensuring its continuity.

How did we do it?

Benchmarking | Consultancy | Interviews | Testing

'Castration resistant prostate cancer, human tissue' by Mateus Crespo, The Institute of Cancer Research. Credit: Mateus Crespo, The Institute of Cancer Research. CC BY

'Castration resistant prostate cancer, human tissue' by Mateus Crespo, The Institute of Cancer Research. Credit: Mateus Crespo, The Institute of Cancer Research. CC BY

Finding the best solution

The Enterprise & Innovation Team worked with the Head of COSMIC to propose a sustainability plan aimed at securing the relevance and utility of the database. From the outset, we decided that maximal access to a high quality database for the entire cancer research community was a prerequisite.

We established basic principles to drive our search for a sustainable development model:

-The COSMIC website would be free for all to access, providing a window into the database and allowing anyone interested in the link between cancer and genetic mutation to access the vast array of information available;

-Academic users would have free access to the entire database in a downloadable format, allowing integration with other data sources or the development of new data visualisation tools;

-We would not consider models in which users would access different content dependent on whether they were fee paying or accessing through academic free access.

We explored a number of options, namely: collaborative development with industry, license-controlled access to the database and provision of fee for service. Ahead of recommending a strategy to support long-term growth for COSMIC, we:

- Carried out a benchmarking exercise that involved engaging with a number of academic providers of database resources who had already adopted commercial distribution models;

- Worked with a consultancy firm that had experience of managing Kyoto Encyclopedia of Genes and Genomes‘ commercial licensing on behalf of Kyoto University, for advice on structuring a licensing model;

- Interviewed a large number of companies, primarily from the pharmaceutical and molecular diagnostics sectors to gauge their future use, needs and price point sensitivity. Most of these interviews were conducted face-to-face with bioinformatics and oncology R&D leaders.

Our plan

Dual licensing & co-development with industry

'Human brain cancer stem cells treated with graphene, SEM' by Dr Khuloud T. Al-Jamal. Credit: Izzat Suffian, Pedro Costa, Stephen Pollard, David McCarthy & Khuloud T. Al-Jamal. CC BY

'Human brain cancer stem cells treated with graphene, SEM' by Dr Khuloud T. Al-Jamal. Credit: Izzat Suffian, Pedro Costa, Stephen Pollard, David McCarthy & Khuloud T. Al-Jamal. CC BY

Following our research, we decided to focus upon a dual licensing model under which academics would still download the COSMIC database freely while industry users would pay a simple annual subscription fee commensurate with the size of company. Resultant licensing fees are entirely reinvested back into COSMIC, to further improve the website and the database for the entire cancer research community.

Also, we sought tactical or strategic collaborations with industry to help fund and develop new tools and content for the COSMIC database. This means that our industry partners are able to accelerate the development of parts of the database that they are most interested in. This benefits the whole cancer research community as these database improvements are then shared with all its users, as part of the COSMIC package.

In December 2014 the proposed strategy was approved by Genome Research Limited, subject to ongoing review. To test this model and its acceptance in the community a two-year pilot study was launched in March 2015.

Pilot results

Licensing and collaboration exceeded expectations

The Enterprise & Innovation Team established processes for responding to commercial enquiries, as well as negotiating and managing license contracts. Negotiations with commercial users revealed the diversity of COSMIC commercial users and helped us to create a wider range of licensing contracts to suit the varied needs of our customers.

During a 2-year period, over 100 different organisations took a license to download the COSMIC database. The majority of licenses were either biotech/pharma users whose internal R&D activity includes genomic and bioinformatic platforms or smaller entities providing genomic services principally to the research community (academic, clinical and commercial). The income generated by licensing under the pilot together with industry collaborations, enabled us to double the size of the COSMIC team to a total of 20 individuals by March 2017. The quality of curation, derived manually by PhD scientists with an oncology background, is a unique feature of COSMIC. Extra curators were therefore recruited as a priority to address the immediate challenge of the increasing volume of data in the field.

Collaborations were at the source of two main innovations in COSMIC. Firstly, a new COSMIC-3D feature, developed with Astex Pharmaceuticals, visualising cancer mutations within 3D protein model as a support tool for drug discovery. Secondly, the Cancer Gene Census, developed in partnership with Open Targets, which catalogues over 700 genes implicated in causing cancer together with evidence of each gene’s role in the tumour process (Cancer Hallmarks).

The pilot phase demonstrated both the acceptability of the licensing model to a diverse set of users from industry and the value of industry collaboration in helping us maintain COSMIC as a gold standard database in cancer bioinformatics that is free to academic users.

The future?

Building long term sustainability and ongoing engagement with the cancer research community

COSMIC licensing is based on a yearly subscription model with new releases of the database made available each quarter. In order to support COSMIC over the long term, this model relies on motivating users to renew subscriptions. We are committed to maintaining the quality of these updates and encouraging growth in subscriptions. In particular efforts will be made to:

1) engage with the community to ensure new content fits COSMIC users’ ongoing needs;

2) explore new markets.

Dedicated communications will engage the community of current COSMIC users (both from industry and academia), to inform the ongoing development of the database based on user feedback. The COSMIC team is also looking to develop training workshops that will provide an opportunity to interact directly with the users, creating touch-points for cross-learning and collaboration.

A Scientific Advisory Board will be appointed to help steer the development of COSMIC in a direction that matches emerging scientific needs.

The Enterprise & Innovation Team has an effective industry network and supports COSMIC by exploring new partnership and product opportunities focused on areas of industry needs and growth.

For further information on COSMIC, go to: www.sanger.ac.uk/science/tools/cosmic

We are always looking for new collaborative projects and would be happy to hear from you.

View of the Sulston Building cafeteria and meeting space. Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK

View of the Sulston Building cafeteria and meeting space. Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK