AI in drug discovery: do you know the value of your healthcare data?

Artificial intelligence (AI) can be an umbrella term used to describe technologies that can mimic traits associated with the human mind. These technologies can learn from experience and can be trained to accomplish specific tasks (such as problem-solving) by processing large amounts of data and recognising patterns in the data.


Tech and Life Sciences converge

AI remains a hot topic in the life sciences sector, because it offers the chance to expedite various stages of the drug discovery process and ultimately to decrease the time it takes to bring novel drugs to market. For example, AI can reduce the number of drug targets, help select the most appropriate pre-clinical model, better target the patient pool for a clinical trial and model drug responses to reduce the number of trial participants – bringing huge time and cost savings and helping innovators gain first-mover advantage.

However, AI-driven research is fundamentally only as useful as the datasets to which the AI is applied. Put simply, the dataset processed usually has to be substantial in size, as rich and diverse as possible and, perhaps even more importantly, contain high-quality data. You truly do “get out what you put in”. However, large high-quality datasets can be extremely difficult to obtain and, even then, the data that is obtained needs to be suitably cleaned, categorised and handled to be usable for research with AI. This task becomes even more challenging when dealing with different types of data and/or data from different sources which are not necessarily compatible.

Therefore, while the use of AI in drug discovery offers revolutionary opportunities (across both the public and private sector), the key questions on everyone’s lips are how can you access and protect this data and, crucially, how can you extract value from applying AI to it?

How best to protect and extract value from healthcare data

Healthcare data used in research in the UK is predominantly pseudo-anonymised (or sometimes, anonymised) patient-level data from patient / clinical trial records. These datasets are an invaluable source of information if and when they are used to inform research or patient care.

When sharing data, it is crucial to ensure that the value in the data is protected. Most commonly, datasets are licensed only for a specific use and protected by appropriate confidentiality obligations. Licence agreements often limit the methods that can be applied to the data and restrict combining the licensed data with other data sources. In any event, restrictions on the use of the data may be dictated by data privacy and patient confidentiality requirements (which are not examined in this article) – but the scope of the permitted use should always be considered more holistically, in line with your organisation’s IP strategy.

Typically, in any project, the main value lies in the results generated from the data, not the raw data itself. Therefore, the ownership of new IP generated is of key importance to those involved and can be fiercely contested – including in respect of any models or improvements to AI systems that may arise, or methodology that is developed or trained on datasets.

In some cases, an organisation may have little or no interest in owning or having access to the newly‑developed IP itself, even though it may have been developed off the back of its datasets – for example, a pharma company may have no desire to own any improvements to a technology company’s AI tool, nor an NHS trust in claiming proprietary rights in a pharma company’s drug. Even so, a party making its datasets available should still consider whether it is appropriate for it to share, in some way, in the value that its contractual partner derives from commercialising the data or the resulting IP. Some possible value-sharing mechanisms could include obtaining:

  • licence fee(s) in return for access to the data;
  • royalties or a revenue share from the sales of products or services that are developed with the benefit of the data;
  • lump sum payments linked to product development and commercial milestones; and/or
  • an equity stake or share options in the partner (or other entity) that commercialises the data/resulting IP.

What (if any) value-sharing mechanism is appropriate will depend of the individual circumstances, taking into account factors such as the quantity and quality of the datasets provided in the first place, the extent to which the resulting products or services are derived from them, and the level of investment and risk taken by the contractual partner to develop and commercialise the results.

Public body perspective

The UK National Health Service (NHS) collects large amounts of data and has comprehensive longitudinal patient-level datasets. Similarly, the ambitious NHS Genomic Medicine Service aims to sequence 500,000 whole genomes by 2023/24 to help transform healthcare for the benefit of patients. From a public sector perspective, these datasets hold economic value (often untapped and underutilised) but also offer wider societal benefit and bring real possibilities of enhancing patient care. However, research with AI is a specialised task that most NHS Trusts do not have the capacity to undertake. Therefore, while the NHS does not “sell” data per se, various NHS entities engage in a range of collaborations with third parties in respect of this type of data to enhance care.

On 6 January 2021, the AI Council published its independent “AI Roadmap”, making recommendations to help the UK Government shape its strategic direction on AI. One of the AI Council’s 16 key recommendations is to build on the work of NHSX and others to lead the way in using AI to improve outcomes and create value in healthcare. The report proposes that the UK Government should focus on making more public sector data safely and securely available. It encourages developing smart strategies for data sharing and new partnership models with SMEs – and, in particular, the creation of a data strategy for healthcare and a strong industry-specific governance framework.

To be clear, the AI Council’s report is advisory only, it is not a government policy or strategy. Nonetheless, on 12 March 2021, Digital Secretary Oliver Dowden announced that the UK Government is currently developing a formal National AI Strategy (to be published later this year), which will take the UK AI Roadmap recommendations into account.  However, at the current time, there is no definitive clear strategy of how the NHS should share in the benefits that can be gained from a third party’s use of NHS patient data. Therefore, when implementing the National AI Strategy in a healthcare context, not only will the relevant framework need to address questions on privacy, ethics and security, but it should also clearly define the NHS’s strategy on value recognition from its datasets.

Public acceptance will be vital for the success of any policy that deals with the sharing of individuals’ health data. A strategy that is hell-bent on the NHS maximising its financial returns would be untenable, but equally one in which the NHS’s contribution is undervalued, and where data is seen to be offered up on the cheap (at the cost of the taxpayer), would not gain much public support either. Therefore, even though patient benefit should still be the key reason for the NHS sharing its datasets, there is no reason why public health bodies cannot also consider value-sharing mechanisms such as the ones mentioned above. In particular, options such as giving the NHS the ability to procure products developed from its data for free, or at a discount, may offer a good solution in certain cases.

Commercial organisation perspective

As explained, large high-quality datasets create considerable opportunities for developing new drugs. Therefore, inevitably, innovators in the life sciences industry are increasingly interested in using AI for drug discovery – and most companies in the sector have AI projects afoot. However, although AI tools have now started to come into their own in drug discovery, these initiatives are far from straightforward.

Many life sciences companies are lacking access to sufficient data (or the right kind of data) and/or do not have the internal resources or relevant expertise to conduct AI-driven analysis. As such, they are often faced with the need to outsource the analysis of data to specialist service providers, which can create real risks. Just like an NHS trust, a life sciences company must avoid “giving away” future value from its dataset to its service providers (or, even worse, its competitors). Paradoxically, moving first could put you in a worse position: as competitors following in your trail, with those providers, could reap the benefits of more sophisticated and valuable AI tools. Clearly, it is attractive to work with providers whose AI systems have already interrogated large datasets from other life sciences companies and which have been improved as a result.

Therefore, one option for early adopters may be to try and negotiate a discounted rate for the services. Alternatively, they could explore other value-sharing mechanisms (such as those suggested above) – for example, their own share in any future revenues earned by the AI providers in respect of the future “use” of their data – although this is likely to lead to complex and protracted negotiations. Another interesting development (and indeed opportunity) lies in data-sharing collaborations between life sciences companies and research institutions. Take, for example, project MELLODY where a number of pharmaceutical companies and research institutions collaborate by enabling AI to learn from datasets without sharing the data – thereby retaining the value in the IP within each dataset for the owners but also extracting value beyond what would have been possible without sharing the sets. Therefore, while it may seem counter-intuitive, collaborating with competitors may be an increasingly popular route to extract value from AI-driven drug discovery research.


While AI is not necessarily the “golden ticket” for drug discovery, if used properly, it undoubtedly offers extraordinary opportunities for all. Through this data, the NHS is uniquely placed to help improve patient care. Similarly, if the risks are navigated carefully, innovators can extract real value from their own (and other’s) healthcare data. However, considered analysis, suitable contractual protections and potentially unique business models are all required to ensure the use of AI in drug discovery is a success.


Bristows Life Science Summit 2021

Following on from the success of our previous Bristows Life Sciences Summit on gene editing, we will be exploring the use of artificial intelligence in the medical sphere in another big debate in November 2021.

Keep an eye out on our events page for further details, and register your interest here.


Related Articles