The rise of the embedded data scientist – Lessons from the pharmaceutical front line

Pharmaceuticals are increasingly using predictive science as a platform for R&D.  Critical to this approach is building models from non-clinical and clinical data.


What makes this tough is that you have to integrate and interrogate multiple data assets across different stages in drug development. These data assets are diverse in type, scale and quality and in many cases the data is heterogeneous in format, even where there are some data-standards in place. I have seen teams spend several weeks gathering data and manipulating it into a usable form on some projects.

There are lots of ways of organising and managing data. In some cases drug project teams have taken this on themselves through ad-hoc approaches, however, their team rarely has the time and necessary skills to make this an efficient process – it’s often not what they’re paid to do. Alternatively, I’ve also seen more distant approaches, where a separate IT organisation does some business analysis to develop a robust solution. While this may well work eventually, the project team rarely benefit in the short term. Moreover, in these cases I worry about exactly how fit-for-purpose the software will be when it’s actually deployed.

What I wanted to share is a novel approach: embedding ‘Data Scientists’ within drug-project teams. So what is an embedded data scientist and what do they do? An embedded data scientist helps people out with informatics. They get hold of data, organise, integrate, analyse and share results – and not always all of these things. They work closely and directly with scientists who have daily challenges. Once an embedded data scientist has tailored a solution to a particular challenge and developed prototypes that really benefits the scientists, then a more structured and robust approach can be taken to developing a long-term solution: that way the solution eventually delivered has an established value-set and is known to work.

Some recent work was on a data-pilot for an early-clinical phase drug project within Oncology. The team were trying to predicting drug behaviour in clinical setting by analysing exposure versus effect. I was embedded in the team and worked iteratively with a range of stakeholders to find some solutions. We initially took a very ad-hoc approach; taking preclinical data and evaluating their predictions. Being able to do this upfront for the study team meant they were able to make decisions on the ongoing data rather than waiting for a big chunk at the end. So the approach is certainly incremental and agile but with a particular focus on the science, working prototypes and really establishing value, before a big investment is made. Now, requirements that were established in that initial phase have been refined and generalised and we intend to roll-out a standard solution across a much larger set of projects.

This approach has clearly been beneficial, as some software for clinical trial monitoring we’ve been involved in developing within an R&D team at AstraZeneca, recently won the 2014 Bio-IT Best Practice award for best Clinical and Health IT project. I played a significant role in development of that software, operating exactly as just described. I emphasised this way of working when I authored the award entry, so I guess this way of working must have ‘struck a chord’ among the judges.

So is this something that can be applied to other industries? My feeling is that it can, but that the key to success is the capability of data scientist to communicate with the customers, become a part of the team, have the trust of senior stakeholders and be able to work alongside technical software developers to translate prototypes into robust solutions. In my case this was the drug development and pre-clinical research team, but equally, this could work in another sector, where data-driven, scientific decision-making is paramount. When this is done the embedded data scientist provides an efficient approach that balances technical innovation and a practical ability to provide timely effective analytics.

Jamie MacPherson

Jamie MacPherson

Dr Jamie MacPherson works as a Tessella Consultant, a position he has held for two years. Tessella ...

© Copyright 2018 Tessella
All rights reserved