Centre of Excellence for AI/ML in Life Sciences

Register for AI Community Newsletter

We are running a very active webinar program that highlights diverse use cases in AI and ML in the biotechnology and pharmaceutical industries.

We welcome suggestions for other topics and speakers too. Please contact Vladimir Makarov (vladimir.makarov at pistoiaalliance dot org)

SessionDateSpeakersTopics & Themes

23 May 2018

Slides and Talk

Prashant Natarajan 
A Brief History of AI/ML
  • Big Data/ML/DL/AI - fundamentals and concepts 
  • Data Fidelity
  • Real-life use cases in health & life sciences from their book (www.BigDataCXO.com)
  • Q & A

21 June 2018

Slides and talk

Prashant Natarajan 
Demystifying AI – Part 2
  • Considerations for Life Sciences
  • ML 102
  • TIE –Interpretability & Explainability
  • Conversational AI: Bot Basics
  • Q & A

1 Oct 2018

Launch Slides


Register for datathon

Datathon launch

Drug Repurposing Datathon

  • Datathon launch details - Rare disease drug repurposing
  • More details & register on Datathon
AI/ML Workshop

9 Oct 2018

More details

Slides and Notes

CoE AI Workshop and network meeting

Hosting a AI/ML workshop to allow our community to meet, share ideas and make progress on their AI/ML adoption, implementation planning and impact.

Speakers from across the industry and panels, plus networking


18 Oct 2018



Joint meeting with PRISME forum

Rescheduled from Sep 20

Maximizing Value from Healthcare Data Using Machine Learning

  • AI maturity model
  • CoE update and Datathon

10 Dec 2018



Webinar panel:

Terry Stouch,

Jamie Powers,

Isabella Fieirberg

Sirarat Sarntivijai

Jabe Wilson

Matters in data quality: quality scores for data sets and individual data items; FAIR annotations for methods by which data are obtained; Value of old data vs new. Value of even new data on its own and how that can change depending on how it's developed, stored, labeled  retrieved, and interpreted.  How data and its use can change with age. Different needs of need of the level of quality of the data. How the need for  level of  quality and variations  might differ between methods of analysis. The same data might be considered both junk and useful depending on need; ....  PLUS standards for all of the above.

26 Feb 2019



Drs. Alex Tropsha and Ola EngkvistAI/ML in Drug Design - use neural nets to generate new molecules that are synthetically accessible and fit specified properties.
AI/ML Workshop

12 March 2019


Slides and Notes

CoE AI Workshop and network meeting

Hosting a AI/ML workshop to allow our community to meet, share ideas and make progress on their AI/ML adoption, implementation planning and impact.

Speakers from across the industry and panels, plus networking


6 June 2019



Prof. John OveringtonProf. John Overington, the CIO of the Medicines Discovery Catapult described the AssayNet project and its very far reaching implications.

20 June 2019



Webinar Panel:

  • Aleksandar Poleksic
  • Bruce Aronow
  • Finlay Maclean
  • Jabe Wilson

Pistoia Alliance and Elsevier Datathon Report Webinar on Drug Repurposing

In the late Fall and Winter of 2018, the Pistoia Alliance in cooperation with Elsevier and charitable organizations Cures within Reach and Mission: Cure run a datathon aiming to find drugs suitable for treatment of childhood chronic pancreatitis, a rare disease that causes extreme suffering.  The datathon resulted in identification of four candidate compounds in a short time frame of just under three months. In this webinar our speakers will discuss the technologies that made this leap possible.


18 September 2019



Host: Paula Matos (Pistoia Alliance)

Webinar Panel:

  • Simon Fortenbacher (GSK)
  • Gergely Szabo (Elsevier)
  • Kirk Brote (Brote Consulting)

Building Trust and Accountability: The Role User Experience Design Can Play in Artificial Intelligence

Our panelists described the principles of UX design and its importance in the context of AI, and illustrated with case studies of how UX is being applied in AI.

FAIR, AI and ML workshop

22 October 2019


Joint Workshop by the FAIR Implementation Project team and AI/ML CoE

Hosting a AI/ML workshop to allow our community to meet, share ideas and make progress on their AI/ML adoption, implementation planning and impact.

Summary  of the workshop


16 January 2020



Talk by Dr. Dennis Wang (University of Sheffield) followed by a panel discussion with Mr. Albert Wang (BMS)

Looking beyond the hype: Applied AI and machine learning in translational medicine

We will discuss possible ways to enable ML methods to be more powerful for discovery and reduce ambiguity within translational medicine, allowing data-informed decision-making to deliver the next generation of diagnostics and therapeutics to patients quicker, at lowered costs, and at scale.


7 April 2020



Dr. Darren Green, GSK

Automated Molecular Design and the BRADSHAW Platform

Dr Darren Green discusses how data-driven chemoinformatics methods may automate much of what has historically been done by a medicinal chemist, considering what the balance is between AI approaches and human expertise and uses examples from Bradshaw, GSK’s experimental automated design environment to support his presentation.


4 May 2020



  • Craig Rhodes, nVIDIA
  • Nicola Rieke, nVIDIA
  • Jennifer Goldsack, DiMe
  • Tim McCarthy, Pfizer
  • Marissa Dockendorf, Merck

How Can Federated AI/ML Learning Support Genomics and Patient Data Analysis to Enable Precision Medicine at Scale?

Organized by the Digital Medicine Program at the Pistoia Alliance and the Digital Medicine (DiMe) Society

How federated learning can help overcome some of the barriers seen in the development of AI-based solutions for pharma, genomics and healthcare? Following the presentation, the panel debate on other elements that could drive the adoption of digital approaches more widely and help answer currently intractable science and business questions.


26 June 2020


  • Andrew Prigodich, Pfizer
  • Peter Henstock, Pfizer
  • James Weatherall, AZ
  • John Overington, Medicines Discovery Catapult
  • Harvard Student Team

Putting AI into Practice

Is it possible to forecast which of the drug discovery projects would advance to clinical trials?

A talk on "Mining Drug-Target-Disease Trends from Public Data Sources" presented by Andrew Prigodich, and Peter V Henstock from Pfizer and a Harvard University Extension School team (Andrew Wang, Bhavani Shekhawat, Charlie Flanagan, Derek Kinzo, Gerald Ding, Ramandeep Hariai, Roman Burdakov), followed by a panel discussion by James Weatherall, AstraZeneca, John Overington, Medicines Discovery Catapult, and Peter Henstock, Pfizer.


9 December 2020


  • Prof. Atul Butte, UCSF
  • Dr. Beau Norgot, Anthem
  • Dr. Jen Harrow, ELIXIR EU
  • Dr. Fotis Psomopoulos, ELIXIR EU
  • Prof. Tom Lenaerts,  Université Libre de Bruxelles

Minimal Information Standard for an AI Model

Artificial intelligence and machine learning models are used more and more often in the development of pharmaceuticals and as software components in medical devices. However, because there has been a lack of clear reporting standards, many clinically relevant models have been reported with insufficient details to properly assess their risks and benefits. Historically, this has made the science underlying these products irreproducible, deployment and comparison of AI algorithmic solutions hard, and may lead to the users of these products facing unequal or unforeseen harms. Therefore a standard for reporting of biomedically-relevant AI/ML models is necessary. In this panel discussion we will brainstorm options for the transparent reporting of AI algorithms in biology and medicine. Participants include Prof. Atul Butte and Dr. Beau Norgot, authors of the MI-CLAIM checklist recently published in Nature, and Drs. Jen Harrow, Fotis Psomopoulos, and Tom Lenaerts, who are actively working on the standards for AI and ML in Europe.


20 January 2021


  • Prof. Alexandar Poleksic, UNI
  • Prof. Lei Xie, CUNY

AI for Drug Repurposing

Chemical-induced gene expression profiles provide a mechanistic signature of phenotypic response, and are thus promising for drug repurposing. However, the use of such data is limited by their sparseness, unreliability, and relatively low throughput.

Our speakers, Drs. Aleksandar Poleksic and Lei Xie, describe two new computational techniques for prediction of the differential gene expression profiles perturbed by de novo chemicals and inference of drug-disease associations.


27 January 2021


Panel discussion:

  • Jérôme Windsor, PharmD, MBA (Moderator), Advisor, Boston Digital Bio Consulting
  • Karine Seymour, MBA, CIO, Medexprim
  • Tim McCarthy, PhD, MBA, VP and Digital Medicine Head, Pfizer
  • Prof. Laure Fournier, Academic Radiologist, Hôpitaux de Paris
  • Angel Alberich-Bayarri, PhD, CEO, Quibim (Quantitative Imaging Biomarkers in Medicine)

Imaging Biomarkers

Biomarkers have become an essential part of the drug discovery and development process. A biomarker-driven approach to developing targeted therapies and patient selection strategies has the potential to increase success in the drug development process, decrease costs, and ultimately improve patient outcomes.

But what about imaging biomarkers? Usually obtained from PET, MRI, and CT scans, they comprise measurements of structural and metabolic features of the body that over time are used to assess disease progression and response to treatment. Imaging biomarkers are an ideal method to draw evidence from retrospective data and can be used both as inclusion criteria—to select relevant cohorts of patients and output data—to quantify responses to treatments.

  • How to use imaging in early clinical trials for an increased confidence in the target and in the new drug discovery?
  • From the investigator perspective, how to best combine standard imaging and advanced, personalized phenotypic endpoints in clinical trials?
  • Radiomics, ML and AI, digital patient, synthetic control arms .. :  Where the future of imaging is?
  • How to massively access real world quality data to create data lake and to develop new imaging markers?

15 February 2021


Prashant Natarajan, Vice President of AI & Analytics Solutions, H2O.ai

Real-World Evidence - Levering AI and Analytics For Real Value and Lasting Impact

Real-world evidence is not new, but with advances in processes, technology, policy, and analytics, is becoming more accessible and usable. RWE is being used to drive real outcomes and lasting impact for pharma, patients/subjects, and other participants in the continuum of care. At the foundation of RWE is data – behaviors, patterns, computational biomarkers, phenotypic/genomic data, imaging, outcomes, and social determinants of health.

The RWE trends that are happening in life sciences and biological sciences are driven by

  • Datafication is driven by the availability of diverse data – big, small, and everything in between
  • Competitive advantages
  • Reducing the time for regulatory approvals
  • Cost and outcomes

While data and descriptive analytics have been in vogue for years, advances in processing RWE – in combination with RCTs via data science, machine/deep learning, and advanced analytics – are creating new value for Pharma companies across the board – not just in R&D and pharmacovigilance but also extending into economic value, sales & marketing, affordable therapies, and patient outcomes.

More importantly, with the success of these analytics and AI efforts, we will see an increasing appetite for more types of RWE – beyond EMRs, all-claims, and commercial data sets – into patient-reported experiences, wearables, at-home devices, and implants.

Creating value at scale and achieving lasting impact is important, doable, and repeatable. This presentation will provide practical recommendations on how to put this tsunami of RWE and data variety to work using the IMPACT framework.

We will conclude with a discussion of representative use cases that pharma and biotechnology organizations can use to move the needle from a product focus to customized/personalized therapies, precision medicine, and population health.

Speaker: Prashant Natarajan, Vice President of AI & Analytics Solutions, H2O.ai and Pistoia Alliance AI CoE Advisory Committee Member

Please note: This presentation was originally delivered during the Qiagen Digital Insights hackathon in February 2021 and is being shared with permission. All rights reserved.


25 March 2021


The Pistoia Alliance DataFAIRy Team:

  • Isabella Feierberg, Associate Principal Scientist, AstraZeneca
  • Dana Vanderwall, Director of Biology & Preclinical IT, Bristol Myers Squibb
  • Rama Balakrishnan, Biomedical Ontology Specialist, Genentech
  • Martin Romacker, Senior Principal Scientist in Scientific Solution Engineering and Architecture, Roche
  • Samantha Jeschonek, Research Scientist, Collaborative Drug Discovery
  • Timothy Ikeda, Automation Principal Scientist, AstraZeneca
  • Gabriel Backiananthan, Novartis
  • Anosha Siripala, Technical Associate Director, Scientific Products, Novartis Institutes for BioMedical Research (NIBR)

Lessons Learned in a Pilot BioAssay Annotation Project

In 2020, a team of scientists from AstraZeneca, Bristol Myers Squibb, Novartis, and Roche set forth to find a way to convert unstructured biological assay descriptions into FAIR information objects.

In this talk, we will present the lessons learned in the pilot project to annotate bioassay descriptions (bioassay) en masse and will chart a way to expand this effort in the future.


21 April 2021

There are 2 sessions: 21 April at 8-9 am PST, and a repeat for the APAC time zones on 22nd April 4-6:30 pm

Dr. Vladimir Makarov, PhD, MBA, Pistoia Alliance

Pistoia Alliance AI CoE, FAIR, and DataFAIRy

Invited short talk at the Research Data Alliance Virtual Plenary "FAIR 4 ML"; see full agenda and the direct link to our session "Defining FAIR for AI"


21 April 2021

Talk starts at 12:20 pm EDT (9:20 am PST)

  • Isabella Feierberg, Associate Principal Scientist, AstraZeneca
  • Dana Vanderwall, Director of Biology & Preclinical IT, Bristol-Myers Squibb
  • Rama Balakrishnan, Biomedical Ontology Specialist, Genentech
  • Samantha Jeschonek, Product Manager, CDD

Panel Discussion: The Pistoia Alliance DataFAIRy Project

Part of the Pistoia Alliance Conference - Collaborative R&D in Action


5 May 2021



  • Helena Deus (ZS)
  • Peter Henstock (Pfizer)
  • Margi Sheth, AstraZeneca
  • Prashant Natarajan, Vice President of AI & Analytics Solutions, H2O.ai

Technical strategies against bias in AI

There is an increasing number of reports discussing the urgent need for addressing bias in decision making algorithms in healthcare. In fact, a recent JAMA commentary published in 2021 (link) highlighted systemic kidney transplantation inequities for black individuals. With AI-based and machine learning techniques increasingly playing a role in healthcare decision making, it becomes necessary to discuss not only the ethical implications but solutions and approaches to detect and reduce the impact of computer bias in healthcare. The Pistoia Alliance is happy to announce the "Technical strategies against bias in AI", which will bring together industry experts to share lessons learned and discuss possible solutions.


2 June 2021

8 am PST = 11 am EST = 4 pm London

  • Isabella Feierberg (AZ)
  • Samantha Jeschoneck (CDD)

DataFAIRy Bioassay Annotation

Invited short talk at the Cambridge Cheminformatics meeting


2 June 2021

8 am PST = 11 am EST = 4 pm London


  • Matt Segall, CEO, Optibrium
  • Samar Mahmoud, Senior Scientist, Optibrium
  • Fabio Broccatelli, Senior Scientist, Genentech

Optimizing Kinase Profiling Programs with Deep Learning

Join Genentech and Optibrium for this discussion of Alchemite™, a novel deep learning approach, and its application to optimizing kinase profiling programs. Using Alchemite reduces the number of kinase assays required to accurately predict the full kinase selectivity profile, effectively accelerating experimental programs.

The team will demonstrate the method’s performance on a data set of approximately 650 kinases and 10,000 compounds, significantly outperforming state-of-the-art quantitative structure-activity relationship (QSAR) approaches, including multi-target deep learning. Furthermore, we will discuss Alchemite’s unique ability to provide reliable prediction-uncertainty-estimates that enable the selection of the most informative kinase assays and which compounds to test.


30 June 2021

8 am PST = 11 am EST = 4 pm London


Post-webinar Q&A

  • Victor Dillard, Commercial Operations Director, Owkin
  • Hugo Ceulemans, Scientific Director Discovery Data Science, Johnson & Johnson
  • Dr. Guillaume Bataillon, oncologist at Institut Curie, a partner of the HealthChain project 
Building the future of collaborative research with federated learning

Federated learning is a new machine-learning paradigm where multiple partners can collaborate on complex research questions without centralising or sharing data outside of their organizations. This ‘collaborative machine learning’ approach enables data science teams to work on larger and more diverse datasets, previously inaccessible, boosting the predictive power of machine learning algorithms and enhancing AI capabilities. By overcoming privacy and confidentiality concerns, companies can build partnerships and consortia and retain their competitive edge. For example, the MELLODDY consortium pioneers federated learning-based drug discovery across 10 pharma companies benefiting from the collective insights of the world’s largest cheminformatics data network where each participant retains full confidentiality and governance over their molecular libraries.
Federated learning in healthcare can also facilitate knowledge transfer between medical researchers and data scientists, bridging the gap between AI and clinical care. The HealthChain project is a successful demonstration that an algorithm can be trained on siloed histology images, distributed across different hospitals, to predict treatment responses in breast cancer. Together with clinical, research and technology partners we demonstrated improved robustness and performance of the technology over locally trained algorithms. With the platform deployed and used reliably in a production environment, the stage is set for further collaborative research projects and eventually clinical applications in cancer, heart failure and other therapeutic areas.



28 July 2021

8 am PST = 11 am EST = 4 pm London


  • Loganathan Kumarasamy, Head of Scientific Informatics, Validation and Compliance services, North America, Zifo R&D
  • Pat Baird, Regulatory Head of Global Software Standards, Phillips
  • Nathan A. Carrington, Ph.D.
    Head of Digital Health and Innovation
    Global Regulatory Policy and Intelligence
    Roche Diagnostics

Challenges in the regulation of AI Software as a Medical Device

Software as a medical device (SaMD) that leverages artificial intelligence (AI) has the opportunity to reshape healthcare. It also raises unique challenges for developers and regulators. As healthcare advances and digital solutions leveraging AI become more prevalent, it is important that medical device regulatory frameworks also advance to match the speed of innovation. The panel will review key terms related to AI SaMD and describe unique regulatory challenges associated with devices that leverage AI. Additionally, the panel will explore novel regulatory approaches to the regulation of AI SaMD currently under consideration by international regulatory authorities.


8 September 2021

8 am PST = 11 am EST = 4 pm London


Andreas Bender, Reader for Molecular Informatics at Cambridge University, and Director Digital Life Sciences at Nuvisan/ICB

Artificial Intelligence in Drug Discovery – What is Realistic, What are Illusions?

Although artificial intelligence (AI) has had a profound impact on areas such as image recognition, comparable advances in drug discovery are rare. We will discuss the stages of drug discovery in which improvements in the time taken, success rate or affordability will have the most profound overall impact on bringing new drugs to market. Changes in clinical success rates will have the most profound impact on improving success in drug discovery; in other words, the quality of decisions regarding which compound to take forward (and how to conduct clinical trials) are more important than speed or cost. Although current advances in AI focus on how to make a given compound, the question of which compound to make, using clinical efficacy and safety-related end points, has received significantly less attention. As a consequence, current proxy measures and available data cannot fully utilize the potential of AI in drug discovery, in particular when it comes to drug efficacy and safety in vivo. Thus, addressing the questions of which data to generate and which end points to model will be key to improving clinically relevant decision-making in the future.


4 November 2021

8 am PST = 11 am EST = 4 pm London


Jacob Aptekar, MD, PhD, Senior Director of Product Management, Acorn AI, a part of Dassault Systems

How Synthetic Data Is Unlocking a Decade's Worth of Clinical Trial Data to Power a New Era of Drug Development

Historic clinical trial data (HCT) is emerging as an important source of evidence across clinical development. Data from past trials is often superior to real-world data from EMR records etc. as it is more structured, complete, 100% traceable and contains the typical endpoints and covariates captured in a clinical trial. Regulators have lately been supportive of the use of HCT data with both the FDA and EMA approving hybrid trials: phase 3 trials where patients from the control arm have been replaced by synthetic patients from past trials. This talk will explore methodologies and use cases for Synthetic Patients - 'digital twins' of real patients that replicate their behavior to a very high degree. Synthetic Patients enable easy sharing of patient-level data without risk of subject-level or sponsor disclosure while allowing data scientists to mine deep insights on patient characteristics and behavior.


23 February 2022

8 am PST = 11 am EST = 4 pm London


Martin-Immanuel Bittner, MD, DPhil, FRSA, the Chief Executive Officer and co-founder of Arctoris

Combining Robotics and Machine Learning for Accelerated Drug Discovery

Artificial intelligence has an increasing impact on drug discovery and development, offering opportunities to identify novel targets, hit, and lead-like compounds in accelerated timeframes. However, the success of any AI/ ML model depends on the quality of the input data, and the speed with which in silico predictions can be validated in vitro. The talk will cover laboratory automation and robotics and the benefits they offer in terms of quality and speed of data generation synergized with AI/ ML-powered drug discovery approaches. The talk will cover some of the general trends in the industry, and also highlight successfully implemented case studies that show how the combination of robotics and AI/ ML lead to accelerated project timelines and superior research outputs. 


12 May 2022

8 am PST = 11 am EST = 4 pm London

Karl Leswing, Machine Learning Tech Lead, Schrödinger

AI/ML Webinar: AI Tools for Drug Design - AutoDesigner, a De Novo Design Algorithm

The lead optimization stage of a drug discovery program generally involves the design, synthesis, and assaying of hundreds to thousands of compounds. The design phase is usually carried out via traditional medicinal chemistry approaches and/or structure-based drug design (SBDD) when suitable structural information is available. Two of the major limitations of this approach are (1) difficulty in rapidly designing potent molecules that adhere to myriad project criteria, or the multiparameter optimization (MPO) problem, and (2) the relatively small number of molecules explored compared to the vast size of chemical space. To address these limitations we have developed AutoDesigner, a de novo design algorithm.

  • Prashant Natarajan, H2O.ai
  • Peter Henstock, Pfizer
  • TBA - pharma
  • TBA - tech vendor
Valuation of AI Technology Investments
IdeaPlanningQIAGEN; TBD

Precision Medicine in Pharma

Based on the TCGA curated data 


Mika Lindvall, Novartis Institutes for BioMedical Research, Emeryville, California 94608

Email: mika.lindvall@outlook.com

RE: the paper: An Artificial Intelligence Approach to Proactively Inspire Drug Discovery with Recommendations

IdeaTBDTBDExploring Chemical Information with NLP/ML etc
IdeaTBDTBDMulti Parameter Optimisation and use of Machine Learning

State of the Art in AI with working examples 


AI and Chemistry synthesis and reaction planning

  • What are the challenges going forward
  • Moderator: An experienced med chemist from industry or academia