Meeting minutes
December 2019
Present: Anne, Boris, Konrad, Vincent, Richard.
Summary:
ACTION - Vincent & Dave to provide further guidelines on the key parameters used to define PK based on table below and the challenges encountered during the analysis.
ACTION - Vincent & Dave to provide some initial views of how this data would be used in drug discovery decision-making. Refer to slides from proposal put forward by Boris and Richard.
First draft of discussion on the 3 minimum parameters used to define PK, based on Anne's report and tabulation of extracted data:
PARAMETER | DESCRIPTION | Data assessment GUIDELINES for TIER 1 | DATA VALUE RANGES (units) | EXAMPLES |
---|---|---|---|---|
Clearance (Cl) |
|
| ||
Volume of Distribution (V) |
|
| ||
Half-life (T½) |
|
|
Below is a list of some more general challenges encountered during the analysis which need be addressed by establishing a set of consistent rules, at least for the purpose of this exercise. We need to consider whether these rules should be the basis for industry-wide standardisation.
- How to deal with rate constants
- How to deal with scaling factors being employed
- How to deal with data adjustments based on weight
- Lack of consistency around determination of standard error and confidence intervals around parameters
- Lack of consistency around naming conventions for some parameters
Additional considerations:
- Need to identify a number of papers across various therapy areas containing data which meets the tabulated criteria (above), otherwise Tier 1 becomes very narrow.
Data extraction will be done manually in the first instance, as what is currently done for ChEMBL, with attempts to automate the process coming later.
- Anne reminded us of the importance of the context in which this 'quality' PK data will be used.
November 2019
Present: Anne, Anna, Boris, Friedrich, Konrad, Vincent, Richard.
Summary:
ACTION - Anne to share analysis: here
ACTION - Dave/Vincent to provide feedback on analysis by Anne, highlighting the relevant parameters, additional context and why this is important to define 'quality' (December)
ACTION - Dave/Vincent/Boris/Konrad/Anne to meet and define the next steps and iterations required to define and build the Minimum Viable Product - see below.
ACTION - Dave/Vincent/Yves/Terry/Friedrich experts to provide insights into points 1-4 in slides, focusing on which human PK data is used in early discovery, how it is used and relevant sources.
Boris and Richard proposal slides
Key discussion points:
- General agreement with way forward described in the slides.
- Analysis from Anne on papers highlighted by Dave shows that PK parameters are described in many different ways. It is not difficult to extract the data, it is difficult to report the data in a standard way without people having to re-read the papers. At this stage the sense is that we should not aim to standardise the terminology but rather report the selected data - naming conventions are a challenge and perhaps not for us to solve (more community input needed here) but will need to find a way to use this.
- This initial (prototyping) stage is key in understanding how much data is out there and how much fits with what is in ChEMBL. We need to understand complexity and amount of data before we decide to put in ChEMBL.
- Choosing the PK parameters is fairly easy, the quality and analysis is the challenging part. Just having the values for e.g. t-half value will not tell you anything but understanding the overall experimental setup and context is key inc. Cl, VoD. Getting the right PK in the right place is biggest difficulty. This is where the proposed paper will discuss different cases, and how the data can be used (pitfalls to look out for).
- We aim to deliver a prototype as defined in slide 4 but given the available resource we should initially focus on a Minimum Viable Product defined as having a series of quality parameters and associated data in one place which permit 'others' to contribute similar data.
- Best approach is to start small and think about including biophysical/developability properties as a second pass. However, if not too complex we should explore inclusion of IC50, sequence (researchers can derive the 3D structural models, charge, charge in particular areas of the molecule based on sequence)
- Applicability: want to create Abs with long half life is key (associated with right Cl) - need other interests/motivations
June 2019
Present: Dave, Boris, Andrew, Yves, Richard.
Summary:
ACTION - Dave and Boris to upload a couple of further papers which have been identified.
ACTION - Dave to upload proposed 'search terms'.
ACTION - Friedrich to update sources of Biophysical data to cross-reference
ACTION - Andrew to let us know when it would be feasible to have resource to establish a working prototype (PoC) and hence inform the timelines for running the workshop
Update on search term capabilities
>Dave has shared additional publications with high quality PK (likely Tier 1) and a starting set of search terms (need to be added to the wiki space)
>Andrew and Anna are planning data extraction from papers based on the above but this has to be manageable and needs to be resourced by existing funds. In this context some Q's still need to be addressed:
- What is the feasibility of extracting this kind of data vs SM data?
- Will ChEMBL data model match onto this kind of data?
- Looking ahead: what is the broader scope (see below)? does more automated identification of literature fit into the workflows which have been already developed for ChEMBL? and what funding model will support this?
>We need a consensus on what is a meaningful, manageable and affordable pilot study but for this to be useful to as many people as possible, without sacrificing quality and relevance, there should be 2 components: PK data described above and more biophysical data as described in the Jain paper. These should be included in the list on the wiki.
>A first pass pilot study (PoC) needs to successfully identify relevant data based on search terms and will not necessarily give insight info confounding factors, issues etc. which would help the users/searchers of the information.
>Way forward proposal from Boris (see slides below)
- Data on a particular molecule would be qualified by the completeness and how much do we trust the data = completeness + quality tier -> overall score
- First we need to define completeness profiles for each molecule based on each publication (automation involved)
- Completeness involves 3 groups of variables being used to display the parameters (popPK) we want to capture in DB: independent, measured, estimated. It also includes also includes other information (covariates, formulation, etc)
- Second we need to define the quality of this data and this is based on the 3 quality Tiers we have defined. This step involves curation and the relevant expertise.
- Initial assessment carried out for 2 papers (Sirukumab and Benralizumab) based on current Tier definitions.
>Taking Mepilizumab as example, we can see how this framework can be used to define a PoC: initial pass of literature may yield one value for popPK analysis. As we expand the DB we may have 10 values for popPK, with each value having an estimate of the quality. As DB incorporates more data we can incorporate orthogonal searches taking into consideration e.g. Biophysical data.
Workshop
>Proposed timelines for pre-work and workshop below may be optimistic and need confirmation by Andrew based on resource availability
'August-September': conduct the preliminary work on: refining Boris' framework and parameters, defining search terms, defining the PoC to achieve during the workshop
'October-November': hold workshop in 2 parts - Part 1 uses pre-work to do the searching and delivers a workable prototype (PoC); Part 2 defines the route(s) forward (future plans) inc. Funding options, publication, further work, etc.
Funding opportunity
>Slide 2 in the below set contains information on the CZI funding opportunity
- Our application would fall under the category of domain-specific software for analyzing, visualizing, and otherwise working with the specific data types that arise in biomedical science (e.g., genomic sequences, microscopy images, molecular structures).
- We would likely aim for the 1 February 2020 deadline
- Funds ($50-250k) are for 1 year and cover a range of uses including salaries, workshops, operational needs, etc.
April 2019
Present: Dave, Boris, Andrew, Yves, Richard.
Summary:
March 2019
Present: Vincent, Dave, Boris, Richard.
Summary:
Proposed way forward and actions
>Half to one day face-to-face workshop attended by the above focused on defining an unbiased approach to identify relevant PK data publications resulting in a curated set of publications and data. Proposed location is Cambridge area (UK) hosted by GSK or AstraZeneca.
>The aims of the workshop would be to: (i) understand the challenge of extracting such data; (ii) develop a reproducible process to do so which is agnostic of which tool(s) are used; (iii) lay the foundation for a peer-reviewed publication or white paper, and eventually placing the information in a data repository.
>During the workshop we would work with an expert (e.g. from Elsevier) in automated methods for searching and extraction of information from text.
>This proposal keeps the focus on PK within a small group and would be a feasibility assessment for doing the same thing for other important properties of interest to us (immunogenicity, physicochemical, biophysical) based on a similar framework.
ACTION: Richard to identify contact within Pistoia Alliance with relevant expertise in NLP-type search methods who can help us with searching and extracting relevant information from various sources.
ACTION: Boris and Vincent to identify additional publications describing molecules which are considered ‘must have’ outputs from a structured search and place them in the appropriate quality Tier.
ACTION: Boris to share view of current sources of PK data, inc. presentation from PK workshop 2018.
Discussion on proposed outputs
>The proposed output from the workshop is a peer-reviewed publication or white paper containing the following:
- An ‘antibody clinical PK quality framework’ under which to capture data from key sources, inc. scientific papers, FDA drug approval packages, etc.
- Guidelines on how to classify data from these sources.
- A highlight and demonstration of common pitfalls and confounding factors.
- Some practical examples of how the framework is used and adds value in real antibody therapeutic discovery scenarios.
>The peer-reviewed publication or white paper would provide an output that is measurable and can be used as a platform to arrive at a valuable database faster by:
- Providing a structure for the initial database
- Identifying the high quality data to start populating the database
- Guiding the future sourcing and analysis of data, esp. in the context of drug discovery projects
- Providing a snapshot of the current state and relevant references in the field
>The peer-reviewed publication or white paper will provide value to specialists in the field of PK/PD analysis and prediction but will also be targeted towards non-specialists to ensure that data is not used at face value i.e. PK values are dependent on many parameters and how these are measured.
Discussion on required inputs
>Information requirements to put together peer-reviewed publication or white paper:
- List of molecules with associated data from various sources. Molecules should be limited to intact mAbs (no domain Ab fragments) without FcRn function disabled or enhanced e.g. PD-1 and PD-L1 groups.
- List and definition of key parameters which define PK (and PD) and their units. Focus on good quality data for linear portion of PK such that influence of target (TMDD) is removed (this will facilitate comparison between molecules and understanding dose predictions).
- List of main data sources which include: literature, FDA approval packages and existing databases, in which the data is not well curated and difficult to extract.
- List of key words and perceived challenges to conduct a structured search which will extract the right PK data and associated parameters (one of the outputs could be to highlight the difficulty in finding this information).
- Further development of existing ‘antibody clinical PK quality framework’ based on 3 Tiers before it can be applied to the output from the structured search.
- Awareness of any non-proprietary existing guidelines on PK classification, e.g. FDA
February 2019
Present: Vincent, Terry, Boris, Andrew, Yves, Friedrich, Richard.
Summary:
Options for an open access "biotherapeutics ChEMBL" resource - Andrew and Friedrich:
>Presentation by Andrew and Friedrich at the next EMBL-EBI Industry Programme meeting on Monday 4th March.
>Purpose of the engagement is to understand and gather support for an open access and free to use ChEMBL containing biologics data. This model is in opposition to a closed consortium owned database.
>Proposal is to build on the existing data in ChEMBL on marketed Ab drugs (mostly) with core information on the biologic i.e. sequence, mass, glycosylation pattern, etc. and add primary assays data in the first instance. User community can shape assay descriptions at a later date.
>Key questions to answer through the engagement include; would such a resource be of value to the wider community? how to create and evolve such a database in the biotherapeutic space? and how to fund? This cannot be done as an add-on to what is already being done for ChEMBL.
>Feedback from group discussion:
- Consider the importance of data quality, which is what this group has been concerned with from the start. As such, as well as using existing ChEMBL data quality indicators input from this group and perhaps other industry experts will be key.
- Need an indication of funds required and strategy to be employed based on total number of approved Abs (60-70) and those currently in clinical trials (200), e.g. use of an initial pilot study based on a number of Tier 1 Abs and associated core parameters identified by this group vs. data for approx. 400 molecules already in ChEMBL.
- Need to map out what is already available in the database, the minimum information required for each Ab and what is required to achieve our needs. Important not to lose sight of our interested in early PK trends.
- Important not to duplicate other efforts, e.g. IEDB (https://www.iedb.org/)
Commitment to initiative - All:
>Overall commitment to this initiative with a feeling that we need to start operationalising and committing resources.
>Two main approaches being proposed which are not mutually exclusive:
- Focus on depositing assay data based on existing framework provided by ChEMBL and build quality and curation at later stage.
- Focus on establishing framework based on data quality. Ensure that we have the right type of data and a good understanding of elements needed in database to meet our drivers. This approach is not about starting with a high number of compounds in the database but more about the right quality of information. This type of repository should allow us to understand which data is required (i.e. where are the gaps) and what the quality needs to be when developing a new product.
>To build further momentum, members of this group should encourage other members of the Industry Programme who have expertise in this area to join the initiative.
Further discussion on PK quality Tiers:
<Deferred to next meeting>
Next steps:
>Discuss feedback from Industry Programme community engagement at next meeting. Current thinking is that based on gaining support from Industry Programme community, ChEMBL expansion will be driven through a small group of experts, perhaps through one-to-one conversations, to understand data types and construct a more detailed proposal which can be presented to the wider community. The idea is that this is further driven through an expanded group working in the biotherapeutics space.
>Dedicate next meeting to further developing the PK quality tiers and classification of selected molecules based on these.
ACTION: Andrew and Friedrich to share Industry Programme engagement feedback by email ahead of the next meeting.
ACTION: All to let Richard know if there are other colleagues who will join the next meeting.
ACTION: All to look at proposed publications in each Tier and comment on proposed Tier definitions (https://pistoiaalliance.atlassian.net/wiki/spaces/ANT/pages/982482964/Initial+PK+quality+Tiers+and+publications)
ACTION: Richard to work with all to set up next meeting agenda and slot, aiming for 1.5h.
December 2018
Present: Dave, Terry, Boris, Yves, Richard.
Summary:
Data and availability:
>Estimated #molecules with associated data in each Tier: 10-20 (T1), +10-20 (T2), ~100 (T3). Currently ~80 mAbs approved by FDA.
>Data will be limited to actual numbers of molecules in clinic
>Non-public data will be important to elevate a particular molecule from a lower to a higher tier. For some organisations (GSK) all data should be published 18m post-CSR
>Diversity of data (Abs vs particular targets) important
>Comparability of data important, even if only for e.g. T1
Drivers:
>Dave - main drive is to have all clinical PK data in one place and move away from comparing the same parameter across different molecules irrespective of the way this has been measured
>Terry - main drive is to collect enough data to derive meaningful PK predictions; risk is there is not enough public data available so companies must be open to sharing non-public data
>Yves - main drive is to derive biophysical data from suitably classified mAbs (open to sharing this data once generated) and understand potential correlations with clinical PK
Challenges:
>Low numbers of molecules with corresponding PK data in the higher tiers to draw meaningful comparisons between them and correlate to earlier (biophysical) data
ACTION: all to look at proposed publications in each Tier and comment on proposed Tier definitions (https://pistoiaalliance.atlassian.net/wiki/spaces/ANT/pages/982482964/Initial+PK+quality+Tiers+and+publications)
ACTION: all to have a first attempt at classifying specific molecules into each Tier
ACTION: all to add name and further populate wiki table re. Drivers, value proposition and contribution (https://pistoiaalliance.atlassian.net/wiki/spaces/ANT/pages/919732437/Biologics+database+collaboration)
ACTION: all to share feedback on AbVance II proposal with Richard by email (https://pistoiaalliance.atlassian.net/wiki/download/attachments/92307462/AbVance%20II%20partner%20proposal%20October%202018%20FINAL.pdf?version=1&modificationDate=1541770378626&cacheVersion=1&api=v2)
October 2018
Present: Vincent, Dave, Anna, Terry, Boris, Andrew, Yves, Friedrich, Richard.
Summary: summarised in Biologics DB Collaboration landing page