Competency Questions in Target Discovery

Reference papers: https://www.sciencedirect.com/science/article/pii/S1359644613001542 and https://www.nature.com/articles/s41573-020-0087-3 list many possible competency questions. We need to refine the list and gather 20 to 100 (no more, ideally fewer) questions that can be answered with knowledge graphs and LLM RAG.

Sub-Team: Lee Harland, John Wise, Bruce Press

Link to the working Google doc: https://docs.google.com/document/d/17-fwEYe1BKiGzZ4rzV7oKEJ-pWGrbN9WV1r-2INwTRI/edit?usp=sharing

Link to the spreadsheet with questions: Pistoia-LLM_Questions_v0.01.xlsx

Link to the document with “ground truth” examples of business questions with answers, derived from the Open Targets database, developed courtesy of the ZS Consulting: 180624_OT_Questions.docx We can use this small collection of questions for testing of the future target discovery LLM system.

2024.02.12 sub-team call:

  • Recording: Video Conferencing, Web Conferencing, Webinars, Screen Sharing
    Passcode: =8T7nf=e

  • This sub-team is done.

  • Questions are defined. Some can be readily answered with the Open Targets KG. Good enough questions are ok. Not everything should be answerable right away.

  • Let us now take one or more of easy to address (“green”) questions and feed it to KG. Get POC for technical and procedural feasibility.

    • No need to invest in harder questions now, no need to expand the Open Targets KG

    • Success criterion #1: compare RAG answer to the opinion from a human scientist who is an expert on the topic of the question

    • Success criterion #2: compare RAG answer (with a plain language question asked of LLM) to the KG-derived answer produced by an expert data scientist

  • Strategy: Ask at the next large team call: validate an approach to encase Open Targets KG questions as multiple modules with tuned prompts for a RAG system, one for each of the questions. But say we are successful in the POC, - then what? Need a better vision of success, exciting, complex.

    • Creating many open-source APIs to public data sources is not exciting. Perhaps define a standard for such APIs?

    • Need for the API standard is one lesson learned

    • This is valuable for KG software vendors: ease of “wiring in” additional data, including proprietary data sources

    • What is the volatility of the data sets that we want to eventually use? For rapidly developing data sources continuous updates may be needed to accommodate ongoing changes in data source structure. So need a data standard for rapidly evolving data sources, with data increments (easier use case) or completely new data dimensions and variables (harder use case). Can a data source present itself to a RAG query system to automate data updates?

    • Possible new risk: will LLMs be confused by the similar data types from multitude of sources?

    • Provider / vendor / expert ecosystem needed - not just Open AI ChatGPT with Medline in it

 

Notes by Lee:

Sorry, I wrote this summary during the meeting and never sent it out, in case its still helpful Vladimir:

 

  • The core principle we were working off is that the initial aim of the Pistoia project is to create a suite of “modules” that can be plugged into LLM Chatbots to answer specific questions from public APIs/Knowledge graphs. The benefits of this are (i) modules readily available for all companies to simply plug into their LLMs and (ii) emerging standards/approaches/learning around doing this

    • !! We should validate this is indeed the case with the steering group!

 

  • The subteam was tasked with generating Target ID/Validation questions that could be mapped to data/queries solvable by the Open Targets knowledge graph

  • So we specifically limited to fairly generic high level questions, not therapy area specific and not detailed AND limited to the kinds of data available in Open Targets

 

  • The Hyve have mapped which questions can be answered by the OTKG and how easily that would be

  • Our recommendation is to just pick one or two that already have all data needed and demonstrate generally the approach works [i.e. create LLM modules]

  • “Validation” should be that “a bench scientist, asking a natural language question through a Chat LLM should obtain the same data as a data-scientist querying the knowledge graph correctly”

  • If this works (which it should) we should then look at next steps to focus on (and resourcing) of either/and/or

    • Building more modules from existing questions

    • Expanding questions beyond the generic to something deeper

    • Generating more/new data to support 1&2

 

 

Of course, if this is not what the potential funders think is worthwhile, we should stop until we establish what is the next step