Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • This call was short and not recorded

  • The remaining items in the LLM comparison table are costs for the Llama models (Brian to look up) and the performance figures on BioCypher (here we are dependent on Sebastian and may have to wait)

  • There is an expectation, based on team members' work experiences on other projects, that fine-tuning of open-source models may be heavily dependent on use case and may not be cost-effective

  • In that case GPT4 would win

Notes from March 21st small team call:

Notes from March 28th small team call:

Notes from the April 11th small team call:

  • This call was not recorded, but slides with extensive notes and a file with code captured from a Jupyter notebook (VM) are available:

    View file
    name2024.04.11 LLMs meeting.pptx
    View file
    nameChatting with the SEC Knowledge Graph.txt

  • Vladimir shared observations on LLM behavior in generation of Cypher queries, and on answering questions in English based on structured input, all corroborated by Jon and Brian (and by Rob vis email earlier)

  • The highest risk step is Cypher code generation

  • Agreed to delegate the LLM testing to the BioCypher team, and meanwhile pick two LLMs for POC (GPT4 and Mistral)

  • Officially close this work stream, because we gathered all information we could and now need to learn more by doing - and actually prototype a POC

  • The new team will be composed of the members of Thursday (LLM choice) and Friday (Open Targets and architecture) sub-teams, and will meet on Fridays

  • The matter of Cypher query generation from plain English questions is discussed here: Query generation with LLM