...
Recording: https://pistoiaalliance-org.zoom.us/rec/share/oxGWla7rTcksvYfxMt0NrepIGltJxS6aYo-UUeN5dYQ21F8rNr8IW9LLNQCO-T-Y.OyMU-x9CAyXjcieU Passcode: uwwr&H5A
Transcript: https://pistoiaalliance-org.zoom.us/rec/share/mQT2t0Z0mbcIr8Yq0y1sqsoeo_nByZoTWPw8EwubZDxihARk5mgT8D-Gk_1IYG0a.AjRl0-VIFJyQKNWw Passcode: uwwr&H5A
Prompt size may be important, and we increased its weight in the comparison table
Preferred architecture would allow for swapping of LLMs
Censorship is most likely already included in the performance scores - this thought discounts the censorship risk
Given that not all scores are available, we may end up having to do our own evaluation
Consider hosting platforms for open-source models (Amazon Bedrock) instead of renting servers at AWS
Preference for hosted models with pay-per-token
Add this dimension to the spreadsheet ACTION for Jon Stevens
Review rankings - ACTION for Brian and Etzard
Notes from the April 11th small team call:
This call was not recorded, but slides with extensive notes and a file with code captured from a Jupyter notebook (VM) are available:
View file name 2024.04.11 LLMs meeting.pptx View file name Chatting with the SEC Knowledge Graph.txt Vladimir shared observations on LLM behavior in generation of Cypher queries, and on answering questions in English based on structured input, all corroborated by Jon and Brian (and by Rob vis email earlier)
The highest risk step is Cypher code generation
Agreed to delegate the LLM testing to the BioCypher team, and meanwhile pick two LLMs for POC (GPT4 and Mistral)
Officially close this work stream, because we gathered all information we could and now need to learn more by doing - and actually prototype a POC
The new team will be composed of the members of Thursday (LLM choice) and Friday (Open Targets and architecture) sub-teams, and will meet on Fridays