...
Recording https://pistoiaalliance-org.zoom.us/rec/share/3vT1H30cX_zFgEUtJ828Spj158rN2oqOssBNDc6hC1mti2TQ5G-uxdoBZkN7I8GQ.xtvwbqf---G-qnut?startTime=1706799805000
Passcode: vW*uB7^2Transcript: https://pistoiaalliance-org.zoom.us/rec/share/dcrTHAezwaqAQtBPJMUSgGAb5LUS8lTiqJquis3yeyo6U6SgTvPVk6dDZ0K6oNIU.7s_B5JDf98QIHXz1
Passcode: vW*uB7^2The main action item is to add information to the list of candidate LLMs: https://docs.google.com/spreadsheets/d/1muOE2zweNl9LvW1yIsUcJRy3gGTDe2C_/edit?usp=sharing&ouid=111803761008578493760&rtpof=true&sd=true
Notes from February 15th small team call:
Warning: BioCypher may not be W3C compliant, and needs discussion in the large team before adoption - or consider alternatives - so far this is the most important question.
This team cannot make progress until we make the decision about BioCypher
Focus on smaller, cheaper models first? Pick a handful of models, at various size points, look up performance on general benchmarks
What is the task → that dictates the choice of the benchmarks
Verify that BioChatter has benchmarks for writing cypher queries
How important is each benchmark? Perhaps create a linear model that combines multiple scores into a single score
Helena: This benchmark answers the question “what are the best embeddings” across a variety of tasks: https://huggingface.co/spaces/mteb/leaderboard
Convert into a weekly call at the same time on Thursdays for the next six weeks