Notes from February 15th small team call:
Recording: https://pistoiaalliance-org.zoom.us/rec/share/kwJaJdaq_HSBP9ZrVL1DPWME3ufPxeyl3I2CBkr-A9cWwSgounfMg8NlaGjrGkry.HbzA8n0Oxo88oXQi
Passcode: @6UEvs7^Transcript: https://pistoiaalliance-org.zoom.us/rec/share/1Ihp3towQE8NXeHp4-EANWwHeDhNHCIjaU7RVSv-uZRd0XWL1YY3nrGHssIEOwR2.d4qyWVgOutxa68Bt
Passcode: @6UEvs7^Warning: BioCypher may not be W3C compliant, and needs discussion in the large team before adoption - or consider alternatives - so far this is the most important question.
This team cannot make progress until we make the decision about BioCypher
Focus on smaller, cheaper models first? Pick a handful of models, at various size points, look up performance on general benchmarks
What is the task → that dictates the choice of the benchmarks
Verify that BioChatter has benchmarks for writing cypher queries
How important is each benchmark? Perhaps create a linear model that combines multiple scores into a single score
Helena: This benchmark answers the question “what are the best embeddings” across a variety of tasks: https://huggingface.co/spaces/mteb/leaderboard
Convert into a weekly call at the same time on Thursdays for the next six weeks