...
2024.02.07 Recording (Passcode: L58@v7Dg) Slides | Slides from the talk by Sebastian Lobentanzer
2024.03.20 Recording (Passcode: LZ!jZT4z) Slides | Architecture diagram in Draw.io | Architecture diagram PNG file
Lessons Learned
The highest risk item is generation of the structured query (Cyphrer or SPARQL) from a plain English request. Some publications estimate success rate of about 48% on the first attempt.
Practically useful system requires filtering or secondary mining of output in addition to natural language narration.
References
https://www.sciencedirect.com/science/article/pii/S1359644613001542
Open LLM Leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
Chatbot Arena: https://chat.lmsys.org/?arena
Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning
Knowledge-Consistent Dialogue Generation with Language Models and Knowledge Graphs
BioChatter Benchmark Results: https://biochatter.org/benchmark-results/#biochatter-query-generation
MBET Benchmark (embeddings) https://huggingface.co/spaces/mteb/leaderboard
Lora-Land and Lorax: https://predibase.com/lora-land
A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases. Summary: queries over a KG with GPT 4 are much more accurate than queries over a SQL database with GPT 4. https://arxiv.org/abs/2311.07509
https://towardsdatascience.com/evaluating-llms-in-cypher-statement-generation-c570884089b3