Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Lessons Learned

  • The highest risk item is generation of the structured query (Cyphrer or SPARQL) from a plain English request. Some publications estimate success rate of about 48% on the first attempt.

  • The structure of the database used for queries matters. LLMs can easier produce meaningful structured queries for databases with flat, simple structure.

  • Practically useful system requires filtering or secondary mining of output in addition to natural language narration.

  • It is extremely important to implement a reliable named entity recognition system. The same acronym can refer to completely different entities, which can be differentiated either from the context (hard) or by asking clarifying questions. Must also map synonyms. Without these measures naïve queries in a RAG environment will fail.

References

  1. https://www.sciencedirect.com/science/article/pii/S1359644613001542

  2. https://www.nature.com/articles/s41573-020-0087-3

  3. https://www.epam.com/about/newsroom/press-releases/2023/epam-launches-dial-a-unified-generative-ai-orchestration-platform

  4. https://epam-rail.com/open-source

  5. Open LLM Leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

  6. Chatbot Arena: https://chat.lmsys.org/?arena

  7. Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning

    https://arxiv.org/abs/2310.01061

  8. Knowledge-Consistent Dialogue Generation with Language Models and Knowledge Graphs

    https://openreview.net/forum?id=WhWlYzUTJfP&source=post_page-----97a4cf96eb69--------------------------------

  9. BioChatter Benchmark Results: https://biochatter.org/benchmark-results/#biochatter-query-generation

  10. MBET Benchmark (embeddings) https://huggingface.co/spaces/mteb/leaderboard

  11. Lora-Land and Lorax: https://predibase.com/lora-land

  12. A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases. Summary: queries over a KG with GPT 4 are much more accurate than queries over a SQL database with GPT 4. https://arxiv.org/abs/2311.07509

  13. https://towardsdatascience.com/evaluating-llms-in-cypher-statement-generation-c570884089b3

  14. https://medium.com/neo4j/enhancing-the-accuracy-of-rag-applications-with-knowledge-graphs-ad5e2ffab663

  15. linkedlifedata.com

  16. Kazu - Biomedical NLP Framework: https://github.com/AstraZeneca/KAZU

  17. https://github.com/f/awesome-chatgpt-prompts/tree/main