ITRT(IT Research Trends)

Improving the Accuracy of Text-to-SQL Tools Based on Large Language Models for Real-World Relational Databases

연구 분야: Databases

논문 키워드: #challenging #complex #databases #tools #sql

학회: International Conference on Database and Expert Systems Applications

초록

Real-world relational databases (RW-RDB) have large, complex schemas often expressed in terms alien to end-users. This scenario is challenging to LLM-based text-to-SQL tools, that is, tools that translate Natural Language (NL) sentences into SQL queries using a Large Language Model (LLM). Indeed, their accuracy on RW-RDBs is considerably less than that reported for well-known synthetic benchmarks. This paper then introduces a technique to improve the accuracy of LLM-based text-to-SQL tools on RW-RDBs using Retrieval-Augmented Generation. The technique consists of two steps. Using the RW-RDB schema, the first step generates a synthetic dataset E of pairs , where is an NL sentence and is the corresponding SQL translation. The core contribution of the paper is an algorithm that implements this first step. Given an input NL sentence , the second step retrieves pairs from E based on the similarity of and , and prompts such pairs to the LLM to improve accuracy. To argue in favor of the proposed technique, the paper includes experiments with an RW-RDB, which is in production at an Energy company, and a well-known text-to-SQL prompt strategy. It repeats the experiments with Mondial, an openly available database with a large schema. These experiments constitute a second contribution of the paper.

📄 논문 정보

발행 연도	2024년
인용수	0
출판 국가	Brazil
사이트	Springer
좋아요 수	0

Improving the Accuracy of Text-to-SQL Tools Based on Large Language Models for Real-World Relational Databases

Improving the Accuracy of Text-to-SQL Tools Based on Large Language Models for Real-World Relational Databases

Gustavo M. C. Coelho

Eduardo R. S. Nascimento

Yenier T. Izquierdo

Grettel M. García

Lucas Feijó

Melissa Lemos

Robinson L. S. Garcia

Aiko R. de Oliveira

João P. Pinheiro

Marco A. Casanova

📄 논문 정보

연관 논문 목록 (416건)

Improving the Accuracy of Text-to-SQL Tools Based on Large Language Models for Real-World Relational Databases

Improving the Accuracy of Text-to-SQL Tools Based on Large Language Models for Real-World Relational Databases

📄 논문 정보

연관 논문 목록 (416건) 내 서재 담기

연관 논문 목록 (416건)