연구 분야: Databases
학회: International Conference on Conceptual Modeling
This paper investigates how the model size affects the ability of a Generative AI Language Model, or briefly a GLM, to support the text-to-SQL task for databases with large schemas typical of real-world applications. The paper first introduces a text-to-SQL framework that combines a prompt strategy and a Retrieval-Augmented Generation (RAG) technique, leaving as flexibilization points the GLM and the database. Then, it describes a benchmark based on an open-source database featuring a schema much larger than the schemas of most of the databases in familiar text-to-SQL benchmarks. The paper proceeds with experiments to assess the performance of the text-to-SQL framework instantiated with the benchmark database and GLMs of different sizes. The paper concludes with recommendations to help select which GLM size is appropriate for a text-to-SQL scenario, characterized by the difficulty of the expected NL questions and the data privacy requirements, among other characteristics.
| 발행 연도 | 2024년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Brazil |
| 사이트 | Springer |
| 좋아요 수 | 0 |