Enhancing Interaction Graph of Data Schema and Syntactic Structure with Pre-trained Language Model for Text-to-SQL


연구 분야: Databases



학회: CCF Conference on Big Data


초록

Text-to-SQL generation is an important area of natural language processing. It can help non-specialists interact with databases using natural language, simplify the database query process, improve efficiency, enhance the user experience, etc. Existing work on Text-to-SQL mainly utilizes large-scale pre-trained models to improve the performance of model generation. Despite progress, Text-to-SQL still has some shortcomings in some respects, such as the discrepancy of words in natural languages, inaccurate scheme links, and inadequate domain generalization capabilities. In this paper, we present an SQL generating framework,which enhancing the interaction graph of data schema and syntactic structure with pre-trained language model for Text-to-SQL(SGIS), aimed at improving the domain generalization of models and the ability of models to deal with cross-cutting questions. Specifically, we first introduce a model linking method based on a pre-trained model, extracting input NL question and database scheme relationship structures to solve the scheme linking question between the NL questions in the model and database models. On this basis, the sentence in the input NL question is extracted from the reliant information and integrated into the well-structured chart data to solve the incomplete question of the relationship characteristics embedded in it. At the same time, in order to prevent the over-adaptation of embedded sides during the training process during the optimization process, we use a type-coding method to help the model effectively differentiate between the type of relationship that the sentence depends on when embedding sides, thereby reducing unnecessary entanglement. Numerous experiments have proven that SGIS’s performance on both data sets Spider and Spider-SYN under standard settings is due to all comparative base lines.


Author Profile
Wenbin Zhao

School of Information Science and Technology Shijiazhuang Tiedao University Shijiazhuang Hebei China

Andorra
Author Profile
Long Zhao

School of Information Science and Technology Shijiazhuang Tiedao University Shijiazhuang Hebei China

Andorra
Author Profile
Feng Wu

Hebei Science and Technology Information Processing Laboratory Hebei Institute of Science and Technology Information Shijiazhuang Hebei China

Andorra

📄 논문 정보

발행 연도 2025년
인용수 0
출판 국가 Andorra
사이트 Springer
좋아요 수 0

연관 논문 목록 (432건)