Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ CAAI Transactions on...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
CAAI Transactions on Intelligence Technology
Article . 2025 . Peer-reviewed
License: CC BY NC
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://dx.doi.org/10.48550/ar...
Article . 2024
License: CC BY NC ND
Data sources: Datacite
DBLP
Article
Data sources: DBLP
versions View all 4 versions
addClaim

TCSR‐SQL: Towards Table Content‐Aware Text‐to‐SQL With Self‐Retrieval

Authors: Wenbo Xu; Liang Yan; Chuanyi Liu; Peiyi Han; Haifeng Zhu; Yong Xu; Yingwei Liang; +1 Authors

TCSR‐SQL: Towards Table Content‐Aware Text‐to‐SQL With Self‐Retrieval

Abstract

ABSTRACT Large language model‐based (LLM‐based) text‐to‐SQL methods have achieved important progress in generating SQL queries for real‐world applications. When confronted with table content‐aware questions in real‐world scenarios, ambiguous data content keywords and nonexistent database schema column names within the question lead to the poor performance of existing methods. To solve this problem, we propose a novel approach towards table content‐aware text‐to‐SQL with self‐retrieval (TCSR‐SQL). It leverages LLM's in‐context learning capability to extract data content keywords within the question and infer possible related database schema, which is used to generate Seed SQL to fuzz search databases. The search results are further used to confirm the encoding knowledge with the designed encoding knowledge table, including column names and exact stored content values used in the SQL. The encoding knowledge is sent to obtain the final Precise SQL following multi‐rounds of generation‐execution‐revision process. To validate our approach, we introduce a table‐content‐aware, question‐related benchmark dataset, containing 2115 question‐SQL pairs. Comprehensive experiments conducted on this benchmark demonstrate the remarkable performance of TCSR‐SQL, achieving an improvement of at least 27.8% in execution accuracy compared to other state‐of‐the‐art methods.

Related Organizations
Keywords

FOS: Computer and information sciences, Databases, Databases (cs.DB)

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
gold
Related to Research communities