A Comprehensive Exploration on WikiSQL with Table-Aware Word Contextualization

We present SQLova, the first Natural-language-to-SQL (NL2SQL) model to achieve human performance in WikiSQL dataset. We revisit and discuss diverse popular methods in NL2SQL literature, take a full advantage of BERT {Devlin et al., 2018) through an effective table contextualization method, and coherently combine them, outperforming the previous state of the art by 8.2% and 2.5% in logical form and execution accuracy, respectively. We particularly note that BERT with a seq2seq decoder leads to a poor performance in the task, indicating the importance of a careful design when using such large pretrained models. We also provide a comprehensive analysis on the dataset and our model, which can be helpful for designing future NL2SQL datsets and models. We especially show that our model's performance is near the upper bound in WikiSQL, where we observe that a large portion of the evaluation errors are due to wrong annotations, and our model is already exceeding human performance by 1.3% in execution accuracy.

KR2ML Workshop at NeurIPS 2019, 11 pages, 4 figures

Keywords

FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)

8 Research products, page 1 of 1

Relation Aware Semi-autoregressive Semantic Parsing for NL2SQL
2021IsAmongTopNSimilarDocuments
Auto-conversion from Natural Language to Structured Query Language using Neural Networks Embedded with Pre-training and Fine-tuning Mechanism
2020IsAmongTopNSimilarDocuments
Natural Language to SQL Generation for Observational Study Designs: Current Challenges and Possible Directions (Preprint)
2020IsAmongTopNSimilarDocuments
WikiSQL software on GitHub
IsRelatedTo
transformer software on GitHub
IsRelatedTo
sqlova software on GitHub
IsRelatedTo
SQLNet software on GitHub
IsRelatedTo
bert software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Average

Green

Fields of Science (4) View all

natural sciences

Fields of Science

natural sciences

View all