Semantic Parsing with Syntax- and Table-Aware SQL Generation

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2018Embargo end date: 01 Jan 2018Publisher:Association for Computational Linguistics (ACL)Journal:Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Authors: Yibo Sun; Duyu Tang; Nan Duan; Jianshu Ji; Guihong Cao; Xiaocheng Feng; Bing Qin 0001; +2 Authors

doi: 10.18653/v1/p18-1034 , 10.48550/arxiv.1804.08338

arXiv: 1804.08338

Semantic Parsing with Syntax- and Table-Aware SQL Generation

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

We present a generative model to map natural language questions into SQL queries. Existing neural network based approaches typically generate a SQL query word-by-word, however, a large portion of the generated results are incorrect or not executable due to the mismatch between question words and table contents. Our approach addresses this problem by considering the structure of table and the syntax of SQL language. The quality of the generated SQL query is significantly improved through (1) learning to replicate content from column names, cells or SQL keywords; and (2) improving the generation of WHERE clause by leveraging the column-cell relation. Experiments are conducted on WikiSQL, a recently released dataset with the largest question-SQL pairs. Our approach significantly improves the state-of-the-art execution accuracy from 69.0% to 74.4%.

Related Organizations

Microsoft Research Asia (China)
China (People's Republic of)
Microsoft (United States)
United States
Harbin Institute of Technology
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)

1 Research products, page 1 of 1

WikiSQL software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	23
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%