0

A Pilot Study for Chinese SQL Semantic Parsing

The study explores semantic parsing for Chinese in the context of translating natural language to SQL, comparing character- and word-based encoders, and finding word embeddings beneficial for cross-lingual text-to-SQL tasks.

Year
2019
Venue
a-pilot-study-for-chinese-sql-semantic-1
Authors
3
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/1909.13293v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

The task of semantic parsing is highly useful for dialogue and question answering systems. Many datasets have been proposed to map natural language text into SQL, among which the recent Spider dataset provides cross-domain samples with multiple tables and complex queries. We build a Spider dataset for Chinese, which is currently a low-resource language in this task area. Interesting research questions arise from the uniqueness of the language, which requires word segmentation, and also from the fact that SQL keywords and columns of DB tables are typically written in English. We compare character- and word-based encoders for a semantic parser, and different embedding schemes. Results show that word-based semantic parser is subject to segmentation errors and cross-lingual word embeddings are useful for text-to-SQL.

Authors

3