暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
RASAT- Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL.pdf
268
15页
3次
2023-03-10
免费下载
RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model
for Text-to-SQL
Jiexing Qi
1
, Jingyao Tang
1
, Ziwei He
1
, Xiangpeng Wan
2
, Yu Cheng
3
Chenghu Zhou
4
, Xinbing Wang
1
, Quanshi Zhang
1
, Zhouhan Lin
1
1
Shanghai Jiao Tong University, Shanghai, China
2
NetMind.AI and ProtagoLabs, Virginia, USA
3
Microsoft Research, Redmond, Washington, USA
4
IGSNRR, Chinese Academy of Sciences, Beijing, China
{qi_jiexing, monstar, ziwei.he, zqs1022, xwang8}@sjtu.edu.cn
lin.zhouhan@gmail.com
Abstract
Relational structures such as schema linking
and schema encoding have been validated as a
key component to qualitatively translating nat-
ural language into SQL queries. However, in-
troducing these structural relations comes with
prices: they often result in a specialized model
structure, which largely prohibits using large
pretrained models in text-to-SQL. To address
this problem, we propose RASAT: a Trans-
former seq2seq architecture augmented with
relation-aware self-attention that could lever-
age a variety of relational structures while in-
heriting the pretrained parameters from the T5
model effectively. Our model can incorpo-
rate almost all types of existing relations in
the literature, and in addition, we propose in-
troducing co-reference relations for the multi-
turn scenario. Experimental results on three
widely used text-to-SQL datasets, covering
both single-turn and multi-turn scenarios, have
shown that RASAT could achieve state-of-the-
art results across all three benchmarks (75.5%
EX on Spider, 52.6% IEX on SParC, and
37.4% IEX on CoSQL).
1
1 Introduction
Text-to-SQL is the task that aims at translating
natural language questions into SQL queries. Since
it could significantly break down barriers for non-
expert users to interact with databases, it is among
the most important semantic parsing tasks that are
of practical importance (Kamath and Das, 2018;
Deng et al., 2021).
Various types of relations have been introduced
for this task since Zhong et al. (2017) collected
the first large-scale text-to-SQL dataset, which has
resulted in significant boosts in the performance
Zhouhan Lin is the corresponding author.
1
Our implementation is available at
https://github.
com/LUMIA-group/rasat.
through recent years. For example, Bogin et al.
(2019b) introduced schema encoding to represent
the schema structure of the database, and the result-
ing augmented LSTM encoder-decoder architec-
ture was able to generalize better towards unseen
database schema. Lin et al. (2020a) introduced rela-
tions between the entity mentioned in the question
and the matched entries in the database to utilize
database content effectively. Their BERT-based
encoder is followed by an LSTM-based pointer net-
work as the decoder, which generalizes better be-
tween natural language variations and captures cor-
responding schema columns more precisely. RAT-
SQL (Wang et al., 2020a) introduced schema link-
ing, which aligns mentions of entity names in the
question to the corresponding schema columns or
tables. Their augmented Transformer encoder is
coupled with a specific tree-decoder. SADGA (Cai
et al., 2021) introduced the dependency structure of
the natural language question and designed a graph
neural network-based encoder with a tree-decoder.
On the other hand, a tree-decoder that can gener-
ate grammatically correct SQL queries is usually
needed to better decode the encoder output, among
which Yin and Neubig (2017) is one of the most
widely used.
Although integrating various relational struc-
tures as well as using a tree-decoder have been
shown to be vital to generating qualitative SQL
queries and generalizing better towards unseen
database schema, the dev of various specifically
designed model architectures significantly deviate
from the general sequential form, which has made
it hard if one considers leveraging large pre-trained
models for this task. Existing methods either use
BERT output as the input embedding of the specifi-
cally designed model (Cao et al., 2021; Choi et al.,
2021; Wang et al., 2020a; Guo et al., 2019), or
stack a specific decoder on top of BERT (Lin et al.,
arXiv:2205.06983v2 [cs.CL] 9 Oct 2022
2020a).
In another thread, pretrained seq2seq models just
have unveiled their powerful potential for this task.
Recent attempts by Shaw et al. (2021) show that
directly fine-tuning a T5 model (Raffel et al., 2020)
on this task without presenting any relational struc-
tures could achieve satisfying results. Moreover,
PICARD (Scholak et al., 2021) presents a way to
prune invalid beam search results during inference
time, thus drastically improving the grammatical
correctness of the SQL queries generated by the
autoregressive decoder that comes with T5.
In this work, different from the more common ap-
proach of fine-tuning the original pretrained model
or using prompt tuning, we propose to augment
the self-attention modules in the encoder and in-
troduce new parameters to the model while still
being able to leverage the pre-trained weights. We
call the proposed model RASAT
2
. Our model can
incorporate almost all existing types of relations in
the literature, including schema encoding, schema
linking, syntactic dependency of the question, etc.,
into a unified relation representation. In addition
to that, we also introduce coreference relations to
our model for multi-turn text-to-SQL tasks. Exper-
imental results show that RASAT could effectively
leverage the advantage of T5. It achieves the state-
of-art performance in question execution accuracy
(EX/IEX) on both multi-turn (SParC and CoSQL)
and single-turn (Spider) text-to-SQL benchmarks.
On SParC, RASAT surpasses all previous methods
in interaction execution accuracy (IEX) and im-
proves state-of-the-art performance from 21.6% to
52.6%, 31% absolute improvements. On CoSQL,
we improve state-of-the-art IEX performance from
8.4% to 37.4%, achieving 29% absolute improve-
ments. Moreover, on Spider, we improve state-of-
the-art execution accuracy from 75.1% to 75.5%,
achieving 0.4% absolute improvements.
2 Related Work
Early works usually exploit a sketch-based slot-
filling method that uses different modules to pre-
dict the corresponding part of SQL. These methods
decompose the SQL generation task into several
independent sketches and use different classifiers
to predict corresponding part, such as SQLNet (Xu
et al., 2017), SQLOVA (Hwang et al., 2019), X-
SQL (He et al., 2019), RYANSQL (Choi et al.,
2021), et.al,. However, most of these methods only
2
RASAT: Relation-Aware Self-Attention-augmented T5
handle simple queries while failing to generate cor-
rect SQL in a complex setting such as on Spider.
Faced with the multi-table and complex SQL
setting, using graph structures to encode various
complex relationships is a major trend in the text-to-
SQL task. For example, Global-GNN (Bogin et al.,
2019a) represents the complex database schema as
a graph, RAT-SQL (Wang et al., 2020a) introduces
schema encoding and linking and assigns every two
input items a relation, LGESQL (Cao et al., 2021)
further distinguishes local and non-local relations
by exploiting a line graph enhanced hidden module,
SADGA (Cai et al., 2021) uses contextual structure
and dependency structure to encode question-graph
while database schema relations are used in schema
graph, S
2
SQL (Hui et al., 2022) adds syntactic de-
pendency information in relational graph attention
network (RGAT) (Wang et al., 2020b).
For the conversational context-dependent text-
to-SQL task that includes multiple turns of interac-
tions, such as SParC and CoSQL, the key challenge
is how to take advantage of historical interaction
context. Edit-SQL (Zhang et al., 2019) edits the
last turn’s predicted SQL to generate the newly pre-
dicted SQL at the token level. IGSQL (Cai and
Wan, 2020) uses cross-turn and intra-turn schema
graph layers to model database schema items in a
conversational scenario. Tree-SQL (Wang et al.,
2021b) uses a tree-structured intermediate repre-
sentation and assigns a probability to reuse sub-
tree of historical Tree-SQLs. IST-SQL (Wang
et al., 2021a) proposes an interaction state tracking
method to predict the SQL query. RAT-SQL-TC
(Li et al., 2021)adds two auxiliary training tasks
to explicitly model the semantic changes in both
turn grain and conversation grain. R
2
SQL (Hui
et al., 2021) and HIE-SQL (Zheng et al., 2022) in-
troduce a dynamic schema-linking graph by adding
the current utterance, interaction history utterances,
database schema, and the last predicted SQL query.
Recently, Shaw et al. (2021) showed that fine-
tuning a pre-trained T5-3B model could yield re-
sults competitive to the then-state-of-the-art. Based
on this discovery, Scholak et al. (2021) proposed to
constrain the autoregressive decoder through incre-
mental parsing during inference time, effectively
filtering out grammatically incorrect sequences on
the fly during beam search, which significantly im-
proved the qualities of the generated SQL.
of 15
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜