暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

TiDB DELETE Statement,懂你不容易

2021-07-20
275

作者:RogueJin

原文来源:https://tidb.net/blog/223ac6a0

【是否原创】是\ 【首发渠道】TiDB 社区

【目录】\ 闲话少叙\ 语法定义\ Yacc实现\ 验证Yacc\ 修改Yacc\ Conflicts问题\ 【正文】

闲话少叙

书接上文,自从团队对外公布了TiDB for PostgreSQL项目之后,团队成员正在紧锣密鼓的为下一个里程碑做努力,目前我们主要的工作集中在以下几个方面:

  1. 完善PgSQL的通信协议
  2. 实现PgSQL关键字和语法
  3. 修复单元测试
  4. 数据库Chaos测试平台

我们先回顾一下往期内容,再进入本文主题:

TiDB for PostgreSQL—牛刀小试 - 知乎 (zhihu.com)

TiDB for PostgreSQL 学习指南 - 知乎 (zhihu.com)

TiDB Parser模块的简单解读与改造方法 - 知乎 (zhihu.com)

SQL只是CRUD? - 知乎 (zhihu.com)

除此之外,我们的项目也有幸入选PingCap DevCon 2021大会论坛话题,同时还进入了第二期TiDB Hacking Camp,欢迎大家多多关注我们,更欢迎大家加入我们。

Hacking Camp 第二期开启,六大开源项目重磅来袭! - 知乎 (zhihu.com)

语法定义

书归正文,这期文章主要介绍Data Manipulation Language中的Delete Statement,也就是CRUD中的最后一个字母。让我们先看看Mysql,PostgreSQL,SQL-2003 Standard都是怎么定义Delete的语法。

MySQL

Single-Table Syntax

DELETE [LOW_PRIORITY] [QUICK] [IGNORE] FROM tbl_name [[AS] tbl_alias] [PARTITION (partition_name [, partition_name] ...)] [WHERE where_condition] [ORDER BY ...] [LIMIT row_count]

From https://dev.mysql.com/doc/refman/8.0/en/delete.html

Multiple-Table Syntax

``` DELETE [LOW_PRIORITY] [QUICK] [IGNORE] tbl_name[.] [, tbl_name[.]] ... FROM table_references [WHERE where_condition]

DELETE [LOW_PRIORITY] [QUICK] [IGNORE] FROM tbl_name[.] [, tbl_name[.]] ... USING table_references [WHERE where_condition] ```

From https://dev.mysql.com/doc/refman/8.0/en/delete.html

PostgreSQL

[ WITH [ RECURSIVE ] with_query [, ...] ] DELETE FROM [ ONLY ] table_name [ * ] [ [ AS ] alias ] [ USING from_item [, ...] ] [ WHERE condition | WHERE CURRENT OF cursor_name ] [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]

This command conforms to the SQL standard, except that the USING and RETURNING clauses are PostgreSQL extensions, as is the ability to use WITH with DELETE.

From https://www.postgresql.org/docs/current/sql-delete.html

SQL-2003 BNF

``` ::= DELETE FROM WHERE CURRENT OF ::= DELETE FROM [ WHERE ]

::=

| ONLY

::= ```

从这三个数据库的定义是不是觉得只有SQL-2003跟常用的Delete From Table删表跑路的语法差不多,其他两个数据库加入了一些自定义的语法功能,尤其是MySQL,长得像个远房亲戚。这里不对各个自定义语法功能做解释,想了解的读者可以点击对应的官方链接查阅。

Yacc实现

了解语法定义是远远不够的,我们还需要看看parser.y对Delete Statement的MySQL实现,从Delete Statement开始,分成了DeleteWithoutUsingStmt和DeleteWithUsingStmt两个分支,每个分支中又定义了对应的语法,可以看出每一个语法定义都基本符合MySQL的定义,整个过程就是把SQL解析成AST,下一步就是拿现有的parser做个demo,看看是否能正常解析我们的Delete SQL。\ DCParser/parser.y at main · DigitalChinaOpenSource/DCParser (github.com)

DELETE | PingCAP Docs\ /*******************************************************************\ *\ * Delete Statement\ *\ *******************************************************************/\ DeleteWithoutUsingStmt:\ “DELETE” TableOptimizerHints PriorityOpt QuickOptional IgnoreOptional “FROM” TableName PartitionNameListOpt TableAsNameOpt IndexHintListOpt WhereClauseOptional OrderByOptional LimitClause\ {\ // Single Table\ tn := $7.(*ast.TableName)\ tn.IndexHints = $10.([]*ast.IndexHint)\ tn.PartitionNames = $8.([]model.CIStr)\ join := \&ast.Join{Left: \&ast.TableSource{Source: tn, AsName: $9.(model.CIStr)}, Right: nil}\ x := \&ast.DeleteStmt{\ TableRefs: \&ast.TableRefsClause{TableRefs: join},\ Priority: $3.(mysql.PriorityEnum),\ Quick: $4.(bool),\ IgnoreErr: $5.(bool),\ }\ if $2 != nil {\ x.TableHints = $2.([]*ast.TableOptimizerHint)\ }\ if $11 != nil {\ x.Where = $11.(ast.ExprNode)\ }\ if $12 != nil {\ x.Order = $12.(*ast.OrderByClause)\ }\ if $13 != nil {\ x.Limit = $13.(*ast.Limit)\ }\ $$ = x\ }\ \| “DELETE” TableOptimizerHints PriorityOpt QuickOptional IgnoreOptional TableAliasRefList “FROM” TableRefs WhereClauseOptional\ {\ // Multiple Table\ x := \&ast.DeleteStmt{\ Priority: $3.(mysql.PriorityEnum),\ Quick: $4.(bool),\ IgnoreErr: $5.(bool),\ IsMultiTable: true,\ BeforeFrom: true,\ Tables: \&ast.DeleteTableList{Tables: $6.([]*ast.TableName)},\ TableRefs: \&ast.TableRefsClause{TableRefs: $8.(*ast.Join)},\ }\ if $2 != nil {\ x.TableHints = $2.([]*ast.TableOptimizerHint)\ }\ if $9 != nil {\ x.Where = $9.(ast.ExprNode)\ }\ $$ = x\ }\ DeleteWithUsingStmt:\ “DELETE” TableOptimizerHints PriorityOpt QuickOptional IgnoreOptional “FROM” TableAliasRefList “USING” TableRefs WhereClauseOptional\ {\ // Multiple Table\ x := \&ast.DeleteStmt{\ Priority: $3.(mysql.PriorityEnum),\ Quick: $4.(bool),\ IgnoreErr: $5.(bool),\ IsMultiTable: true,\ Tables: \&ast.DeleteTableList{Tables: $7.([]*ast.TableName)},\ TableRefs: \&ast.TableRefsClause{TableRefs: $9.(*ast.Join)},\ }\ if $2 != nil {\ x.TableHints = $2.([]*ast.TableOptimizerHint)\ }\ if $10 != nil {\ x.Where = $10.(ast.ExprNode)\ }\ $$ = x\ }\ DeleteFromStmt:\ DeleteWithoutUsingStmt\ \| DeleteWithUsingStmt

验证Yacc\ 我们新建一个golang项目,如下。 这个demo解析了两个Delete SQL,第一个比较简单,第二个几乎把所有的参数都填上了,我们看看执行后的效果。

``` package main import ( "fmt" parser "github.com/DigitalChinaOpenSource/DCParser" "github.com/DigitalChinaOpenSource/DCParser/ast" _ "github.com/DigitalChinaOpenSource/DCParser/test_driver" )

func parse(sql string) (*ast.StmtNode, error) { p := parser.New() stmtNodes, _, err := p.Parse(sql, "", "") if err != nil { return nil, err } return &stmtNodes[0], nil }

func main() { astNode, err := parse("DELETE FROM somelog;") if err != nil { fmt.Printf("parse error: %v\"n", err.Error()) return } fmt.Printf("%v\"n", astNode) astNode, err = parse("DELETE /+ BKA(somelog) / LOW_PRIORITY QUICK IGNORE FROM somelog PARTITION (p1,p2) AS t FORCE INDEX FOR ORDER BY(timestamp_column) WHERE user = 'jcole' ORDER BY timestamp_column LIMIT 1;") if err != nil { fmt.Printf("parse error: %v\"n", err.Error()) return } fmt.Printf("%v\"n", astNode) } ```

项目设置断点后运行,对"DELETE FROM somelog;"的解析呈现AST结构,可以看到TableName正确获取到了,其他的参数where, order, limit,priority, ignore, quick等等都是空值。

image

对二个SQL解析时"DELETE /*+ BKA(somelog) */ LOW_PRIORITY QUICK IGNORE FROM somelog PARTITION (p1,p2) AS t FORCE INDEX FOR ORDER BY(timestamp_column) WHERE user = ‘jcole’ ORDER BY timestamp_column LIMIT 1;",真个AST就变得庞大很多,包括where里的相等操作也能正确解析,这个parser工作正常,接下来我们开始删除不需要的语法参数。

image

修改Yacc\ 我们开始删除PostgreSQL中没有的参数,如以下代码所示,对应的代码解析部分也要做相应的调整,不然会编译报错,当然中间也确实遇到了一个很麻烦的问题,如果你也遇到了类似conflict的问题,可以看看文末的trouble shooting。\ /*******************************************************************\ *\ * Delete Statement\ *\ *******************************************************************/\ DeleteWithoutUsingStmt:\ “DELETE” TableOptimizerHints PriorityOpt QuickOptional “FROM” TableName TableAsNameOpt WhereClauseOptional\ {\ // Single Table\ tn := $6.(*ast.TableName)\ join := \&ast.Join{Left: \&ast.TableSource{Source: tn, AsName: $7.(model.CIStr)}, Right: nil}\ x := \&ast.DeleteStmt{\ TableRefs: \&ast.TableRefsClause{TableRefs: join},\ Priority: $3.(mysql.PriorityEnum),\ Quick: $4.(bool),\ }\ if $2 != nil {\ x.TableHints = $2.([]*ast.TableOptimizerHint)\ }\ if $8 != nil {\ x.Where = $8.(ast.ExprNode)\ }\ $$ = x\ }\ \| “DELETE” TableOptimizerHints PriorityOpt QuickOptional TableAliasRefList “FROM” TableRefs WhereClauseOptional\ {\ // Multiple Table\ x := \&ast.DeleteStmt{\ Priority: $3.(mysql.PriorityEnum),\ Quick: $4.(bool),\ IsMultiTable: true,\ BeforeFrom: true,\ Tables: \&ast.DeleteTableList{Tables: $5.([]*ast.TableName)},\ TableRefs: \&ast.TableRefsClause{TableRefs: $7.(*ast.Join)},\ }\ if $2 != nil {\ x.TableHints = $2.([]*ast.TableOptimizerHint)\ }\ if $8 != nil {\ x.Where = $8.(ast.ExprNode)\ }\ $$ = x\ }\ DeleteWithUsingStmt:\ “DELETE” TableOptimizerHints PriorityOpt QuickOptional “FROM” TableAliasRefList “USING” TableRefs WhereClauseOptional\ {\ // Multiple Table\ x := \&ast.DeleteStmt{\ Priority: $3.(mysql.PriorityEnum),\ Quick: $4.(bool),\ IsMultiTable: true,\ Tables: \&ast.DeleteTableList{Tables: $6.([]*ast.TableName)},\ TableRefs: \&ast.TableRefsClause{TableRefs: $8.(*ast.Join)},\ }\ if $2 != nil {\ x.TableHints = $2.([]*ast.TableOptimizerHint)\ }\ if $9 != nil {\ x.Where = $9.(ast.ExprNode)\ }\ $$ = x\ }\ DeleteFromStmt:\ DeleteWithoutUsingStmt\ \| DeleteWithUsingStmt

利用项目自带的goyacc进行编译,记住,一定是用项目再带的goyacc编译,因为tidb对yyLexer接口一些改动,如果用go自带的goyacc编译会报没有实现方法的错误。好了,执行完之后就会生成一个新的parser.go文件。\ image

最后还需要把yacc修改导致的UT错误给修复了:)

image

本文删除MySQL自定义语法的目标已经接近完成,剩下的几个语法功能就留给读者自己尝试删除,关于如何加入PostgreSQL语法功能,我们会在下次分解。

Conflicts问题

当你在编译y文件时出现conflicts问题的时候,需要添加-v来查阅解析步骤。\ image

可以通过正确和错误文件对比提高排查效率,当然这次问题属于第一次修改没经验造成的,我还特地翻阅了Bison官方文档Shift/Reduce (Bison 3.7.6) (gnu.org)。最后根据Yacc的工作原理和output文件,把问题修复的,原因其实也非常简单,就是只改了一个delete分支里的语法,导致编译之后Delete AST解析路劲不一致导致的冲突。

image

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论