作者:RogueJin
【是否原创】是\ 【首发渠道】TiDB 社区
【目录】\ 闲话少叙\ 语法定义\ Yacc实现\ 验证Yacc\ 修改Yacc\ Conflicts问题\ 【正文】
闲话少叙
书接上文,自从团队对外公布了TiDB for PostgreSQL项目之后,团队成员正在紧锣密鼓的为下一个里程碑做努力,目前我们主要的工作集中在以下几个方面:
- 完善PgSQL的通信协议
- 实现PgSQL关键字和语法
- 修复单元测试
- 数据库Chaos测试平台
我们先回顾一下往期内容,再进入本文主题:
TiDB for PostgreSQL—牛刀小试 - 知乎 (zhihu.com)
TiDB for PostgreSQL 学习指南 - 知乎 (zhihu.com)
TiDB Parser模块的简单解读与改造方法 - 知乎 (zhihu.com)
除此之外,我们的项目也有幸入选PingCap DevCon 2021大会论坛话题,同时还进入了第二期TiDB Hacking Camp,欢迎大家多多关注我们,更欢迎大家加入我们。
Hacking Camp 第二期开启,六大开源项目重磅来袭! - 知乎 (zhihu.com)
语法定义
书归正文,这期文章主要介绍Data Manipulation Language中的Delete Statement,也就是CRUD中的最后一个字母。让我们先看看Mysql,PostgreSQL,SQL-2003 Standard都是怎么定义Delete的语法。
MySQL
Single-Table Syntax
DELETE [LOW_PRIORITY] [QUICK] [IGNORE] FROM tbl_name [[AS] tbl_alias]
[PARTITION (partition_name [, partition_name] ...)]
[WHERE where_condition]
[ORDER BY ...]
[LIMIT row_count]
From https://dev.mysql.com/doc/refman/8.0/en/delete.html
Multiple-Table Syntax
``` DELETE [LOW_PRIORITY] [QUICK] [IGNORE] tbl_name[.] [, tbl_name[.]] ... FROM table_references [WHERE where_condition]
DELETE [LOW_PRIORITY] [QUICK] [IGNORE] FROM tbl_name[.] [, tbl_name[.]] ... USING table_references [WHERE where_condition] ```
From https://dev.mysql.com/doc/refman/8.0/en/delete.html
PostgreSQL
[ WITH [ RECURSIVE ] with_query [, ...] ]
DELETE FROM [ ONLY ] table_name [ * ] [ [ AS ] alias ]
[ USING from_item [, ...] ]
[ WHERE condition | WHERE CURRENT OF cursor_name ]
[ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
This command conforms to the SQL standard, except that the USING and RETURNING clauses are PostgreSQL extensions, as is the ability to use WITH with DELETE.
From https://www.postgresql.org/docs/current/sql-delete.html
SQL-2003 BNF
```
从这三个数据库的定义是不是觉得只有SQL-2003跟常用的Delete From Table删表跑路的语法差不多,其他两个数据库加入了一些自定义的语法功能,尤其是MySQL,长得像个远房亲戚。这里不对各个自定义语法功能做解释,想了解的读者可以点击对应的官方链接查阅。 Yacc实现 了解语法定义是远远不够的,我们还需要看看parser.y对Delete Statement的MySQL实现,从Delete Statement开始,分成了DeleteWithoutUsingStmt和DeleteWithUsingStmt两个分支,每个分支中又定义了对应的语法,可以看出每一个语法定义都基本符合MySQL的定义,整个过程就是把SQL解析成AST,下一步就是拿现有的parser做个demo,看看是否能正常解析我们的Delete SQL。\
DCParser/parser.y at main · DigitalChinaOpenSource/DCParser (github.com) DELETE | PingCAP Docs\
/*******************************************************************\
*\
* Delete Statement\
*\
*******************************************************************/\
DeleteWithoutUsingStmt:\
“DELETE” TableOptimizerHints PriorityOpt QuickOptional IgnoreOptional “FROM” TableName PartitionNameListOpt TableAsNameOpt IndexHintListOpt WhereClauseOptional OrderByOptional LimitClause\
{\
// Single Table\
tn := $7.(*ast.TableName)\
tn.IndexHints = $10.([]*ast.IndexHint)\
tn.PartitionNames = $8.([]model.CIStr)\
join := \&ast.Join{Left: \&ast.TableSource{Source: tn, AsName: $9.(model.CIStr)}, Right: nil}\
x := \&ast.DeleteStmt{\
TableRefs: \&ast.TableRefsClause{TableRefs: join},\
Priority: $3.(mysql.PriorityEnum),\
Quick: $4.(bool),\
IgnoreErr: $5.(bool),\
}\
if $2 != nil {\
x.TableHints = $2.([]*ast.TableOptimizerHint)\
}\
if $11 != nil {\
x.Where = $11.(ast.ExprNode)\
}\
if $12 != nil {\
x.Order = $12.(*ast.OrderByClause)\
}\
if $13 != nil {\
x.Limit = $13.(*ast.Limit)\
}\
$$ = x\
}\
\| “DELETE” TableOptimizerHints PriorityOpt QuickOptional IgnoreOptional TableAliasRefList “FROM” TableRefs WhereClauseOptional\
{\
// Multiple Table\
x := \&ast.DeleteStmt{\
Priority: $3.(mysql.PriorityEnum),\
Quick: $4.(bool),\
IgnoreErr: $5.(bool),\
IsMultiTable: true,\
BeforeFrom: true,\
Tables: \&ast.DeleteTableList{Tables: $6.([]*ast.TableName)},\
TableRefs: \&ast.TableRefsClause{TableRefs: $8.(*ast.Join)},\
}\
if $2 != nil {\
x.TableHints = $2.([]*ast.TableOptimizerHint)\
}\
if $9 != nil {\
x.Where = $9.(ast.ExprNode)\
}\
$$ = x\
}\
DeleteWithUsingStmt:\
“DELETE” TableOptimizerHints PriorityOpt QuickOptional IgnoreOptional “FROM” TableAliasRefList “USING” TableRefs WhereClauseOptional\
{\
// Multiple Table\
x := \&ast.DeleteStmt{\
Priority: $3.(mysql.PriorityEnum),\
Quick: $4.(bool),\
IgnoreErr: $5.(bool),\
IsMultiTable: true,\
Tables: \&ast.DeleteTableList{Tables: $7.([]*ast.TableName)},\
TableRefs: \&ast.TableRefsClause{TableRefs: $9.(*ast.Join)},\
}\
if $2 != nil {\
x.TableHints = $2.([]*ast.TableOptimizerHint)\
}\
if $10 != nil {\
x.Where = $10.(ast.ExprNode)\
}\
$$ = x\
}\
DeleteFromStmt:\
DeleteWithoutUsingStmt\
\| DeleteWithUsingStmt 验证Yacc\
我们新建一个golang项目,如下。 这个demo解析了两个Delete SQL,第一个比较简单,第二个几乎把所有的参数都填上了,我们看看执行后的效果。 ```
package main
import (
"fmt"
parser "github.com/DigitalChinaOpenSource/DCParser"
"github.com/DigitalChinaOpenSource/DCParser/ast"
_ "github.com/DigitalChinaOpenSource/DCParser/test_driver"
) func parse(sql string) (*ast.StmtNode, error) {
p := parser.New()
stmtNodes, _, err := p.Parse(sql, "", "")
if err != nil {
return nil, err
}
return &stmtNodes[0], nil
} func main() {
astNode, err := parse("DELETE FROM somelog;")
if err != nil {
fmt.Printf("parse error: %v\"n", err.Error())
return
}
fmt.Printf("%v\"n", astNode)
astNode, err = parse("DELETE /+ BKA(somelog) / LOW_PRIORITY QUICK IGNORE FROM somelog PARTITION (p1,p2) AS t FORCE INDEX FOR ORDER BY(timestamp_column) WHERE user = 'jcole' ORDER BY timestamp_column LIMIT 1;")
if err != nil {
fmt.Printf("parse error: %v\"n", err.Error())
return
}
fmt.Printf("%v\"n", astNode)
}
``` 项目设置断点后运行,对"DELETE FROM somelog;"的解析呈现AST结构,可以看到TableName正确获取到了,其他的参数where, order, limit,priority, ignore, quick等等都是空值。 对二个SQL解析时"DELETE /*+ BKA(somelog) */ LOW_PRIORITY QUICK IGNORE FROM somelog PARTITION (p1,p2) AS t FORCE INDEX FOR ORDER BY(timestamp_column) WHERE user = ‘jcole’ ORDER BY timestamp_column LIMIT 1;",真个AST就变得庞大很多,包括where里的相等操作也能正确解析,这个parser工作正常,接下来我们开始删除不需要的语法参数。 修改Yacc\
我们开始删除PostgreSQL中没有的参数,如以下代码所示,对应的代码解析部分也要做相应的调整,不然会编译报错,当然中间也确实遇到了一个很麻烦的问题,如果你也遇到了类似conflict的问题,可以看看文末的trouble shooting。\
/*******************************************************************\
*\
* Delete Statement\
*\
*******************************************************************/\
DeleteWithoutUsingStmt:\
“DELETE” TableOptimizerHints PriorityOpt QuickOptional “FROM” TableName TableAsNameOpt WhereClauseOptional\
{\
// Single Table\
tn := $6.(*ast.TableName)\
join := \&ast.Join{Left: \&ast.TableSource{Source: tn, AsName: $7.(model.CIStr)}, Right: nil}\
x := \&ast.DeleteStmt{\
TableRefs: \&ast.TableRefsClause{TableRefs: join},\
Priority: $3.(mysql.PriorityEnum),\
Quick: $4.(bool),\
}\
if $2 != nil {\
x.TableHints = $2.([]*ast.TableOptimizerHint)\
}\
if $8 != nil {\
x.Where = $8.(ast.ExprNode)\
}\
$$ = x\
}\
\| “DELETE” TableOptimizerHints PriorityOpt QuickOptional TableAliasRefList “FROM” TableRefs WhereClauseOptional\
{\
// Multiple Table\
x := \&ast.DeleteStmt{\
Priority: $3.(mysql.PriorityEnum),\
Quick: $4.(bool),\
IsMultiTable: true,\
BeforeFrom: true,\
Tables: \&ast.DeleteTableList{Tables: $5.([]*ast.TableName)},\
TableRefs: \&ast.TableRefsClause{TableRefs: $7.(*ast.Join)},\
}\
if $2 != nil {\
x.TableHints = $2.([]*ast.TableOptimizerHint)\
}\
if $8 != nil {\
x.Where = $8.(ast.ExprNode)\
}\
$$ = x\
}\
DeleteWithUsingStmt:\
“DELETE” TableOptimizerHints PriorityOpt QuickOptional “FROM” TableAliasRefList “USING” TableRefs WhereClauseOptional\
{\
// Multiple Table\
x := \&ast.DeleteStmt{\
Priority: $3.(mysql.PriorityEnum),\
Quick: $4.(bool),\
IsMultiTable: true,\
Tables: \&ast.DeleteTableList{Tables: $6.([]*ast.TableName)},\
TableRefs: \&ast.TableRefsClause{TableRefs: $8.(*ast.Join)},\
}\
if $2 != nil {\
x.TableHints = $2.([]*ast.TableOptimizerHint)\
}\
if $9 != nil {\
x.Where = $9.(ast.ExprNode)\
}\
$$ = x\
}\
DeleteFromStmt:\
DeleteWithoutUsingStmt\
\| DeleteWithUsingStmt 利用项目自带的goyacc进行编译,记住,一定是用项目再带的goyacc编译,因为tidb对yyLexer接口一些改动,如果用go自带的goyacc编译会报没有实现方法的错误。好了,执行完之后就会生成一个新的parser.go文件。\
最后还需要把yacc修改导致的UT错误给修复了:) 本文删除MySQL自定义语法的目标已经接近完成,剩下的几个语法功能就留给读者自己尝试删除,关于如何加入PostgreSQL语法功能,我们会在下次分解。 当你在编译y文件时出现conflicts问题的时候,需要添加-v来查阅解析步骤。\
可以通过正确和错误文件对比提高排查效率,当然这次问题属于第一次修改没经验造成的,我还特地翻阅了Bison官方文档Shift/Reduce (Bison 3.7.6) (gnu.org)。最后根据Yacc的工作原理和output文件,把问题修复的,原因其实也非常简单,就是只改了一个delete分支里的语法,导致编译之后Delete AST解析路劲不一致导致的冲突。
| ONLY

Conflicts问题

评论








