GBase 8a的SELECT语句支持使用START WITH...CONNECT BY语句实现分级查询。
1.相关概念
在使用START WITH...CONNECT BY语句前,请先了解以下概念。
数据源:单物理表、单逻辑表或单个视图。
迭代种子:START WITH指定的过滤条件,基于数据源过滤出来的数据称为迭代种子,迭代种子都是根节点,如果省略START WITH,那么所有数据源都是根节点。
迭代关系:CONNECT BY指定的关联条件,父子节点之间通过迭代关系进行关联。
分级查询:基于迭代种子和数据源,根据迭代关系遍历数据,称为分级查询,也称树形查询或层次查询。即,基于START WITH指定的根节点和原始表数据,根据CONNECT BY指定的关系遍历数据
环路:如果一个节点的子孙节点同时又是该节点的祖先节点,称为存在环路。
兄弟间排序:按KEY值顺序遍历数据,遍历规则是树的深度优先遍历中的先序遍历。
伪列:伪列并不存储在表中,但是它的行为却类似于表中的字段,可以基于伪列做查询操作,但不能更新伪列;伪列也可以理解为没有参数的函数,区别是:基于不同的行,伪列的值一般是不同的。
集:某些特定数据集合,如父节点集是指当前层的上层节点数据集合,子节点集是指当前层的数据集合,祖先节点集是指当前层的所有祖先节点组成的数据集合。
2. 执行原理
遍历表中的每条记录,对比是否满足START WITH后的条件,如果不满足则继续下一条,如果满足则以该记录为根节点,然后递归寻找该节点下的子节点,查找条件是CONNECT BY后面指定的条件。
3. 语法格式
SELECT column_list|[LEVEL]
FROM single_table | view
[WHERE ...]
[hierarchical_clause]
[GROUP BY ...]
[ORDER [SIBLINGS] BY ...]
hierarchical_clause:
[START WITH <conditions>] CONNECT BY <connect_conditions>
| CONNECT
BY <connect_conditions> [START WITH <conditions>]
[ORDER
SIBLINGS BY {col_name | expr | position} [ASC | DESC] , ...]
PRIOR expr1 op expr2
|
expr1 op PRIOR expr2
|
expr op connect_condition
|
expr
LEVEL:可选关键字,表示等级。即该节点在树中的层数,根节点为第一层,level为1
WHERE:根据CONNECT BY
<connect_conditions> START WITH<connect_conditions>选择出来的记录进行过滤,是针对单条记录的过滤。
WHERE条件限制查询返回的行。
hierarchical_clause中参数说明如下:
START WITH <conditions>:START WITH表示开始的记录,限定作为搜索起始点的条件,如果是自上而下的搜索则是限定作为根节点的条件,如果是自下而上的搜索则是限定作为叶子节点的条件。
CONNECT BY <conditions>:这里执行的过滤会把符合条件的记录及其下所有子节点都过滤掉。
CONNECT BY PRIOR指定与当前记录关联时的字段关系。
START WITH和CONNECT
BY的先后顺序不影响查询结果。
ORDER SIBLINGS BY:使用此关键字时可以进行层次查询
connect_condition中参数说明如下:
PRIOR:表示上一条记录。可以指定与当前记录关联时的字段关系。包含此参数时表示不进行递归查询。
4. 使用约束
START WITH...CONNECT BY语句在使用时有以下约束:
1、分级查询只能基于单物理表或单隐式表(derived
table)。
以下语句写法错误:
SELECT LEVEL FROM t1, t1 t2 WHERE t1.i =
t2.i CONNECT BY t1.i =PRIOR t1.j GROUP BY LEVEL ORDER BY LEVEL;
2、CONNECT BY关联条件不能包含OR操作,并且必须包含父子节点间的等值条件,等号的两边必须是不同的维度(一边包含PRIOR,一边不包含PRIOR)。
以下语句写法正确:
SELECT * FROM t1 START WITH i = 1 CONNECT BY
LEVEL < 4 and PRIOR j = i ORDER SIBLINGS BY j;
以下语句写法错误:
SELECT * FROM t1 START WITH i = 1 CONNECT BY
LEVEL < 4 and PRIOR j > i ORDER SIBLINGS BY j;
SELECT * FROM t1 START WITH i = 1 CONNECT BY
PRIOR j = i or PRIOR i = j ORDER SIBLINGS BY j;
3、PRIOR是一元操作符,优先级同正负号,只能用在CONNECT BY子句中;
PRIOR后面不可以接伪列(LEVEL、ROWID等)、不可以接包含伪列的表达式、不可以嵌套使用PRIOR,不可以接聚合函数。
以下语句写法错误:
SELECT 1 FROM t1 START WITH i = 1 CONNECT BY
j = PRIOR (ROWID+1);
SELECT 1 FROM t1 START WITH i = 1 CONNECT BY
j = PRIOR LEVEL;
SELECT 1 FROM t1 START WITH i = 1 CONNECT BY
j = PRIOR SUM(i);
SELECT 1 FROM t1 START WITH i = 1 CONNECT BY
j = PRIOR (i + PRIOR 1);
4、ORDER SIBLINGS BY只能用在分级查询语句中,并且不能同聚合同时存在
以下语句写法错误:
SELECT * FROM t1 START WITH i = 1 CONNECT BY
PRIOR j = i GROUP BY i,j ORDER SIBLINGS BY j;
5、LEVEL伪列只能用于分级查询语句中。
以下语句写法错误:
SELECT LEVEL FROM t1;
6、不支持实时环路判断,只能保证最终能够检测出环路,如果数据量太大(如超过十万),会耗时很长才能检测出。
7、最大节点数为MAX_INT(2147483647)
8、不能基于做过DLETE的表进行分级查询,但可以修改为隐式表的形式。
DELETE FROM t1 WHERE…;
以下语句写法错误:
SELECT … FROM t1 CONNECT BY…;
以下语句写法正确:
SELECT … FROM (SELECT * FROM t1) t1 CONNECT
BY…;
9、在配置文件“gbase_8a_gbase8a.cnf”中增加参数gbase_max_allowed_level,用于控制允许出现的最大level数,默认值为1024,最大2147483647,最小1。
5. 分级查询语句示例
5.1. 表
建表语句如下:
DROP TABLE t1;
CREATE TABLE t1(i int, j int, l varchar(10),
k double);
CREATE TABLE t2 LIKE t1;
INSERT INTO t1 VALUES
(1,2,'2013-1-2',1.2),(1,5,'2013-1-5',1.5),(1,8,'2013-1-8',1.8);
INSERT INTO t1 VALUES
(5,3,'2013-5-3',5.3),(5,4,'2013-5-4',5.4);
INSERT INTO t1 VALUES
(2,7,'2013-2-7',2.7),(2,6,'2013-2-6',2.6);
INSERT INTO t1 VALUES
(8,9,'2013-8-9',8.9),(8,0,'2013-8-1',8.0);
INSERT INTO t1 VALUES (0,10,'2011-1-10',0.1);
INSERT INTO t2 SELECT * FROM t1;
示例1:START
WITH...CONNECT BY PRIOR...
gbase> SELECT * FROM t1 START WITH i = 1 CONNECT BY
PRIOR j = i ORDER BY j, l DESC, k;
+------+------+-----------+------+
| i
| j | l | k
|
+------+------+-----------+------+
| 8
| 0 | 2013-8-1 | 8
|
| 1
| 2 | 2013-1-2 | 1.2
|
| 5
| 3 | 2013-5-3 | 5.3
|
| 5
| 4 | 2013-5-4 | 5.4
|
| 1
| 5 | 2013-1-5 | 1.5
|
| 2
| 6 | 2013-2-6 | 2.6
|
| 2
| 7 | 2013-2-7 | 2.7
|
| 1
| 8 | 2013-1-8 | 1.8
|
| 8
| 9 | 2013-8-9 | 8.9
|
| 0
| 10 | 2011-1-10 | 0.1 |
+------+------+-----------+------+
10 rows in set
示例2:START
WITH...CONNECT BY PRIOR...ORDER SIBLINGS BY...
gbase> SELECT * FROM t1 START WITH i = 1 CONNECT BY
PRIOR j = i AND (1 OR k = i) > 0 ORDER SIBLINGS BY j, l DESC, k;
+------+------+-----------+------+
| i
| j | l | k
|
+------+------+-----------+------+
| 8
| 0 | 2013-8-1 | 8
|
| 1
| 2 | 2013-1-2 | 1.2
|
| 5
| 3 | 2013-5-3 | 5.3
|
| 5 | 4
| 2013-5-4 | 5.4 |
| 1
| 5 | 2013-1-5 | 1.5
|
| 2
| 6 | 2013-2-6 | 2.6
|
| 2
| 7 | 2013-2-7 | 2.7
|
| 1
| 8 | 2013-1-8 | 1.8
|
| 8
| 9 | 2013-8-9 | 8.9
|
| 0
| 10 | 2011-1-10 | 0.1 |
+------+------+-----------+------+
10 rows in set
示例3:CONNECT BY PRIOR与CONNECT BY。
gbase> SELECT * FROM t1 CONNECT BY PRIOR j = i;
+------+------+-----------+------+
| i
| j | l | k
|
+------+------+-----------+------+
| 1
| 2 | 2013-1-2 | 1.2
|
| 1
| 5 | 2013-1-5 | 1.5
|
| 1
| 8 | 2013-1-8 | 1.8
|
| 5
| 3 | 2013-5-3 | 5.3
|
| 5
| 4 | 2013-5-4 | 5.4
|
| 2
| 7 | 2013-2-7 | 2.7
|
| 2
| 6 | 2013-2-6 | 2.6
|
| 8
| 9 | 2013-8-9 | 8.9
|
| 8
| 0 | 2013-8-1
| 8 |
| 0
| 10 | 2011-1-10 | 0.1 |
+------+------+-----------+------+
10 rows in set
gbase>
SELECT * FROM t1 CONNECT BY j = i;
+------+------+-----------+------+
| i
| j | l | k
|
+------+------+-----------+------+
| 1
| 2 | 2013-1-2 | 1.2
|
| 1
| 5 | 2013-1-5 | 1.5
|
| 1
| 8 | 2013-1-8 | 1.8
|
| 5
| 3 | 2013-5-3 | 5.3
|
| 5
| 4 | 2013-5-4 | 5.4
|
| 2
| 7 | 2013-2-7 | 2.7
|
| 2
| 6 | 2013-2-6 | 2.6
|
| 8
| 9 | 2013-8-9 | 8.9
|
| 8
| 0 | 2013-8-1 | 8
|
| 0
| 10 | 2011-1-10 | 0.1 |
+------+------+-----------+------+
10 rows in set
示例4:SELECT * FROM
table WHERE ...START WITH...CONNECT BY PRIOR
gbase> SELECT * FROM t1 WHERE i < 1 START WITH i =
1 CONNECT BY PRIOR j = i ORDER BY j, l DESC, k;
+------+------+-----------+------+
| i
| j | l | k
|
+------+------+-----------+------+
| 0
| 10 | 2011-1-10 | 0.1 |
+------+------+-----------+------+
1 row in set
5.2. 视图
视图支持分级查询,即START WITH...CONNECT BY PRIOR的使用。
示例1:START
WITH...CONNECT BY PRIOR...
gbase>
DROP VIEW IF EXISTS v1;
Query OK, 0 rows affected
gbase>
CREATE VIEW v1 AS SELECT * FROM t1 START WITH i = 1 CONNECT BY PRIOR j = i
ORDER BY j, l DESC, k;
Query OK, 0 rows affected
gbase>
SELECT * FROM v1;
+------+------+-----------+------+
| i
| j | l | k
|
+------+------+-----------+------+
| 8
| 0 | 2013-8-1 | 8
|
| 1
| 2 | 2013-1-2 | 1.2
|
| 5
| 3 | 2013-5-3 | 5.3
|
| 5
| 4 | 2013-5-4 | 5.4
|
| 1
| 5 | 2013-1-5 | 1.5
|
| 2
| 6 | 2013-2-6 | 2.6
|
| 2
| 7 | 2013-2-7 | 2.7
|
| 1
| 8 | 2013-1-8 | 1.8
|
| 8
| 9 | 2013-8-9 | 8.9
|
| 0
| 10 | 2011-1-10 | 0.1 |
+------+------+-----------+------+
10 rows in set
gbase>
SHOW CREATE VIEW v1;
+------+--------------------+----------------------+----------------------+
| View | Create View | character_set_client |
collation_connection |
+------+--------------------+----------------------+----------------------+
| v1
| CREATE ALGORITHM=UNDEFINED DEFINER="root"@"%" SQL
SECURITY DEFINER VIEW "v1" AS select "t1"."i" AS
"i","t1"."j" AS "j","t1"."l"
AS "l","t1"."k" AS "k" from
"t1" start with ("t1"."i" = 1) connect by (prior
"t1"."j" = "t1"."i") order by
"t1"."j","t1"."l"
desc,"t1"."k" | utf8 | utf8_general_ci |
+------+--------------------+----------------------+----------------------+
1 row in set
示例2:START
WITH...CONNECT BY PRIOR...ORDER SIBLINGS BY...
gbase>
DROP VIEW v1;
Query OK, 0 rows affected
gbase>
CREATE VIEW v1 AS SELECT * FROM t1 START WITH i = 1 CONNECT BY PRIOR j = i AND
(1 OR k = i) > 0 ORDER SIBLINGS BY j, l DESC, k;
Query OK, 0 rows affected
gbase>
SELECT * FROM v1;
+------+------+-----------+------+
| i
| j | l | k
|
+------+------+-----------+------+
| 1
| 2 | 2013-1-2 | 1.2
|
| 2
| 6 | 2013-2-6 | 2.6
|
| 2
| 7 | 2013-2-7 | 2.7
|
| 1
| 5 | 2013-1-5 | 1.5
|
| 5
| 3 | 2013-5-3 | 5.3
|
| 5
| 4 | 2013-5-4
| 5.4 |
| 1
| 8 | 2013-1-8 | 1.8
|
| 8
| 0 | 2013-8-1 | 8
|
| 0
| 10 | 2011-1-10 | 0.1 |
| 8
| 9 | 2013-8-9 | 8.9
|
+------+------+-----------+------+
10 rows in set
gbase>
SHOW CREATE VIEW v1;
+------+--------------------+----------------------+----------------------+
| View | Create View | character_set_client |
collation_connection |
+------+--------------------+----------------------+----------------------+
| v1
| CREATE ALGORITHM=UNDEFINED DEFINER="root"@"%" SQL
SECURITY DEFINER VIEW "v1" AS select "t1"."i" AS
"i","t1"."j" AS
"j","t1"."l" AS
"l","t1"."k" AS "k" from "t1"
start with ("t1"."i" = 1) connect by ((prior
"t1"."j" = "t1"."i") and ((1 or
("t1"."k" = "t1"."i")) > 0)) order
siblings by "t1"."j","t1"."l"
desc,"t1"."k" | utf8 | utf8_general_ci |
+------+--------------------+----------------------+----------------------+
1 row in set
示例3:WHERE...START
WITH...CONNECT BY PRIOR...
gbase>
DROP VIEW v1;
Query OK, 0 rows affected
gbase>
CREATE VIEW v1 AS SELECT * FROM t1 WHERE i < 1 START WITH i = 1 CONNECT BY
PRIOR j = i ORDER BY j, l DESC, k;
Query OK, 0 rows affected
gbase>
SELECT * FROM v1;
+------+------+-----------+------+
| i
| j | l | k
|
+------+------+-----------+------+
| 0
| 10 | 2011-1-10 | 0.1 |
+------+------+-----------+------+
1 row in set
gbase>
SHOW CREATE VIEW v1;
+------+---------------+----------------------+----------------------+
| View | Create View | character_set_client | collation_connection
|
+------+---------------+----------------------+----------------------+
| v1
| CREATE ALGORITHM=UNDEFINED DEFINER="root"@"%" SQL
SECURITY DEFINER VIEW "v1" AS select "t1"."i" AS
"i","t1"."j" AS "j","t1"."l"
AS "l","t1"."k" AS "k" from
"t1" where ("t1"."i" < 1) start with
("t1"."i" = 1) connect by (prior
"t1"."j" = "t1"."i") order by
"t1"."j","t1"."l" desc,"t1"."k"
| utf8 |
utf8_general_ci |
+------+----------------+----------------------+----------------------+
1 row in set




