暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

openGauss存储模型-行存和列存

原创 落笔花生 2022-12-14
306

学习目标

学习openGauss存储模型-行存和列存

行存储是指将表按行存储到硬盘分区上,列存储是指将表按列存储到硬盘分区上。默认情况下,创建的表为行存储。

行、列存储模型各有优劣,通常用于TP场景的数据库,默认使用行存储,仅对执行复杂查询且数据量大的AP场景时,才使用列存储

课程学习

连接数据库

#第一次进入等待15秒
#数据库启动中...
root@modb:~# su - omm omm@modb:~$ gsql -r gsql ((openGauss 3.0.0 build 02c14696) compiled at 2022-04-01 18:12:00 commit 0 last mr ) Non-SSL connection (SSL connection is recommended when requiring high-security) Type "help" for help.

1.创建行存表

omm=# create table test_t1
omm-# (
omm(# col1 CHAR(2),
omm(# col2 VARCHAR2(40),
omm(# col3 NUMBER
omm(# );
CREATE TABLE
–压缩属性为no
omm=# \d+ test_t1 Table "public.test_t1" Column | Type | Modifiers | Storage | Stats target | Description --------+-----------------------+-----------+----------+--------------+------------- col1 | character(2) | | extended | | col2 | character varying(40) | | extended | | col3 | numeric | | main | | Has OIDs: no Options: orientation=row, compression=no omm=# insert into test_t1 select col1, col2, col3 from (select generate_series(1, 100000) as key, repeat(chr(int4(random() * 26) + 65), 2) as col1, repeat(chr(int4(random() * 26) + 65), 30) as col2, (random() * (10^4))::integer as col3); omm=# INSERT 0 100000

2.创建列存表

omm=# CREATE TABLE test_t2
omm-# (
omm(# omm(# col1 CHAR(2),
col2 VARCHAR2(40),
omm(# col3 NUMBER
omm(# )
omm-# WITH (ORIENTATION = COLUMN);
CREATE TABLE
–压缩属性为low
omm=# \d+ test_t2; Table "public.test_t2" Column | Type | Modifiers | Storage | Stats target | Description --------+-----------------------+-----------+----------+--------------+------------- col1 | character(2) | | extended | | col2 | character varying(40) | | extended | | col3 | numeric | | main | | Has OIDs: no Options: orientation=column, compression=low
–插入和行存表相同的数据omm=# insert into test_t2 select * from test_t1; INSERT 0 100000

3.占用空间对比

omm=# \d+
                                        List of relations
 Schema |  Name   | Type  | Owner |  Size   |               Storage                | Description 
--------+---------+-------+-------+---------+--------------------------------------+-------------
 public | test_t1 | table | omm   | 6760 kB | {orientation=row,compression=no}     | 
 public | test_t2 | table | omm   | 1112 kB | {orientation=column,compression=low} | 
(2 rows)

4.对比读取一列的速度

omm=# analyze VERBOSE test_t1;
INFO:  analyzing "public.test_t1"(gaussdb pid=1)
INFO:  ANALYZE INFO : "test_t1": scanned 841 of 841 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows(gaussdb pid=1)
ANALYZE
omm=# analyze VERBOSE test_t2;
INFO:  analyzing "public.test_t2"(gaussdb pid=1)
INFO:  ANALYZE INFO : estimate total rows of "pg_delta_16395": scanned 0 pages of total 0 pages with 1 retry times, containing 0 live rows and 0 dead rows,  estimated 0 total rows(gaussdb pid=1)
INFO:  ANALYZE INFO : "test_t2": scanned 2 of 2 cus, sample 30000 rows, estimated total 100000 rows(gaussdb pid=1)
ANALYZE
–列存表时间少于行存表
omm=# explain analyze select distinct col1 from test_t1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=2091.00..2091.27 rows=27 width=3) (actual time=43.805..43.809 rows=27 loops=1) Group By Key: col1 -> Seq Scan on test_t1 (cost=0.00..1841.00 rows=100000 width=3) (actual time=0.012..19.588 rows=100000 loops=1) Total runtime: 43.900 ms (4 rows) omm=# explain analyze select distinct col1 from test_t2; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Row Adapter (cost=1008.27..1008.27 rows=27 width=3) (actual time=4.291..4.295 rows=27 loops=1) -> Vector Sonic Hash Aggregate (cost=1008.00..1008.27 rows=27 width=3) (actual time=4.289..4.289 rows=27 loops=1) Group By Key: col1 -> CStore Scan on test_t2 (cost=0.00..758.00 rows=100000 width=3) (actual time=0.027..0.273 rows=100000 loops=1) Total runtime: 4.474 ms (5 rows)

5.对比插入一行的速度

–行存表时间少于列存表
omm=# explain analyze insert into test_t1 values('x', 'xxxx', '123');
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 [Bypass]
 Insert on test_t1  (cost=0.00..0.01 rows=1 width=0) (actual time=0.068..0.069 rows=1 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1)
 Total runtime: 0.173 ms
(4 rows)

omm=# explain analyze insert into test_t2 values('x', 'xxxx', '123');
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Insert on test_t2  (cost=0.00..0.01 rows=1 width=0) (actual time=2.550..2.551 rows=1 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1)
 Total runtime: 2.655 ms
(3 rows)

6.清理数据

omm=# drop table test_t1;
DROP TABLE
omm=# drop table test_t2;
DROP TABLE

课程作业

1.创建行存表和列存表,并批量插入10万条数据(行存表和列存表数据相同)

root@modb:~# su - omm
omm@modb:~$ gsql -r
gsql ((openGauss 3.0.0 build 02c14696) compiled at 2022-04-01 18:12:00 commit 0 last mr  )
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.
omm=# create table test1(t_id int,t_name char(10));
CREATE TABLE
omm=# insert into test1 values (generate_series(1,10000)); INSERT 0 10000
omm=# create table test2(t_id int,t_name char(10)) with (ORIENTATION = COLUMN); CREATE TABLE
omm=# insert into test2 select * from test1 ; INSERT 0 10000

2.对比行存表和列存表空间大小

omm=# \d+
                                      List of relations
 Schema | Name  | Type  | Owner |  Size  |               Storage                | Description 
--------+-------+-------+-------+--------+--------------------------------------+-------------
 public | test1 | table | omm   | 384 kB | {orientation=row,compression=no}     | 
 public | test2 | table | omm   | 56 kB  | {orientation=column,compression=low} | 
(2 rows)

3.对比查询一列和插入一行的速度

omm=# analyze VERBOSE test1;
INFO:  analyzing "public.test1"(gaussdb pid=1)
INFO:  ANALYZE INFO : "test1": scanned 45 of 45 pages, containing 10000 live rows and 0 dead rows; 10000 rows in sample, 10000 estimated total rows(gaussdb pid=1)
ANALYZE
omm=# analyze VERBOSE test2;
INFO:  analyzing "public.test2"(gaussdb pid=1)
INFO:  ANALYZE INFO : estimate total rows of "pg_delta_16392": scanned 0 pages of total 0 pages with 1 retry times, containing 0 live rows and 0 dead rows,  estimated 0 total rows(gaussdb pid=1)
INFO:  ANALYZE INFO : "test2": scanned 1 of 1 cus, sample 10000 rows, estimated total 10000 rows(gaussdb pid=1)
ANALYZE
omm=# explain analyze insert into test1 values (10001,'a');
                                         QUERY PLAN                                          
---------------------------------------------------------------------------------------------
 [Bypass]
 Insert on test1  (cost=0.00..0.01 rows=1 width=0) (actual time=0.062..0.062 rows=1 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1)
 Total runtime: 0.168 ms
(4 rows)

omm=# explain analyze insert into test2 values (10001,'a');
                                         QUERY PLAN                                          
---------------------------------------------------------------------------------------------
 Insert on test2  (cost=0.00..0.01 rows=1 width=0) (actual time=0.121..0.122 rows=1 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1)
 Total runtime: 0.230 ms
(3 rows)

4.清理数据

omm=# drop table test1;
DROP TABLE
omm=# drop table test2;
DROP TABLE
最后修改时间:2022-12-14 14:49:04
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论