openGauss每日一练第20天课程笔记和作业

原创云散月明 All China Database Union 2021-12-22

659

数据库环境

openGauss:2.0.0 - 数据库实训平台

学习目标

openGauss提供了两种数据类型用于支持全文检索。tsvector类型表示为文本搜索优化的文件格式，tsquery类型表示文本查询

学习笔记

tsvector、tsquery
- tsvector 拥有位置的词汇甚至可以用一个权来标记，反映文档结构，这个权可以是A，B，C或D。默认的是D，因此输出中不会出现

omm=# SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
          tsvector          
----------------------------
 'a':1A 'cat':5 'fat':2B,4C
(1 row)

omm=# SELECT 'fat & rat'::tsquery;
    tsquery    
---------------
 'fat' & 'rat'
(1 row)

omm=# SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector @@ 'cat & rat'::tsquery AS RESULT;
 result 
--------
 t
(1 row)

omm=# SELECT 'fat & cow'::tsquery @@ 'a fat cat sat on a mat and ate a fat rat'::tsvector AS RESULT;
 result 
--------
 f
(1 row)

分词器

omm=# \dF
               List of text search configurations
   Schema   |    Name    |              Description              
------------+------------+---------------------------------------
 pg_catalog | danish     | configuration for danish language
 pg_catalog | dutch      | configuration for dutch language
 pg_catalog | english    | configuration for english language
 pg_catalog | finnish    | configuration for finnish language
 pg_catalog | french     | configuration for french language
 pg_catalog | german     | configuration for german language
 pg_catalog | hungarian  | configuration for hungarian language
 pg_catalog | italian    | configuration for italian language
 pg_catalog | ngram      | ngram configuration
 pg_catalog | norwegian  | configuration for norwegian language
 pg_catalog | romanian   | configuration for romanian language
 pg_catalog | russian    | configuration for russian language
 pg_catalog | simple     | simple configuration
 pg_catalog | spanish    | configuration for spanish language
 pg_catalog | portuguese | configuration for portuguese language
 pg_catalog | pound      | pound configuration
 pg_catalog | zhparser   | zhparser configuration
 pg_catalog | swedish    | configuration for swedish language
 pg_catalog | turkish    | configuration for turkish language
(19 rows)

omm=# 
omm=# show default_text_search_config;
 default_text_search_config 
----------------------------
 pg_catalog.english
(1 row)

GIN索引
- 连接列的索引

omm=# CREATE INDEX pgweb_idx_1 ON tsearch.pgweb USING gin(to_tsvector('english', body));
CREATE INDEX
omm=# CREATE INDEX pgweb_idx_3 ON tsearch.pgweb USING gin(to_tsvector('english', title || ' ' || body));
CREATE INDEX
omm=# \d+ tsearch.pgweb
                            Table "tsearch.pgweb"
    Column     |  Type   | Modifiers | Storage  | Stats target | Description 
---------------+---------+-----------+----------+--------------+-------------
 id            | integer |           | plain    |              | 
 body          | text    |           | extended |              | 
 title         | text    |           | extended |              | 
 last_mod_date | date    |           | plain    |              | 
Indexes:
    "pgweb_idx_1" gin (to_tsvector('english'::regconfig, body)) TABLESPACE pg_default
    "pgweb_idx_3" gin (to_tsvector('english'::regconfig, (title || ' '::text) || body)) TABLESPACE pg_default
Has OIDs: no
Options: orientation=row, compression=no

课后作业

1.用tsvector @@ tsquery和tsquery @@ tsvector完成两个基本文本匹配

omm=# SELECT 'good good study day day up'::tsvector;                   
         tsvector          
---------------------------
 'day' 'good' 'study' 'up'
(1 row)

omm=# SELECT 'you & live'::tsquery;
    tsquery     
----------------
 'you' & 'live'
(1 row)

omm=# SELECT 'good good study day day up'::tsvector @@ 'study & day'::tsquery AS RESULT;
 result 
--------
 t
(1 row)

omm=# SELECT 'you & live'::tsquery @@ 'Cease to struggle and you cease to live'::tsvector AS RESULT;
 result 
--------
 f
(1 row)

2.创建表且至少有两个字段的类型为 text类型，在创建索引前进行全文检索

omm=# create table c
omm-# (
omm(#     id integer,
omm(#     content1 text,
omm(#     content2 text
omm(# );
CREATE TABLE

omm=# insert into c values
omm-# (1,'good good study day day up','hello'),
omm-# (2,'Cease to struggle and you cease to live','world');
INSERT 0 2

omm=# select * from c;
 id |                content1                 | content2 
----+-----------------------------------------+----------
  1 | good good study day day up              | hello
  2 | Cease to struggle and you cease to live | world
(2 rows)

omm=# select * from c where to_tsvector(content1) @@ to_tsquery('good');
 id |          content1          | content2 
----+----------------------------+----------
  1 | good good study day day up | hello
(1 row)

omm=# select * from c where to_tsvector(content1 || ' ' || content2) @@ to_tsquery('cease & world');
 id |                content1                 | content2 
----+-----------------------------------------+----------
  2 | Cease to struggle and you cease to live | world
(1 row)

3.创建GIN索引

omm=# create index content1_idx on c using gin(to_tsvector('english', content1));
CREATE INDEX
omm=# \d c
        Table "public.c"
  Column  |  Type   | Modifiers 
----------+---------+-----------
 id       | integer | 
 content1 | text    | 
 content2 | text    | 
Indexes:
    "content1_idx" gin (to_tsvector('english'::regconfig, content1)) TABLESPACE pg_default

4.清理数据

omm=# drop table c;
DROP TABLE

学习资源

欢迎各位同学一起来交流学习心得！

墨力计划 opengauss

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

文章被以下合辑收录

课程笔记每日一练：openGauss数据库在线实训课程（共24篇）

《每日一练：openGauss数据库在线实训课程》的课程笔记（2021年12月）

openGauss每日一练第20天 课程笔记和作业

数据库环境

学习目标

学习笔记

课后作业

学习资源

文章被以下合辑收录

评论

openGauss每日一练第20天课程笔记和作业