暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

openGauss每日一练第20天 课程笔记和作业

数据库环境

openGauss:2.0.0 - 数据库实训平台

学习目标

openGauss提供了两种数据类型用于支持全文检索。tsvector类型表示为文本搜索优化的文件格式,tsquery类型表示文本查询

学习笔记

  • tsvector、tsquery
    • tsvector 拥有位置的词汇甚至可以用一个权来标记,反映文档结构,这个权可以是A,B,C或D。默认的是D,因此输出中不会出现
omm=# SELECT 'a:1A fat:2B,4C cat:5D'::tsvector; tsvector ---------------------------- 'a':1A 'cat':5 'fat':2B,4C (1 row) omm=# SELECT 'fat & rat'::tsquery; tsquery --------------- 'fat' & 'rat' (1 row) omm=# SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector @@ 'cat & rat'::tsquery AS RESULT; result -------- t (1 row) omm=# SELECT 'fat & cow'::tsquery @@ 'a fat cat sat on a mat and ate a fat rat'::tsvector AS RESULT; result -------- f (1 row)
  • 分词器
omm=# \dF List of text search configurations Schema | Name | Description ------------+------------+--------------------------------------- pg_catalog | danish | configuration for danish language pg_catalog | dutch | configuration for dutch language pg_catalog | english | configuration for english language pg_catalog | finnish | configuration for finnish language pg_catalog | french | configuration for french language pg_catalog | german | configuration for german language pg_catalog | hungarian | configuration for hungarian language pg_catalog | italian | configuration for italian language pg_catalog | ngram | ngram configuration pg_catalog | norwegian | configuration for norwegian language pg_catalog | romanian | configuration for romanian language pg_catalog | russian | configuration for russian language pg_catalog | simple | simple configuration pg_catalog | spanish | configuration for spanish language pg_catalog | portuguese | configuration for portuguese language pg_catalog | pound | pound configuration pg_catalog | zhparser | zhparser configuration pg_catalog | swedish | configuration for swedish language pg_catalog | turkish | configuration for turkish language (19 rows) omm=# omm=# show default_text_search_config; default_text_search_config ---------------------------- pg_catalog.english (1 row)
  • GIN索引
    • 连接列的索引
omm=# CREATE INDEX pgweb_idx_1 ON tsearch.pgweb USING gin(to_tsvector('english', body)); CREATE INDEX omm=# CREATE INDEX pgweb_idx_3 ON tsearch.pgweb USING gin(to_tsvector('english', title || ' ' || body)); CREATE INDEX omm=# \d+ tsearch.pgweb Table "tsearch.pgweb" Column | Type | Modifiers | Storage | Stats target | Description ---------------+---------+-----------+----------+--------------+------------- id | integer | | plain | | body | text | | extended | | title | text | | extended | | last_mod_date | date | | plain | | Indexes: "pgweb_idx_1" gin (to_tsvector('english'::regconfig, body)) TABLESPACE pg_default "pgweb_idx_3" gin (to_tsvector('english'::regconfig, (title || ' '::text) || body)) TABLESPACE pg_default Has OIDs: no Options: orientation=row, compression=no

课后作业

1.用tsvector @@ tsquery和tsquery @@ tsvector完成两个基本文本匹配

omm=# SELECT 'good good study day day up'::tsvector; tsvector --------------------------- 'day' 'good' 'study' 'up' (1 row) omm=# SELECT 'you & live'::tsquery; tsquery ---------------- 'you' & 'live' (1 row) omm=# SELECT 'good good study day day up'::tsvector @@ 'study & day'::tsquery AS RESULT; result -------- t (1 row) omm=# SELECT 'you & live'::tsquery @@ 'Cease to struggle and you cease to live'::tsvector AS RESULT; result -------- f (1 row)

2.创建表且至少有两个字段的类型为 text类型,在创建索引前进行全文检索

omm=# create table c omm-# ( omm(# id integer, omm(# content1 text, omm(# content2 text omm(# ); CREATE TABLE omm=# insert into c values omm-# (1,'good good study day day up','hello'), omm-# (2,'Cease to struggle and you cease to live','world'); INSERT 0 2 omm=# select * from c; id | content1 | content2 ----+-----------------------------------------+---------- 1 | good good study day day up | hello 2 | Cease to struggle and you cease to live | world (2 rows) omm=# select * from c where to_tsvector(content1) @@ to_tsquery('good'); id | content1 | content2 ----+----------------------------+---------- 1 | good good study day day up | hello (1 row) omm=# select * from c where to_tsvector(content1 || ' ' || content2) @@ to_tsquery('cease & world'); id | content1 | content2 ----+-----------------------------------------+---------- 2 | Cease to struggle and you cease to live | world (1 row)

3.创建GIN索引

omm=# create index content1_idx on c using gin(to_tsvector('english', content1)); CREATE INDEX omm=# \d c Table "public.c" Column | Type | Modifiers ----------+---------+----------- id | integer | content1 | text | content2 | text | Indexes: "content1_idx" gin (to_tsvector('english'::regconfig, content1)) TABLESPACE pg_default

4.清理数据

omm=# drop table c; DROP TABLE

学习资源


欢迎各位同学一起来交流学习心得!

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论