暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

PostgreSQL/openGauss/MogDB中时间戳带时区的类型简单介绍

原创 Maleah 2024-05-20
1063

前言:通过两个小案例来初步了解带时区的时间戳类型的利与弊,实际在进行数据库的设计时,可以根据情况来自行选择

1、时间戳类型

Name Storage Size Description Example
timestamp [ (p) ] [ without time zone ] 8 bytes both date and time (no time zone) 2024-05-08 15:15:34.154392
timestamp [ (p) ] with time zone 8 bytes both date and time, with time zone 2024-05-08 15:15:38.34098+08

2024-05-08 15:15:38.34098+08:中+08代表时区,以UTC为偏移。即本地时间等于 UTC(协调世界时)加上时区偏移量

  • 对于timestamp with time zone,在存储timestamptz时,会自动将输入的时间转换为UTC时区来存储,查询时会自动从 UTC 转换成当前的timezone时区,并且显示为该时区的本地时间来展示;

  • timestamp without time zone:不受时区的影响

    -- timestamptz受时区影响
    # SELECT '2024-05-08 12:00:00+00'::timestamptz,'2024-05-08 12:00:00+08'::timestamptz ;
    +------------------------+------------------------+
    |      timestamptz       |      timestamptz       |
    +------------------------+------------------------+
    | 2024-05-08 20:00:00+08 | 2024-05-08 12:00:00+08 |
    +------------------------+------------------------+
    (1 row)
    
    -- timestamp不受时区影响
    # SELECT '2024-05-08 12:00:00+00'::timestamp,'2024-05-08 12:00:00+08'::timestamp ;
    +---------------------+---------------------+
    |      timestamp      |      timestamp      |
    +---------------------+---------------------+
    | 2024-05-08 12:00:00 | 2024-05-08 12:00:00 |
    +---------------------+---------------------+
    (1 row)
    

UTC(Coordinated Universal Time -协调世界时)和GMT(Greenwich Mean Time - 格林尼治标准时间)常常被提到,GMT是前世界标准时,UTC是现世界标准时。GMT和UTC在不需要精确到秒的情况下,二者可以视为等同。但在科技和国际标准中,UTC因其更高的精确度和稳定性而更为常用。

2、夏令时

1)简单介绍

DST(Daylight Saving Time),夏令时又称夏季时间,或者夏时制。

它是为节约能源而人为规定地方时间的制度。一般在天亮早的夏季人为将时钟拨快一小时

在PostgreSQL数据库中处理时间类型数据时,尤其是涉及到时区的情况,夏令时(Daylight Saving Time, DST)的影响是非常关键的。PostgreSQL提供了对时区敏感的数据类型,主要是timestamp with time zone。这种数据类型可以在存储时自动考虑时区变化,包括夏令时的调整。

2)诡异案例

看到夏令时的影响如何体现在时间数据的存储和查询中:

CREATE TABLE dst_test(id serial, event_name varchar(100), insr_time timestamp without time zone, event_tt timestamp without time zone) ;
INSERT INTO dst_test(event_name,insr_time,event_tt) 
VALUES ('Before DST','1940-05-31 23:59:59','1940-05-31 23:59:59'::timestamptz),
	('After DST','1940-06-01 00:00:00','1940-06-01 00:00:00'::timestamptz);

# SELECT * from dst_test;
+----+------------+---------------------+---------------------+
| id | event_name |      insr_time      |      event_tt       |
+----+------------+---------------------+---------------------+
|  1 | Before DST | 1940-05-31 23:59:59 | 1940-05-31 23:59:59 |
|  2 | After DST  | 1940-06-01 00:00:00 | 1940-06-01 01:00:00 |
+----+------------+---------------------+---------------------+
(2 rows)

揭秘:

ALTER TABLE dst_test ADD COLUMN event_tz timestamp with time zone ;
UPDATE dst_test SET event_tz = '1940-05-31 23:59:59'::timestamptz where insr_time = '1940-05-31 23:59:59';
UPDATE dst_test SET event_tz = '1940-06-01 00:00:00'::timestamptz where insr_time = '1940-06-01 00:00:00';

# SELECT * from dst_test;
+----+------------+---------------------+---------------------+------------------------+
| id | event_name |      insr_time      |      event_tt       |        event_tz        |
+----+------------+---------------------+---------------------+------------------------+
|  1 | Before DST | 1940-05-31 23:59:59 | 1940-05-31 23:59:59 | 1940-05-31 23:59:59+08 |
|  2 | After DST  | 1940-06-01 00:00:00 | 1940-06-01 01:00:00 | 1940-06-01 01:00:00+09 |
+----+------------+---------------------+---------------------+------------------------+
(2 rows)

# \d dst_test
                                         Table "public.dst_test"
+------------+-----------------------------+-----------+----------+--------------------------------------+
|   Column   |            Type             | Collation | Nullable |               Default                |
+------------+-----------------------------+-----------+----------+--------------------------------------+
| id         | integer                     |           | not null | nextval('dst_test_id_seq'::regclass) |
| event_name | character varying(100)      |           |          |                                      |
| insr_time  | timestamp without time zone |           |          |                                      |
| event_tt   | timestamp without time zone |           |          |                                      |
| event_tz   | timestamp with time zone    |           |          |                                      |
+------------+-----------------------------+-----------+----------+--------------------------------------+

不要使用timestamp来存储带时区的时间类型的数据

timestamp with time zone类型的输出是以该时区的时间,如果要输出其它时区的时间,要么修改timezone,要么使用AT TIME ZONE

3)AT TIME ZONE

AT TIME ZONE 运算符将不带时区的时间戳与带时区的时间戳相互转换,以及将带时区值的时间转换为不同时区

# SELECT '2024-05-01 14:00:00+08'::timestamptz at time zone 'UTC' AS tz2ts,  --本地时间14:00,查看UTC时区的时间
	'2024-05-01 14:00:00'::timestamp at time zone 'UTC' as ts2tz ;   --UTC时间14:00,显示本地时间
+---------------------+------------------------+
|        tz2ts        |         ts2tz          |
+---------------------+------------------------+
| 2024-05-01 06:00:00 | 2024-05-01 22:00:00+08 |
+---------------------+------------------------+
(1 row)

4)进阶版

感兴趣的可以自行研究下

# show timezone;
+----------+
| TimeZone |
+----------+
| PRC      |
+----------+
(1 row)

# SELECT * from pg_timezone_names where name ~ 'PRC' ;
+------+--------+------------+--------+
| name | abbrev | utc_offset | is_dst |
+------+--------+------------+--------+
| PRC  | CST    | 08:00:00   | f      |
+------+--------+------------+--------+
(1 row)

# SELECT '1940-06-01 00:00:00'::timestamp, 
	'1940-06-01 00:00:00'::timestamp at time zone 'UTC' , 
	'1940-06-01 00:00:00'::timestamptz, 
	'1940-06-01 00:00:00'::timestamptz at time zone 'UTC' ;
+---------------------+------------------------+------------------------+---------------------+
|      timestamp      |        timezone        |      timestamptz       |      timezone       |
+---------------------+------------------------+------------------------+---------------------+
| 1940-06-01 00:00:00 | 1940-06-01 09:00:00+09 | 1940-06-01 01:00:00+09 | 1940-05-31 16:00:00 |
+---------------------+------------------------+------------------------+---------------------+
(1 row)

3、时区类型的数据对于全表扫和索引扫的案例

1)基础知识

函数稳定性:

在PostgreSQL/MogDB/openGauss中,函数稳定性分为三档:

  • immutable,超级稳定,任何时候调用,只要函数的参数不变结果就不变。如果参数为常量或者无参数,在生成执行计划时,直接将这类函数替换为常量。

  • stable,稳定,在一个事务中调用时,只要函数的参数不变结果就不变。pg中,只有stable和immutable的函数才可以被执行计划选择作为索引的比较条件

  • volatile,不稳定。同一个QUERY中,如果需要返回该函数的结果,那么每一行都会运算一遍这个函数

函数索引需要immutable函数

函数索引一定要immutable的函数。如果一个时间字段上要创建to_char(timestamp without time zone, text)的函数索引。 是没有办法创建的,报错:

ERROR:  functions in index expression must be marked IMMUTABLE  

# select distinct(proname),provolatile from pg_proc where proname='to_char';
+---------+-------------+
| proname | provolatile |
+---------+-------------+
| to_char | s           |
+---------+-------------+
(1 row)

2)时区不同导致的走索引和走全表扫描得到结果不一致 案例

来源于digoal德哥案例

1、创建测试表及测试数据

# show TimeZone;  
+----------+
| TimeZone |
+----------+
| PRC      |
+----------+
(1 row)

create table test (id int,crt_time timestamp(0) with time zone); 
# create index idx_test on test (to_char(crt_time,'yyyymmddhh24')); 
ERROR:  42P17: functions in index expression must be marked IMMUTABLE
LOCATION:  ComputeIndexAttrs, indexcmds.c:1946
Time: 10.656 ms
-- 两种办法:一种是直接修改该函数的稳定性,另一种是创建一个IMMUTABLE类型的函数
-- 两种方法对于该实验的结果是相同的,以下采用修改函数稳定性的方法。另一种方法读者可以自行验证
alter function to_char(timestamp with time zone,text) strict immutable;  

create index idx_test on test (to_char(crt_time,'yyyymmddhh24')); 
insert into test values (1,'2012-01-01 12:00:00');  

-- PRC时区
# select to_char(crt_time,'yyyymmddhh24') from test; 
+------------+
|  to_char   |
+------------+
| 2012010112 |
+------------+
(1 row)

-- GMT时区查看
# set TimeZone='GMT'; 
SET

# select to_char(crt_time,'yyyymmddhh24') from test;
+------------+
|  to_char   |
+------------+
| 2012010104 |
+------------+
(1 row)

2、索引扫描

# explain select * from test where to_char(crt_time,'yyyymmddhh24')='2012010112';  
+------------------------------------------------------------------------------------+
|                                     QUERY PLAN                                     |
+------------------------------------------------------------------------------------+
| Bitmap Heap Scan on test  (cost=4.23..14.81 rows=10 width=12)                      |
|   Recheck Cond: (to_char(crt_time, 'yyyymmddhh24'::text) = '2012010112'::text)     |
|   ->  Bitmap Index Scan on idx_test  (cost=0.00..4.23 rows=10 width=0)             |
|         Index Cond: (to_char(crt_time, 'yyyymmddhh24'::text) = '2012010112'::text) |
+------------------------------------------------------------------------------------+
(4 rows)

# select * from test where to_char(crt_time,'yyyymmddhh24')='2012010112';
+----+------------------------+
| id |        crt_time        |
+----+------------------------+
|  1 | 2012-01-01 04:00:00+00 |
+----+------------------------+
(1 row)

3、顺序扫描

相同的where条件走全表扫得到的结果为空

set enable_indexscan=off;  
set enable_bitmapscan=off;  

# select * from test where to_char(crt_time,'yyyymmddhh24')='2012010112';
+----+----------+
| id | crt_time |
+----+----------+
+----+----------+
(0 rows)


# explain analyze select * from test where to_char(crt_time,'yyyymmddhh24')='2012010112';
+-------------------------------------------------------------------------------------------------+
|                                           QUERY PLAN                                            |
+-------------------------------------------------------------------------------------------------+
| Seq Scan on test  (cost=0.00..40.60 rows=10 width=12) (actual time=0.025..0.026 rows=0 loops=1) |
|   Filter: (to_char(crt_time, 'yyyymmddhh24'::text) = '2012010112'::text)                        |
|   Rows Removed by Filter: 1                                                                     |
| Planning Time: 0.092 ms                                                                         |
| Execution Time: 0.042 ms                                                                        |
+-------------------------------------------------------------------------------------------------+
(5 rows)

-- 查询2012010104则有结果, 返回一条记录  
# select * from test where to_char(crt_time,'yyyymmddhh24')='2012010104';
+----+------------------------+
| id |        crt_time        |
+----+------------------------+
|  1 | 2012-01-01 04:00:00+00 |
+----+------------------------+
(1 row)

4、原因

因为建立索引时, 这个to_char(crt_time,'yyyymmddhh24')是等于2012010112的,在索引中存储的是2012010112的结果;走全表扫描的话, 这个值 to_char(crt_time,'yyyymmddhh24')将被重新运算,结果是2012010104

索引中存储:

# select * from bt_page_items('idx_test',1) ;
+------------+-------+---------+-------+------+-------------------------------------------------+------+-------+--------+
| itemoffset | ctid  | itemlen | nulls | vars |                      data                       | dead | htid  |  tids  |
+------------+-------+---------+-------+------+-------------------------------------------------+------+-------+--------+
|          1 | (0,1) |      24 | f     | t    | 17 32 30 31 32 30 31 30 31 31 32 00 00 00 00 00 | f    | (0,1) | (null) |
+------------+-------+---------+-------+------+-------------------------------------------------+------+-------+--------+
(1 row)

ASCII转字符:`\u0017\u0032\u0030\u0031\u0032\u0030\u0031\u0030\u0031\u0031\u0032` -> '2012010112'

5、解决办法

法一:使用timestamp without time zone不带时区的时间戳类型

法二:增加一列, 用来存储需要建立的函数索引to_char()的结果值. 这样就不需要使用函数索引了, 也不需要考虑函数的strict问题

最后修改时间:2024-05-20 10:27:32
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论