前言:通过两个小案例来初步了解带时区的时间戳类型的利与弊,实际在进行数据库的设计时,可以根据情况来自行选择
1、时间戳类型
| Name | Storage Size | Description | Example |
|---|---|---|---|
timestamp [ (p) ] [ without time zone ] |
8 bytes | both date and time (no time zone) | 2024-05-08 15:15:34.154392 |
timestamp [ (p) ] with time zone |
8 bytes | both date and time, with time zone | 2024-05-08 15:15:38.34098+08 |
2024-05-08 15:15:38.34098+08:中+08代表时区,以UTC为偏移。即本地时间等于 UTC(协调世界时)加上时区偏移量
-
对于
timestamp with time zone,在存储timestamptz时,会自动将输入的时间转换为UTC时区来存储,查询时会自动从 UTC 转换成当前的timezone时区,并且显示为该时区的本地时间来展示; -
timestamp without time zone:不受时区的影响-- timestamptz受时区影响 # SELECT '2024-05-08 12:00:00+00'::timestamptz,'2024-05-08 12:00:00+08'::timestamptz ; +------------------------+------------------------+ | timestamptz | timestamptz | +------------------------+------------------------+ | 2024-05-08 20:00:00+08 | 2024-05-08 12:00:00+08 | +------------------------+------------------------+ (1 row) -- timestamp不受时区影响 # SELECT '2024-05-08 12:00:00+00'::timestamp,'2024-05-08 12:00:00+08'::timestamp ; +---------------------+---------------------+ | timestamp | timestamp | +---------------------+---------------------+ | 2024-05-08 12:00:00 | 2024-05-08 12:00:00 | +---------------------+---------------------+ (1 row)
UTC(Coordinated Universal Time -协调世界时)和GMT(Greenwich Mean Time - 格林尼治标准时间)常常被提到,GMT是前世界标准时,UTC是现世界标准时。GMT和UTC在不需要精确到秒的情况下,二者可以视为等同。但在科技和国际标准中,UTC因其更高的精确度和稳定性而更为常用。
2、夏令时
1)简单介绍
DST(Daylight Saving Time),夏令时又称夏季时间,或者夏时制。
它是为节约能源而人为规定地方时间的制度。一般在天亮早的夏季人为将时钟拨快一小时
在PostgreSQL数据库中处理时间类型数据时,尤其是涉及到时区的情况,夏令时(Daylight Saving Time, DST)的影响是非常关键的。PostgreSQL提供了对时区敏感的数据类型,主要是timestamp with time zone。这种数据类型可以在存储时自动考虑时区变化,包括夏令时的调整。
2)诡异案例
看到夏令时的影响如何体现在时间数据的存储和查询中:
CREATE TABLE dst_test(id serial, event_name varchar(100), insr_time timestamp without time zone, event_tt timestamp without time zone) ;
INSERT INTO dst_test(event_name,insr_time,event_tt)
VALUES ('Before DST','1940-05-31 23:59:59','1940-05-31 23:59:59'::timestamptz),
('After DST','1940-06-01 00:00:00','1940-06-01 00:00:00'::timestamptz);
# SELECT * from dst_test;
+----+------------+---------------------+---------------------+
| id | event_name | insr_time | event_tt |
+----+------------+---------------------+---------------------+
| 1 | Before DST | 1940-05-31 23:59:59 | 1940-05-31 23:59:59 |
| 2 | After DST | 1940-06-01 00:00:00 | 1940-06-01 01:00:00 |
+----+------------+---------------------+---------------------+
(2 rows)
揭秘:
ALTER TABLE dst_test ADD COLUMN event_tz timestamp with time zone ;
UPDATE dst_test SET event_tz = '1940-05-31 23:59:59'::timestamptz where insr_time = '1940-05-31 23:59:59';
UPDATE dst_test SET event_tz = '1940-06-01 00:00:00'::timestamptz where insr_time = '1940-06-01 00:00:00';
# SELECT * from dst_test;
+----+------------+---------------------+---------------------+------------------------+
| id | event_name | insr_time | event_tt | event_tz |
+----+------------+---------------------+---------------------+------------------------+
| 1 | Before DST | 1940-05-31 23:59:59 | 1940-05-31 23:59:59 | 1940-05-31 23:59:59+08 |
| 2 | After DST | 1940-06-01 00:00:00 | 1940-06-01 01:00:00 | 1940-06-01 01:00:00+09 |
+----+------------+---------------------+---------------------+------------------------+
(2 rows)
# \d dst_test
Table "public.dst_test"
+------------+-----------------------------+-----------+----------+--------------------------------------+
| Column | Type | Collation | Nullable | Default |
+------------+-----------------------------+-----------+----------+--------------------------------------+
| id | integer | | not null | nextval('dst_test_id_seq'::regclass) |
| event_name | character varying(100) | | | |
| insr_time | timestamp without time zone | | | |
| event_tt | timestamp without time zone | | | |
| event_tz | timestamp with time zone | | | |
+------------+-----------------------------+-----------+----------+--------------------------------------+
不要使用timestamp来存储带时区的时间类型的数据
timestamp with time zone类型的输出是以该时区的时间,如果要输出其它时区的时间,要么修改timezone,要么使用AT TIME ZONE
3)AT TIME ZONE
AT TIME ZONE 运算符将不带时区的时间戳与带时区的时间戳相互转换,以及将带时区值的时间转换为不同时区
# SELECT '2024-05-01 14:00:00+08'::timestamptz at time zone 'UTC' AS tz2ts, --本地时间14:00,查看UTC时区的时间
'2024-05-01 14:00:00'::timestamp at time zone 'UTC' as ts2tz ; --UTC时间14:00,显示本地时间
+---------------------+------------------------+
| tz2ts | ts2tz |
+---------------------+------------------------+
| 2024-05-01 06:00:00 | 2024-05-01 22:00:00+08 |
+---------------------+------------------------+
(1 row)
4)进阶版
感兴趣的可以自行研究下
# show timezone;
+----------+
| TimeZone |
+----------+
| PRC |
+----------+
(1 row)
# SELECT * from pg_timezone_names where name ~ 'PRC' ;
+------+--------+------------+--------+
| name | abbrev | utc_offset | is_dst |
+------+--------+------------+--------+
| PRC | CST | 08:00:00 | f |
+------+--------+------------+--------+
(1 row)
# SELECT '1940-06-01 00:00:00'::timestamp,
'1940-06-01 00:00:00'::timestamp at time zone 'UTC' ,
'1940-06-01 00:00:00'::timestamptz,
'1940-06-01 00:00:00'::timestamptz at time zone 'UTC' ;
+---------------------+------------------------+------------------------+---------------------+
| timestamp | timezone | timestamptz | timezone |
+---------------------+------------------------+------------------------+---------------------+
| 1940-06-01 00:00:00 | 1940-06-01 09:00:00+09 | 1940-06-01 01:00:00+09 | 1940-05-31 16:00:00 |
+---------------------+------------------------+------------------------+---------------------+
(1 row)
3、时区类型的数据对于全表扫和索引扫的案例
1)基础知识
函数稳定性:
在PostgreSQL/MogDB/openGauss中,函数稳定性分为三档:
-
immutable,超级稳定,任何时候调用,只要函数的参数不变结果就不变。如果参数为常量或者无参数,在生成执行计划时,直接将这类函数替换为常量。 -
stable,稳定,在一个事务中调用时,只要函数的参数不变结果就不变。pg中,只有stable和immutable的函数才可以被执行计划选择作为索引的比较条件 -
volatile,不稳定。同一个QUERY中,如果需要返回该函数的结果,那么每一行都会运算一遍这个函数
函数索引需要immutable函数
函数索引一定要immutable的函数。如果一个时间字段上要创建to_char(timestamp without time zone, text)的函数索引。 是没有办法创建的,报错:
ERROR: functions in index expression must be marked IMMUTABLE
# select distinct(proname),provolatile from pg_proc where proname='to_char';
+---------+-------------+
| proname | provolatile |
+---------+-------------+
| to_char | s |
+---------+-------------+
(1 row)
2)时区不同导致的走索引和走全表扫描得到结果不一致 案例
来源于digoal德哥案例
1、创建测试表及测试数据
# show TimeZone;
+----------+
| TimeZone |
+----------+
| PRC |
+----------+
(1 row)
create table test (id int,crt_time timestamp(0) with time zone);
# create index idx_test on test (to_char(crt_time,'yyyymmddhh24'));
ERROR: 42P17: functions in index expression must be marked IMMUTABLE
LOCATION: ComputeIndexAttrs, indexcmds.c:1946
Time: 10.656 ms
-- 两种办法:一种是直接修改该函数的稳定性,另一种是创建一个IMMUTABLE类型的函数
-- 两种方法对于该实验的结果是相同的,以下采用修改函数稳定性的方法。另一种方法读者可以自行验证
alter function to_char(timestamp with time zone,text) strict immutable;
create index idx_test on test (to_char(crt_time,'yyyymmddhh24'));
insert into test values (1,'2012-01-01 12:00:00');
-- PRC时区
# select to_char(crt_time,'yyyymmddhh24') from test;
+------------+
| to_char |
+------------+
| 2012010112 |
+------------+
(1 row)
-- GMT时区查看
# set TimeZone='GMT';
SET
# select to_char(crt_time,'yyyymmddhh24') from test;
+------------+
| to_char |
+------------+
| 2012010104 |
+------------+
(1 row)
2、索引扫描
# explain select * from test where to_char(crt_time,'yyyymmddhh24')='2012010112';
+------------------------------------------------------------------------------------+
| QUERY PLAN |
+------------------------------------------------------------------------------------+
| Bitmap Heap Scan on test (cost=4.23..14.81 rows=10 width=12) |
| Recheck Cond: (to_char(crt_time, 'yyyymmddhh24'::text) = '2012010112'::text) |
| -> Bitmap Index Scan on idx_test (cost=0.00..4.23 rows=10 width=0) |
| Index Cond: (to_char(crt_time, 'yyyymmddhh24'::text) = '2012010112'::text) |
+------------------------------------------------------------------------------------+
(4 rows)
# select * from test where to_char(crt_time,'yyyymmddhh24')='2012010112';
+----+------------------------+
| id | crt_time |
+----+------------------------+
| 1 | 2012-01-01 04:00:00+00 |
+----+------------------------+
(1 row)
3、顺序扫描
相同的where条件走全表扫得到的结果为空
set enable_indexscan=off;
set enable_bitmapscan=off;
# select * from test where to_char(crt_time,'yyyymmddhh24')='2012010112';
+----+----------+
| id | crt_time |
+----+----------+
+----+----------+
(0 rows)
# explain analyze select * from test where to_char(crt_time,'yyyymmddhh24')='2012010112';
+-------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+-------------------------------------------------------------------------------------------------+
| Seq Scan on test (cost=0.00..40.60 rows=10 width=12) (actual time=0.025..0.026 rows=0 loops=1) |
| Filter: (to_char(crt_time, 'yyyymmddhh24'::text) = '2012010112'::text) |
| Rows Removed by Filter: 1 |
| Planning Time: 0.092 ms |
| Execution Time: 0.042 ms |
+-------------------------------------------------------------------------------------------------+
(5 rows)
-- 查询2012010104则有结果, 返回一条记录
# select * from test where to_char(crt_time,'yyyymmddhh24')='2012010104';
+----+------------------------+
| id | crt_time |
+----+------------------------+
| 1 | 2012-01-01 04:00:00+00 |
+----+------------------------+
(1 row)
4、原因
因为建立索引时, 这个to_char(crt_time,'yyyymmddhh24')是等于2012010112的,在索引中存储的是2012010112的结果;走全表扫描的话, 这个值 to_char(crt_time,'yyyymmddhh24')将被重新运算,结果是2012010104
索引中存储:
# select * from bt_page_items('idx_test',1) ;
+------------+-------+---------+-------+------+-------------------------------------------------+------+-------+--------+
| itemoffset | ctid | itemlen | nulls | vars | data | dead | htid | tids |
+------------+-------+---------+-------+------+-------------------------------------------------+------+-------+--------+
| 1 | (0,1) | 24 | f | t | 17 32 30 31 32 30 31 30 31 31 32 00 00 00 00 00 | f | (0,1) | (null) |
+------------+-------+---------+-------+------+-------------------------------------------------+------+-------+--------+
(1 row)
ASCII转字符:`\u0017\u0032\u0030\u0031\u0032\u0030\u0031\u0030\u0031\u0031\u0032` -> '2012010112'
5、解决办法
法一:使用timestamp without time zone不带时区的时间戳类型
法二:增加一列, 用来存储需要建立的函数索引to_char()的结果值. 这样就不需要使用函数索引了, 也不需要考虑函数的strict问题




