暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL jsonpath使用实践

晟数学苑 2021-06-29
416

点击蓝字|关注我们

  jsonpath是用来解析json数据的工具,类似于xpath,jsonpath可以解析十分复杂的json数据。


PostgreSQL json发展历史:

  PostgreSQL从9.2开始就支持json数据类型,但是由于解析json数据的性能很差,导致并不受大家青睐,而是选择使用nosql数据库代替。于是从pg9.4开始支持了jsonb数据类型,相较于json类型,jsonb由于并不需要每次使用时都去进行解析,因此性能提升很多,都是还支持索引查询等。


  而从pg12开始对于json的支持更加强大:sql 2016的sql/json标准有15条, PG 12 实现了14条, 远远超过oracle(18c 11/15), mysql(8.0.4 5/15), sqlserver(2017 2/15)最新版本。


  同时在pg12中引入了jsonpath类型,以及一系列相关的函数,使得json数据的查询性能更进一步,功能也愈发强大。


JSONPATH语法:

JSONpath 函数表达式语法如下:


.点号 . 表示引用 Json 数据的元素

.方括号 [] 表示引用数组元素

.Json 数据中的数组元素下标从0开始

JSONpath中的变量如下:


.$ 符号表示要查询的Json文本的变量

.$varname 表示指定变量

.@ 指在 filter 表达式中表示当前路径元素的变量

JSONPATH使用举例:

简单查询:

    bill@bill=>SELECT jsonb_path_query_array('[1,2,3,4,5]', '$[*] ? (@ > 3)');
    jsonb_path_query_array
    ------------------------
    [4, 5]
    (1 row)

    创建测试表:

      CREATE TABLE house(js jsonb);
      INSERT INTO house VALUES
      ('{
      "address": {
      "city":"Moscow",
      "street": "Ulyanova, 7A"
      },
      "lift": false,
      "floor": [
      {
      "level": 1,
      "apt": [
      {"no": 1, "area": 40, "rooms": 1},
      {"no": 2, "area": 80, "rooms": 3},
      {"no": 3, "area": 50, "rooms": 2}
      ]
      },
      {
      "level": 2,
      "apt": [
      {"no": 4, "area": 100, "rooms": 3},
      {"no": 5, "area": 60, "rooms": 2}
      ]
      }
      ]
      }');

      查询:

        bill@bill=>select jsonb_pretty(js) from house ;
        jsonb_pretty
        ----------------------------------
        { +
        "lift": false, +
        "floor": [ +
        { +
        "apt": [ +
        { +
        "no": 1, +
        "area": 40, +
        "rooms": 1 +
        }, +
        { +
        "no": 2, +
        "area": 80, +
        "rooms": 3 +
        }, +
        { +
        "no": 3, +
        "area": 50, +
        "rooms": 2 +
        } +
        ], +
        "level": 1 +
        }, +
        { +
        "apt": [ +
        { +
        "no": 4, +
        "area": 100,+
        "rooms": 3 +
        }, +
        { +
        "no": 5, +
        "area": 60, +
        "rooms": 2 +
        } +
        ], +
        "level": 2 +
        } +
        ], +
        "address": { +
        "city": "Moscow", +
        "street": "Ulyanova, 7A"+
        } +
        }
        (1 row)

        该数据的层次结构如下图:

          看上去该数据层次挺复杂的,但是实际中可能数据层次远比这个复杂的多,那么我们看看如何使用jsonpath来进行查询的:

          bill@bill=>SELECT jsonb_path_query_array(js, '$.floor[0, 1].apt[1 to last]')  FROM house;
          jsonb_path_query_array
          -----------------------------------------------------------------------------------------------------------
          [{"no": 2, "area": 80, "rooms": 3}, {"no": 3, "area": 50, "rooms": 2}, {"no": 5, "area": 60, "rooms": 2}]
          (1 row)

          而如果不用jsonpath的话,我们可能需要这么写:

            bill@bill=>SELECT jsonb_agg(apt) FROM (
            bill(# SELECT apt->generate_series(1, jsonb_array_length(apt) - 1) FROM (
            bill(# SELECT js->'floor'->unnest(array[0, 1])->'apt' FROM house
            bill(# ) apts(apt)
            bill(# ) apts(apt);
            jsonb_agg
            -----------------------------------------------------------------------------------------------------------
            [{"no": 2, "area": 80, "rooms": 3}, {"no": 3, "area": 50, "rooms": 2}, {"no": 5, "area": 60, "rooms": 2}]
            (1 row)

            相比之下,使用jsonpath相关的函数查询简便太多了。

            又比如,我们需要判断json数据中是否包含某个值,可以这样:

              bill@bill=>SELECT jsonb_path_exists(js, '$.** ? (@ == "Moscow")') FROM house;
              jsonb_path_exists
              -------------------
              t
              (1 row)

              而如果不使用jsonpath呢?

                bill@bill=>WITH RECURSIVE t(value) AS (
                bill(# SELECT * FROM house UNION ALL (
                bill(# SELECT COALESCE(kv.value, e.value) AS value
                bill(# FROM t
                bill(# LEFT JOIN LATERAL jsonb_each (
                bill(# CASE WHEN jsonb_typeof(t.value) = 'object' THEN t.value
                bill(# ELSE NULL END
                bill(# ) kv ON true
                bill(# LEFT JOIN LATERAL jsonb_array_elements (
                bill(# CASE WHEN jsonb_typeof(t.value) = 'array' THEN t.value
                bill(# ELSE NULL END
                bill(# ) e ON true
                bill(# WHERE kv.value IS NOT NULL OR e.value IS NOT NULL
                bill(# )
                bill(# ) SELECT EXISTS (SELECT 1 FROM t WHERE value = '"Moscow"');
                exists
                --------
                t
                (1 row)

                  那么如何使用jsonpath进行数据过滤呢?已上面这张表为例,我们查询apt.no大于3的数据:

                  bill@bill=>SELECT jsonb_path_query(js, '$.floor.apt.no ? (@>3)') FROM house;
                  jsonb_path_query
                  ------------------
                  4
                  5
                  (2 rows)

                  同样,jsonpath也支持索引的使用:

                    bill@bill=>CREATE INDEX ON house USING gin (js);
                    CREATE INDEX
                    bill@bill=>SET ENABLE_SEQSCAN TO OFF;
                    SET
                    bill@bill=>EXPLAIN (COSTS OFF) SELECT * FROM house
                    bill-# WHERE js @? '$.floor[*].apt[*] ? (@.rooms == 3)';
                    QUERY PLAN
                    --------------------------------------------------------------------------------
                    Bitmap Heap Scan on house
                    Recheck Cond: (js @? '$."floor"[*]."apt"[*]?(@."rooms" == 3)'::jsonpath)
                    -> Bitmap Index Scan on house_js_idx
                    Index Cond: (js @? '$."floor"[*]."apt"[*]?(@."rooms" == 3)'::jsonpath)
                    (4 rows)

                    除此之外,jsonpath支持了20多种相关的函数,是不是十分强大,赶快用起来吧!

                      postgres=# \df *.*json*path*
                      List of functions
                      Schema | Name | Result data type | Argument data types
                      | Type
                      ------------+------------------------------+------------------+------------------------------------------------------------------------------------------
                      -+------
                      pg_catalog | gin_consistent_jsonb_path | boolean | internal, smallint, jsonb, integer, internal, internal, internal, internal
                      | func
                      pg_catalog | gin_extract_jsonb_path | internal | jsonb, internal, internal
                      | func
                      pg_catalog | gin_extract_jsonb_query_path | internal | jsonb, internal, smallint, internal, internal, internal, internal
                      | func
                      pg_catalog | gin_triconsistent_jsonb_path | "char" | internal, smallint, jsonb, integer, internal, internal, internal
                      | func
                      pg_catalog | json_extract_path | json | from_json json, VARIADIC path_elems text[]
                      | func
                      pg_catalog | json_extract_path_text | text | from_json json, VARIADIC path_elems text[]
                      | func
                      pg_catalog | jsonb_delete_path | jsonb | jsonb, text[]
                      | func
                      pg_catalog | jsonb_extract_path | jsonb | from_json jsonb, VARIADIC path_elems text[]
                      | func
                      pg_catalog | jsonb_extract_path_text | text | from_json jsonb, VARIADIC path_elems text[]
                      | func
                      pg_catalog | jsonb_path_exists | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonb_path_exists_opr | boolean | jsonb, jsonpath
                      | func
                      pg_catalog | jsonb_path_exists_tz | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonb_path_match | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonb_path_match_opr | boolean | jsonb, jsonpath
                      | func
                      pg_catalog | jsonb_path_match_tz | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonb_path_query | SETOF jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonb_path_query_array | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonb_path_query_array_tz | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonb_path_query_first | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonb_path_query_first_tz | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonb_path_query_tz | SETOF jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
                      | func
                      pg_catalog | jsonpath_in | jsonpath | cstring
                      | func
                      pg_catalog | jsonpath_out | cstring | jsonpath
                      | func
                      pg_catalog | jsonpath_recv | jsonpath | internal
                      | func
                      pg_catalog | jsonpath_send | bytea | jsonpath
                      | func
                      (25 rows)

                      参考链接:

                      https://github.com/digoal/blog/blob/master/202010/20201013_01.md

                      http://www.postgres.cn/docs/12/functions-json.html

                      http://www.postgres.cn/docs/12/datatype-json.html#DATATYPE-JSONPATH



                      推荐阅读

                      Oracle12c连接pdb并创建用户

                       2021-06-15

                      使用Oracle数据泵导出时运行异常缓慢

                      2021-05-14


                      这里“阅读原文”,查看更多

                      文章转载自晟数学苑,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                      评论