作者
digoal
日期
2021-01-22
标签
PostgreSQL , BzTree , 非易失内存
背景
Bztree: a high-performance latch-free range index for non-volatile memory
Storing a database (rows and indexes) entirely in non-volatile memory (NVM) potentially enables both high performance and fast recovery. To fully exploit parallelism on modern CPUs, modern main-memory databases use latch-free (lock-free) index structures, e.g. Bw-tree or skip lists. To achieve high performance NVM-resident indexes also need to be latch-free. This paper describes the design of the BzTree, a latch-free B-tree index designed for NVM. The BzTree uses a persistent multi-word compare-and-swap operation (PMwCAS) as a core building block, enabling an index design that has several important advantages compared with competing index structures such as the Bw-tree. First, the BzTree is latch-free yet simple to implement. Second, the BzTree is fast - showing up to 2x higher throughput than the Bw-tree in our experiments. Third, the BzTree does not require any special-purpose recovery code. Recovery is near-instantaneous and only involves rolling back (or forward) any PMwCAS operations that were in-flight during failure. Our end-to-end recovery experiments of BzTree report an average recovery time of 145 μs. Finally, the same BzTree implementation runs seamlessly on both volatile RAM and NVM, which greatly reduces the cost of code maintenance.
postgresql bztree索引接口
https://github.com/postgrespro/bztree
```
[BzTree] (git@github.com:sfu-dis/bztree.git) BzTree FDW for Postgres.
Usage:
create extension bztree;
create index on t using bztree(pk);
```
bztree库实现
https://github.com/sfu-dis/bztree
BzTree
An open-source BzTree implementation, used in our VLDB paper:
Lucas Lersch, Xiangpeng Hao, Ismail Oukid, Tianzheng Wang, Thomas Willhalm:
Evaluating Persistent Memory Range Indexes. PVLDB 13(4): 574-587 (2019)
Build
Use PMDK
bash
mkdir build & cd build
cmake -DPMEM_BACKEND=PMDK ..
Volatile only
bash
mkdir build & cd build
cmake -DPMEM_BACKEND=VOLATILE ..
Other build options
-DPMEM_BACKEND=EMU to emulate persistent memory using DRAM
-DGOOGLE_FRAMEWORK=0 if you're not comfortable with google frameworks (gtest/glog/gflags)
-DBUILD_TESTS=0 to build shared library only (without tests)
-DMAX_FREEZE_RETRY=n to set max freeze retry times, default to 1, check the original paper for details
-DENABLE_MERGE=1 to enable merge after delete, this is disabled by default, check the original paper for details.
Benchmark on PiBench
We officially support bztree wrapper for pibench:
bash
make bztree_pibench_wrapper -j
Checkout PiBench here: https://github.com/wangtzh/pibench
Build/Create BzTree shared lib
bash
mkdir Release & cd Release
cmake -DCMAKE_BUILD_TYPE=Release -DPMEM_BACKEND=${BACKEND} -DGOOGLE_FRAMEWORK=0 -DBUILD_TESTS=0 ..
论文
http://www.vldb.org/pvldb/vol13/p574-lersch.pdf
https://dl.acm.org/doi/10.1145/3164135.3164147
PostgreSQL 许愿链接
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.
9.9元购买3个月阿里云RDS PostgreSQL实例
PostgreSQL 解决方案集合
德哥 / digoal's github - 公益是一辈子的事.





