在一个分布式的数据库系统中，必须解决不同数据库之间事务的时序问题，MVCC 和 ACID 都需要确保事务的顺序。
Hybrid Logic Clock
TSO - Timestamp Oracle
我们在此讨论一下 TSO 的实现。
TSO (Timestamp Oracle)，即通过中心统一授时，通过中心授权可以保证按照递增的方式分配逻辑时钟,任何事件申请的时钟都不会重复，能够保证事务版本号的单调递增，确保分布式事务的时序。
TiDB作为国内开源分布式数据库的优秀代表，就采用了集中式的 TSO 服务来获取全局一致的版本号，TSO模块位于TiDB 全局中心总控节点 PD 中，PD通过集成 etcd ，保证了持久化数据的强一致性并且可以做到自动的 failover，解决了集中式服务带来的单点故障问题。
The timestamp oracle plays a significant role in the Percolator Transaction model, it is a server that hands out timestamps in strictly increasing order, a property required for correct operation of the snapshot isolation protocol.
Since every transaction requires contacting the timestamp oracle twice, this service must scale well. The timestamp oracle periodically allocates a range of timestamps by writing the highest allocated timestamp to stable storage; then with that allocated range of timestamps, it can satisfy future requests strictly from memory. If the timestamp oracle restarts, the timestamps will jump forward to the maximum allocated timestamp. Timestamps never go "backwards".
To save RPC overhead (at the cost of increasing transaction latency) each timestamp requester batches timestamp requests across transactions by maintaining only one pending RPC to the oracle. As the oracle becomes more loaded, the batching naturally increases to compensate. Batching increases the scalability of the oracle but does not affect the timestamp guarantees.
The transaction protocol uses strictly increasing timestamps to guarantee that Get() returns all committed writes before the transaction’s start timestamp. To see how it provides this guarantee, consider a transaction R reading at timestamp TR and a transaction W that committed at timestamp TW < TR; we will show that R sees W’s writes. Since TW < TR, we know that the timestamp oracle gave out TW before or in the same batch as TR; hence, W requested TW before R received TR. We know that R can’t do reads before receiving its start timestamp TR and that W wrote locks before requesting its commit timestamp TW . Therefore, the above property guarantees that W must have at least written all its locks before R did any reads; R’s Get() will see either the fully committed write record or the lock, in which case W will block until the lock is released. Either way, W’s write is visible to R’s Get().
In our system, the timestamp oracle has been embeded into Placement Driver (PD). PD is the management component with a "God view" and is responsible for storing metadata and conducting load balancing.
在谷歌在2010发表的论文«Large-scale Incremental Processing Using Distributed Transactions and Notiﬁcations» 中，详细介绍了 Percolator 系统的实现，该系统也采用了 TSO 集中授时。