Skip to content
This repository has been archived by the owner on Jun 12, 2020. It is now read-only.

Two Phase Commit with TokuDB

prohaska edited this page Dec 26, 2014 · 2 revisions

Two Phase Commit with TokuDB

Step 1: TokuDB Prepare TokuDB writes a prepare recovery log event to its recovery log and fsync's its recovery log using a group fsync algorithm. Since TokUDB uses a group fsync algorithm, throughput scales with the number of threads.

If the fsync of TokuDB's recovery log is skipped, then crash recovery can not ensure that the TokuDB fractal trees and the binlog are consistent.

Step 2: Binlog Write MySQL writes the binlog with the transaction's write events along with the transaction identifier and fsync's the binlog files using a group fsync algorithm (for versions > 5.5). Since MySQL uses a group fsync algorithm, throughput scales with the number of threads.

Step 3: TokuDB Commit TokuDB writes a commit recovery log event to its recovery log. The fsync of the TokuDB recovery log during the commit is NOT necessary.

Recovery from crash between steps 1 and 2

The transaction is prepared in TokuDB but the transaction is not in the binlog. MySQL will ask TokuDB for the identity of all of its prepared but not committed transactions. TokuDB returns this transaction since it was prepared. MySQL will lookup the transaction in its last binlog file. Since the binlog was not yet written for this transaction, MySQL will not find it in the binlog. In this case, MySQL will ROLLBACK the transaction in TokuDB.

Recovery from crash between step 2 and 3

The transaction is prepared in TokuDB but may or may not exist in the binlog depending on whether or not the binlog files completed the fsync.

If the prepared transaction exists in the binlog, MySQL will rerun the TokuDB commit. Otherwise, MySQL will rollback the transaction.

Recovery from crash after step 3

Since the transaction has been committed in TokuDB, it must exist in the binlog so there is nothing to do during recovery.

Clone this wiki locally