Skip to content

Commit

Permalink
WriteUnPrepared: Enable WAL during crash recovery (facebook#6418)
Browse files Browse the repository at this point in the history
Summary:
Unfortunately, it seems like mysqld reuses xids across machine restarts. When that happens, we could have something like the following happening:

```
BEGIN_PREPARE(unprepared) Put(a) END_PREPARE(xid = 1)
-- crash and recover with Put(a) rolled back as it was not prepared
BEGIN_PREPARE(prepared) Put(b) END_PREPARE(xid = 1)
COMMIT(xid = 1)
-- crash and recover with both a, b
```

To solve this, we will have to log the rollback batch into the WAL during recovery.

WritePrepared already logs the rollback batch into the WAL, if a rollback happens after prepare, so there is no problem there.
Pull Request resolved: facebook#6418

Differential Revision: D19896151

Pulled By: lth

fbshipit-source-id: 2ff65ddc5fe75efd57736fed4b7cd7a109d26609
  • Loading branch information
lth authored and facebook-github-bot committed Feb 14, 2020
1 parent ac8e89a commit fb57150
Showing 1 changed file with 16 additions and 3 deletions.
19 changes: 16 additions & 3 deletions utilities/transactions/write_unprepared_txn_db.cc
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,23 @@ Status WriteUnpreparedTxnDB::RollbackRecoveredTransaction(
assert(rtxn->unprepared_);
auto cf_map_shared_ptr = WritePreparedTxnDB::GetCFHandleMap();
auto cf_comp_map_shared_ptr = WritePreparedTxnDB::GetCFComparatorMap();
// In theory we could write with disableWAL = true during recovery, and
// assume that if we crash again during recovery, we can just replay from
// the very beginning. Unfortunately, the XIDs from the application may not
// necessarily be unique across restarts, potentially leading to situations
// like this:
//
// BEGIN_PREPARE(unprepared) Put(a) END_PREPARE(xid = 1)
// -- crash and recover with Put(a) rolled back as it was not prepared
// BEGIN_PREPARE(prepared) Put(b) END_PREPARE(xid = 1)
// COMMIT(xid = 1)
// -- crash and recover with both a, b
//
// We could just write the rollback marker, but then we would have to extend
// MemTableInserter during recovery to actually do writes into the DB
// instead of just dropping the in-memory write batch.
//
WriteOptions w_options;
// If we crash during recovery, we can just recalculate and rewrite the
// rollback batch.
w_options.disableWAL = true;

class InvalidSnapshotReadCallback : public ReadCallback {
public:
Expand Down

0 comments on commit fb57150

Please sign in to comment.