op-conductor is an auxiliary service designed to enhance the reliability and availability of a sequencer in high-availability setups, thereby minimizing the risks associated with single point of failure. It is important to note, however, that this design does not incorporate Byzantine fault tolerance. This means it operates under the assumption that all participating nodes are honest.
The design will provide below guarantees:
- No unsafe reorgs
- No unsafe head stall during network partition
- 100% uptime with no more than 1 node failure (for a standard 3 node setup)
On a high level, op-conductor serves the following functions:
- serves as a (raft) consensus layer participant to determine
- leader of the sequencers
- store latest unsafe block within its state machine.
- serves rpc requests for
- admin rpc for manual recovery scenarios such as stop leadership vote, remove itself from cluster, etc
- health rpc for op-node to determine if it should allow publish txs / unsafe blocks
- monitor sequencer (op-node) health
- control loop => control sequencer (op-node) status (start / stop) based on different scenarios
Helpful tips: To better understand the graph, focus on one node at a time, understand what can be transitioned to this current state and how it can transition to other states. This way you could understand how we handle the state transitions.
This is initial version of README, more details will be added later.