Skip to content

Commit

Permalink
bpf, docs: Generate nicer tables for instruction encodings
Browse files Browse the repository at this point in the history
Use RST tables that are nicely readable both in plain ascii as well as
in html to render the instruction encodings, and add a few subheadings
to better structure the text.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
  • Loading branch information
Christoph Hellwig authored and Alexei Starovoitov committed Dec 31, 2021
1 parent 41db511 commit 5e4dd19
Showing 1 changed file with 95 additions and 63 deletions.
158 changes: 95 additions & 63 deletions Documentation/bpf/instruction-set.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,10 @@ The eBPF calling convention is defined as:
R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
necessary across calls.

eBPF opcode encoding
====================

For arithmetic and jump instructions the 8-bit 'opcode' field is divided into
three parts::

+----------------+--------+--------------------+
| 4 bits | 1 bit | 3 bits |
| operation code | source | instruction class |
+----------------+--------+--------------------+
(MSB) (LSB)
Instruction classes
===================

Three LSB bits store instruction class which is one of:
The three LSB bits of the 'opcode' field store the instruction class:

========= =====
class value
Expand All @@ -46,17 +37,34 @@ Three LSB bits store instruction class which is one of:
BPF_ALU64 0x07
========= =====

When BPF_CLASS(code) == BPF_ALU or BPF_JMP, 4th bit encodes source operand ...
Arithmetic and jump instructions
================================

For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and
BPF_JMP32), the 8-bit 'opcode' field is divided into three parts:

::
============== ====== =================
4 bits (MSB) 1 bit 3 bits (LSB)
============== ====== =================
operation code source instruction class
============== ====== =================

BPF_K 0x00 /* use 32-bit immediate as source operand */
BPF_X 0x08 /* use 'src_reg' register as source operand */
The 4th bit encodes the source operand:

... and four MSB bits store operation code.
====== ===== ========================================
source value description
====== ===== ========================================
BPF_K 0x00 use 32-bit immediate as source operand
BPF_X 0x08 use 'src_reg' register as source operand
====== ===== ========================================

If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 BPF_OP(code) is one of::
The four MSB bits store the operation code.

For class BPF_ALU or BPF_ALU64:

======== ===== =========================
code value description
======== ===== =========================
BPF_ADD 0x00
BPF_SUB 0x10
BPF_MUL 0x20
Expand All @@ -68,26 +76,31 @@ If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 BPF_OP(code) is one of::
BPF_NEG 0x80
BPF_MOD 0x90
BPF_XOR 0xa0
BPF_MOV 0xb0 /* mov reg to reg */
BPF_ARSH 0xc0 /* sign extending shift right */
BPF_END 0xd0 /* endianness conversion */
BPF_MOV 0xb0 mov reg to reg
BPF_ARSH 0xc0 sign extending shift right
BPF_END 0xd0 endianness conversion
======== ===== =========================

If BPF_CLASS(code) == BPF_JMP or BPF_JMP32 BPF_OP(code) is one of::
For class BPF_JMP or BPF_JMP32:

BPF_JA 0x00 /* BPF_JMP only */
======== ===== =========================
code value description
======== ===== =========================
BPF_JA 0x00 BPF_JMP only
BPF_JEQ 0x10
BPF_JGT 0x20
BPF_JGE 0x30
BPF_JSET 0x40
BPF_JNE 0x50 /* jump != */
BPF_JSGT 0x60 /* signed '>' */
BPF_JSGE 0x70 /* signed '>=' */
BPF_CALL 0x80 /* function call */
BPF_EXIT 0x90 /* function return */
BPF_JLT 0xa0 /* unsigned '<' */
BPF_JLE 0xb0 /* unsigned '<=' */
BPF_JSLT 0xc0 /* signed '<' */
BPF_JSLE 0xd0 /* signed '<=' */
BPF_JNE 0x50 jump '!='
BPF_JSGT 0x60 signed '>'
BPF_JSGE 0x70 signed '>='
BPF_CALL 0x80 function call
BPF_EXIT 0x90 function return
BPF_JLT 0xa0 unsigned '<'
BPF_JLE 0xb0 unsigned '<='
BPF_JSLT 0xc0 signed '<'
BPF_JSLE 0xd0 signed '<='
======== ===== =========================

So BPF_ADD | BPF_X | BPF_ALU means::

Expand All @@ -108,37 +121,58 @@ the return value into register R0 before doing a BPF_EXIT. Class 6 is used as
BPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide
operands for the comparisons instead.

For load and store instructions the 8-bit 'code' field is divided as::

+--------+--------+-------------------+
| 3 bits | 2 bits | 3 bits |
| mode | size | instruction class |
+--------+--------+-------------------+
(MSB) (LSB)
Load and store instructions
===========================

For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the
8-bit 'opcode' field is divided as:

============ ====== =================
3 bits (MSB) 2 bits 3 bits (LSB)
============ ====== =================
mode size instruction class
============ ====== =================

The size modifier is one of:

Size modifier is one of ...
============= ===== =====================
size modifier value description
============= ===== =====================
BPF_W 0x00 word (4 bytes)
BPF_H 0x08 half word (2 bytes)
BPF_B 0x10 byte
BPF_DW 0x18 double word (8 bytes)
============= ===== =====================

::
The mode modifier is one of:

BPF_W 0x00 /* word */
BPF_H 0x08 /* half word */
BPF_B 0x10 /* byte */
BPF_DW 0x18 /* double word */
============= ===== =====================
mode modifier value description
============= ===== =====================
BPF_IMM 0x00 used for 64-bit mov
BPF_ABS 0x20
BPF_IND 0x40
BPF_MEM 0x60
BPF_ATOMIC 0xc0 atomic operations
============= ===== =====================

... which encodes size of load/store operation::
BPF_MEM | <size> | BPF_STX means::

B - 1 byte
H - 2 byte
W - 4 byte
DW - 8 byte
*(size *) (dst_reg + off) = src_reg

Mode modifier is one of::
BPF_MEM | <size> | BPF_ST means::

BPF_IMM 0x00 /* used for 64-bit mov */
BPF_ABS 0x20
BPF_IND 0x40
BPF_MEM 0x60
BPF_ATOMIC 0xc0 /* atomic operations */
*(size *) (dst_reg + off) = imm32

BPF_MEM | <size> | BPF_LDX means::

dst_reg = *(size *) (src_reg + off)

Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.

Packet access instructions
--------------------------

eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
(BPF_IND | <size> | BPF_LD) which are used to access packet data.
Expand All @@ -165,15 +199,10 @@ For example::
R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
and R1 - R5 were scratched.

eBPF has generic load/store operations::
Atomic operations
-----------------

BPF_MEM | <size> | BPF_STX: *(size *) (dst_reg + off) = src_reg
BPF_MEM | <size> | BPF_ST: *(size *) (dst_reg + off) = imm32
BPF_MEM | <size> | BPF_LDX: dst_reg = *(size *) (src_reg + off)

Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.

It also includes atomic operations, which use the immediate field for extra
eBPF includes atomic operations, which use the immediate field for extra
encoding::

.imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
Expand Down Expand Up @@ -217,6 +246,9 @@ You may encounter ``BPF_XADD`` - this is a legacy name for ``BPF_ATOMIC``,
referring to the exclusive-add operation encoded when the immediate field is
zero.

16-byte instructions
--------------------

eBPF has one 16-byte instruction: ``BPF_LD | BPF_DW | BPF_IMM`` which consists
of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
instruction that loads 64-bit immediate value into a dst_reg.

0 comments on commit 5e4dd19

Please sign in to comment.