Documentation/bpf/instruction-set.rst - linux - Git at Google


 ====================
 eBPF Instruction Set
 ====================

 Registers and calling convention
 ================================

 eBPF has 10 general purpose registers and a read-only frame pointer register,
 all of which are 64-bits wide.

 The eBPF calling convention is defined as:

  * R0: return value from function calls, and exit value for eBPF programs
  * R1 - R5: arguments for function calls
  * R6 - R9: callee saved registers that function calls will preserve
  * R10: read-only frame pointer to access stack

 R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
 necessary across calls.

 Instruction encoding
 ====================

 eBPF uses 64-bit instructions with the following encoding:

  =============  =======  ===============  ====================  ============
  32 bits (MSB)  16 bits  4 bits           4 bits                8 bits (LSB)
  =============  =======  ===============  ====================  ============
  immediate      offset   source register  destination register  opcode
  =============  =======  ===============  ====================  ============

 Note that most instructions do not use all of the fields.
 Unused fields shall be cleared to zero.

 Instruction classes
 -------------------

 The three LSB bits of the 'opcode' field store the instruction class:

   =========  =====  ===============================
   class      value  description
   =========  =====  ===============================
   BPF_LD     0x00   non-standard load operations
   BPF_LDX    0x01   load into register operations
   BPF_ST     0x02   store from immediate operations
   BPF_STX    0x03   store from register operations
   BPF_ALU    0x04   32-bit arithmetic operations
   BPF_JMP    0x05   64-bit jump operations
   BPF_JMP32  0x06   32-bit jump operations
   BPF_ALU64  0x07   64-bit arithmetic operations
   =========  =====  ===============================

 Arithmetic and jump instructions
 ================================

 For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and
 BPF_JMP32), the 8-bit 'opcode' field is divided into three parts:

   ==============  ======  =================
   4 bits (MSB)    1 bit   3 bits (LSB)
   ==============  ======  =================
   operation code  source  instruction class
   ==============  ======  =================

 The 4th bit encodes the source operand:

   ======  =====  ========================================
   source  value  description
   ======  =====  ========================================
   BPF_K   0x00   use 32-bit immediate as source operand
   BPF_X   0x08   use 'src_reg' register as source operand
   ======  =====  ========================================

 The four MSB bits store the operation code.


 Arithmetic instructions
 -----------------------

 BPF_ALU uses 32-bit wide operands while BPF_ALU64 uses 64-bit wide operands for
 otherwise identical operations.
 The code field encodes the operation as below:

   ========  =====  ==========================
   code      value  description
   ========  =====  ==========================
   BPF_ADD   0x00   dst += src
   BPF_SUB   0x10   dst -= src
   BPF_MUL   0x20   dst \*= src
   BPF_DIV   0x30   dst /= src
   BPF_OR    0x40   dst \|= src
   BPF_AND   0x50   dst &= src
   BPF_LSH   0x60   dst <<= src
   BPF_RSH   0x70   dst >>= src
   BPF_NEG   0x80   dst = ~src
   BPF_MOD   0x90   dst %= src
   BPF_XOR   0xa0   dst ^= src
   BPF_MOV   0xb0   dst = src
   BPF_ARSH  0xc0   sign extending shift right
   BPF_END   0xd0   endianness conversion
   ========  =====  ==========================

 BPF_ADD | BPF_X | BPF_ALU means::

   dst_reg = (u32) dst_reg + (u32) src_reg;

 BPF_ADD | BPF_X | BPF_ALU64 means::

   dst_reg = dst_reg + src_reg

 BPF_XOR | BPF_K | BPF_ALU means::

   src_reg = (u32) src_reg ^ (u32) imm32

 BPF_XOR | BPF_K | BPF_ALU64 means::

   src_reg = src_reg ^ imm32


 Jump instructions
 -----------------

 BPF_JMP32 uses 32-bit wide operands while BPF_JMP uses 64-bit wide operands for
 otherwise identical operations.
 The code field encodes the operation as below:

   ========  =====  =========================  ============
   code      value  description                notes
   ========  =====  =========================  ============
   BPF_JA    0x00   PC += off                  BPF_JMP only
   BPF_JEQ   0x10   PC += off if dst == src
   BPF_JGT   0x20   PC += off if dst > src     unsigned
   BPF_JGE   0x30   PC += off if dst >= src    unsigned
   BPF_JSET  0x40   PC += off if dst & src
   BPF_JNE   0x50   PC += off if dst != src
   BPF_JSGT  0x60   PC += off if dst > src     signed
   BPF_JSGE  0x70   PC += off if dst >= src    signed
   BPF_CALL  0x80   function call
   BPF_EXIT  0x90   function / program return  BPF_JMP only
   BPF_JLT   0xa0   PC += off if dst < src     unsigned
   BPF_JLE   0xb0   PC += off if dst <= src    unsigned
   BPF_JSLT  0xc0   PC += off if dst < src     signed
   BPF_JSLE  0xd0   PC += off if dst <= src    signed
   ========  =====  =========================  ============

 The eBPF program needs to store the return value into register R0 before doing a
 BPF_EXIT.


 Load and store instructions
 ===========================

 For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the
 8-bit 'opcode' field is divided as:

   ============  ======  =================
   3 bits (MSB)  2 bits  3 bits (LSB)
   ============  ======  =================
   mode          size    instruction class
   ============  ======  =================

 The size modifier is one of:

   =============  =====  =====================
   size modifier  value  description
   =============  =====  =====================
   BPF_W          0x00   word        (4 bytes)
   BPF_H          0x08   half word   (2 bytes)
   BPF_B          0x10   byte
   BPF_DW         0x18   double word (8 bytes)
   =============  =====  =====================

 The mode modifier is one of:

   =============  =====  ====================================
   mode modifier  value  description
   =============  =====  ====================================
   BPF_IMM        0x00   used for 64-bit mov
   BPF_ABS        0x20   legacy BPF packet access
   BPF_IND        0x40   legacy BPF packet access
   BPF_MEM        0x60   all normal load and store operations
   BPF_ATOMIC     0xc0   atomic operations
   =============  =====  ====================================

 BPF_MEM | <size> | BPF_STX means::

   *(size *) (dst_reg + off) = src_reg

 BPF_MEM | <size> | BPF_ST means::

   *(size *) (dst_reg + off) = imm32

 BPF_MEM | <size> | BPF_LDX means::

   dst_reg = *(size *) (src_reg + off)

 Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.

 Atomic operations
 -----------------

 eBPF includes atomic operations, which use the immediate field for extra
 encoding::

    .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W  | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
    .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg

 The basic atomic operations supported are::

     BPF_ADD
     BPF_AND
     BPF_OR
     BPF_XOR

 Each having equivalent semantics with the ``BPF_ADD`` example, that is: the
 memory location addresed by ``dst_reg + off`` is atomically modified, with
 ``src_reg`` as the other operand. If the ``BPF_FETCH`` flag is set in the
 immediate, then these operations also overwrite ``src_reg`` with the
 value that was in memory before it was modified.

 The more special operations are::

     BPF_XCHG

 This atomically exchanges ``src_reg`` with the value addressed by ``dst_reg +
 off``. ::

     BPF_CMPXCHG

 This atomically compares the value addressed by ``dst_reg + off`` with
 ``R0``. If they match it is replaced with ``src_reg``. In either case, the
 value that was there before is zero-extended and loaded back to ``R0``.

 Note that 1 and 2 byte atomic operations are not supported.

 Clang can generate atomic instructions by default when ``-mcpu=v3`` is
 enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction
 Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable
 the atomics features, while keeping a lower ``-mcpu`` version, you can use
 ``-Xclang -target-feature -Xclang +alu32``.

 You may encounter ``BPF_XADD`` - this is a legacy name for ``BPF_ATOMIC``,
 referring to the exclusive-add operation encoded when the immediate field is
 zero.

 16-byte instructions
 --------------------

 eBPF has one 16-byte instruction: ``BPF_LD | BPF_DW | BPF_IMM`` which consists
 of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
 instruction that loads 64-bit immediate value into a dst_reg.

 Packet access instructions
 --------------------------

 eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
 (BPF_IND | <size> | BPF_LD) which are used to access packet data.

 They had to be carried over from classic BPF to have strong performance of
 socket filters running in eBPF interpreter. These instructions can only
 be used when interpreter context is a pointer to ``struct sk_buff`` and
 have seven implicit operands. Register R6 is an implicit input that must
 contain pointer to sk_buff. Register R0 is an implicit output which contains
 the data fetched from the packet. Registers R1-R5 are scratch registers
 and must not be used to store the data across BPF_ABS | BPF_LD or
 BPF_IND | BPF_LD instructions.

 These instructions have implicit program exit condition as well. When
 eBPF program is trying to access the data beyond the packet boundary,
 the interpreter will abort the execution of the program. JIT compilers
 therefore must preserve this property. src_reg and imm32 fields are
 explicit inputs to these instructions.

 For example, BPF_IND | BPF_W | BPF_LD means::

   R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))

 and R1 - R5 are clobbered.

	====================
	eBPF Instruction Set
	====================

	Registers and calling convention
	================================

	eBPF has 10 general purpose registers and a read-only frame pointer register,
	all of which are 64-bits wide.

	The eBPF calling convention is defined as:

	* R0: return value from function calls, and exit value for eBPF programs
	* R1 - R5: arguments for function calls
	* R6 - R9: callee saved registers that function calls will preserve
	* R10: read-only frame pointer to access stack

	R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
	necessary across calls.

	Instruction encoding
	====================

	eBPF uses 64-bit instructions with the following encoding:

	============= ======= =============== ==================== ============
	32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
	============= ======= =============== ==================== ============
	immediate offset source register destination register opcode
	============= ======= =============== ==================== ============

	Note that most instructions do not use all of the fields.
	Unused fields shall be cleared to zero.

	Instruction classes
	-------------------

	The three LSB bits of the 'opcode' field store the instruction class:

	========= ===== ===============================
	class value description
	========= ===== ===============================
	BPF_LD 0x00 non-standard load operations
	BPF_LDX 0x01 load into register operations
	BPF_ST 0x02 store from immediate operations
	BPF_STX 0x03 store from register operations
	BPF_ALU 0x04 32-bit arithmetic operations
	BPF_JMP 0x05 64-bit jump operations
	BPF_JMP32 0x06 32-bit jump operations
	BPF_ALU64 0x07 64-bit arithmetic operations
	========= ===== ===============================

	Arithmetic and jump instructions
	================================

	For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and
	BPF_JMP32), the 8-bit 'opcode' field is divided into three parts:

	============== ====== =================
	4 bits (MSB) 1 bit 3 bits (LSB)
	============== ====== =================
	operation code source instruction class
	============== ====== =================

	The 4th bit encodes the source operand:

	====== ===== ========================================
	source value description
	====== ===== ========================================
	BPF_K 0x00 use 32-bit immediate as source operand
	BPF_X 0x08 use 'src_reg' register as source operand
	====== ===== ========================================

	The four MSB bits store the operation code.


	Arithmetic instructions
	-----------------------

	BPF_ALU uses 32-bit wide operands while BPF_ALU64 uses 64-bit wide operands for
	otherwise identical operations.
	The code field encodes the operation as below:

	======== ===== ==========================
	code value description
	======== ===== ==========================
	BPF_ADD 0x00 dst += src
	BPF_SUB 0x10 dst -= src
	BPF_MUL 0x20 dst \*= src
	BPF_DIV 0x30 dst /= src
	BPF_OR 0x40 dst \\|= src
	BPF_AND 0x50 dst &= src
	BPF_LSH 0x60 dst <<= src
	BPF_RSH 0x70 dst >>= src
	BPF_NEG 0x80 dst = ~src
	BPF_MOD 0x90 dst %= src
	BPF_XOR 0xa0 dst ^= src
	BPF_MOV 0xb0 dst = src
	BPF_ARSH 0xc0 sign extending shift right
	BPF_END 0xd0 endianness conversion
	======== ===== ==========================

	BPF_ADD \| BPF_X \| BPF_ALU means::

	dst_reg = (u32) dst_reg + (u32) src_reg;

	BPF_ADD \| BPF_X \| BPF_ALU64 means::

	dst_reg = dst_reg + src_reg

	BPF_XOR \| BPF_K \| BPF_ALU means::

	src_reg = (u32) src_reg ^ (u32) imm32

	BPF_XOR \| BPF_K \| BPF_ALU64 means::

	src_reg = src_reg ^ imm32


	Jump instructions
	-----------------

	BPF_JMP32 uses 32-bit wide operands while BPF_JMP uses 64-bit wide operands for
	otherwise identical operations.
	The code field encodes the operation as below:

	======== ===== ========================= ============
	code value description notes
	======== ===== ========================= ============
	BPF_JA 0x00 PC += off BPF_JMP only
	BPF_JEQ 0x10 PC += off if dst == src
	BPF_JGT 0x20 PC += off if dst > src unsigned
	BPF_JGE 0x30 PC += off if dst >= src unsigned
	BPF_JSET 0x40 PC += off if dst & src
	BPF_JNE 0x50 PC += off if dst != src
	BPF_JSGT 0x60 PC += off if dst > src signed
	BPF_JSGE 0x70 PC += off if dst >= src signed
	BPF_CALL 0x80 function call
	BPF_EXIT 0x90 function / program return BPF_JMP only
	BPF_JLT 0xa0 PC += off if dst < src unsigned
	BPF_JLE 0xb0 PC += off if dst <= src unsigned
	BPF_JSLT 0xc0 PC += off if dst < src signed
	BPF_JSLE 0xd0 PC += off if dst <= src signed
	======== ===== ========================= ============

	The eBPF program needs to store the return value into register R0 before doing a
	BPF_EXIT.


	Load and store instructions
	===========================

	For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the
	8-bit 'opcode' field is divided as:

	============ ====== =================
	3 bits (MSB) 2 bits 3 bits (LSB)
	============ ====== =================
	mode size instruction class
	============ ====== =================

	The size modifier is one of:

	============= ===== =====================
	size modifier value description
	============= ===== =====================
	BPF_W 0x00 word (4 bytes)
	BPF_H 0x08 half word (2 bytes)
	BPF_B 0x10 byte
	BPF_DW 0x18 double word (8 bytes)
	============= ===== =====================

	The mode modifier is one of:

	============= ===== ====================================
	mode modifier value description
	============= ===== ====================================
	BPF_IMM 0x00 used for 64-bit mov
	BPF_ABS 0x20 legacy BPF packet access
	BPF_IND 0x40 legacy BPF packet access
	BPF_MEM 0x60 all normal load and store operations
	BPF_ATOMIC 0xc0 atomic operations
	============= ===== ====================================

	BPF_MEM \| <size> \| BPF_STX means::

	(size ) (dst_reg + off) = src_reg

	BPF_MEM \| <size> \| BPF_ST means::

	(size ) (dst_reg + off) = imm32

	BPF_MEM \| <size> \| BPF_LDX means::

	dst_reg = (size ) (src_reg + off)

	Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.

	Atomic operations
	-----------------

	eBPF includes atomic operations, which use the immediate field for extra
	encoding::

	.imm = BPF_ADD, .code = BPF_ATOMIC \| BPF_W \| BPF_STX: lock xadd (u32 )(dst_reg + off16) += src_reg
	.imm = BPF_ADD, .code = BPF_ATOMIC \| BPF_DW \| BPF_STX: lock xadd (u64 )(dst_reg + off16) += src_reg

	The basic atomic operations supported are::

	BPF_ADD
	BPF_AND
	BPF_OR
	BPF_XOR

	Each having equivalent semantics with the ``BPF_ADD`` example, that is: the
	memory location addresed by ``dst_reg + off`` is atomically modified, with
	``src_reg`` as the other operand. If the ``BPF_FETCH`` flag is set in the
	immediate, then these operations also overwrite ``src_reg`` with the
	value that was in memory before it was modified.

	The more special operations are::

	BPF_XCHG

	This atomically exchanges ``src_reg`` with the value addressed by ``dst_reg +
	off``. ::

	BPF_CMPXCHG

	This atomically compares the value addressed by ``dst_reg + off`` with
	``R0``. If they match it is replaced with ``src_reg``. In either case, the
	value that was there before is zero-extended and loaded back to ``R0``.

	Note that 1 and 2 byte atomic operations are not supported.

	Clang can generate atomic instructions by default when ``-mcpu=v3`` is
	enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction
	Clang can generate is ``BPF_ADD`` without ``BPF_FETCH``. If you need to enable
	the atomics features, while keeping a lower ``-mcpu`` version, you can use
	``-Xclang -target-feature -Xclang +alu32``.

	You may encounter ``BPF_XADD`` - this is a legacy name for ``BPF_ATOMIC``,
	referring to the exclusive-add operation encoded when the immediate field is
	zero.

	16-byte instructions
	--------------------

	eBPF has one 16-byte instruction: ``BPF_LD \| BPF_DW \| BPF_IMM`` which consists
	of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
	instruction that loads 64-bit immediate value into a dst_reg.

	Packet access instructions
	--------------------------

	eBPF has two non-generic instructions: (BPF_ABS \| <size> \| BPF_LD) and
	(BPF_IND \| <size> \| BPF_LD) which are used to access packet data.

	They had to be carried over from classic BPF to have strong performance of
	socket filters running in eBPF interpreter. These instructions can only
	be used when interpreter context is a pointer to ``struct sk_buff`` and
	have seven implicit operands. Register R6 is an implicit input that must
	contain pointer to sk_buff. Register R0 is an implicit output which contains
	the data fetched from the packet. Registers R1-R5 are scratch registers
	and must not be used to store the data across BPF_ABS \| BPF_LD or
	BPF_IND \| BPF_LD instructions.

	These instructions have implicit program exit condition as well. When
	eBPF program is trying to access the data beyond the packet boundary,
	the interpreter will abort the execution of the program. JIT compilers
	therefore must preserve this property. src_reg and imm32 fields are
	explicit inputs to these instructions.

	For example, BPF_IND \| BPF_W \| BPF_LD means::

	R0 = ntohl((u32 ) (((struct sk_buff *) R6)->data + src_reg + imm32))

	and R1 - R5 are clobbered.