EOF - Data section access instructions
Video
Original
Abstract
Four new instructions are introduced, that allow to read EOF container's data section: DATALOAD
loads 32-byte word to stack, DATALOADN
loads 32-byte word to stack where the word is addressed by a static immediate argument, DATASIZE
loads data section size and DATACOPY
copies a segment of data section to memory.
Motivation
Clear separation between code and data is one of the main features of EOF1. Data section may contain anything, e.g. compiler's metadata, but to make it useful for smart contracts, EVM has to have instructions that allow to read from data section. Previously existing instructions for bytecode inspection (CODECOPY
, CODESIZE
etc.) are deprecated in EOF1 and cannot be used for this purpose.
The DATALOAD
, DATASIZE
, DATACOPY
instruction pattern follows the design of existing instructions for reading other kinds of data (i.e. returndata and calldata).
DATALOADN
is an optimized version of DATALOAD
, where data offset to read is set at compilation time, and therefore need not be validated at run-time, which makes the instruction cheaper.
Specification
We introduce four new instructions on the same block number EIP-3540 is activated on:
DATALOAD
(0xd0)DATALOADN
(0xd1)DATASIZE
(0xd2)DATACOPY
(0xd3)
If the code is legacy bytecode, all of these instructions result in an exceptional halt. (Note: This means no change to behaviour.)
If the code is valid EOF1, the following execution rules apply:
DATALOAD
- Pops one value,
offset
, from the stack. - Reads
[offset:offset+32]
segment from the data section and pushes it as 32-byte value to the stack. - If
offset + 32
is greater than the data section size, bytes after the end of data section are set to 0. - Deducts 4 gas.
DATALOADN
- Has one immediate argument,
offset
, encoded as a 16-bit unsigned big-endian value. - Pops nothing from the stack.
- Reads
[offset:offset+32]
segment from the data section and pushes it as 32-byte value to the stack. - Deducts 3 gas.
[offset:offset+32]
is guaranteed to be within data bounds by code validation.
DATASIZE
- Pops nothing from the stack.
- Pushes the size of the data section of the active container to the stack.
- Deducts 2 gas.
DATACOPY
- Pops three values from the stack:
mem_offset
,offset
,size
. - Performs memory expansion to
mem_offset + size
and deducts memory expansion cost. - Deducts
3 + 3 * ((size + 31) // 32)
gas for copying. - Reads
[offset:offset+size]
segment from the data section and writes it to memory starting at offsetmem_offset
. - If
offset + size
is greater than data section size, 0 bytes will be copied for bytes after the end of the data section.
Code Validation
We extend code section validation rules (as defined in EIP-3670).
- Code section is invalid in case an immediate argument
offset
of anyDATALOADN
is such thatoffset + 32
is greater than data section size, as indicated in the container header before deployment. RJUMP
,RJUMPI
andRJUMPV
immediate argument value (jump destination relative offset) validation: code section is invalid in case offset points to one of two bytes directly followingDATALOADN
instruction.
Rationale
Zero-padding on out of bounds access
Existing instructions for reading other kinds of data implicitly pad with zeroes on out of bounds access, with the only exception of return data copying.
It is beneficial to avoid exceptional failures, because compilers can employ optimizations like removing a code that copies data, but never accesses this copy afterwards, but such optimization is possible only if instruction never has other side effects like exceptional abort.
Lack of EXTDATACOPY
EXTCODECOPY
instruction is deprecated and rejected in EOF contracts and does not copy contract code when being called in legacy with an EOF contract as target. A replacement instruction EXTDATACOPY
has been considered, but decided against in order to reduce the scope of changes.
Data-only contracts which previously relied on EXTCODECOPY
are thereby discouraged, but if there is a strong need, support for them can be easily brought back by introducing EXTDATACOPY
in a future upgrade.
Backwards Compatibility
This change poses no risk to backwards compatibility, as it is introduced only for EOF1 contracts, for which deploying undefined instructions is not allowed, therefore there are no existing contracts using these instructions. The new instructions are not introduced for legacy bytecode (code which is not EOF formatted).
Security Considerations
TBA
Copyright
Copyright and related rights waived via CC0.
Not miss a beat of EIPs' update?
Subscribe EIPs Fun to receive the latest updates of EIPs Good for Buidlers to follow up.
View all