Transaction data gas cost reduction

FinalStandards Track: Core
Created: 2019-05-03
Alexey Akhunov (@AlexeyAkhunov), Eli Ben Sasson <eli@starkware.co>, Tom Brand <tom@starkware.co>, Louis Guthmann <louis@starkware.co>, Avihu Levy <avihu@starkware.co>
DiscussionsOriginal linkEdit
1 min read

EIP-2028 proposes a reduction in gas cost for transaction data in Ethereum, which would increase the average data capacity but not the burst data capacity of the chain. This could lead to an increase in storage requirements for history-storing, with a worst-case scenario of up to 3.0 TB per year. To mitigate this risk, the proposal recommends implementing EIP-4444 or a similar history expiry proposal at the same time or soon after. The proposal also includes a flat penalty of 300 gas on top of the costs calculated in the table to account for the cost of loading the code.

Anyone may contribute to propose contents.
Go propose

Simple Summary

We propose to reduce the gas cost of Calldata (GTXDATANONZERO) from its current value of 68 gas per byte to 16 gas per byte, to be backed by mathematical modeling and empirical estimates. The mathematical model is the one used in the works of Sompolinsky and Zohar [1] and Pass, Seeman and Shelat [2], which relates network security to network delay. We shall (1) evaluate the theoretical impact of lower Calldata gas cost on network delay using this model, (2) validate the model empirically, and (3) base the proposed gas cost on our findings.


There are a couple of main benefits to accepting this proposal and lowering gas cost of Calldata On-Chain Scalability: Generally speaking, higher bandwidth of Calldata improves scalability, as more data can fit within a single block.

  • Layer two scalability: Layer two scaling solutions can improve scalability by moving storage and computation off-chain, but often introduce data transmission instead.
    • Proof systems such as STARKs and SNARKs use a single proof that attests to the computational integrity of a large computation, say, one that processes a large batch of transactions.
    • Some solutions use fraud proofs which requires a transmission of merkle proofs.
    • Moreover, one optional data availability solution to layer two is to place data on the main chain, via Calldata.
  • Stateless clients: The same model will be used to determine the price of the state access for the stateless client regime, which will be proposed in the State Rent (from version 4). There, it is expected that the gas cost of state accessing operation will increase roughly proportional to the extra bandwidth required to transmit the “block proofs” as well as extra processing required to verify those block proofs.


The gas per non-zero byte is reduced from 68 to 16. Gas cost of zero bytes is unchanged.


Roughly speaking, reducing the gas cost of Calldata leads to potentially larger blocks, which increases the network delay associated with data transmission over the network. This is only part of the full network delay, other factors are block processing time (and storage access, as part of it). Increasing network delay affects security by lowering the cost of attacking the network, because at any given point in time fewer nodes are updated on the latest state of the blockchain.

Yonatan Sompolinsky and Aviv Zohar suggested in [1] an elegant model to relate network delay to network security, and this model is also used in the work of Rafael Pass, Lior Seeman and Abhi Shelat [2]. We briefly explain this model below, because we shall study it theoretically and validate it by empirical measurements to reach the suggested lower gas cost for Calldata.

The model uses the following natural parameters:

  • lambda denotes the block creation rate [1/s]: We treat the process of finding a PoW solution as a poisson process with rate lambda.
  • beta - chain growth rate [1/s]: the rate at which new blocks are added to the heaviest chain.
  • D - block delay [s]: The time that elapses between the mining of a new block and its acceptance by all the miners (all miners switched to mining on top of that block).

Beta Lower Bound

Notice that lambda => beta, because not all blocks that are found will enter the main chain (as is the case with uncles). In [1] it was shown that for a blockchain using the longest chain rule, one may bound beta from below by lambda/ (1+ D * lambda). This lower bound holds in the extremal case where the topology of the network is a clique in which the delay between each pair of nodes is D, the maximal possible delay. Recording both the lower and upper bounds on beta we get

_lambda_ >= _beta_ >= _lambda_ / (1 + D * _lambda_)               (*)

Notice, as a sanity check, that when there is no delay (D=0) then beta equals lambda, as expected.

Security of the network

An attacker attempting to reorganize the main chain needs to generate blocks at a rate that is greater than beta. Fixing the difficulty level of the PoW puzzle, the total hash rate in the system is correlated to lambda. Thus, beta / lambda is defined as the efficiency of the system, as it measures the fraction of total hash power that is used to generate the main chain of the network.

Rearranging (*) gives the following lower bound on efficiency in terms of delay:

_beta_ / _lambda_ >= 1 / (1 + D * _lambda_)                 (**)

The delay parameter D

The network delay depends on the location of the mining node within the network and on the current network topology (which changes dynamically), and consequently is somewhat difficult to measure directly. Previously, Christian Decker and Roger Wattenhofer [3] showed that propagation time scales with blocksize, and Vitalik Buterin showed that uncle rate, which is tightly related to efficiency (**) measure, also scales with block size [4].

However, the delay function can be decomposed into two parts D = D_t + D_p, where D_t is the delay caused by the transmission of the block and D_p is the delay caused by the processing of the block by the node. Our model and tests will examine the effect of Calldata on each of D_t and D_p, postulating that their effect is different. This may be particularly relevant for Layer 2 Scalability and for Stateless Clients (Rationales 2, 3 above) because most of the Calldata associated with these goals are Merkle authentication paths that have a large D_t component but relatively small D_p values.

Test Cases

To suggest the gas cost of calldata we shall conduct two types of tests:

  1. Network tests, conducted on the Ethereum mainnet, used to estimate the effect on increasing block size on D_p and D_t, on the overall network delay D and the efficiency ratio (**), as well as delays between different mining pools. Those tests will include regression tests on existing data, and stress tests to introduce extreme scenarios.
  2. Local tests, conducted on a single node and measuring the processing time as a function of Calldata amount and general computation limits.

Reference Implementation

Parity Geth


[1] Yonatan Sompolinsky, Aviv Zohar: Secure High-Rate Transaction Processing in Bitcoin. Financial Cryptography 2015: 507-527

[2] Rafael Pass, Lior Seeman, Abhi Shelat: Analysis of the Blockchain Protocol in Asynchronous Networks, ePrint report 2016/454

[3] Christian Decker, Roger Wattenhofer: Information propagation in the Bitcoin network. P2P 2013: 1-10

[4] Vitalik Buterin: Uncle Rate and Transaction Fee Analysis

Copyright and related rights waived via CC0.

Further reading
Anyone may contribute to propose contents.
Go propose
Adopted by projects
Anyone may contribute to propose contents.
Go propose

Not miss a beat of EIPs' update?

Subscribe EIPs Fun to receive the latest updates of EIPs Good for Buidlers to follow up.

View all
Serve Ethereum Builders, Scale the Community.
Supported by