Two dimensional log filter data structure
相关视频
正文
Abstract
Replace the fixed 2048 bit log event bloom filters in block headers with a new data structure that can adapt to the changing number of events per block and consistently guarantee a sufficiently low false positive ratio. Unlike the per-block bloom filters, the proposed structure allows searching for specific events by accessing only a small portion of the entire dataset which can also be proven with a Merkle proof, making the search both efficient and light client friendly.
As an additional benefit, the new structure provides more precise position information on the potential hits (block number and transaction index) allowing the searcher to only look up a single receipt for each potential hit.
Motivation
Bloom filters are only useful as long as they are sufficiently sparse. False positive ratio rises rapidly with the number of events per filter and the density of 1
bits in the filter bit vector. In the currently existing filter each log address and topic sets 3 out of a fixed length of 2048 bits which resulted in sufficiently sparse filters in the beginning but with the increase of the block gas limits the false positive ratio soon became practically useless.
Another issue with per-block filters in headers is that searching for an event requires access to the entire header chain of the searched section (5-600 bytes per block). The main purpose of making logs a part of consensus was to be able to prove these events with a low amount of data and making trustless interaction with smart contracts possible without operating a full node. While a receipt with a Merkle proof can prove the existence of such events (also worth noting here that since the merge the historical state roots structure of the beacon chain also allows proving the canonicalness of the host blocks without possessing the entire header chain), a log filter structure can prove the non-existence of more such events. The proposed structure using the proposed constants allows a trustless proof of all potential hits within a block range with at least two orders of magnitude less data than the bloom filters currently embedded in headers.
Specification
Terms and definitions
- log value: either an address value or a topic value. Each
LOG
opcode adds one address value and 0..4 topic values. A log value is represented by a 32 byte hash which is calculated asSHA2("A" + address)
orSHA2("T" + topic)
- log value index: values are globally mapped to a linear index space, with a monotonically increasing log value index assigned to each added log value. The log values are added in the order of EVM execution (address value first, then the topic values) so the logs generated in each block and in each transaction of the block occupy a continuous range in the index space, with boundary markers added to each block header and transaction receipt.
- log value pointer: a global pointer to the index space that always points to the next available log value index and is increased by one each time a log value is added. The actual value of log value pointer is stored in each block receipt and each block header after adding the log values generated in the given transaction/block.
- filter map: a
MAP_WIDTH
byMAP_HEIGHT
map intended to help searching for log values in a fixedVALUES_PER_MAP
length section of the log value index space. A mark is placed on the map for each log value. Multiple marks can be placed at the same position if multiple log values are mapped onto the same position. Rows are sparsely encoded as a list of column indices where marks are placed. Filter maps are indexed, the log value index range covered by each map index ismap_index * VALUES_PER_MAP
to(map_index+1) * VALUES_PER_MAP - 1
. The sub-position of a log value inside the range covered by a certain map is called value subindex and is calculated aslog_value_index % VALUES_PER_MAP
and used for calculating the column index where a mark should be placed. Filter maps are tree hashed and referenced in each block header aslog_filter_root
. Note that after processing a block the last filter map usually covers an index range that's not fully populated with values yet and therefore it can contain less thanVALUES_PER_MAP
marks and can be further populated in subsequent blocks. - filter epoch: a group of filter maps stored in the hash tree in a way so that multiple rows of adjacent filter maps with the same row index can be efficiently retrieved in a single Merkle multiproof. The map index range covered by each epoch index is
epoch_index * MAPS_PER_EPOCH
to(epoch_index+1) * MAPS_PER_EPOCH - 1
.
Log value mapping
The log value mapping functions get_row_index
and get_column_index
take a log value and its position in the linear index space and return a position of the filter map where a mark for the given log value should be placed.
def get_row_index(log_value_index, log_value):
map_index = log_value_index // VALUES_PER_MAP
epoch_index = map_index // MAPS_PER_EPOCH
return SHA2(log_value + uint32_to_littleEndian(epoch_index)) % MAP_HEIGHT
def get_column_index(log_value_index, log_value):
value_subindex = log_value_index % VALUES_PER_MAP
map_index = log_value_index // VALUES_PER_MAP
transform_hash = SHA2(log_value + uint32_to_littleEndian(map_index))
return column_transform(value_subindex, transform_hash)
The function column_transform
performs a reversible quasi-random transformation on the range 0 .. MAP_WIDTH-1
. It can be realized with a series of reversible operations (modulo multiplication with odd numbers, modulo addition, binary XOR) where the second arguments of these operations are taken from different sections of the transform_hash
.
def column_transform(value_subindex, transform_hash):
x = value_subindex
x = (x + littleEndian_to_uint32(transform_hash[0:4])) % 2**32
x = (x * (littleEndian_to_uint32(transform_hash[4:8])*2+1)) % 2**32
x = x ^ littleEndian_to_uint32(transform_hash[8:12]
x = (x * (littleEndian_to_uint32(transform_hash[12:16])*2+1)) % 2**32
x = (x + littleEndian_to_uint32(transform_hash[16:20])) % 2**32
x = (x * (littleEndian_to_uint32(transform_hash[20:24])*2+1)) % 2**32
x = x ^ littleEndian_to_uint32(transform_hash[24:28]
x = (x * (littleEndian_to_uint32(transform_hash[28:32])*2+1)) % 2**32
return x
def reverse_transform(column_index, transform_hash):
x = column_index
x = (x + 2**32 - littleEndian_to_uint32(transform_hash[0:4])) % 2**32
x = (x * modInverse(littleEndian_to_uint32(transform_hash[4:8])*2+1, 2**32)) % 2**32
x = x ^ littleEndian_to_uint32(transform_hash[8:12]
x = (x * modInverse(littleEndian_to_uint32(transform_hash[12:16])*2+1, 2**32)) % 2**32
x = (x + 2**32 - littleEndian_to_uint32(transform_hash[16:20])) % 2**32
x = (x * modInverse(littleEndian_to_uint32(transform_hash[20:24])*2+1, 2**32)) % 2**32
x = x ^ littleEndian_to_uint32(transform_hash[24:28]
x = (x * modInverse(littleEndian_to_uint32(transform_hash[28:32])*2+1, 2**32)) % 2**32
return x
Note that with a non-reversible column_transform
the searcher would need to iterate through the entire VALUES_PER_MAP
range for each map row and searched log value and check whether there is a mark in the resulting column_index
, indicating a potential hit. Having a matching reverse_transform
function makes it possible to just iterate through the actual marks found in the map row and see whether they can be transformed back into a potentially valid value_subindex
:
def potential_match_index(map_index, column_index, log_value):
transform_hash = SHA2(log_value + uint32_to_littleEndian(map_index))
potential_value_subindex = reverse_transform(column_index, transform_hash)
if potential_value_subindex < VALUES_PER_MAP:
return map_index * VALUES_PER_MAP + potential_value_subindex
return -1
Adding and searching for values
For pseudocode readability the filter map rows are represented here as filter_map_rows[map_index][row_index]
where each row is a list of column indices. Note that this does not match the consensus hash tree representation which is different for efficiency reasons (see below).
Adding log values generated by a block:
def add_block_logs(block):
for receipt in block.receipts:
for log in receipt.logs:
add_log_value(address_value(log.address))
for topic in log.topics:
add_log_value(topic_value(topic))
receipt.log_value_pointer = log_value_pointer
block.log_value_pointer = log_value_pointer
def add_log_value(log_value):
map_index = log_value_pointer // VALUES_PER_MAP
row_index = get_row_index(map_index // MAPS_PER_EPOCH, log_value)
column_index = get_column_index(log_value_pointer, log_value)
filter_map_rows[map_index][row_index].append(column_index)
log_value_pointer += 1
def address_value(address):
return sha2("A" + address)
def topic_value(topic):
return sha2("T" + topic)
Searching for a log value in a range of blocks:
def search_block_range(first_block_number, last_block_number, log_value):
# convert block number range to log value index range
first_index = 0
if first_block_number > 0:
first_index = canonical_blocks(first_block_number - 1).log_value_pointer
last_index = canonical_blocks(last_block_number).log_value_pointer - 1
return search_index_range(first_index, last_index, log_value)
def search_index_range(first_index, last_index, log_value):
results = []
# iterate through the filter map range where relevant logs might be marked
for map_index in range(first_index // VALUES_PER_MAP, last_index // VALUES_PER_MAP):
# select relevant row and iterate through column indices
row_index = get_row_index(map_index // MAPS_PER_EPOCH, log_value)
row = filter_map_rows[map_index][row_index]
for column_index in row:
# reverse transform column index and check if it is a potential match
match_index = potential_match_index(map_index, column_index, log_value)
if match_index >= first_index and match_index <= last_index:
# find receipt that generated the reverse transformed log value index
receipt = receipt_with_index(match_index)
# check if it actually lists the searched log value
if receipt != null and receipt_has_value(receipt, log_value):
results.append(receipt)
return results
def receipt_with_index(log_value_index):
# return first canonical block receipt where receipt.log_value_pointer > log_value_index
def receipt_has_value(receipt, log_value):
for log in receipt.logs:
if log_value == address_value(log.address):
return true
for topic in log.topics:
if log_value == topic_value(topic):
return true
return false
Note that the example pseudocode shown above realizes a search for a single log value but it is also possible to efficiently filter for specific address/topic combinations and patterns.
Consensus data format
Block headers
Beginning at the execution timestamp FORK_TIMESTAMP
, execution clients MUST replace the logs_bloom
field of the header schema with two new fields: the log_filter_root
(the root hash of the filter map hash tree structure) and log_value_pointer
(the first unoccupied position in the log value index space after processing the block).
The resulting RLP encoding of the header is therefore:
rlp([
parent_hash,
0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347, # ommers hash
coinbase,
state_root,
txs_root,
receipts_root,
log_filter_root,
log_value_pointer,
0, # difficulty
number,
gas_limit,
gas_used,
timestamp,
extradata,
prev_randao,
0x0000000000000000, # nonce
base_fee_per_gas,
withdrawals_root,
blob_gas_used,
excess_blob_gas,
parent_beacon_block_root,
])
Receipts
Beginning at the execution timestamp FORK_TIMESTAMP
, execution clients MUST replace the bloom
field of the transaction receipt schema with a new field, the log_value_pointer
(the first unoccupied position in the log value index space after executing the transaction).
The resulting RLP encoding of the receipt is therefore:
rlp([
type,
post_state,
status,
cumulative_gas_used,
log_value_pointer,
logs,
])
Log filter map
Each row of the filter map is encoded as a series of little endian binary encoded column indices. With the proposed value of MAP_WIDTH = 2**32
this results in a simple and efficient encoding as a series of 4 byte values. Since the order of entries in this list should not affect the filter algorithm, for simplicity the indices are not sorted but simply appended to the list in the original order of the log events. In the rare case of column index collision (two events generating the same column index in the same row) the index is also added twice. This simplifies the in-memory maintenance of the tree and also allows clients to maintain only a single version of the tree and easily roll back events if necessary.
Note that the number of indices in a row of a single map may vary, the global average number of entries per row is VALUES_PER_MAP / MAP_HEIGHT
while the upper limit is VALUES_PER_MAP
. Though such a list could be represented for consensus hashing purposes as List[Bytes4, VALUES_PER_MAP]
it would just add an unnecessary layer of complexity to tree hash this list since typically the entire row is required for any meaningful search operation. Therefore the rows regardless of their length are simply SHA2
hashed and the hashes are stored in the consensus tree format.
Log filter tree
The SHA2
hashes of encoded map rows are stored in log_filter_tree
which is defined as an SSZ binary Merkle tree structured as a list of vector of vectors:
log_filter_tree: List[Vector[Vector[Bytes32, MAPS_PER_EPOCH], MAP_HEIGHT], MAX_EPOCH_HISTORY]
The hash of row row_index
of map map_index
is stored in log_filter_tree[map_index // MAPS_PER_EPOCH][row_index][map_index % MAPS_PER_EPOCH]
. The length of the list of epoch subtrees is always in direct relationship with log_value_pointer
:
epoch_count = (log_value_pointer + VALUES_PER_MAP * MAPS_PER_EPOCH - 1) // VALUES_PER_MAP // MAPS_PER_EPOCH
When a new epoch subtree is added, it should be initialized by filling it with the SHA2
hash of an empty string which is the canonical encoding of a map row with no marks in it.
Proposed constants
MAP_WIDTH
:2**32
MAP_HEIGHT
:2**12
VALUES_PER_MAP
:2**16
MAPS_PER_EPOCH
:2**6
MAX_EPOCH_HISTORY
:2**24
Rationale
Two dimensional mapping
The currently existing bloom filter maps the log events of each block onto a one dimensional bit vector. Searching in such vectors requires accessing and processing the entire vectors from each block in the search range while only a few bits of each of those vectors are actually required in order to search for a specific value. To achieve higher bandwidth efficiency, log data from multiple blocks is collected and mapped in two dimensions where the filter data generated by different log values are sorted into a fixed number of rows. The same log value is mapped onto the same row index during a sufficiently long period (filter epoch), making it possible to access only the interesting rows and making a typical search for a small number of values in a long range of blocks more efficient by multiple orders of magnitude.
Log value index space
In each block a varying number of log values are emitted. In addition to inefficient search, another drawback of per-block fixed size bloom filters is the varying filter utilization leading to over-utilized filters giving many false positives in some blocks and/or wastefully under-utilized filters in some blocks. Block gas limits also tend to change significantly over the long term so any future-proof solution has to be able to adapt to the varying number of log values per block.
Since filter maps of the proposed design are no longer associated with individual blocks but contain filter data from multiple blocks, it is possible to detach them from block boundaries and put a fixed number of log values into each filter map. The horizontal mapping realized by column_transform
and reverse_transform
ensures that potential hits returned by a search give not only the index of the filter map but the exact log value index where the searched log value was probably added. Since a log_value_pointer
field is added to each header and transaction receipt, it is always possible to find the exact transaction and receipt where the log value has probably been added and a canonical receipt covering the log value index of the potential hit always proves whether it was a real hit or a false positive.
Mapping log values on their own linear index space ensures uniform filter utilization of identically structured filter maps. Compared to the alternative of constantly changing adaptive filter size, this approach greatly simplifies the storage and tree hashing scheme and the construction of Merkle multiproofs covering longer block ranges. It also allows mapping each address and topic value separately to consecutive indices and implementing specific address/topic pattern filters.
Sparse horizontal mapping
False positive rate depends on map density and number of marks placed on the map per log value. It can be estimated as FALSE_POSITIVE_RATE = (VALUES_PER_MAP * MARKS_PER_VALUE / MAP_WIDTH / MAP_HEIGHT) ** MARKS_PER_VALUE
. The proposed data structure achieves a low false positive rate by choosing a high MAP_WIDTH
while only adding one mark per value. An alternative would be using multiple marks per log value and a lower MAP_WIDTH
. With a fixed target FALSE_POSITIVE_RATE
the necessary MAP_WIDTH
can be calculated as follows:
AVG_MARKS_PER_ROW = VALUES_PER_MAP * MARKS_PER_VALUE / MAP_HEIGHT
MAP_WIDTH = AVG_MARKS_PER_ROW / (FALSE_POSITIVE_RATE ** (1 / MARKS_PER_VALUE))
One important factor that depends on map size and marks per value is data efficiency. A wider map requires more bits per mark to encode but also allows less marks per value. A simple way of encoding rows is a binary encoding of each mark column index, requiring ceil(log2(MAP_WIDTH))
bits per mark. A close to optimal compression of sparse bit vectors is also considered for comparison (formula not detailed here).
AVG_BITS_PER_ROW = ceil(log2(MAP_WIDTH)) * AVG_MARKS_PER_ROW
COMPRESSED_AVG_BITS_PER_ROW = (log2(MAP_WIDTH / AVG_MARKS_PER_ROW) + 1.5) * AVG_MARKS_PER_ROW
With the following values considered constant, data efficiency depends on MARKS_PER_VALUE
as follows:
VALUES_PER_MAP = 2**16
MAP_HEIGHT = 2**12
FALSE_POSITIVE_RATE = 1 / 2**28
MARKS_PER_VALUE | MAP_WIDTH | AVG_BITS_PER_ROW | COMPRESSED_AVG_BITS_PER_ROW |
---|---|---|---|
1 | 2**32 | 512 | 472 |
2 | 2**19 | 608 | 496 |
3 | 30964 | 720 | 520 |
4 | 2**13 | 832 | 544 |
It shows that the smaller encoding of individual mark indices can almost offset the larger number of marks but still, MARKS_PER_VALUE = 1
results in the best data efficiency when the total size of the filter data set is considered. The advantage of very sparse rows is even greater if we consider the data amount to be accessed when searching for a specific value which requires MARKS_PER_VALUE
rows to be accessed and matched to each other.
Hash tree structure
The hash tree structure and parameters are determined as a tradeoff between two kinds of costs: the bandwidth cost of searching for a log value in a long range of filter maps and the processing/memory cost of updating the canonical hash tree of this structure after processing a block and adding new log values. This tradeoff exists because either proving or modifying multiple leaf nodes of a Merkle tree is more efficient if those nodes are located close to each other.
Searching for a certain log value requires access to a given row index that is calculated as a function of the searched value and the epoch, meaning that in each epoch the same row index of consecutive filter maps are accessed. A row-major indexing of the two-dimensional tree (log_filter_tree[row_index][map_index]
) places these rows next to each other, allowing an efficient Merkle multiproof of an entire epoch that only contains 12 sibling nodes on the row_index
path, 24 on the map_index
path and the actual filter rows.
On the other hand, log values added during block processing typically placed into many different rows but occupies a continuous log value index range which in most cases falls into the range of a single filter map. A column-major indexing (log_filter_tree[map_index][row_index]
) places all rows of the same filter map next to each other, ensuring that during block processing typically only the subtree belonging to a single filter map and the map_index
path leading up to it needs to be re-hashed.
In order to achieve good performance in both aspects, a third, mixed index ordering approach has been chosen for log_filter_tree
that splits map_index
into two parts, epoch_index = map_index // MAPS_PER_EPOCH
and map_subindex = map_index % MAPS_PER_EPOCH
and indexes the tree as log_filter_tree[epoch_index][row_index][map_subindex]
. This approach allows efficient Merkle proofs like row-major ordering for entire epochs and actually performs even better when multiple epochs are proven because the expensive epoch_index
path needs to be proven only once instead of once per epoch. Tree update hashing cost is higher than in case of column-major ordering because it hash to also update a map_subindex
path for every updated row. As calculations below show though, the extra cost is significantly smaller than in case of row-major ordering which has to update an entire map_index
path for every updated row.
The calculations below are based on the proposed constants:
Name | Formula | Value |
---|---|---|
epoch_index path length | log2(MAX_EPOCH_HISTORY) | 24 hashes |
map_subindex path length | log2(MAPS_PER_EPOCH) | 6 hashes |
map_index path length | log2(MAX_EPOCH_HISTORY * MAPS_PER_EPOCH) | 30 hashes |
row_index path length | log2(MAP_HEIGHT) | 12 hashes |
average row length | AVG_BITS_PER_ROW / 8 | 64 bytes |
Merkle proof size is estimated for three different search range lengths: single map, one epoch and 64 epochs. Note that the leaves of the binary Merkle tree are hashes of actual rows but the proof contains the actual row data which has a varying size. For these calculations the average row length is used.
indexing order | 1 map | 1 epoch | 64 epochs |
---|---|---|---|
column-major | (30+12)*32+64 = 1408 | 24*32+64*(12*32+64) = 29440 | 18*32+4096*(12*32+64) = 1835584 |
row-major | (30+12)*32+64 = 1408 | (12+24)*32+64*64 = 5248 | 64*((12+24)*32+64*64) = 335872 |
mixed order | (30+12)*32+64 = 1408 | (24+12)*32+64*64 = 5248 | 18*32+64*(12*32+64*64) = 287296 |
Tree update costs are estimated as a worst-case marginal cost of each additional log value in terms of updated tree nodes and amount of hashed data including actual row data. Note that it is assumed here that all updated rows belong to one or two filter maps and therefore the updated tree nodes leading up to these filter maps are a negligible one time cost. Also note that just like in the calculations above the average encoded row data length is used for hashed data amount estimation.
indexing order | updated nodes | hashed bytes |
---|---|---|
column-major | 12 | 12*64+64 = 832 |
row-major | 30+12 = 42 | 42*64+64 = 2752 |
mixed order | 12+6 = 18 | 18*64+64 = 1216 |
Backwards Compatibility
The existing log filter API (eth_getLogs
, eth_newFilter
, eth_getFilterLogs
, eth_getFilterChanges
) can be implemented with the new filter data structure. Applications relying on this API can operate without any change, with a higher performance. The EVM is not affected in any way.
Test Cases
Reference Implementation
Security Considerations
Safe access with a remote prover
In order to trustlessly access relevant data with the help of a remote prover, the prover needs to
- prove canonical execution headers by number in order to determine the log value index range belonging to the searched block number range
- prove the relevant rows of filter maps based on map index and row index
- prove an individual transaction receipt with a Merkle proof from a canonical execution header, based on the log value index of potential matches
The relevant parts of filter maps can be proved with SSZ multiproofs from the log_filter_root
of the latest execution header. Proving canonical execution headers requires a canonical beacon header from which it is possible to prove any beacon state root through the state_roots
and historical_summary
structures, from which it is possible to prove the belonging execution block hash. In conclusion, any search with a remove prover is as safe as the client's knowledge about a recent canonical beacon header.
False positives
From the filter maps a set of potential matches can be derived for any block range and log value or pattern of log values. A canonical transaction receipt that covers the log value index for the potential match can prove whether it was actual match or a false positive. The design guarantees that the set of potential matches includes all actual matches but it should also be ensured that an excessive amount false positives will not make the bandwidth and processing costs of the search prohibitively high.
False positives can happen when reverse_transform(column_index, transform_hash)
gives a valid potential_value_subindex
that is less than VALUES_PER_MAP
while the column_index
was actually generated by column_transform(value_subindex, another_transform_hash)
where another_transform_hash
belongs to a different log value. By assuming a random uniform distribution of column indices the chance of a false positive can be estimated as VALUES_PER_MAP / MAP_WIDTH = 1 / 2**16
for each mark in the map row. Since each map contains a maximum of VALUES_PER_MAP
marks, by assuming a random uniform distribution of row indices the rate of false positives can be estimated as VALUES_PER_MAP^2 / MAP_WIDTH / MAP_HEIGHT = 1 / 2**12
per map or VALUES_PER_MAP / MAP_WIDTH / MAP_HEIGHT = 1 / 2**28
per individual log value. Assuming 2**11
log values per block this gives a rate of one false positive per 2**17
blocks.
Though certain log values might be used a lot more than others and therefore the row index distribution might not be entirely uniform, changing the row mapping per filter epoch ensures that even very often used log values will not raise the false positive rate of certain other log values over the long term. A deliberate attack on a certain important log value (address value in this case, since topic values can be freely used by anyone) in order to raise its false positive rate can not be ruled out entirely since with a low amount of filter data generated per log value it is always possible to "mine" another value that generates colliding filter data. The sparse quasi-randomized horizontal mapping makes this attack a lot harder though, since the column index depends on both the log value and the exact log value index, making this attack only possible for block creators who are probably offered MEV rewards for much more lucrative manipulations of the transaction set.
Copyright
Copyright and related rights waived via CC0.