Two dimensional log filter data structure
相关视频
正文
Abstract
Replace the fixed 2048 bit log event bloom filters in block headers with a new data structure that can adapt to the changing number of events per block and consistently guarantee a sufficiently low false positive ratio.
The proposed structure maps all log entries onto a global linear log index space and hashes them into a Merkle tree based on that index. It also contains a filter map for every fixed length section of the log index space, hashed into the same Merkle tree. These are two dimensional sparse bit maps similar to bloom filters that allow searching for log address/topic patterns. Unlike the per-block bloom filters, they allow searching for specific events by accessing only a small portion of the entire dataset which can also be proven with a Merkle proof, making the search both efficient and light client friendly. They also provide exact position information for potential hits. Instead of signaling probable presence in a given block, potential hits derived from a filter map are pointers to the log index space and can be directly used to look up the log at the given index from the Merkle tree of log entries.
The proposed design uses the same structure for efficient local search and remote proof generation/verification, thereby simplifying implementation of provers and verifiers. It also allows validators that are not interested in either searching or proving logs to generate the log index root hash by maintaining a minimal log index state with a relatively small (hard capped) size.
Motivation
Bloom filters are only useful as long as they are sufficiently sparse. False positive ratio rises rapidly with the number of events per filter and the density of 1
bits in the filter bit vector. In the currently existing bloom filter each log address and topic sets 3 out of a fixed length of 2048 bits which resulted in sufficiently sparse filters in the beginning but with the increase of the block gas limits the false positive ratio soon made the filter practically useless. Mainnet blocks currently add over 1000 log addresses and topics in average and therefore the bloom filter size would need to increase about tenfold in order to achieve acceptable false positive rates again, allowing log search with a proof size of about 3 kilobytes per block which is still kind of expensive.
The main purpose of making logs a part of consensus was to be able to prove these events with a low amount of data and making trustless interaction with smart contracts possible without operating a full node. The filter maps achieve this by sorting events into rows and allowing smaller log range proofs by about 3 orders of magnitude.
Specification
Terms and definitions
- log value: either an address value or a topic value. Each
LOG
opcode adds one address value and 0..4 topic values. A log value is represented by a 32 byte hash which is calculated asSHA2(address)
orSHA2(topic)
- log index: values are globally mapped to a linear index space, with a monotonically increasing log index assigned to each added log value. The log values are added in the order of EVM execution (address value first, then the topic values) so the logs generated in each block and in each transaction of the block occupy a continuous range in the index space. A block delimiter is also added between blocks which has its own log index and is added to the Merkle tree of log entries but not to the filter maps.
- log entry: an SSZ encoded log event with position metadata (a
LogEntry
container) is added to thelog_entries
Merkle tree at the first log index assigned to the event (the one assigned to the address value). The entries at indices assigned to topic values are left empty (aLogEntry
with all zero fields). - block delimiter: uses the same encoding as regular
LogEntry
containers but is always distinguishable. It is added to thelog_entries
tree before each block and has its own log index assigned. - filter map: a
MAP_WIDTH
byMAP_HEIGHT
sized sparse map intended to help searching for log values in a fixedVALUES_PER_MAP
length section of the log index space. Each log value is marked on the map at a row and column that depends on the log index and the log value itself. Rows are sparsely encoded as a list of marked column indices (in the order of occurence, duplicates not filtered out for simplicity). Each map contains at mostVALUES_PER_MAP
marks and therefore the chance of false positives is kept at a constant low level. - filter epoch: a
MAPS_PER_EPOCH
sized group of consecutive filter maps stored in the hash tree in a way so that multiple rows of adjacent filter maps with the same row index can be efficiently retrieved in a single Merkle multiproof. The log value to row index mapping is constant during a single epoch but changes between epochs.
Consensus data format
Block headers
Beginning at the execution timestamp FORK_TIMESTAMP
, execution clients MUST replace the logs_bloom
field of the header schema with log_index_root
which is the root hash of the LogIndex
structure after adding the logs emitted in the given block.
The resulting RLP encoding of the header is therefore:
rlp([
parent_hash,
0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347, # ommers hash
coinbase,
state_root,
txs_root,
receipts_root,
log_index_root,
0, # difficulty
number,
gas_limit,
gas_used,
timestamp,
extradata,
prev_randao,
0x0000000000000000, # nonce
base_fee_per_gas,
withdrawals_root,
blob_gas_used,
excess_blob_gas,
parent_beacon_block_root,
])
Container types
class LogIndex(Container):
epochs: Vector[LogIndexEpoch, MAX_EPOCH_HISTORY]
next_index: uint64 # next log index to be added
class LogIndexEpoch(Container):
filter_maps: Vector[Vector[Bytes32, MAPS_PER_EPOCH], MAP_HEIGHT] # linear SHA2 hashes of filter map rows, zero for empty/non-existent rows
log_entries: Vector[LogEntry, MAPS_PER_EPOCH * VALUES_PER_MAP] # LogEntry containers at the first index of each log event, empty otherwise
class LogEntry(Container):
log: Log
meta: LogMeta
class LogMeta(Container):
block_number: uint64
transaction_hash: Root
transaction_index: uint64
log_in_tx_index: uint64
class Log(Container):
address: ExecutionAddress
topics: List[Bytes32, MAX_TOPICS_PER_LOG]
data: ByteList[MAX_LOG_DATA_SIZE]
class BlockDelimiterEntry(Container):
dummyLog: Log # zero address and empty lists
meta: BlockDelimiterMeta
class BlockDelimiterMeta(Container):
block_number: uint64
block_hash: Root
timestamp: uint64
dummy_value: uint64 # 2**64-1
Log entries and block delimiters
Log events with position metadata are added to the log_entries
tree at the log index assigned to the address value of the log. This allows a simple Merkle proof of all fields of the eth_getLogs
JSON-RPC response except for the blockHash
which cannot be hashed into the log_entries
tree at the time of block processing because the hash of the processed block is not known yet. In order to make blockHash
provable too a BlockDelimiterEntry
is added with the parent block's number and hash before adding entries of the next block. This allows the prover to either prove the block delimiter of the same block number as the found log entry or in case of the current head block prove that the last log entry belongs to the same block number.
Filter map row encoding
Each row of the filter map is encoded as a series of little endian binary encoded column indices. With the proposed value of MAP_WIDTH = 2**32
this results in a simple and efficient encoding as a series of 4 byte values. Since the order of entries in this list should not affect the filter algorithm, for simplicity the indices are not sorted but simply appended to the list in the original order of the log events. In the rare case of column index collision (two events generating the same column index in the same row) the index is also added twice. This simplifies the in-memory maintenance of the tree.
Note that the number of indices in a row may vary. The total number of indices in a fully populated filter map is VALUES_PER_MAP
minus the number of block delimiters which are not marked on the map. Though the average row size is about VALUES_PER_MAP // MAP_HEIGHT
, the upper limit of individual row length is VALUES_PER_MAP
. Though such a list could be represented for consensus hashing purposes as List[Bytes4, VALUES_PER_MAP]
it would just add an unnecessary layer of complexity and extra processing time to tree hash this list since typically the entire row is required for any meaningful search operation. Therefore the rows regardless of their length are simply SHA2
hashed and the hashes are stored in the consensus tree format.
Proposed constants
Name | Value |
---|---|
MAP_WIDTH | 2**32 |
MAP_HEIGHT | 2**12 |
VALUES_PER_MAP | 2**16 |
MAPS_PER_EPOCH | 2**6 |
MAX_EPOCH_HISTORY | 2**24 |
Initialization and minimal state
Since the Merkle trees are updated with increasing keys, they can be initialized with a Merkle branch of the next leaf to be updated, plus the previous leaf value itself except in case of log_entries_branch
where this is always zero. Providing LogIndexMinimalState
on request needs to be added to the sync protocol in order to allow freshly synced nodes to bootstrap. A validator that wants to keep its state minimal and does not want to prove historical log index data can also discard old Merkle tree nodes after update and use this structure as its log index state.
class LogIndexMinimalState:
next_index: uint64 # next log index to be added
epoch_branch: Vector[Bytes32, floorlog2(MAX_EPOCH_HISTORY)] # merkle branch of the epoch where next_index points
filter_map_rows: Vector[List[uint32, VALUES_PER_MAP], MAP_HEIGHT] # rows of the filter map where next_index points
filter_map_branches: Vector[Vector[Bytes32, floorlog2(MAPS_PER_EPOCH)], MAP_HEIGHT] # merkle branches of each row of the filter map where next_index points
log_entries_branch: Vector[Bytes32, floorlog2(MAPS_PER_EPOCH * VALUES_PER_MAP) + 3] # merkle branch of log entry where next_index points
This structure consists of floorlog2(MAX_EPOCH_HISTORY)+MAP_HEIGHT*floorlog2(MAPS_PER_EPOCH)+floorlog2(MAPS_PER_EPOCH * VALUES_PER_MAP)
branch hashes and at most VALUES_PER_MAP
column indices in total. With the proposed constants this amounts to 32*(24+4096*6+6+16)+4*65536 = 1050048
bytes plus encoding overhead.
Updating the log index
Filter map row and column mapping
The log value mapping functions get_row_index
and get_column_index
take a log value and its position in the linear index space and return a position of the filter map where a mark for the given log value should be placed.
def get_row_index(epoch_index, log_value):
# epoch_index = log_value_index // VALUES_PER_MAP // MAPS_PER_EPOCH
return SHA2(log_value + uint32_to_littleEndian(epoch_index)) % MAP_HEIGHT
def get_column_index(log_value_index, log_value):
value_subindex = log_value_index % VALUES_PER_MAP
map_index = log_value_index // VALUES_PER_MAP
transform_hash = SHA2(log_value + uint32_to_littleEndian(map_index))
return column_transform(value_subindex, transform_hash)
The function column_transform
and its inverse reverse_transform
realize a quasi-random bijective mapping on the set of integers between 0 .. MAP_WIDTH-1
. They are realized with a series of reversible operations (modulo multiplication with odd numbers, modulo addition, binary XOR) where the second arguments of these operations are taken from different sections of the transform_hash
. When adding a mark on the filter map the sub-index of the log value
inside the map (value_subindex = log_index % VALUES_PER_MAP
) is used as an input for column_transform
.
def column_transform(value_subindex, transform_hash):
x = value_subindex
x = (x + littleEndian_to_uint32(transform_hash[0:4])) % 2**32
x = (x * (littleEndian_to_uint32(transform_hash[4:8])*2+1)) % 2**32
x = x ^ littleEndian_to_uint32(transform_hash[8:12]
x = (x * (littleEndian_to_uint32(transform_hash[12:16])*2+1)) % 2**32
x = (x + littleEndian_to_uint32(transform_hash[16:20])) % 2**32
x = (x * (littleEndian_to_uint32(transform_hash[20:24])*2+1)) % 2**32
x = x ^ littleEndian_to_uint32(transform_hash[24:28]
x = (x * (littleEndian_to_uint32(transform_hash[28:32])*2+1)) % 2**32
return x
def reverse_transform(column_index, transform_hash):
x = column_index
x = (x * modInverse(littleEndian_to_uint32(transform_hash[28:32])*2+1, 2**32)) % 2**32
x = x ^ littleEndian_to_uint32(transform_hash[24:28]
x = (x * modInverse(littleEndian_to_uint32(transform_hash[20:24])*2+1, 2**32)) % 2**32
x = (x + 2**32 - littleEndian_to_uint32(transform_hash[16:20])) % 2**32
x = (x * modInverse(littleEndian_to_uint32(transform_hash[12:16])*2+1, 2**32)) % 2**32
x = x ^ littleEndian_to_uint32(transform_hash[8:12]
x = (x * modInverse(littleEndian_to_uint32(transform_hash[4:8])*2+1, 2**32)) % 2**32
x = (x + 2**32 - littleEndian_to_uint32(transform_hash[0:4])) % 2**32
return x
Note that with a non-reversible column_transform
the searcher would need to iterate through the entire VALUES_PER_MAP
range for each map row and searched log value and check whether there is a mark in the resulting column_index
, indicating a potential hit. Having a matching reverse_transform
function makes it possible to just iterate through the actual marked column indices found in the map row and see whether they can be transformed back into a potentially valid value_subindex
that is less than VALUES_PER_MAP
:
def potential_match_index(map_index, column_index, log_value):
transform_hash = SHA2(log_value + uint32_to_littleEndian(map_index))
potential_value_subindex = reverse_transform(column_index, transform_hash)
if potential_value_subindex < VALUES_PER_MAP:
return map_index * VALUES_PER_MAP + potential_value_subindex
return -1
Block processing
# Add all log values emitted in the block to the log index; should be called even if the block is empty
def add_block_logs(log_index, block):
if block.number > 0:
# add block delimiter entry
block_delimiter_meta = BlockDelimiterMeta(block_hash = block.parent_hash, block_number = block.number-1, timestamp = block.parent.timestamp, dummy_value = 2**64-1)
block_delimiter_entry = BlockDelimiterEntry(meta = block_delimiter_meta)
log_index.epochs[log_index.next_index // (VALUES_PER_MAP*MAPS_PER_EPOCH)].log_entries[log_index.next_index % (VALUES_PER_MAP*MAPS_PER_EPOCH)] = block_delimiter_entry
log_index.next_index += 1
# add log entries and mark log values on filter maps
for tx_index, receipt in enumerate(block.receipts):
tx_hash = SHA3(block.transactions[tx_index])
for log_in_tx_index, log in enumerate(receipt.logs):
log_meta = LogMeta(transaction_hash = tx_hash, block_number = block.number, transaction_index = tx_index, log_in_tx_index = log_in_tx_index)
log_entry = LogEntry(meta = log_meta, log = log)
log_index.epochs[log_index.next_index // (VALUES_PER_MAP*MAPS_PER_EPOCH)].log_entries[log_index.next_index % (VALUES_PER_MAP*MAPS_PER_EPOCH)] = log_entry
add_log_value(log_index, address_value(log.address))
for topic in log.topics:
add_log_value(log_index, topic_value(topic))
block.log_index_root = hash_tree_root(log_index.root)
# Mark a single log value on the filter maps
def add_log_value(log_index, log_value):
map_index = log_index.next_index // VALUES_PER_MAP
epoch_index = map_index // MAPS_PER_EPOCH
map_subindex = map_index % MAPS_PER_EPOCH
row_index = get_row_index(epoch_index, log_value)
column_index = get_column_index(log_index.next_index, log_value)
log_index.epochs[epoch_index].filter_maps[row_index][map_subindex].append(column_index)
log_index.next_index += 1
def address_value(address):
return SHA2(address)
def topic_value(topic):
return SHA2(topic)
Rationale
Log index space
In each block a varying number of log values are emitted. In addition to inefficient search, another drawback of per-block fixed size bloom filters is the varying filter utilization leading to over-utilized filters giving many false positives in some blocks and/or wastefully under-utilized filters in some blocks. Block gas limits also tend to change significantly over the long term so any future-proof solution has to be able to adapt to the varying number of log values per block.
Since filter maps of the proposed design are no longer associated with individual blocks but contain filter data from multiple blocks, it is possible to detach them from block boundaries and put a fixed number of log values into each filter map. The horizontal mapping realized by column_transform
and reverse_transform
ensures that potential hits returned by a search give not only the index of the filter map but the exact log index where the searched log value was probably added and which can be directly used to look up the single log entry from log_enrties
.
Mapping log values on their own linear index space ensures uniform filter utilization of identically structured filter maps. Compared to the alternative of constantly changing adaptive filter size, this approach greatly simplifies the storage and tree hashing scheme and the construction of Merkle multiproofs covering longer block ranges. It also allows mapping each address and topic value separately to consecutive indices and implementing specific address/topic pattern filters.
Sparse horizontal mapping
False positive rate depends on map density and number of marks placed on the map per log value. It can be estimated as FALSE_POSITIVE_RATE = (VALUES_PER_MAP * MARKS_PER_VALUE / MAP_WIDTH / MAP_HEIGHT) ** MARKS_PER_VALUE
. The proposed data structure achieves a low false positive rate by choosing a high MAP_WIDTH
while only adding one mark per value. An alternative would be using multiple marks per log value and a lower MAP_WIDTH
. With a fixed target FALSE_POSITIVE_RATE
the necessary MAP_WIDTH
can be calculated as follows:
AVG_MARKS_PER_ROW = VALUES_PER_MAP * MARKS_PER_VALUE / MAP_HEIGHT
MAP_WIDTH = AVG_MARKS_PER_ROW / (FALSE_POSITIVE_RATE ** (1 / MARKS_PER_VALUE))
One important factor that depends on map size and marks per value is data efficiency. A wider map requires more bits per mark to encode but also allows less marks per value. A simple way of encoding rows is a binary encoding of each mark column index, requiring ceil(log2(MAP_WIDTH))
bits per mark. A close to optimal compression of sparse bit vectors is also considered for comparison (formula not detailed here).
AVG_BITS_PER_ROW = ceil(log2(MAP_WIDTH)) * AVG_MARKS_PER_ROW
COMPRESSED_AVG_BITS_PER_ROW = (log2(MAP_WIDTH / AVG_MARKS_PER_ROW) + 1.5) * AVG_MARKS_PER_ROW
With the following values considered constant, data efficiency depends on MARKS_PER_VALUE
as follows:
VALUES_PER_MAP = 2**16
MAP_HEIGHT = 2**12
FALSE_POSITIVE_RATE = 1 / 2**28
MARKS_PER_VALUE | MAP_WIDTH | AVG_BITS_PER_ROW | COMPRESSED_AVG_BITS_PER_ROW |
---|---|---|---|
1 | 2**32 | 512 | 472 |
2 | 2**19 | 608 | 496 |
3 | 30964 | 720 | 520 |
4 | 2**13 | 832 | 544 |
It shows that the smaller encoding of individual mark indices can almost offset the larger number of marks but still, MARKS_PER_VALUE = 1
results in the best data efficiency when the total size of the filter data set is considered. The advantage of very sparse rows is even greater if we consider the data amount to be accessed when searching for a specific value which requires MARKS_PER_VALUE
rows to be accessed and matched to each other.
Log entries tree
Hashing entire logs along with position info into a tree greatly simplifies the remote proving/verifying process. There is no need to separately prove the canonicalness of block headers and the receipts referenced in them, everything can be proven with Merkle proofs of a single LogIndex
structure.
Storing the log_entries
subtrees directly in their proposed merkleized format is not really efficient though. This is not an issue for validators that want to maintain a minimal state; for them, updating the log_entries
subtrees is really cheap as they only need to maintain a single Merkle branch pointing to the next log index. Provers can implement log_entries
efficiently by storing a subset of the Merkle tree nodes (the ones close to the root of each epoch's subtree) and generate the rest on demand based on the receipts that encode the same data in a more compact form. This also requires storing the log index of block delimiters along with reverse pointers pointing to block numbers at the beginning of on-demand generated log_entries
subtrees.
Backwards Compatibility
The existing log filter API (eth_getLogs
, eth_newFilter
, eth_getFilterLogs
, eth_getFilterChanges
) can be implemented with the new filter data structure. Applications relying on this API can operate without any change, with a higher performance. The EVM is not affected in any way.
Test Cases
Reference Implementation
Security Considerations
Safe access with a remote prover
In order to prove a complete set of matches matching a given search pattern in a given block range, the prover needs to
- prove the log index range that corresponds to the searched block number range by proving the block delimiter entries of
first_block - 1
andlast_block
- prove the relevant rows of filter maps based on map index and row index (verifier can determine the relevant rows in advance based on the log values in the search pattern and the relevant log index range)
- prove the actual log entry belonging to any potentially matching log index and also the block delimiter entry with the same block number if
blockHash
of the log is needed
Since all three steps can be realized with Merkle proofs of the same LogIndex
structure referenced in the block headers, any search with a remove prover is as safe as the client's knowledge about the most recent available canonical block header.
False positives
From the filter maps a set of potential matches can be derived for any block range and log value or pattern of log values. These matches can then be looked up in the log_entries
trees and actually matching logs can be added to the set of results. The design guarantees that the set of potential matches includes all actual matches but it should also be ensured that an excessive amount false positives will not make the bandwidth and processing costs of the search prohibitively high.
False positives can happen when reverse_transform(column_index, transform_hash)
gives a valid potential_value_subindex
that is less than VALUES_PER_MAP
while the column_index
was actually generated by column_transform(value_subindex, another_transform_hash)
where another_transform_hash
belongs to a different log value. By assuming a random uniform distribution of column indices the chance of a false positive can be estimated as VALUES_PER_MAP / MAP_WIDTH = 1 / 2**16
for each mark in the map row. Since each map contains a maximum of VALUES_PER_MAP
marks, by assuming a random uniform distribution of row indices the rate of false positives can be estimated as VALUES_PER_MAP^2 / MAP_WIDTH / MAP_HEIGHT = 1 / 2**12
per map or VALUES_PER_MAP / MAP_WIDTH / MAP_HEIGHT = 1 / 2**28
per individual log value. Assuming 2**11
log values per block this gives a rate of one false positive per 2**17
blocks.
Though certain log values might be emitted a lot more than others and therefore the row index distribution might not be entirely uniform, changing the row mapping per filter epoch ensures that even very often used log values
will not raise the false positive rate of certain other log values
over the long term. A deliberate attack on a certain important log value
in order to raise its false positive rate can not be ruled out entirely since with a low amount of filter data generated per log value it is always possible to "mine" another value that generates colliding filter data. The sparse quasi-randomized horizontal mapping makes this attack a lot harder though, since the column index depends on both the log value and the exact log index, making this attack only possible for block creators who are probably offered MEV rewards for much more lucrative manipulations of the transaction set.
Copyright
Copyright and related rights waived via CC0.