Decoding Decentralization
An Ethereum Blockchain Data Primer: Part II
In Part I of Decoding Decentralization, we framed Ethereum (and all blockchains really) as a distributed decentralized database.
We then used the change-data capture design pattern to replay all transfer events of a given smart contract.
By capturing those events through the log entries they create, we could aggregate them and determine how many Murakami Flower NFTs a given wallet address has, or which holder wallets constitute a whale, etc.
In Part II, I want to dive deeper into the interaction between blocks, transactions, and the log entries they create, as well as the process of log ABI translation which is critical for any chain-based analytics.
The State of the Chain Address
If we think of Ethereum as a gigantic database that manages the state, just like all traditional databases, the transaction is the primary mechanism to initiate a state change.
For example, if you buy one of those beautiful Murakami Flowers NFTs, then its smart contract has to update its internal mapping between wallet addresses and tokens.
Also, since you're spending ETH, Ethereum needs to keep track of your wallet's new account balance post-purchase.
Put simply, multiple state changes can occur throughout a transaction - account-based ones, which are global to the entire chain, and more localized ones, which live inside the smart contract's on-chain storage.
Let's uncover where all of this state is buried by taking a known block that contains Murakami NFT transactions and decoding it.
Say hello to my little friend, block 0x1055D65
:
curl -s -X POST -H 'Content-Type: application/json' http://localhost:8545 '{"id": 1, "jsonrpc": "2.0", "method": "eth_getBlockByNumber", "params": ["0x1055D65", false]}' | jq .result
{
"baseFeePerGas": "0x8ae26672e",
"difficulty": "0x0",
"extraData": "0x7273796e632d6275696c6465722e78797a",
"gasLimit": "0x1c9c380",
"gasUsed": "0x99253e",
"hash": "0xec5437bf2ff64483deac4a619b06c32ff01524a77d6c5f10efff2418d39bcd77",
"logsBloom": "0x5fed4420c61d14aec4076260e36040b944994068e0848d26e4e36da63e9034a452075f8f9e683b681c29038a530983303281847c895360d282184625123f6cc87049519a2f3098682b65c80f4436f87955ee815c9347ca434994c8b1903baec47b1000003b26000205466ca0045889c1b211f84ca195d49c82d303d17bac2779284012d90db00cc861dd19c840c20912652c0e834f4fa468c9a124c6207d08b2eacd6584d993628bcbe2d8c030891c0a852444086cd2a002dea01a7a82a2d457158b126ac041c309002c5fda10859950c3c6bc38e4ed22528c34e726168a6f4600fdf6a8254420b349f4b0cc38014861c8f6232db8ead14e8c006802ea63345a",
"miner": "0x1f9090aae28b8a3dceadf281b0f12828e676c326",
"mixHash": "0x671783b16a35ac54e3066fb8aae0962bae26d67d68b5f25866914194d4075ae7",
"nonce": "0x0000000000000000",
"number": "0x1055d65",
"parentHash": "0x2b0d3aec64ffdaf54125bd2b8ce38ff8746ce94047de6bad1d7de9cec21e1745",
"receiptsRoot": "0x4e5fd686c8e4c3630b818be3f13f88ee31c932f09bb4f8393bb8f9ebd7f733de",
"sha3Uncles": "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
"size": "0x1112e",
"stateRoot": "0x137765da079742438704e164e6ced030c938161242cd6a54532417e8b5db58f4",
"timestamp": "0x6448cd3f",
"totalDifficulty": "0xc70d815d562d3cfa955",
"transactions": [
"0x63049d0b5ca39822fe1595f59a300b80e11fea4083a0c32ae82f31999add96fa",
"0x78d95bc09e45e50e65eeabf752a3e54aea825ec41edf5cf9867d4211af463300",
"0x8d3a673c67adbc7a7356163fe4cf39bf4e904d70f82913b3f14f2c33ecb4fd89",
...
"transactionsRoot": "0x749d382646b7d044ce88d02bedfc3d2010bb55b2bfc69071941a2179267fd48e",
"uncles": [],
"withdrawals": [
{
"index": "0x16ee8d",
"validatorIndex": "0x4aef6",
"address": "0xb9d7934878b5fb9610b3fe8a5e441e8fad7e293f",
"amount": "0xbb8d7c"
},
...
]
}
Notice this block has a stateRoot
, a hash of the global state table's root node. The state is stored as a trie.
You can think of this table as a mapping between an account (a wallet or smart contract address) to its current state (its balance
, nonce
, and if it is a contract address, its storageRoot
and codeHash
).
The transactionRoot
is another root node of all successful transactions (again, more global state)
When a transaction is validated, all of its outcomes are stored in the receiptsRoot
trie, e.g., how much gas was used, any events emitted as log entries,, etc.
Thus, most high-value data we want to extract lies in the transaction's receipt
.
Remember, the transaction represents a completed smart contract call, and the receipt
represents an audit trail of what happened during that call's execution.
Let's dump all of this block's transactions receipts via the eth_getBlockReceipt
API:
curl -s -X POST -H 'Content-Type: application/json' http://localhost:8545 '{"id": 1, "jsonrpc": "2.0", "method": "eth_getBlockReceipts", "params": ["0x1055D65"]}' | jq .result
[
{
"blockHash": "0xec5437bf2ff64483deac4a619b06c32ff01524a77d6c5f10efff2418d39bcd77",
"blockNumber": "0x1055d65",
"contractAddress": null,
"cumulativeGasUsed": "0x234f2",
"effectiveGasPrice": "0x8ae26672e",
"from": "0xae2fc483527b8ef99eb5d9b44875f005ba1fae13",
"gasUsed": "0x234f2",
"logs": [
{
"address": "0x7cb683151a83c2b10a30cbb003cda9996228a2ba",
"topics": [
"0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef",
"0x0000000000000000000000006b75d8af000000e20b7a7ddf000ba900b4009a80",
"0x000000000000000000000000c1cb7f41cc17077eb05e22801ec636709f4fc0ca"
],
"data": "0x0000000000000000000000000000000000000008176d9b000000000000000000",
"blockNumber": "0x1055d65",
"transactionHash": "0x63049d0b5ca39822fe1595f59a300b80e11fea4083a0c32ae82f31999add96fa",
"transactionIndex": "0x0",
"blockHash": "0xec5437bf2ff64483deac4a619b06c32ff01524a77d6c5f10efff2418d39bcd77",
"logIndex": "0x0",
"removed": false
},
...
}
]
For every transaction contained within this block, we get an output of all their log
entries, providing our audit trail.
But we are only interested in log
entries from the Murakami Flowers contract. Let's filter on the address
field to reduce our output substantially:
curl -s -X POST -H 'Content-Type: application/json' http://localhost:8545 '{"id": 1, "jsonrpc": "2.0", "method": "eth_getBlockReceipts", "params": ["0x1055D65"]}' | jq '.result[].logs | select(.[].address == "0x7d8820fa92eb1584636f4f5b8515b5476b75171a")'
{
"address": "0x7d8820fa92eb1584636f4f5b8515b5476b75171a",
"topics": [
"0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925",
"0x0000000000000000000000008ae57a027c63fca8070d1bf38622321de8004c67",
"0x0000000000000000000000000000000000000000000000000000000000000000",
"0x0000000000000000000000000000000000000000000000000000000000000d0f"
],
"data": "0x",
"blockNumber": "0x1055d65",
"transactionHash": "0x990c2068f85da8e99b0b782376f37206e6a56bde3dcd9ed3cb10236a93ee5735",
"transactionIndex": "0x71",
"blockHash": "0xec5437bf2ff64483deac4a619b06c32ff01524a77d6c5f10efff2418d39bcd77",
"logIndex": "0xf6",
"removed": false
}
{
"address": "0x7d8820fa92eb1584636f4f5b8515b5476b75171a",
"topics": [
"0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef",
"0x0000000000000000000000008ae57a027c63fca8070d1bf38622321de8004c67",
"0x000000000000000000000000bfabfedaf252e68f0e8ac94197716be416a9dc17",
"0x0000000000000000000000000000000000000000000000000000000000000d0f"
],
"data": "0x",
"blockNumber": "0x1055d65",
"transactionHash": "0x990c2068f85da8e99b0b782376f37206e6a56bde3dcd9ed3cb10236a93ee5735",
"transactionIndex": "0x71",
"blockHash": "0xec5437bf2ff64483deac4a619b06c32ff01524a77d6c5f10efff2418d39bcd77",
"logIndex": "0xf7",
"removed": false
}
We have one transaction from the Murakami Flowers contract contained in this block with the hash 0x990c2...
. But during execution, the contract emitted two different events. We know from Part I, that topic[0]
contains the hash of the event signature. Above we have two different hashes, 0x8c5be...
and 0xddf25...
respectively, ergo two different events.
Now we can see what the call to eth_getLogs
essentially does to filter events-of-interest - the call looks at all of the transaction receipts in each block between fromBlock
and toBlock
, matching on the list of addresses
before applying topic
filtering to check all relevant log
entries then.
Apple of My ABI
We should now understand how a transaction causes a smart contract to execute code that emits events and how those events are stored on-chain.
We also know how to filter for them via the eth_getLogs
and eth_getBlockReceipts
JSON-RPC calls.
Note: There is also
eth_getTransactionReceipt
, which returns a single receipt given a transaction hash.
But log
entry results are still encoded in binary format, so the onus is on us to decode and determine that say, topic[0]
's hash is, in fact, a Transfer event.
What if we could decode these events programmatically?
The good news is we can by leveraging popular client-side libraries like Ethers.js and web3.py, which use the contract's JSON ABI document as their Rosetta Stone.
Here is a simple JavaScript program to do it:
const ethers = require("ethers");
const fetch = require("node-fetch");
// Murakami Flowers NFT Contract address
const contractAddr = "0x7D8820FA92EB1584636f4F5b8515B5476B75171a";
const etherscanAPI = `http://api.etherscan.io/api?module=contract&action=getabi&address=${contracttAddr}&format=raw`;
// The Kurakami Flowers log entries from the
// transaction receipts of block 0x1055D65
const events = JSON.parse(`
[
{
"address": "0x7d8820fa92eb1584636f4f5b8515b5476b75171a",
"topics": [
"0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925",
"0x0000000000000000000000008ae57a027c63fca8070d1bf38622321de8004c67",
"0x0000000000000000000000000000000000000000000000000000000000000000",
"0x0000000000000000000000000000000000000000000000000000000000000d0f"
],
"data": "0x",
"blockNumber": "0x1055d65",
"transactionHash": "0x990c2068f85da8e99b0b782376f37206e6a56bde3dcd9ed3cb10236a93ee5735",
"transactionIndex": "0x71",
"blockHash": "0xec5437bf2ff64483deac4a619b06c32ff01524a77d6c5f10efff2418d39bcd77",
"logIndex": "0xf6",
"removed": false
},
{
"address": "0x7d8820fa92eb1584636f4f5b8515b5476b75171a",
"topics": [
"0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef",
"0x0000000000000000000000008ae57a027c63fca8070d1bf38622321de8004c67",
"0x000000000000000000000000bfabfedaf252e68f0e8ac94197716be416a9dc17",
"0x0000000000000000000000000000000000000000000000000000000000000d0f"
],
"data": "0x",
"blockNumber": "0x1055d65",
"transactionHash": "0x990c2068f85da8e99b0b782376f37206e6a56bde3dcd9ed3cb10236a93ee5735",
"transactionIndex": "0x71",
"blockHash": "0xec5437bf2ff64483deac4a619b06c32ff01524a77d6c5f10efff2418d39bcd77",
"logIndex": "0xf7",
"removed": false
}
]`);
async function main() {
// Fetch the Muralami Flowers contract document
const resp = await fetch(etherscanAPI);
// Parse JSON
const abi = await resp.json();
// Instantiate an interface based on that document
const iface = new ethers.utils.Interface(abi);
// Decode each event and print LogFrament
events.forEach((evt) => {
console.log(iface.parseLog(evt));
});
}
main();
The first few lines import various libraries and define a few globals. This script uses the venerable Ethers.js library.
In our main()
function, the first order of business is to download the Murakami Flowers NFT contract directly from Etherscan.io.
Note: You don't need an Etherscan developer token to do so, but you can incur rate-limiting errors depending on the number of times you execute this script in succession.
After the download is complete, we parse the JSON output into an ABI document. We then instantiate an ethers.utils.Interface
object using the ABI JSON.
Finally, for each event JSON fragment, we use parseLog()
to translate it.
When you run this script, you should see the following output:
LogDescription {
eventFragment: {
name: 'Approval',
anonymous: false,
inputs: [ [ParamType], [ParamType], [ParamType] ],
type: 'event',
_isFragment: true,
constructor: [Function: EventFragment] {
from: [Function (anonymous)],
fromObject: [Function (anonymous)],
fromString: [Function (anonymous)],
isEventFragment: [Function (anonymous)]
},
format: [Function (anonymous)]
},
name: 'Approval',
signature: 'Approval(address,address,uint256)',
topic: '0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925',
args: [
'0x8aE57A027c63fcA8070D1Bf38622321dE8004c67',
'0x0000000000000000000000000000000000000000',
BigNumber { _hex: '0x0d0f', _isBigNumber: true },
owner: '0x8aE57A027c63fcA8070D1Bf38622321dE8004c67',
approved: '0x0000000000000000000000000000000000000000',
tokenId: BigNumber { _hex: '0x0d0f', _isBigNumber: true }
]
}
LogDescription {
eventFragment: {
name: 'Transfer',
anonymous: false,
inputs: [ [ParamType], [ParamType], [ParamType] ],
type: 'event',
_isFragment: true,
constructor: [Function: EventFragment] {
from: [Function (anonymous)],
fromObject: [Function (anonymous)],
fromString: [Function (anonymous)],
isEventFragment: [Function (anonymous)]
},
format: [Function (anonymous)]
},
name: 'Transfer',
signature: 'Transfer(address,address,uint256)',
topic: '0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef',
args: [
'0x8aE57A027c63fcA8070D1Bf38622321dE8004c67',
'0xBFAbFEDAf252E68f0E8AC94197716bE416A9dc17',
BigNumber { _hex: '0x0d0f', _isBigNumber: true },
from: '0x8aE57A027c63fcA8070D1Bf38622321dE8004c67',
to: '0xBFAbFEDAf252E68f0E8AC94197716bE416A9dc17',
tokenId: BigNumber { _hex: '0x0d0f', _isBigNumber: true }
]
}
Each event is translated into a LogDescription
object, that has the event name
as well as a signature
.
Notice that topic
is actually topic[0]
from our original log
entry and is again the keccak-256 hash of the event's signature
. The args
array has the first three elements from topic[1]
to topic[3]
in raw format and then the following three elements as their translated versions
We now know that Approval
takes an owner
, approved
, and tokenId
while the Transfer
event takes from
, to
, and tokenId
respectively
TL;DR
- A transaction calls a smart contract function.
- The smart contract function executes and may change local and global states.
- When a transaction completes, a transaction receipt is generated.
- The transaction receipt contains a log of all events that resulted from that transaction.
- The log entries are encoded in binary (for space reasons mainly)
- We can translate the binary back into human-readable form for analytics using the smart contract's ABI JSON file.
Blockchain data can be intimidating for the uninitiated, but hopefully, this two-part series gave you a gentle but thorough introduction to how to get started working with it.