Skip to main content
Version: ArrowSquid

EVM Processor

This section applies to squids indexing EVM chains. See the supported networks page for a full list.

Overview and the data model

A squid processor is a Node.js process that fetches historical on-chain data from an Archive and/or a chain node RPC endpoint, performs arbitrary transformations and saves the result. EvmBatchProcessor is the central class that handles EVM data extraction, transformation and persistence. By convention, the processor entry point is src/main.ts; it is started by calling EvmBatchProcessor.run() there. A single batch handler function supplied to that method is responsible for transforming data from multiple blocks in a single in-memory batch.

A batch provides iterables to access all items requested in processor configuration, which may include logs, transactions, traces and contract state diffs; see the batch context and block data pages for details. Further, the processor can extract additional data by querying the historical chain state and indeed any external API.

Results of the ETL process can be stored in any Postgres-compatible database or in filesystem-based datasets in CSV and Parquet formats.

RPC ingestion

Starting with the ArrowSquid release, the processor can ingest data either from an Archive or directly from an RPC endpoint. If both an Archive and an RPC endpoint are provided, the processor will use the Archive until it reaches the highest block available there, then index the few remaining blocks using the RPC endpoint. This allows squids to combine low sync times with near real-time chain data access. It is, however, possible to use either just the Archive (e.g. for analytics) or just the RPC endpoint (e.g. for local development).

What's next?