Skip to main content
Version: Old ArrowSquid docs

EVM Processor

This section applies to squids indexing EVM chains. See the supported networks page for a full list. //: # (!!!! Subsquid supports all major EVM chains, including Ethereum, Polygon, BSC and many others.)

Overview and the data model

A squid processor is a Node.js process that fetches historical on-chain data from an Archive and/or a chain node RPC endpoint, performs arbitrary transformations and saves the result. EvmBatchProcessor is the central class that handles EVM data extraction, transformation and persistence. By convention, the processor entry point is src/main.ts; it is started by calling there. A single batch handler function supplied to that method is responsible for transforming data from multiple blocks in a single in-memory batch.

A batch provides iterables to access all items requested in processor configuration, which may include logs, transactions, traces and contract state diffs; see the batch context and block data pages for details. Further, the processor can extract additional data by querying the historical chain state and indeed any external API.

Results of the ETL process can be stored in any Postgres-compatible database or in filesystem-based datasets in CSV and Parquet formats.

RPC ingestion

Starting with the ArrowSquid release, the processor can ingest data either from an Archive or directly from an RPC endpoint. If both an Archive and an RPC endpoint are provided, the processor will use the Archive until it reaches the highest block available there, then index the few remaining blocks using the RPC endpoint. This allows squids to combine low sync times with near real-time chain data access. It is, however, possible to use either just the Archive (e.g. for analytics) or just the RPC endpoint (e.g. for local development).

RPC ingestion can create a heavy load on node endpoints. With Archives the load is typically short and the total number of requests is low, but their frequency may be sufficient to trigger http 429 responses. Use private endpoints and rate limit your requests with the rateLimit chain source option.

What's next?