Substrate Processor
This section applies to squid processors indexing Substrate-based chains, including:
- Polkadot
- Kusama
- Acala
See Supported networks for a full list.
If you are building on one of the networks implementing EVM on Substrate, such as
- Astar
- Moonbeam
- Moonriver
and only require EVM data, consider using EVM processor.
Overview and the data model
A squid processor is a Node.js process that fetches historical on-chain data from an Archive and/or a chain node RPC endpoint, performs arbitrary transformations and saves the result. SubstrateBatchProcessor
is the central class that handles Substrate data extraction, transformation and persistence. By convention, the processor entry point is src/main.ts
; it is started by calling SubstrateBatchProcessor.run()
there. A single batch handler function supplied to that method is responsible for transforming data from multiple blocks in a single in-memory batch.
A batch provides iterables to access all items requested in processor configuration, which may include
- Events, corresponding to matching Substrate runtime events.
- Calls, corresponding to matching calls executed by the Substrate runtime.
See the batch context and block data pages for details.
Additional support is available for log items produced by the Frontier EVM pallet (see EVM support), the Contracts pallet (see ink! support) and the Gear Messages pallet. Further, processor can extract additional data by querying the historical runtime state and indeed any external API.
Results of the ETL process can be stored in any Postgres-compatible database or in filesystem-based datasets in CSV and Parquet formats.
RPC ingestion
Starting with the ArrowSquid release, the processor can ingest data either from an Archive or directly from an RPC endpoint. If both an Archive and an RPC endpoint are provided, the processor will use the Archive until it reaches the highest block available there, then index the remaining blocks using the RPC endpoint. This allows squids to combine low sync times with near real-time chain data access. It is, however, possible to use just the RPC endpoint.
What's next?
- Move forward to the
SubstrateBatchProcessor
configuration page - Follow the tutorial to build a squid indexing the Crust parachain step by step
- Taka a look at the examples