Skip to main content
Version: Firesquid

Overview

Design

The Subsquid services stack separates data ingestion (Archives) from data transformation and presentation (squids). Archives ingest and store raw blockchain data in a normalized way. Archives can be thought of as specialized data lakes optimized for storing and filtering large volumes of raw on-chain-data.

Squid Projects (or simply Squids) are Extract-Tranfsorm-Load-Query (ETLQ) projects that ingest historical on-chain data from Archives, transforming it according to user-defined data mappers. Squids are typically configured to present this data as a GraphQL API. Squids are built using the open-source Squid SDK.

The separation of the data extraction layer (Archives) and the data transformation and presentation layers (squids) make squids lightweight, while achieving indexing speeds up to 50000 blocks per second. Indeed, since the on-chain data is consumed from Archives there is no need for setup high-throuput node infrastructure. Squids can be run locally, on-premises or deployed to the Aquarium hosted service.

Archives

See the Archives section for more information on how to use public Archives or to learn how to run an Archive locally (Local archives are currently supported only for Substrate chains).

At the moment, Subsquid maintains Archives for the following networks:

  • Major EVM chains, including Ethereum, Polygon, Binance Smart Chain, Fantom, Arbitrum. See the full list here.
  • Substrate chains, including parachains on Kusama and Polkadot. Additionally, Substrate Archives support EVM and Ink! smart contracts deployed to Moonbeam/Moonriver, Acala, Astar/Shiden, Gear, Aleph Zero.

Squids

Squids have a certain structure and are supposed to be developed as regular node.js packages. Use sqd init command to scaffold a new squid project from a suitable template.

Normally a squid project consists of a long-running processor service fetching and transforming the data from an archive and a api service exposing the transformed data with an GraphQL API generated from schema.graphql.

The processor service is defined in src/processor.ts by default. Target data sink for the processor may include a Postgres compatible database, S3 buckets, BigQuery or a custom store. The api service is an independent node.js process and is optional.

The deployment of the squid services to the Aquarium (see below) is managed by the squid.yaml manifest.

The Open Source Squid SDK offers an extensive set of tools for developing squids:

  • Core classes for the processor service: EvmBatchProcessor for EVM chains and SubstrateBatchProcessor for Substrate-based chains.
  • evm-typegen, substrate-typegen and ink-typegen tools for generating TypeScript facade classes for type-safe decoding of EVM, Substrate and Ink! smart contract data.
  • typeorm-codegen generates entity classes from a declarative schema file defined by the squid developer. The entity classes define the schema of the target database and the GraphQL API.
  • graphql-server is the backend for the GraphQL API served by the api service. The GraphQL schema is auto-generated from schema.graphql. The resulting API loosely follows the OpenCRUD standard and supports the most common query filters and selectors out-of-the box. See the GraphQL API section for more details and the configuration options.

The Aquarium

Squids can be deployed to the Subsquid cloud service, called the Aquarium, free of charge. Go to the Deploy Squid section for more information.

What's next?