Overview
Design
The Subsquid services stack separates data ingestion (Archives) from data transformation and presentation (squids). Archives ingest and store raw blockchain data in a normalized way. Archives can be thought of as specialized data lakes optimized for storing and filtering large volumes of raw on-chain-data.
Squid Projects (or simply Squids) are Extract-Tranfsorm-Load-Query (ETLQ) projects that ingest historical on-chain data from Archives, transforming it according to user-defined data mappers. Squids are typically configured to present this data as a GraphQL API. Squids are built using the open-source Squid SDK.
The separation of the data extraction layer (Archives) and the data transformation and presentation layers (squids) make squids lightweight, while achieving indexing speeds up to 50000 blocks per second. Indeed, since the on-chain data is consumed from Archives there is no need for setup high-throuput node infrastructure. Squids can be run locally, on-premises or deployed to the Aquarium hosted service.
Archives
See the Archives section for more information on how to use public Archives or to learn how to run an Archive locally (Local archives are currently supported only for Substrate chains).
At the moment, Subsquid maintains Archives for the following networks:
- Major EVM chains, including Ethereum, Polygon, Binance Smart Chain, Fantom, Arbitrum. See the full list here.
- Substrate chains, including parachains on Kusama and Polkadot. Additionally, Substrate Archives support EVM and Ink! smart contracts deployed to Moonbeam/Moonriver, Acala, Astar/Shiden, Gear, Aleph Zero.
Squids
Squids have a certain structure and are supposed to be developed as regular node.js packages. Use sqd init
command to scaffold a new squid project from a suitable template.
Normally a squid project consists of a long-running processor
service fetching and transforming the data from an archive and a api
service exposing the transformed data with an GraphQL API generated from schema.graphql
.
The processor
service is defined in src/processor.ts
by default. Target data sink for the processor
may include a Postgres compatible database, S3 buckets, BigQuery or a custom store. The api
service is an independent node.js process and is optional.
The deployment of the squid services to the Aquarium (see below) is managed by the squid.yaml
manifest.
The Open Source Squid SDK offers an extensive set of tools for developing squids:
- Core classes for the
processor
service:EvmBatchProcessor
for EVM chains andSubstrateBatchProcessor
for Substrate-based chains. evm-typegen
,substrate-typegen
andink-typegen
tools for generating TypeScript facade classes for type-safe decoding of EVM, Substrate and Ink! smart contract data.typeorm-codegen
generates entity classes from a declarative schema file defined by the squid developer. The entity classes define the schema of the target database and the GraphQL API.graphql-server
is the backend for the GraphQL API served by theapi
service. The GraphQL schema is auto-generated fromschema.graphql
. The resulting API loosely follows the OpenCRUD standard and supports the most common query filters and selectors out-of-the box. See the GraphQL API section for more details and the configuration options.
The Aquarium
Squids can be deployed to the Subsquid cloud service, called the Aquarium, free of charge. Go to the Deploy Squid section for more information.
What's next?
- Follow the Quickstart to build the first squid
- Explore Examples
- Deep dive into EVM Batch Processor and Substrate Batch Processor
- Explore GraphQL API Server options including custom extensions, caching and DoS protection in production