Multichain indexing

Squids can extract data from multiple chains into a shared data sink. If the data is stored to Postgres it can then be served as a unified multichain GraphQL API.

To do this, run one processor per source network:

Make a separate entry point (main.ts or equivalent) for each processor. The resulting folder structure may look like this:

├── src
│   ├── bsc
│   │   ├── main.ts
│   │   └── processor.ts
│   ├── eth
│   │   ├── main.ts
│   │   └── processor.ts

Alternatively, parameterize your processor using environment variables: you can set these on a per-processor basis if you use a deployment manifest to run your squid.

Arrange for running the processors alongside each other conveniently:

Add sqd commands for running each processor to commands.json, e.g.

commands.json
...
  "process:eth": {
    "deps": ["build", "migration:apply"],
    "cmd": ["node", "lib/eth/main.js"]
  },
  "process:bsc": {
    "deps": ["build", "migration:apply"],
    "cmd": ["node", "lib/bsc/main.js"]
  },
...

Full example

If you are going to use sqd run for local runs or deploy your squid to Subsquid Cloud, list your processors at the deploy.processor section of your deployment manifest:
```
deploy:
  processor:
    - name: eth-processor
      cmd: [ "sqd", "process:prod:eth" ]
    - name: bsc-processor
      cmd: [ "sqd", "process:prod:bsc" ]
```
Make sure to give each processor a unique name!

On Postgres

Also ensure that

State schema name for each processor is unique

src/bsc/main.ts
processor.run(
  new TypeormDatabase({
    stateSchema: 'bsc_processor'
  }),
  async ctx => { // ...

src/eth/main.ts
processor.run(
  new TypeormDatabase({
    stateSchema: 'eth_processor'
  }),
  async (ctx) => { // ...

Schema and GraphQL API are shared among the processors.

Handling concurrency

Cross-chain data dependencies are to be avoided. With the default isolation level used by TypeormDatabase, SERIALIZABLE, one of the processors will crash with an error whenever two cross-dependent transactions are submitted to the database simultaneously. It will write the correct data when restarted, but such restarts can impact performance, especially in squids that use many (>5) processors.
The alternative isolation level is READ COMMITTED. At this level data dependencies will not crush the processors, but the execution is not guaranteed to be deterministic unless the sets of records that different processors read/write do not overlap.
To avoid cross-chain data dependencies, use per-chain records for volatile data. E.g. if you track account balances across multiple chains you can avoid overlaps by storing the balance for each chain in a different table row.

When you need to combine the records (e.g. get a total of all balaces across chains) use a custom resolver to do it on the GraphQL server side.

It is OK to use cross-chain entities to simplify aggregation. Just don't store any data in them:

type Account @entity {
  id: ID! # evm address
  balances: [Balance!]! @derivedFrom("field": "account")
}

type Balance @entity {
  id: ID! # chainId + evm address
  account: Account!
  value: BigInt!
}

On file-store

Ensure that you use a unique target folder for each processor.

Example

A complete example is available here.

Multichain indexing

On Postgres​

Handling concurrency​

On file-store​

Example​

On Postgres

Handling concurrency

On file-store

Example