Data Flow

SoraChain AI enables decentralized, privacy-preserving federated learning using subnets of Trainer Nodes, Aggregator Nodes, and Validators. The data flow across the system ensures that updates are securely trained, aggregated, validated, and stored with full transparency.


πŸ” Step-by-Step Data Flow


πŸ”Ή 1. Subnet Initialization

A Subnet consists of:

  • Multiple Trainer Nodes (edge devices)

  • One or more Aggregator Nodes

  • Distributed set of Validator Nodes

Each participant joins a subnet using cryptographic registration and stakes tokens to participate and be eligible for rewards.


πŸ”Ή 2. Local Training at Trainer Nodes

Each Trainer Node performs:

  • Loads the latest Global Model from IPFS.

  • Trains locally on private data using a compatible ML framework (e.g., PyTorch, TensorFlow).

  • Computes model updates (not raw data).

  • Encrypts model update & uploads to IPFS (Parameter Store).

  • Sends metadata + IPFS CID to Aggregator Node.

Local Metadata Includes:

  • Epoch ID

  • Device ID (hashed)

  • Update size & timestamp

  • Encrypted loss/accuracy scores


πŸ”Ή 3. Aggregation at Aggregator Node

The Aggregator Node:

  • Collects encrypted updates from Trainer Nodes.

  • Decrypts & verifies structure + signature.

  • Validates update format & cryptographic integrity.

  • Aggregates updates securely (FedAvg, FedProx, or custom logic).

  • Builds a new Global Model.

  • Uploads global model parameters to IPFS.

  • Generates ranking metadata for each Trainer Node.

  • Sends all metadata + rankings to Validator Nodes for verification.


πŸ”Ή 4. Validation by Validator Nodes

Each Validator Node:

  • Fetches Trainer Node metadata & rankings from Aggregator.

  • Uses smart contract-based rules to:

    • Validate contribution quality (based on accuracy/loss deltas).

    • Cross-check update time consistency.

    • Compare model gradients or hashes if needed.

  • Computes a reputation score for each Trainer Node.

  • Confirms or disputes rankings.

  • Posts validation proof + reputation updates to blockchain.


πŸ”Ή 5. Blockchain Metadata Store

Stores training metadata immutably:

  • πŸ”— Hashed model update references

  • ⏰ Timestamps of training rounds

  • πŸ†” Trainer Node identity hashes

  • 🏷️ Reputation & ranking data from Validators

  • πŸ“‹ Audit logs of model version history

  • βœ… Smart contract validation outcomes


πŸ”Ή 6. Parameter Store

Stores actual model parameters and update data:

  • 🌐 Global model weights (latest and historical)

  • πŸ”„ Aggregated model updates per round

  • πŸ“Š Statistical summaries of updates

  • πŸ” Hashes of each parameter state for integrity

  • πŸ“š Versioned model artifacts for rollback or re-training


🧠 Developer Insight: Trust but Verify

  • No raw data ever leaves Trainer Node

  • All updates are encrypted, hashed, and signed

  • Validators use deterministic smart contracts to resolve disputes

  • Model lineage is traceable via blockchain audit logs

  • Incentives are tied to reputation and contribution quality

In essence, the SoraChain AI’s framework facilitates the collaborative improvement of machine learning models by enabling decentralized hosting, transparent data handling, and continuous model updates through blockchain technology, ultimately aiming to make AI more accessible and inclusive

Last updated

Was this helpful?