Data Flow
SoraChain AI enables decentralized, privacy-preserving federated learning using subnets of Trainer Nodes, Aggregator Nodes, and Validators. The data flow across the system ensures that updates are securely trained, aggregated, validated, and stored with full transparency.


π Step-by-Step Data Flow
πΉ 1. Subnet Initialization
A Subnet consists of:
Multiple Trainer Nodes (edge devices)
One or more Aggregator Nodes
Distributed set of Validator Nodes
Each participant joins a subnet using cryptographic registration and stakes tokens to participate and be eligible for rewards.
πΉ 2. Local Training at Trainer Nodes
Each Trainer Node performs:
Loads the latest Global Model from IPFS.
Trains locally on private data using a compatible ML framework (e.g., PyTorch, TensorFlow).
Computes model updates (not raw data).
Encrypts model update & uploads to IPFS (Parameter Store).
Sends metadata + IPFS CID to Aggregator Node.
Local Metadata Includes:
Epoch ID
Device ID (hashed)
Update size & timestamp
Encrypted loss/accuracy scores
πΉ 3. Aggregation at Aggregator Node
The Aggregator Node:
Collects encrypted updates from Trainer Nodes.
Decrypts & verifies structure + signature.
Validates update format & cryptographic integrity.
Aggregates updates securely (FedAvg, FedProx, or custom logic).
Builds a new Global Model.
Uploads global model parameters to IPFS.
Generates ranking metadata for each Trainer Node.
Sends all metadata + rankings to Validator Nodes for verification.
πΉ 4. Validation by Validator Nodes
Each Validator Node:
Fetches Trainer Node metadata & rankings from Aggregator.
Uses smart contract-based rules to:
Validate contribution quality (based on accuracy/loss deltas).
Cross-check update time consistency.
Compare model gradients or hashes if needed.
Computes a reputation score for each Trainer Node.
Confirms or disputes rankings.
Posts validation proof + reputation updates to blockchain.
πΉ 5. Blockchain Metadata Store
Stores training metadata immutably:
π
Hashed model update referencesβ°
Timestamps of training roundsπ
Trainer Node identity hashesπ·οΈ
Reputation & ranking data from Validatorsπ
Audit logs of model version historyβ
Smart contract validation outcomes
πΉ 6. Parameter Store
Stores actual model parameters and update data:
π
Global model weights (latest and historical)π
Aggregated model updates per roundπ
Statistical summaries of updatesπ
Hashes of each parameter state for integrityπ
Versioned model artifacts for rollback or re-training
π§ Developer Insight: Trust but Verify
No raw data ever leaves Trainer Node
All updates are encrypted, hashed, and signed
Validators use deterministic smart contracts to resolve disputes
Model lineage is traceable via blockchain audit logs
Incentives are tied to reputation and contribution quality
In essence, the SoraChain AIβs framework facilitates the collaborative improvement of machine learning models by enabling decentralized hosting, transparent data handling, and continuous model updates through blockchain technology, ultimately aiming to make AI more accessible and inclusive
Last updated
Was this helpful?