Aggregator Node Guide

How to start the aggregator node and automate server configuration

This guide provides step-by-step instructions for the entire Aggregator workflow. By the end, you will have successfully created configuration files, requested task assignment(s) from admin, assigned tasks to the executor/trainer node, saved metadata of the training workflow, updated global model, uploaded loss score(s), and claimed your staking rewards.

1. Connect wallet with SoraChain AI Dashboard.

Connect your Metamask Wallet and login in to the dashboard.

Stack necessary tokens for the Aggregator node role.

you will see API key for the authenticating user. You will use those API keys as arguments to the Aggregator/Server automation module.

At the moment we have disabled api key usage for testing purposes.you can directly proceed to the next step.

2. Set up your environment

Windows

For Windows users, we suggest installing WSL. Follow the guidance: WSL installation

Mac/Linux

You can install Anaconda via HERE

Setup up container using SoraChain AI's Docker image

2. Set up Aggregator/Server repository

Before proceeding, make sure you have cloned the client repository described in the "AI Layer Repo" section.

After setting up the environment ,download test AI Layer repository to work with different nodes.

# Clone dev version of AI Layer of SoraEngine
git clone https://github.com/SoraChain-AI/testnet-AILayer-Node.git
cd testnet-AILayer-Node/git 
checkout prepareProvision

#Set PythonPath Env Variable

export PYTHONPATH=${PYTHONPATH}:${PWD}

#Install required Python Packages
pip install --no-cache-dir -r requirements.txt

We have developed an automation script that handles the complete setup process, including configuring the environment, uploading configuration files to the database for trainer nodes, and initializing the aggregator node. The aggregator assigns tasks to trainer nodes and ultimately builds a global model.

The script requires the following parameters as arguments:

Client IDs – Unique identifiers for trainers participating in the network
Model name or path – Specifies the model to be used for training
Data path – Location of the dataset
Workspace directory – Where configuration files are generated before being uploaded to the database
Training mode – Defined by the task creator; SoraEngine supports standard SFT training as well as efficient LoRA PEFT training and quantization
SoraAccess keys – Used for authentication
SoraBucketName – Specifies the directory where trainer node configuration files will be uploaded

This automation simplifies the deployment of trainer and aggregator nodes within the SoraEngine ecosystem, ensuring seamless task assignment and global model aggregation. 🚀

python AutoMateServer.py --client_ids Client1 Client2 
--model_name_or_path crumb/nano-mistral 
--data_path ${PWD}/data/Output --min_clients 1 
--workspace_dir ${PWD}/workspace/SoraWorkspace 
--train_mode PEFT --save_global_state
--SORA_ACCESS_KEY_ID "*****" --SORA_SECRET_ACCESS_KEY "******" 
--SORA_BUCKET_NAME "sorachaintestnode"

In the dev/test environment, We can enable automation without defining access keys. By default, we are storing our configuration files in workspace/SoraWorkspace directory.

PreviousTrainer Node Guide NextDelegator Guide

Last updated 8 months ago

Was this helpful?