SLURM User Management
Adding Users to Accounts
Via AWS Secrets Manager (Recommended for ODIN)
Users are defined in odin/terraform/users.yaml and automatically synchronized to AWS Secrets Manager, which provisions them on the cluster.
1. Edit users.yaml
# odin/terraform/users.yaml
users:
john_doe:
name: "John Doe"
email: "john.doe@roche.com"
project: "odin" # Project/account assignment
jane_smith:
name: "Jane Smith"
email: "jane.smith@roche.com"
project: "albus" # Another project
bob_jones:
name: "Bob Jones"
email: "bob.jones@roche.com"
project: "qcs" # CPU project
2. Commit and Push
cd odin/terraform
git add users.yaml
git commit -m "Add new users to cluster"
git push origin main
3. GitHub Action Triggers Automatically
The .github/workflows/update-users.yml workflow:
- Detects changes to
odin/terraform/users.yaml - Extracts user information and project assignments
- Creates/updates AWS Secrets Manager secrets with user config
- Cluster reads secrets and provisions users with correct account assignment
4. Verify User Creation
ssh headnode
# List all users
getent passwd | grep -v "root\|slurm\|daemon" | head -20
# Check user's account association
/opt/slurm/bin/sacctmgr show user john_doe
Manual User Creation (Temporary)
For immediate user provisioning (before next Terraform apply):
ssh headnode
# Add user to system
sudo useradd -m -s /bin/bash john_doe
sudo passwd john_doe
# Add user to SLURM account
sudo /opt/slurm/bin/sacctmgr add user john_doe account=odin
# Set as default account for user
sudo /opt/slurm/bin/sacctmgr modify user john_doe account=odin set default=odin
Managing User Account Assignments
List User Account Memberships
/opt/slurm/bin/sacctmgr show user john_doe
Example output:
User=john_doe
Admin=None
Account=albus (default), genius, odin
Coords=
Add User to Additional Account
sudo /opt/slurm/bin/sacctmgr add user john_doe account=bali
Remove User from Account
sudo /opt/slurm/bin/sacctmgr delete user john_doe account=qcs
Change Default Account
sudo /opt/slurm/bin/sacctmgr modify user john_doe account=odin set default=odin
User Resource Limits
Set per-user limits within an account:
CPU Limits
sudo /opt/slurm/bin/sacctmgr modify user john_doe account=odin \
set MaxCpus=96
Job Limits
sudo /opt/slurm/bin/sacctmgr modify user john_doe account=odin \
set MaxJobs=20
Memory Limits
sudo /opt/slurm/bin/sacctmgr modify user john_doe account=odin \
set MaxMemory=500000 # In MB
Viewing Limits
/opt/slurm/bin/sacctmgr show user john_doe WithAssoc format=User,Account,MaxCpus,MaxJobs
User Job Submission
Check Available Accounts
# As the user:
/opt/slurm/bin/squeue --account-list
# or
sacctmgr show user $(whoami) | grep Account
Submit Job to Specific Account
# Use default account
sbatch my_script.sh
# Use specific account
sbatch --account=odin my_script.sh
# Verify job was submitted to correct account
squeue -u $(whoami)
Listing Users by Account
All Users in Account
/opt/slurm/bin/sacctmgr show user account=odin
Users with Job History
sacct --accounts=odin \
--format="User,Account,JobCount,CPUTime" \
--group=user | sort
Active Users (With Recent Jobs)
sacct --starttime "$(date -d '7 days ago' '+%Y-%m-%d')" \
--accounts=odin \
--format="User,Account,State" \
--parsable2 | awk -F'|' '!seen[$1]++' | sort -u
User Onboarding Checklist
- User account created in AWS Secrets Manager (via users.yaml)
- User SSH key uploaded to headnode
- User added to SLURM account
- User’s default account set correctly
- User resource limits configured (if needed)
- User can SSH to headnode
- User can submit test job
- User can view job in
squeue - Job completes and appears in
sacct
User Offboarding
Remove User from All Accounts
ssh headnode
# Delete from SLURM (removes from all accounts)
sudo /opt/slurm/bin/sacctmgr delete user john_doe
# Remove from system (optional, may want to keep for historical job records)
# sudo userdel -r john_doe
Remove from Specific Account
sudo /opt/slurm/bin/sacctmgr delete user john_doe account=odin
Delegated Account Administration
Create admin users who can manage their account without full sudo access:
# Grant admin rights to user for specific account
sudo /opt/slurm/bin/sacctmgr modify user john_doe \
set AdminLevel=Account account=odin
# Verify
/opt/slurm/bin/sacctmgr show user john_doe WithAssoc
Admin users can then:
# Add users to their account
sacctmgr add user new_user account=odin
# View account users
sacctmgr show user account=odin
# Modify user limits
sacctmgr modify user john_doe account=odin set MaxCpus=256
Troubleshooting
User cannot submit jobs
# Check if user exists in SLURM
sacctmgr show user john_doe
# Check user's accounts
sacctmgr show user john_doe WithAssoc
# Check if account has available resources
sacctmgr show account WithAssoc format=Account,MaxCpus,MaxJobs
Job rejected - account not found
# Verify account exists
sacctmgr show account my_account
# Verify user is in account
sacctmgr show user john_doe WithAssoc
SSH key not working
See SSH Setup documentation.
Next Steps
- Account Management - Manage accounts
- Usage Tracking - Monitor user and account usage