SLURM Account Management
Creating a New Account
Via Terraform (Recommended)
Edit odin/terraform/main.tf and add to the slurm_accounts map:
slurm_accounts = {
# ... existing accounts ...
my_project = {
name = "my_project"
description = "My research project"
max_cpus_per_user = 192 # Max CPUs per user in this account
max_jobs_per_user = 50 # Max concurrent jobs per user
}
}
# Also add to the slurm_projects list to ensure partition is created
slurm_projects = ["qcs", "albus", "bali", "genius", "odin", "my_project"]
Then deploy:
cd odin/terraform
terraform plan
terraform apply
Manual Creation on Cluster (Temporary)
SSH to headnode and use sacctmgr:
ssh headnode
# Create new account
sudo /opt/slurm/bin/sacctmgr add account my_project description="My research project"
# Verify creation
/opt/slurm/bin/sacctmgr show account my_project
# Set account limits (optional)
sudo /opt/slurm/bin/sacctmgr modify account my_project \
set MaxCpus=192,MaxJobs=50
Modifying Account Settings
CPU Limits
sudo /opt/slurm/bin/sacctmgr modify account my_project \
set MaxCpus=256
Job Limits
sudo /opt/slurm/bin/sacctmgr modify account my_project \
set MaxJobs=75
Memory Limits
sudo /opt/slurm/bin/sacctmgr modify account my_project \
set MaxMemory=1000000 # In MB
Listing Accounts
List all accounts
/opt/slurm/bin/sacctmgr show account
Example output:
Account Descr Org
---------- ---------- ----------
qcs General c CPU
albus GPU res
bali GPU res
genius GPU res
odin ODIN pr
List with resource limits
/opt/slurm/bin/sacctmgr show account WithAssoc format=Account,Description,MaxCpus,MaxJobs
List account usage
sacct --starttime "$(date -d '30 days ago' '+%Y-%m-%d')" \
--accounts=my_project \
--format="Account,User,JobCount,CPUTime" \
--parsable2 | awk -F'|' '
NR>1 {
acc=$1; user=$2; jobs=$3; cputime=$4;
split(cputime, t, ":");
if (length(t)==2) secs = t[1]*60 + t[2];
else secs = t[1]*3600 + t[2]*60 + t[3];
cpu_hours = secs/3600;
total_jobs[acc]++;
total_cpu_hours[acc] += cpu_hours;
}
END {
for (a in total_jobs)
printf "%s: %d jobs, %.1f CPU-hours\n", a, total_jobs[a], total_cpu_hours[a];
}'
Deleting an Account
Via Terraform
Remove from slurm_accounts map in odin/terraform/main.tf and slurm_projects list, then apply.
Manual Deletion
ssh headnode
# Delete account (removes all associated users)
sudo /opt/slurm/bin/sacctmgr delete account my_project
# Confirm deletion
/opt/slurm/bin/sacctmgr show account my_project
Cost Estimation
CPU-Hours Calculation
sacct --starttime "$(date -d '30 days ago' '+%Y-%m-%d')" \
--accounts=my_project \
--format="Account,CPUTime" \
--parsable2 | awk -F'|' '
NR>1 {
cputime=$2;
split(cputime, t, ":");
if (length(t)==2) secs = t[1]*60 + t[2];
else secs = t[1]*3600 + t[2]*60 + t[3];
cpu_hours += secs/3600;
}
END {
printf "Total CPU-hours: %.1f\n", cpu_hours;
printf "Est. cost @ $0.40/CPU-hour: $%.2f\n", cpu_hours * 0.40;
}'
GPU Usage Estimation
# Check GPU-hours by account
sacct --starttime "$(date -d '30 days ago' '+%Y-%m-%d')" \
--accounts=my_project \
--format="Account,AllocGRES,CPUTime" \
--parsable2 | grep gpu | head -10
Account Hierarchy
SLURM accounts support parent-child relationships:
# Create parent account
sudo /opt/slurm/bin/sacctmgr add account parent_account
# Create child account
sudo /opt/slurm/bin/sacctmgr add account child_account parent=parent_account
# Set parent limits that apply to all children
sudo /opt/slurm/bin/sacctmgr modify account parent_account \
set MaxCpus=500
Troubleshooting
Cannot create account
# Verify slurmdbd is running
systemctl is-active slurmdbd
# Check database connectivity
mysql -u root slurm_acct_db -e "SELECT 1;" 2>&1
Account created but users can’t submit jobs
- Verify users are associated with the account (see User Management)
- Check account MaxCpus limit:
sacctmgr show account WithAssoc - Check partition is accepting jobs:
sinfo
Next Steps
- User Management - Add users to accounts
- Usage Tracking - Monitor account usage