Job Description
Position Overview (Job Summary):
The role is for an HPC Engineer responsible for designing, deploying, managing, and optimizing an on-premises High Performance Computing (HPC) environment.
The environment includes SLURM-managed CPU and GPU clusters.
Strong emphasis on HPC architecture, Linux administration, job scheduling, and cluster operations.
Experience with parallel/distributed storage (Weka FS, Scality) is preferred but optional.
Primary Skills:
HPC Operations & Cluster Management (CPU & GPU)
SLURM Workload Manager (Mandatory) Install/configure/manage SLURM across multiple clusters
Partitions/queues, fairshare, job priority, scheduling policies
Upgrades, migrations, automation via API/integrations
Linux System Administration (RHEL focus) OS patching, hardening, tuning, package management
Troubleshooting & Performance Optimization Cluster health, node/job failures, bottlenecks, utilization optimization
Parallel Computing Knowledge MPI, Open MP, ...
The role is for an HPC Engineer responsible for designing, deploying, managing, and optimizing an on-premises High Performance Computing (HPC) environment.
The environment includes SLURM-managed CPU and GPU clusters.
Strong emphasis on HPC architecture, Linux administration, job scheduling, and cluster operations.
Experience with parallel/distributed storage (Weka FS, Scality) is preferred but optional.
Primary Skills:
HPC Operations & Cluster Management (CPU & GPU)
SLURM Workload Manager (Mandatory) Install/configure/manage SLURM across multiple clusters
Partitions/queues, fairshare, job priority, scheduling policies
Upgrades, migrations, automation via API/integrations
Linux System Administration (RHEL focus) OS patching, hardening, tuning, package management
Troubleshooting & Performance Optimization Cluster health, node/job failures, bottlenecks, utilization optimization
Parallel Computing Knowledge MPI, Open MP, ...
Apply for this Position
Ready to join HCLTech? Click the button below to submit your application.
Submit Application