Job Description
Job Title: HPC (High-Performance Computing) System Admin
Location: 100% REMOTE - India
Employment Type: Contrct to hire role - 6 to 9 Months Contract
Capacity and Duration
This role is not full-time. We are looking for a candidate with 50% capacity (20 hours per week) for a duration of six months.
Daily Schedule and Expectations
We are looking for a consistent daily presence rather than fragmented hours. The ideal breakdown for their 4-hour workday is as follows:
Monitoring system health and addressing urgent incidents.
Bright Cluster Manager
Slurm queue status and stuck jobs
Authentication system status
Hardware alerts and BMC notifications
Ticket Review
Triage, prioritization, and initial response to the service desk queue.
Project & Escalation (Remaining daily time): Addressing escalated issues and planning future system improvements.
Time Zone and Communication
The candidate must have a significant overlap with US Eastern Time (EST). It is important to note that the team's technical expert is based in EST, and frequent interaction will be required.
We are open to candidates in the European time zone, provided they can maintain the necessary EST overlap.
Because this role involves critical system stability and coordination, strong English communication skills are a requirement.
Job Overview:
We are seeking an experienced HPC System Administrator with hands-on expertise in Bright Cluster Manager, Slurm, Linux environments, and HPC command-line operations. This role involves supporting and maintaining existing production HPC clusters, ensuring stable performance, resolving hardware issues, and assisting users to keep computational workflows running efficiently.
Key Responsibilities & Skills:
Bright Cluster Manager: Proficient in administering and monitoring clusters through CMSH, managing system images, and maintaining cluster configurations.
Slurm: Solid understanding of scheduler configuration, handling job prioritization, creating policy exceptions, and managing reservations.
Linux Administration: Strong background in system management, troubleshooting, and providing technical support to users.
Hardware Diagnostics: Ability to identify hardware faults, perform basic server-level troubleshooting, and pinpoint failing components.
BMC/Remote Management: Familiarity with Dell iDRAC, HPE iLOM, and Supermicro management interfaces.
Thanks & Best Regards
Piyush Sharma
Recruitment
eMail: [email protected] | www.ishift.net
7014 East Camelback Road, Suite 1452
Scottsdale, Arizona 85251
Location: 100% REMOTE - India
Employment Type: Contrct to hire role - 6 to 9 Months Contract
Capacity and Duration
This role is not full-time. We are looking for a candidate with 50% capacity (20 hours per week) for a duration of six months.
Daily Schedule and Expectations
We are looking for a consistent daily presence rather than fragmented hours. The ideal breakdown for their 4-hour workday is as follows:
Monitoring system health and addressing urgent incidents.
Bright Cluster Manager
Slurm queue status and stuck jobs
Authentication system status
Hardware alerts and BMC notifications
Ticket Review
Triage, prioritization, and initial response to the service desk queue.
Project & Escalation (Remaining daily time): Addressing escalated issues and planning future system improvements.
Time Zone and Communication
The candidate must have a significant overlap with US Eastern Time (EST). It is important to note that the team's technical expert is based in EST, and frequent interaction will be required.
We are open to candidates in the European time zone, provided they can maintain the necessary EST overlap.
Because this role involves critical system stability and coordination, strong English communication skills are a requirement.
Job Overview:
We are seeking an experienced HPC System Administrator with hands-on expertise in Bright Cluster Manager, Slurm, Linux environments, and HPC command-line operations. This role involves supporting and maintaining existing production HPC clusters, ensuring stable performance, resolving hardware issues, and assisting users to keep computational workflows running efficiently.
Key Responsibilities & Skills:
Bright Cluster Manager: Proficient in administering and monitoring clusters through CMSH, managing system images, and maintaining cluster configurations.
Slurm: Solid understanding of scheduler configuration, handling job prioritization, creating policy exceptions, and managing reservations.
Linux Administration: Strong background in system management, troubleshooting, and providing technical support to users.
Hardware Diagnostics: Ability to identify hardware faults, perform basic server-level troubleshooting, and pinpoint failing components.
BMC/Remote Management: Familiarity with Dell iDRAC, HPE iLOM, and Supermicro management interfaces.
Thanks & Best Regards
Piyush Sharma
Recruitment
eMail: [email protected] | www.ishift.net
7014 East Camelback Road, Suite 1452
Scottsdale, Arizona 85251
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application