Job Description

Responsibilities

  • Manage the full OS lifecycle including installation, configuration, patching, and upgrades across HPC environments
  • Monitor system performance across compute nodes, login nodes, and HCI infrastructure, ensuring reliability and uptime
  • Perform troubleshooting and root cause analysis for system-level issues and incidents
  • Support configuration management and automation using scripting and relevant tools
  • Work with internal and external teams to resolve incidents and maintain service continuity
  • Participate in on-call rotation to support critical escalations
  • Ensure adherence to operational standards, documentation practices, and ITIL processes

Requirements

  • At least 3 years of experience in system administration, or infrastructure operations
  • Strong hands-on experience with Linux (preferably Red Hat Enterprise Linux)

Apply for this Position

Ready to join Tyson Jay? Click the button below to submit your application.

Submit Application