Job Description

NVIDIA is looking for an experienced HPC DevOps and Network Engineer to help build the supercomputers and HPC clusters of the future. As a Senior HPC DevOps Engineer, you will drive breakthroughs in artificial intelligence and GPU computing, working with accelerated computing and deep learning platforms to craft improved workflows and develop leading solutions for scientific researchers, developers, and customers.

What You’ll Be Doing
  • Design, implement, and maintain large‑scale HPC/AI clusters with state‑of‑the‑art monitoring, logging, and alerting systems.
  • Utilize and develop tools to manage infrastructure as code, ensuring scalable and repeatable deployments.
  • Develop and maintain CI/CD pipelines to automate deployment processes.
  • Build automation scripts and tools for deployment, configuration management, and operational monitoring.
  • Develop complex networking automations.
  • Perform comprehensive troubleshooting f...

Apply for this Position

Ready to join NVIDIA? Click the button below to submit your application.

Submit Application