Job Description

Our client is seeking anMLOps Engineer with a strong background in systems programming and infrastructure engineering. This role is focused on owning and evolving the on-premise infrastructure that powers their advancedPyTorch -based training workloads.

This position is a perfect fit for an engineer who is not just focused on model outcomes, but on the quality and robustness of the underlying systems. You will be responsible for building high-quality, maintainable training pipelines, solving low-level systems and networking challenges, and ensuring the training codebase is clean, scalable, and built to last.

Key Responsibilities
  • Architect, build, and maintain end-to-endtraining and inference pipelines usingPyTorch .
  • Develop and maintain high-quality, robust tooling in bothPython andC++ to support the entire model training lifecycle.
  • Takefull ownership of the core training codeb...

Apply for this Position

Ready to join Confidential? Click the button below to submit your application.

Submit Application