Job Description

Responsibilities

  • Manage the global AI Training environment (Servers/Data center) that dynamically allocates compute and GPU resources based on model training requirements.
  • MLOps platform for model tracking, cataloging, and deployment using tools such as MLFlow and KServe.
  • Dashboard development to show the servers/environment.
  • Admin on AWS cloud and on-premises system, including usage tracking and billing through a chargeback model.
  • Technical support on server.

Mandatory Skills

  • Unix/Linux & Windows Server Admin experience in Data Center and Servers
  • Familiarity with containerized environments, Kubernetes/Docker, and Rancher.
  • Experience with virtual machines (VMs), containerized systems, and cloud infrastructure basics (AWS).
  • Scripting in Server admin role.

Good to Have

  • Python, SQL, NodeJS, and web services design/development.

Apply for this Position

Ready to join YASH Technologies? Click the button below to submit your application.

Submit Application