Job Description

Key responsibilities
  • Design, build, and operate OCI based Kubernetes platforms for AI/ML/LLM services with strong security, observability, and reliability.
  • Implement and manage IaC/GitOps for repeatable environments, model/inference deployments, and traffic policies.
  • Enable progressive delivery (blue green/canary/A B) with metric gated rollouts and fast rollback.
  • Stand up and optimize LLM serving stacks, vector search, and RAG pipelines; enforce guardrails and monitor quality/cost SLOs.
  • Integrate Oracle Databases and OCI services securely; manage secrets, credentials, and network segmentation.
  • Establish SLOs, dashboards, runbooks, and incident/DR procedures; lead operational readiness reviews and postmortems.

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application