Job Description

Job Description

  • We are looking for a highly skilled Site Reliability Engineer (SRE) to own and evolve our enterprise observability and reliability platforms.
  • This role is responsible for ensuring availability, performance, scalability, and reliability of large-scale, cloud-native applications running on Kubernetes and OpenShift.
  • The SRE will partner closely with application and platform teams to embed reliability engineering, SLO-driven operations, and automation-first practices.

Key Responsibilities

  • Reliability Engineering & SRE Practices: Define, implement, and continuously improve SLIs, SLOs, and error budgets for enterprise applications.
  • Drive reliability-focused decision making using error budgets, MTTD, MTTR, and service health metrics.
  • Proactively identify reliability risks and performance bottlenecks and drive remediation.
  • Lead incident response, post-incident reviews (blamel...

Apply for this Position

Ready to join ELLIOTT MOSS CONSULTING PTE. LTD.? Click the button below to submit your application.

Submit Application