Job Description
Job Description
- We are looking for a highly skilled Site Reliability Engineer (SRE) to own and evolve our enterprise observability and reliability platforms.
- This role is responsible for ensuring availability, performance, scalability, and reliability of large-scale, cloud-native applications running on Kubernetes and OpenShift.
- The SRE will partner closely with application and platform teams to embed reliability engineering, SLO-driven operations, and automation-first practices.
Key Responsibilities
- Reliability Engineering & SRE Practices: Define, implement, and continuously improve SLIs, SLOs, and error budgets for enterprise applications.
- Drive reliability-focused decision making using error budgets, MTTD, MTTR, and service health metrics.
- Proactively identify reliability risks and performance bottlenecks and drive remediation.
- Lead incident response, post-incident reviews (blamel...
Apply for this Position
Ready to join ELLIOTT MOSS CONSULTING PTE. LTD.? Click the button below to submit your application.
Submit Application