Job Description
International technology company is looking for a:
Senior Site Reliability Engineer (SRE)
Responsible for ensuring the reliability, resilience, and operational excellence of a cloud-native product ecosystem by implementing advanced Site Reliability Engineering (SRE) practices. This role has a direct impact on business stability, critical incident management, and SLO compliance.
- Design and maintain end-to-end observability systems (metrics, logs, traces, alerts, and dashboards).
- Define and manage SLIs, SLOs, and error budgets.
- Lead critical incident response and Root Cause Analysis (RCA) processes.
- Manage and mature alerting and on-call systems (OpsGenie).
- Drive continuous improvement in reliability and performance.
- Capacity planning and operational support for cloud teams.
- Professional with solid experience in SRE / Cloud Operations.
- Strong expertise in observability tools (Grafana, Prometheus...
Apply for this Position
Ready to join Confidencial USQ? Click the button below to submit your application.
Submit Application