Job Description
Job Description
As a Principal/Chief Site Reliability Engineer, you will play a critical role in designing, developing, and maintaining scalable and highly reliable systems. You'll work closely with development teams to improve system reliability, monitor critical applications, and design fail-proof infrastructure.
Responsibilities
- Design and implement scalable, highly available infrastructure and automation solutions.
- Drive adoption of SRE principles, SLAs, SLOs, and error budgets across teams.
- Proactively identify, debug, and resolve complex system reliability issues.
- Build tooling for observability, alerting, and performance monitoring.
- Collaborate with developers and architects on cloud-native design and service resilience.
- Conduct failure analysis, system audits, and root cause investigations.
- Contrib...
Apply for this Position
Ready to join Collabera? Click the button below to submit your application.
Submit Application