Job Description

Job Description

As a Principal/Chief Site Reliability Engineer, you will play a critical role in designing, developing, and maintaining scalable and highly reliable systems. You'll work closely with development teams to improve system reliability, monitor critical applications, and design fail-proof infrastructure.

Responsibilities

  • Design and implement scalable, highly available infrastructure and automation solutions.
  • Drive adoption of SRE principles, SLAs, SLOs, and error budgets across teams.
  • Proactively identify, debug, and resolve complex system reliability issues.
  • Build tooling for observability, alerting, and performance monitoring.
  • Collaborate with developers and architects on cloud-native design and service resilience.
  • Conduct failure analysis, system audits, and root cause investigations.
  • Contrib...

Apply for this Position

Ready to join Collabera? Click the button below to submit your application.

Submit Application