Job Description

Staff-level Site Reliability Engineer focused on scaling reliability maturity across the organization. This role blends deep operational experience, coaching, and system-level thinking, operating either in platform squads or rotating enablement engagements.
 
Software / Systems engineers with advanced cloud experience capable of deploying, operating, scaling and maintaining:

  •  EKS and existing AWS infrastructure
  • 3rd party systems deployed within environment
  • CICD infrastructure (COTS and internally developed)
     
    What they'll do
  •  Advance org-wide reliability practices (SLOs, SLIs, error budgets)
  • Lead Tier-0 incident response and operational improvements
  • Identify systemic scaling and reliability gaps
  • Embed with teams to implement SLOs and uplift reliability
  • Improve on-call quality, tooling, and operational hygiene
  • Teach reliability as a repeatable engineering discipline

Require...

Apply for this Position

Ready to join Hre Group? Click the button below to submit your application.

Submit Application