Job Description

Staff-level Site Reliability Engineer focused on scaling reliability maturity across the organization. This role blends deep operational experience, coaching, and system-level thinking, operating either in platform squads or rotating enablement engagements.

Software / Systems engineers with advanced cloud experience capable of deploying, operating, scaling and maintaining:

  • EKS and existing AWS infrastructure
  • 3rd party systems deployed within environment
  • CICD infrastructure (COTS and internally developed)

What they'll do
- Advance org-wide reliability practices (SLOs, SLIs, error budgets)
- Lead Tier-0 incident response and operational improvements
- Identify systemic scaling and reliability gaps
- Embed with teams to implement SLOs and uplift reliability
- Improve on-call quality, tooling, and operational hygiene
- Teach reliability as a repeatable engineering discipline

Requisitos:

  • 10+ yea...

Apply for this Position

Ready to join HRE GROUP? Click the button below to submit your application.

Submit Application