Job Description

What You’ll Do

You’ll operate at the intersection of software engineering and systems engineering, building resilient systems that scale, self-heal, and empower developers to ship safely.

Reliability Engineering

  • Define and manage SLIs, SLOs, and error budgets
  • Reduce MTTD, MTTA, and MTTR through structured incident response
  • Conduct blameless postmortems and drive preventative improvements
  • Champion reliability in architectural reviews and production readiness

Observability & Monitoring

  • Design actionable, symptom-based alerts (not noise)
  • Build dashboards and tracing systems using tools like CloudWatch, Prometheus, Grafana, New Relic, X-Ray, ADOT
  • Implement synthetic monitoring to simulate real user journeys (URLs, clickpaths, APIs)
  • Ensure full observability coverage across critical paths

Cloud & Infrastructure

  • Operate and optimize AWS environment...

Apply for this Position

Ready to join Devopie Inc.? Click the button below to submit your application.

Submit Application