Job Description

Job Title: AWS Site Reliability Engineer (Data Platform)

Role Summary

We are looking for an AWS Site Reliability Engineer (SRE) to support and scale a cloud-native data platform built on AWS, Snowflake, and Databricks . The role focuses on driving reliability through automation, disaster recovery (DR) testing, resiliency engineering, observability, and proactive SLO/SLI/SLA management .

Key Responsibilities

  • Design, build, and maintain automation for infrastructure provisioning, platform operations, and incident response using IaC and CI/CD.
  • Lead resiliency and disaster recovery planning , including regular DR drills, failure testing, and recovery validation across AWS and data platform components.
  • Define, implement, and manage SLIs, SLOs, and SLAs for critical data pipelines and platform services; use error budgets to guide reliability improvements.
  • Build and operate rob...

Apply for this Position

Ready to join Pyramid Consulting, Inc? Click the button below to submit your application.

Submit Application