Job Description
Job Title: AWS Site Reliability Engineer (Data Platform)
Role Summary
We are looking for an AWS Site Reliability Engineer (SRE) to support and scale a cloud-native data platform built on AWS, Snowflake, and Databricks . The role focuses on driving reliability through automation, disaster recovery (DR) testing, resiliency engineering, observability, and proactive SLO/SLI/SLA management .
Key Responsibilities
- Design, build, and maintain automation for infrastructure provisioning, platform operations, and incident response using IaC and CI/CD.
- Lead resiliency and disaster recovery planning , including regular DR drills, failure testing, and recovery validation across AWS and data platform components.
- Define, implement, and manage SLIs, SLOs, and SLAs for critical data pipelines and platform services; use error budgets to guide reliability improvements.
- Build and operate rob...
Apply for this Position
Ready to join Pyramid Consulting, Inc? Click the button below to submit your application.
Submit Application