Job Description
Key Responsibilities
- Architect for Resilience: Design systems with redundancy, fault tolerance, and graceful degradation.
- Observability & Monitoring: Implement full-stack observability including monitoring, logging, tracing, and alerting.
- Automation First: Build workflows to automate deployments, incident response, and routine tasks.
- Incident Management: Enable blameless postmortems and continuous improvement.
- Release Planning: Collaborate with DevOps and engineering teams to manage lifecycle work items and release cycles.
- Global Collaboration: Work in a shared responsibility model with 50 60% overlap with onshore teams for effective communication.
Required Skills & Experience
- Cloud Platforms: Azure (preferred), AWS (acceptable with upskilling plan)
- Infrastructure as Code: Terraform, Helm, GitHub Actions
- Containerization & Orchestration: Docker, Kubernetes, Argo CD, Flux
- DevOps Tools: CI/CD pipelines, GitOps, REST APIs
- Programming: Bash, Python (moderate proficiency)
- Data Ecosystems: Azure Data Factory, Databricks, Fabric (optional but preferred)
- Team Integration & Expectations
- Work closely with technical leads on support tasks and playbook development.
- Participate in onboarding and training programs outlined in internal documentation.
- Contribute to offshore delivery excellence and maintain high standards of reliability and performance.
Required Technical Skills
- Strong Azure Infrastructure and Networking skills.
- Strong Terraform IaC experience and skills.
- BICEP knowledge a plus.
- Strong previous experience in troubleshooting complex issues on an unfamiliar tech stack.
- Strong Github Actions/Azure DevOps Pipelines.
- Moderate knowledge of the operations and Infrastructure patterns for working in Data Ecosystems.
- Data Factory, Databricks, Fabric knowledge a plus.
- Moderate knowledge of SRE, Observability, and other maintenance style knowledge.
- Moderate Bash skills.
- Moderate Python skills.
- Moderate experience in CI/CD and Git operations for software releases.
- Moderate AKS/Helm/Kustomize Skills.
- Flux/Argo/GitOps experience a plus.
- Moderate Docker operations knowledge.
- Moderate REST API Knowledge.
Skills Required
Docker, Rest Api, Python, SRE, Github
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application