Job Description
Technical Summary
You will support the reliability and scalability of services across AWS, Azure, GCP, and Oracle by executing automation, CI/CD, observability, and container orchestration tasks. You will work closely with senior engineers to ensure production systems are stable, well-monitored, and continuously improving.
Responsibilities
- Implement and maintain monitoring, alerting, and logging systems (Prometheus, Grafana, ELK, OpenTelemetry)
- Build and maintain CI/CD pipelines and automation for deployments and testing
- Support containerized workloads using Docker and Kubernetes; manage Helm charts and deployments
- Contribute to incident response, troubleshooting, and postmortem documentation
- Implement IaC patterns (Terraform, CloudFormation, ARM templates) under guidance
- Collaborate with developers to improve service reliability and operational readiness
- Participate in continuous platform improve...
Apply for this Position
Ready to join Datavail? Click the button below to submit your application.
Submit Application