Job Description

Technical Summary

You will support the reliability and scalability of services across AWS, Azure, GCP, and Oracle by executing automation, CI/CD, observability, and container orchestration tasks. You will work closely with senior engineers to ensure production systems are stable, well-monitored, and continuously improving.

Responsibilities

  • Implement and maintain monitoring, alerting, and logging systems (Prometheus, Grafana, ELK, OpenTelemetry)
  • Build and maintain CI/CD pipelines and automation for deployments and testing
  • Support containerized workloads using Docker and Kubernetes; manage Helm charts and deployments
  • Contribute to incident response, troubleshooting, and postmortem documentation
  • Implement IaC patterns (Terraform, CloudFormation, ARM templates) under guidance
  • Collaborate with developers to improve service reliability and operational readiness
  • Participate in continuous platform improve...

Apply for this Position

Ready to join Datavail? Click the button below to submit your application.

Submit Application