Job Description

                                                                                               Overview As a senior DevOps Engineer, you will  own  the AWS infrastructure and DevOps toolchain for a high-scale ad serving system composed of asynchronous  Java microservices (Akka framework) .

Targets include  <50ms  response time and  up to 5M concurrent users  with  99.99% uptime .

Responsibilities Design & stand up AWS  environments end-to-end (landing zone, VPCs, networking, security, automation).

Build  immutable infrastructure  and  CI/CD  for Java microservices (Maven/Gradle) including blue/green & canary releases and automated rollbacks.

Implement  observability : metrics, logs, traces, SLOs/SLIs, alerting, on-call runbooks.

Engineer  reliability & performance : autoscaling, caching layers, multi-AZ/region DR, capacity planning to support 5M+ concurrent users and p95/p99 latency goals.

Establish  security-by-design : IAM least privilege, KMS/Secrets Manager, WAF/Shield, image/signing policies, CIS benchmarks.

Partner with EY developers & Performance Test Engineer to tune JVM/Akka, thread pools, GC, and infra limits based on load-testing feedback.

Champion  cost governance  and tagging; produce dashboards and weekly reports.

Tech you’ll use (you don’t need every single one, but you know most) AWS : EKS/ECS, EC2, ALB/NLB, API Gateway/Lambda, S3/CloudFront, DynamoDB/ElastiCache (Redis), Aurora/RDS, MSK/Kinesis, OpenSearch, Route 53, VPC, NAT/GW, WAF/Shield, CloudWatch/X-Ray, IAM, KMS, Secrets Manager.

IaC & CI/CD : Terraform/CloudFormation, Helm, Argo CD or Flux, GitHub Actions/Jenkins/GitLab CI, Docker.

Observability : CloudWatch, OpenTelemetry, Prometheus/Grafana, log pipelines.

Languages/Build : Bash/Python for automation; familiarity with Java build/release workflows.

What makes you a great fit 3–5+ years total experience;  Senior/Manager-level  depth in AWS platform engineering for  high-throughput, low-latency  services.

Proven ownership of  production systems at 10k–1M+ concurrent users  (or comparable high RPS) with 99.9x SLOs.

Hands-on with  Akka/Java microservice delivery pipelines  (nice if you’ve tuned JVM, GC, Akka dispatchers).

Strong grounding in  scaling patterns  (event-driven, async IO, caching, backpressure, rate limiting) and  resilience (circuit breakers, retries, chaos).

Excellent collaboration, documentation, and stakeholder communication.

Logistics Location : Remote (prefer India candidates) Schedule : Must join  US morning calls (Eastern Time)  as needed.

Start :  1–3 weeks  from offer.

Term : Through  end of January  (likely extension).

  •   Powered by JazzHR

  • Apply for this Position

    Ready to join ? Click the button below to submit your application.

    Submit Application