Job Description

  • Manage, monitor, and improve application reliability, scalability, and performance.
  • Implement and maintain monitoring, alerting, and observability tools (Dynatrace, Kibana, CloudWatch).
  • Troubleshoot production issues and drive root cause analysis (RCA) for incidents.
  • Automate operational processes using scripting (Python, Shell, or similar).
  • Collaborate with development and DevOps teams to improve CI/CD and infrastructure reliability.
  • Ensure high system uptime through proactive performance tuning and incident management.
  • Work with AWS services (EC2, ECS, EKS, Lambda, S3, CloudWatch, etc.) for deployment and monitoring.
  • Participate in on-call rotation and production support as required.
  • Support Java / Microservices-based environments, ensuring efficient scaling and health monitoring.
  • Maintain documentation for SRE processes, runbooks, and automation workflows.

Skills Required
Monitoring Tools...

Apply for this Position

Ready to join Pathfinders Global P Ltd? Click the button below to submit your application.

Submit Application