Job Description
- Manage, monitor, and improve application reliability, scalability, and performance.
- Implement and maintain monitoring, alerting, and observability tools (Dynatrace, Kibana, CloudWatch).
- Troubleshoot production issues and drive root cause analysis (RCA) for incidents.
- Automate operational processes using scripting (Python, Shell, or similar).
- Collaborate with development and DevOps teams to improve CI/CD and infrastructure reliability.
- Ensure high system uptime through proactive performance tuning and incident management.
- Work with AWS services (EC2, ECS, EKS, Lambda, S3, CloudWatch, etc.) for deployment and monitoring.
- Participate in on-call rotation and production support as required.
- Support Java / Microservices-based environments, ensuring efficient scaling and health monitoring.
- Maintain documentation for SRE processes, runbooks, and automation workflows.
Skills Required
Kubernetes, Dock...
Apply for this Position
Ready to join Hanker Systems India? Click the button below to submit your application.
Submit Application