Job Description
Job Description : - Continuous monitoring of system performance and identify potential issues before they impact users. - Experience working with Industry leading monitoring tools. - Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly. - Analyze monitoring data to identify trends, anomalies, to identify potential issues. - Cross-collaboration with development and platform teams to enhance application performance and reliability. - Participate in on-call rotation to support production systems. - Automate monitoring processes to improve efficiency and reduce manual overhead. - Creation of dashboards - Recommend for best practices in observability, alerting, and incident management. - Understanding of distributed systems and microservices architecture. - Conduct post-incident reviews and refine monitoring practices based on findings. - Provide support during major incidents, coordinating response efforts and communication. Quali...
Apply for this Position
Ready to join ACL Digital? Click the button below to submit your application.
Submit Application