Job Description

Responsibilities:
Observability and Monitoring:
Design, implement, and manage comprehensive observability solutions, including monitoring, logging, and tracing, using industry-standard tools (e.g., Prometheus, Grafana, ELK Stack, Datadog, New Relic, Splunk)
Develop and maintain dashboards, alerts, and reports to provide real-time insights into system health, application performance, and key business metrics
Instrument applications and infrastructure to collect relevant data for monitoring and analysis
Application Support and Troubleshooting:
Provide expert-level support for production applications, including troubleshooting and resolving complex incidents, performance bottlenecks, and system outages
Collaborate with development teams to ensure applications are designed for observability and supportability
Participate in on-call rotations to ensure timely resolution of production issues
Trend Analysis and Performance Optimization:
Conduct proactive trend an...

Apply for this Position

Ready to join Optimum Solutions Pte Ltd? Click the button below to submit your application.

Submit Application