Job Description

About the Company



We are a forward-thinking technology company dedicated to delivering innovative solutions that enhance operational efficiency and drive business success. Our mission is to empower organizations through advanced monitoring and analytics tools, fostering a culture of collaboration, inclusivity, and continuous improvement.



About the Role



The Monitoring Analyst will play a crucial role in implementing proactive monitoring strategies to ensure optimal performance and availability of applications and services.



Responsibilities



  • Implement proactive monitoring strategies for early anomaly detection and incident prevention
  • Design, configure, and maintain real-time dashboards for application, infrastructure, and service observability
  • Perform end-to-end infrastructure monitoring, including CPU utilization, memory consumption, disk I/O, network latency, and throughput
  • Monitor service availability, uptime, and adherence to SLA/SLO targets
  • Define, tune, and manage thresholds, dynamic baselines, health rules, and alerting policies
  • Conduct monitoring health checks to ensure data accuracy, coverage, and telemetry reliability
  • Perform Root Cause Analysis (RCA) for performance degradation, service outages, and availability issues
  • Provide real-time incident monitoring, alert triage, and operational support during critical events
  • Extract, transform, and analyze monitoring, alert, and incident data for operational insights
  • Identify incident trends, recurring issues, and systemic performance patterns
  • Prepare and deliver monthly operational and performance reports for technical and business stakeholders
  • Develop interactive dashboards and visualizations to track KPIs, MTTR, incident volumes, and reliability metrics



Qualifications



Experience required: 3+ Years



Required Skills



  • AppDynamics: Application Performance Monitoring (APM), Business Transactions, Service Endpoints, Health Rules, Dynamic Baselines, Alerting and Event Correlation
  • Splunk: Centralized log management, advanced search and correlation, real-time dashboards, alerting, and incident investigation
  • Grafana: Time-series data visualization, custom metric dashboards, multi-source observability, and trend analysis
  • Power BI: Incident and operational data modeling, trend analysis, KPI tracking, and executive-level reporting



Preferred Skills



  • Experience with cloud monitoring tools
  • Knowledge of scripting languages for automation
  • Familiarity with ITIL processes and frameworks


Equal Opportunity Statement


We are committed to creating a diverse environment and are proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status.

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application