Job Description

About the Role


We are seeking three Observability Engineers for a fixed-term engagement to help standardize and enhance our observability practices following a major platform consolidation. You will work closely with internal engineering teams and our vendor migration partner to unify monitoring, logging, and alerting across the organization. This role is critical to building a consistent, enterprise-wide observability experience that improves reliability, performance, and proactive detection.


  • Role: Observability Engineers
  • Location: All Persistent Locations
  • Experience: 10+ Years
  • Job Type: Full Time Employment


What You’ll Do:


  • Design and implement a Universal Dashboard in Grafana for leadership and engineering visibility.
  • Ensure a consistent look and feel across all observability views.
  • Define and implement SLIs, SLOs, and error budgets for critical services.
  • Establish alerting thresholds and escalation workflows aligned with reliability goals.
  • Integrate anomaly detection and AI-assisted insights into the observability platform.
  • Contribute to self-healing workflows and automated remediation strategies.
  • Partner with engineering teams to instrument services with metrics, logs, and traces.
  • Provide documentation and best practices for observability adoption across teams.


Expertise You’ll Bring:


  • 5+ years in observability, monitoring, or reliability engineering roles.
  • Hands-on experience with common observability tools such as Prometheus, Grafana, Splunk, Coralogix, and external monitoring tools (e.g., Catchpoint, ThousandEyes).
  • Strong scripting skills in Python, plus Bash or PowerShell for automation.
  • Experience with Terraform and Ansible for infrastructure automation.
  • Solid understanding of SLIs, SLOs, error budgets, and reliability engineering principles.
  • Familiarity with Linux environments and distributed systems.
  • Knowledge of AI/ML-based anomaly detection and AIOps platforms.
  • Experience with log ingestion pipelines (Opentelemetry, Fluentd).
  • Additional languages (Python, Go, Java).
  • Hands-on pipeline experience with GitHub Actions


Benefits:


  • Competitive salary and benefits package
  • Culture focused on talent development with quarterly growth opportunities and company-sponsored higher education and certifications
  • Opportunity to work with cutting-edge technologies
  • Employee engagement initiatives such as project parties, flexible work hours, and Long Service awards
  • Annual health check-ups
  • Insurance coverage: group term life, personal accident, and Mediclaim hospitalization for self, spouse, two children, and parents


Values-Driven, People-Centric & Inclusive Work Environment:


Persistent is dedicated to fostering diversity and inclusion in the workplace. We invite applications from all qualified individuals, including those with disabilities, and regardless of gender or gender preference. We welcome diverse candidates from all backgrounds.


  • We support hybrid work and flexible hours to fit diverse lifestyles.
  • Our office is accessibility-friendly, with ergonomic setups and assistive technologies to support employees with physical disabilities.
  • If you are a person with disabilities and have specific requirements, please inform us during the application process or at any time during your employment


Let’s unleash your full potential at Persistent - persistent.com/careers


“Persistent is an Equal Opportunity Employer and prohibits discrimination and harassment of any kind.”

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application