Site Reliability Engineer

📍 bangalore, bangalore, India

Full-time Other-General Posted January 19, 2026

Apply Now Similar Jobs

Job Description

                        5+ years in observability, monitoring, or reliability engineering roles.  
 Hands-on experience with common observability tools such as Prometheus, Grafana, Splunk, Coralogix, and external monitoring tools (e.g., Catchpoint, ThousandEyes).  
 Strong scripting skills in Python, plus Bash or PowerShell for automation.  
 Experience with Terraform and Ansible for infrastructure automation.  
 Solid understanding of SLIs, SLOs, error budgets, and reliability engineering principles.  
 Familiarity with Linux environments and distributed systems.  
 Design and implement a Universal Dashboard in Grafana for leadership and engineering visibility.  
 Ensure a consistent look and feel across all observability views.  
 Define and implement SLIs, SLOs, and error budgets for critical services.  
 Establish alerting thresholds and escalation workflows aligned with reliability goals.  
 Integrate anomaly detection and AI-assisted insights into the observability platform.  
 Contribute to self-healing workflows and automated remediation strategies.  
 Partner with engineering teams to instrument services with metrics, logs, and traces.  
 Provide documentation and best practices for observability adoption across teams.  
 
                    

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application

Job Details

Location

bangalore, bangalore, India

Job Type

Full-time