Site Reliability Engineer (SRE)

📍 Bengaluru, India
Full-time other-general Posted January 20, 2026
Apply Now Similar Jobs
Job Description

                        **Description**
 
**Platform & System Reliability (SRE)**
 
+ Build and maintain highly available, scalable, and fault-tolerant systems in GCP and other cloud environments.
 
+ Design and implement automated solutions to eliminate toil and improve operational efficiency.
 
+ Develop, refine, and maintain monitoring, observability, and alerting systems across infrastructure and services.
 
+ Instrument platforms with OpenTelemetry for metrics, logs, and traces.
 
+ Own incident response processes, including on-call participation, root-cause analysis, and post-incident improvement actions.
 
+ Build and support CI/CD pipelines, GitOps workflows, and infrastructure-as-code deployments (e.g., Terraform).
**Data Reliability Engineering (DRE)**
 
+ Ensure reliability, accuracy, and availability of batch, streaming, and real-time data pipelines.
 
+ Instrument data flows with data observability patterns, including lineage (OpenLineage), freshness, completeness, and quality checks.
 
+ Monitor data systems end-to-end using automated alerting and anomaly detection.
 
+ Contribute to data SLOs, SLIs, and error budgets that measure reliability and drive continuous improvement.
 
+ Improve performance, scalability, and resilience across data storage systems (SQL,
 
+ NoSQL, lakehouse, analytics services).
 
**Qualifications**
 
+ 5–7 years in Site Reliability Engineering, Data Engineering, Platform Engineering, or similar roles.
 
+ Strong experience in GCP (preferred) plus exposure to OCI/Azure.
 
+ Proficiency in Python, Go, Bash, or similar languages for automation and tooling.
 
+ Hands-on experience with containerization, service mesh, and distributed systems design.
 
+ Expertise with observability platforms and telemetry standards (Prometheus, Grafana, Cloud Monitoring, OpenTelemetry).
 
+ Solid understanding of networking, Linux fundamentals, and scalable system design.
 
+ Familiarity with modern data platforms (BigQuery, Kafka, Spark, data lakes) and data reliability concepts.
 
+ Experience with IaC practices (Terraform, Ansible) and CI/CD systems.
 
+ Excellent communication skills for partnering with platform, data, and application teams.
 
+ Ability to work with team members and clients to assess needs, provide assistance, and resolve problems.
 
+ Strong problem-solving and analytical skills.
 
+ Desire to understand why things work the way they do.
 
+ Ability to present and explain technical concepts to business audiences.
 
+ All other duties as assigned.
 
This job posting will remain open a minimum of 72 hours and on an ongoing basis until filled.
**Job** Information Technology
**Primary Location** India-Karnataka-Bengaluru
**Schedule:** Full-time
**Travel:** No
**Req ID:** 254849
**Job Hire Type** Experienced Not Applicable #BMI N/A
Apply for this Position

Ready to join ? Click the button below to submit your application.
Submit Application
Job Details

Location
Bengaluru, India
Job Type
Full-time