Job Description

Role: Site Reliability Engineer (SRE) – Core IT Infrastructure
Location: Pune
Work mode: On-site (full Time)
Experience: 6+ year's
Key Responsibilities
Infrastructure Reliability & Operations
• Design, implement, and maintain highly available and fault-tolerant infrastructure
• Ensure reliability, performance, scalability, and security of core IT systems
• Monitor system health, capacity, and performance using proactive observability practices
• Lead incident response, root cause analysis (RCA), and post-incident reviews
Automation & SRE Development
• Develop and maintain automation tools, scripts, and frameworks to reduce manual operations
• Apply Infrastructure as Code (Ia C) principles using tools such as Terraform, Ansible, or Cloud Formation
• Build self-healing systems and automate repetitive operational tasks
• Improve deployment pipelines and operational workflows through engineering solutions
Dev Ops & Platform Engineering
• Collaborate with Dev Ops, development, and security teams to support CI/CD pipelines
• Enable seamless application deployments with minimal downtime
• Support containerized and orchestration platforms (Docker, Kubernetes, Open Shift)
• Implement best practices for configuration management and environment consistency
Monitoring, Observability & Performance
• Design and maintain monitoring, logging, and alerting systems
• Define and track SLIs, SLOs, and SLAs
• Optimize system performance, capacity planning, and cost efficiency
• Enhance observability using tools such as Prometheus, Grafana, ELK, Datadog, or similar
Security & Compliance
• Implement infrastructure security best practices
• Collaborate with security teams on vulnerability management and compliance requirements
• Ensure secure access, identity management, and audit readiness

Required Skills & Qualifications
Technical Skills
• Strong experience in Linux/Unix system administration
• Proficiency in programming/scripting (Python, Go, Bash, Shell, or similar)
• Experience with cloud platforms (AWS, Azure, or GCP)
• Hands-on experience with containerization and orchestration
• Knowledge of networking concepts (DNS, TCP/IP, load balancing, firewalls)
• Experience with monitoring, logging, and alerting tools

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application