Job Description

Job Summary: As a Site Reliability Engineer you are expected to take ownership of platform reliability, monitoring, logging, incident response, and operational excellence. This role requires strong accountability, calm decision-making during incidents, and the ability to fix and restore systems under pressure.


Key Responsibilities
Reliability & Operations:

  • Own availability, performance, and reliability of production systems
  • Participate in on-call rotations and lead incident resolution
  • Perform root cause analysis (RCA) and implement preventive fixes
  • Drive reliability improvements through automation and observability

Monitoring, Logging & Observability:

  • mplement and maintain observability using Prometheus, Grafana, Grafana Alloy, Loki, and Datadog
  • Build dashboards, logs, and actionable alerts
  • Correlate metrics, logs, and alerts to reduce MTTR
  • Reduce alert fatigue through signal-driven alerting

Kubernetes & Platform Monitoring: 

  • Monitor and troubleshoot Kubernetes clusters in production
  • Diagnose pod failures, resource exhaustion, networking, and scaling issues
  • Ensure end-to-end observability across workloads and infrastructure

Databases & Data Stores: 

  • Monitor and support MongoDB, MySQL, and Azure SQL
  • Identify performance bottlenecks, replication issues, and availability risks


Required skills & Experience:

  • 3+ years of experience in SRE / DevOps / Production Support roles
  • Strong hands-on experience with Prometheus, Grafana, Grafana Alloy, Loki, and Datadog
  • Solid understanding of monitoring, logging, and alerting best practices
  • Experience troubleshooting Kubernetes-based systems
  • Strong Linux fundamentals and incident-handling experience


Mindset:

  • ownership and accountability
  • Calm, structured, and decisive during outages
  • Bias toward action and long-term fixes rather than temporary workarounds


About:

CoreStack provides a NextGen Cloud Governance platform that empowers enterprises to increase top-line revenues and gain a competitive edge through AI-powered real-time cloud governance on autopilot.
CoreStack is successfully deployed with companies across multiple industries such as Healthcare, Financial Services, Retail, Education, Technology, and Government. CoreStack has a stellar leadership team, creative investors, and backed by industry-leading advisors. Gartner recognized CoreStack in the 2020 Cloud Computing Platforms Magic Quadrant. CoreStack is also a recent recipient of the 2021 Gold Stevie American Business Awards in the Cloud Infrastructure category and 2021 Gold Globee Winner of the Most Innovative Company of the Year in IT Cloud/SaaS, Tech Ascension Award 2022, CoreStack Wins DataCloud Global Award 2022 and 19th Annual 2023 Silver Globee Winner® Cybersecurity Awards.

Why Join CoreStack?

  • Be part of a fast-growing, innovative company and the leader in cloud governance. Work with a talented and passionate team of professionals.
  • High-impact role with significant influence over company growth trajectory and expanding market presence.
  • A mission-driven culture that values and rewards collaboration and innovation.
  • Competitive compensation package, including performance bonus and stock options.
  • Comprehensive benefits package, including health, dental, vision, and life insurance.
  • Retirement savings plan.
  • Generous Paid Time Off and Holidays.
  • Real ownership of production systems
  • Exposure to real-world scale, incidents, and reliability challenges
  • A team that values competence, responsibility, and learning from failures
    Location:  India

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application