Job Description
Overview:
Terralogic is a leading IT service provider company based in the Greater Bengaluru Area. We specialize in providing technology services for the connected world, from product conceptualization to product maturity, decline, and sustenance. With a strong focus on stability and increased productivity, we offer long-lasting partnerships and far-reaching solutions that meet our customers’ roadmap and business needs.
Total Experience:
3+ years
Job Skills:
Bachelor’s degree in Computer Science, Information Technology, or related field.
3+ years of experience in IT operations, production support, or application management.
Good understanding of ITIL framework, incident/change/problem management.
Excellent leadership, communication, and stakeholder management skills.
Ability to work under pressure and manage multiple priorities in a fast-paced environment
Good to Have Skills :
Exposure to CI/CD pipelines, DevOps, and automation tools [ Good to Have]
Proficiency in SQL, Linux/Unix commands, and scripting for troubleshooting [ Good to have]
Responsibilities:
Incident & Problem Management:
Lead the triage and resolution of high-impact incidents and service disruptions.
Manage root cause analysis (RCA) and implement corrective and preventive actions.
Establish escalation procedures and ensure adherence to SLAs.
Team Management:
Manage and mentor a team of production support engineers and analysts.
Ensure effective 24×7 coverage through shift rotations and on-call schedules.
Build team capability through training and knowledge sharing.
Operational Excellence:
Oversee system monitoring, alert management, and performance optimization.
Define and track key operational metrics (uptime, MTTR, ticket volume, etc.).
Continuously improve processes and automation to reduce manual intervention.
Stakeholder Coordination:
Collaborate with development, QA, infrastructure, and DevOps teams to ensure smooth release transitions to production.
Serve as the primary contact for production-related escalations from business or clients.
Communicate status, risks, and recovery plans during incidents and post-incident reviews.
Change & Release Management:
Participate in change advisory board (CAB) meetings to assess production risks.
Ensure deployment readiness, rollback strategies, and proper documentation.
Verify post-release validations and track issues arising from new deployments.
Documentation & Compliance:
Maintain accurate runbooks, standard operating procedures (SOPs), and knowledge base articles.
Ensure compliance with internal security, audit, and regulatory standards.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application