Job Description

About the Job


We are looking for an experienced L2 IT Infrastructure Operations Engineer to provide advanced technical support for our enterprise server and network infrastructure. This mid-level position bridges the gap between frontline support and expert-level engineering, handling escalated incidents, performing complex troubleshooting, and contributing to operational excellence. The ideal candidate will possess hands-on

experience with Dell PowerEdge servers, Cisco networking equipment, and enterprise monitoring solutions. You will mentor L1 engineers, participate in change management activities, and collaborate with cross-functional teams to ensure high availability and performance of critical infrastructure in a 24x7 global environment.


Key Responsibilities-

 Provide advanced troubleshooting and fault isolation for escalated server and network incidents, utilizing iDRAC, Redfish, and Cisco CLI tools to diagnose and resolve complex issues.

 Execute firmware, BIOS, and driver updates on Dell PowerEdge servers following standardized procedures, ensuring minimal service disruption and maintaining system stability.

 Perform IOS/NX-OS firmware and software updates on Cisco routers and switches, adhering to change management protocols and conducting post-update validation.

 Manage hardware break/fix procedures for server infrastructure, coordinating with Dell support for warranty claims, parts ordering, and scheduling on-site technician dispatch.

 Conduct regular network health audits and performance analysis, identifying potential bottlenecks and recommending optimization measures to prevent service degradation.

 Collaborate with the SRE team to enhance monitoring dashboards and refine alerting thresholds, ensuring proactive detection of infrastructure instability or security events.

 Mentor and provide technical guidance to L1 engineers, conducting knowledge transfer sessions and assisting with complex ticket resolution to build team capability.

 Participate in blameless post-mortems following major incidents, contributing to root cause analysis and implementing preventative actions to improve system reliability.

 Maintain and update operational runbooks, network diagrams, and technical documentation to reflect current configurations and best practices.

 Support hardware lifecycle management activities, including equipment provisioning, asset tracking, and coordination with vendors for hardware returns and repairs.

 Provide 24x7 on-call support for critical escalations, ensuring rapid response to high-priority incidents affecting production systems.

 Collaborate with the FTE IT Team Lead on capacity planning activities, providing data-driven insights on infrastructure utilization trends and growth projections.


Required Skills-

 Related field experience with 5+ years of hands-on experience in enterprise IT infrastructure operations.

 Strong proficiency with Dell PowerEdge server administration, including hardware

troubleshooting, iDRAC/Redfish management, and firmware lifecycle management.

 Solid experience with Cisco networking equipment (routers, switches), including IOS/NX-OS configuration, troubleshooting, and upgrade procedures.

 Working knowledge of monitoring and logging tools, with the ability to create dashboards, configure alerts, and analyze performance metrics for proactive issue detection.

 Excellent problem-solving abilities with demonstrated experience in incident management, root cause analysis, and implementing corrective actions in production environments.

 Industry certifications such as Dell Server certifications or ITIL Foundation; ability to work rotating shifts in a 24x7 global support model.


Tools Required-

 Server & Hardware Tools: Dell iDRAC, Lifecycle Controller, OpenManage, and RAID/PERC utilities for server provisioning, firmware baselining, and remote management.

 OS Deployment Tools: PXE boot infrastructure, iDRAC Virtual Media, and Windows Server and & Linux ISOs with hardening and automation scripts.

 Network Tools: Cisco IOS CLI, PoE management, VLAN/QoS configuration tools, network monitoring, and bandwidth/latency testing utilities.

 Automation & Operations Tools: Ansible, Python, CMDB systems, configuration backup tools, and documentation/diagramming platforms

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application