Job Description
Key Responsibilities:
- Execute on the Incident, Change Management, Problem Management processes
- Building and supporting a reliable application suite for the environment in order to meet the development and maintenance requirements of systems/platforms.
- Provide consultation and direct technical support in life cycle planning, problem management, integration, and systems programming
- Ensure platform performance and availability meet enterprise objectives through monitoring, timely service restoration, and tuning
- Constantly working to improve and implement automation of applications tasks
- Providing technical support for systems/platforms according to application SLA's.
- Responsible for designing and developing resiliency in the application code, troubleshooting incidents, engaging with squads to address failure patterns, and participating in incident management.
- Strong Troubleshooting ability required
- Leads calls or contributes in a logical fashion
- Focus on resolving issues before they become incidents
- Identify and articulate severity of impacts using provided monitoring tools and escalate as needed
- Able to understand architecture and design of applications and identify or narrow focus for an incident based on symptoms
- Perform root cause analysis to quickly recover from service interruptions, and to prevent recurring problems
- Monitor, manage, and tune platforms to ensure expected availability and performance levels are achieved
- Identify gaps in monitoring or documentation and reaches out to appropriate teams to fill those gaps
- Implement changes to platforms with minimal impact to the business by following enterprise standards and procedures
- Design and document enterprise standards and procedures
Minimum Qualifications:
- Bachelors degree or industry certification in an applicable IT field, in addition to 3 years applicable experience in the design/administration/support of one or more platforms; or
- Bachelor's degree in an IT field, in addition to two years applicable experience in the design/administration/support of one or more platforms;
- 3 or more years of experience as a Systems Engineer or Site Reliability Engineer
- 2 or more years of experience with ops automation using a scripting language such as Python or Ansible
- Site Reliability Engineering: Knowledge of the theories and methodologies of reliability engineering; ability to design, develop and support various tools, services and applications to maintain a reliable site Environment.
- Performance Measurement and Tuning: Knowledge of system performance, testing and programming; ability to monitor, measure, and optimize system performance and network communication.
- CI/CD Pipeline: Knowledge of concepts, values and tools applied in building Continuous Integration(CI), Continuous Delivery and Continuous Deployment(CD) pipeline; ability to design, build, implement and maintain CI/CD pipelines to achieve the automation of software delivery process.
- Software Release Management: Knowledge of strategies, practices and tools for managing versions and distribution of software products and enhancements; ability to evaluate and improve release management practices and tools
- Application Maintenance: Knowledge of production applications; ability to monitor application functions and resolve issues to maintain optimal conditions for system applications.
- Software Engineering: Knowledge of software engineering; ability to deliver new or enhanced software products.
- Agile Development: Knowledge of agile methodologies and the agile development lifecycle; ability to utilize formal agile methodologies, disciplines, practices and techniques for the delivery of new and enhanced applications.
- Embraces diverse people, thinking and styles
Behavioral Competencies:
- Ability to produce high quality results, work in a collaborative environment by embracing diverse perspectives and with a solution-based approach.
- Adapt communication clearly and concisely based on team dynamics and expresses thoughts & ideas effectively.
- Ability to engage effectively with peers and stakeholders to build trust and reliable working relationships.
- Ability to understand business processes, implement innovative solutions, guide juniors on continuous improvement by constantly updating oneself on current technology & trends.
- Inquisitive to understand customer and business expectations while creating value addition on technical solutions.
Preferred Qualifications:
- Masters degree in Computer Science, Information Technology or related field is preferred
- Experience and exposure to VMWare VDI implementations a huge plus
- Experience with Dynatrace APM and synthetic monitoring
- Experience with airline applications and infrastructure technology is a plus
ID: DELNN01
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application