Job Description
Key Responsibilities:
- Ensure the production reliability of the firm's Linux-based research and trading platform as part of a globally distributed engineering team.
- Provide rapid emergency response to production infrastructure issues.
- Proactively understand internal clients' needs and effectively communicate them to leadership at both regional and global levels.
- Identify risks, develop contingency plans, and implement solutions to mitigate them.
- Develop and enhance the observability platform to monitor the performance and health of critical computing environments.
- Participate in occasional (monthly) on-call rotations and support on-call staff during their shifts.
- Contribute to organizational knowledge through documentation, education, and writing maintainable code.
Qualifications/Skills:
- 2+ years of experience in SRE, DevOps, or other infrastructure engineering roles, preferably within the financial industry.
- Strong understanding of Linux system internals, including kernel operations, memory management, and performance optimization.
- In-depth knowledge of storage technologies, particularly those used in high-performance computing (GPFS experience is a plus).
- Broad understanding of IT infrastructure components, such as networking, DNS, NTP/PTP, and NIS.
- Proficiency in system automation, monitoring, and self-healing (experience with Salt is a plus).
- Experience with container orchestration and virtualization technologies (e.g., Kubernetes, Nomad, VMware).
- Familiarity with on-premises and cloud-based HPC infrastructure (operational knowledge of Slurm and GPU is a plus).
- Understanding of AI technologies and their applications in infrastructure automation and management. Experience with or a strong interest in implementing AI/ML solutions for infrastructure optimization, anomaly detection, or predictive analytics.
- A passion for technology and automation, with a deep sense of curiosity and ownership.
- A hands-on approach to problem-solving and a demonstrable enthusiasm for technology.
- Excellent verbal and written communication skills.
Skills Required
Devops, Kubernetes, Linux, Networking
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application