Job Description
Responsibilities
- Design and implement efficient parallel computing strategies and memory management mechanisms to improve end-to-end throughput and latency
- Develop and optimize high-performance training and inference frameworks, maximizing hardware compute and memory bandwidth utilization
Qualifications
- Proficiency in Python and C++, with strong foundations in data structures, algorithms, and systems programming
- Solid experience with PyTorch, including a deep understanding of model execution workflows, operator invocation, and computation graph mechanisms
- Familiarity with high-performance computing (HPC) concepts such as parallel computing, memory hierarchy, and operator fusion
- Basic understanding of accelerator architectures (e.g., GPU, NPU), including compute units, memory systems, and communication mechanisms
Preferred Qualifications
- <...
Apply for this Position
Ready to join hpc ai technology pte. ltd.? Click the button below to submit your application.
Submit Application