Job Description
About the Team
The Workload Networking team is responsible for the collective communication stack used in our largest training jobs. Using a combination of C++ and CUDA we work on novel collective communication techniques that enable efficient training of our flagship models on our largest custom built supercomputers.
The models we train are key ingredients to the AI research progress at OpenAI and the field as a whole, and we continually incorporate learnings from our entire research org into our training platform.
About the Role
As a Software Engineer, Networking you will design and implement custom networking collectives that are tightly integrated into our training stack.
We’re looking for people who have a background in low level performance critical software. Experience with collective communication is a bonus.
This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer reloca...
Apply for this Position
Ready to join OpenAI? Click the button below to submit your application.
Submit Application