Job Description

About the Team

We are the Online Storage team powering ChatGPT, Sora, and the OpenAI APIs. We’re a growing team set up to own the databases and online‑storage infrastructure that serve all our products.

About the Role

As OpenAI scales, we’re seeking experienced, problem‑solving engineers to build robust, high‑performance, and scalable database systems. Our ability to rapidly iterate on products while ensuring reliability and speed is key to our success.

You’ll work in a fast‑paced, collaborative environment, building systems that serve hundreds of millions of users globally, with a strong emphasis on safety, reliability, and performance.

We’re hiring skilled software engineers to join the Online Storage team. You’ll help design and build a large‑scale database, collaborate with various product teams to scale it to meet their needs, and own operational excellence by defining SLAs and KPIs that directly satisfy stakeholder expectations. This is a critical role for engineers who thrive on solving complex, large‑scale challenges and are passionate about building resilient systems that perform under load.

 In this role, you will:

  • Design and build highly scalable, reliable, and performant database

  • Design and build highly simple and intuitive APIs for the underlying database

  • Analyze and resolve performance and scalability bottlenecks to improve overall system efficiency

  • Debug, instrument, and fix system issues — from pinpointing root causes to delivering long-term solutions

  • Define technical strategy and guide the development of robust infrastructure that supports high-scale production systems and evolving business needs

  • Collaborate closely with product teams to deeply understand requirements and deliver impactful solutions

  • Boost engineering productivity by building intuitive tools and systems that empower fellow developers

  • Own the reliability of the systems you build, including participating in an on-call rotation to address critical incidents

  • You might thrive in this role if you:

  • Have experience building (and rebuilding) production systems to support new product capabilities and growing scale

  • Care deeply about the end-user experience and take pride in solving real customer needs

  • Embrace a humble, collaborative mindset and go the extra mile to support your teammates and the broader mission

  • Own problems end-to-end — you're comfortable learning on the fly to fill gaps and get things done

  • Build internal tools that improve workflows when off-the-shelf solutions fall short

  • Have hands-on experience with distributed systems such as data storage, caching, search, or other backend infrastructure components

  • Prioritize the reliability, scalability, and performance of large-scale systems

  • Thrive in ambiguous, fast-paced environments and enjoy iterating rapidly on product and research initiatives

  • Qualifications:

  • 4+ years of industry experience, including 2+ years leading large-scale, complex projects or technical initiatives as an engineer or tech lead

  • Strong passion for building distributed systems at scale, with a focus on reliability, scalability, security, and continuous improvement

  • Expertise in systems programming, with hands-on experience in multi-threading and concurrency; proficiency in C++ and/or Python is highly preferred

  • Preferably, domain experience in areas such as databases, large-scale data systems, storage, caching, search, or other core components of distributed infrastructure

  • Excellent communication skills, with the ability to build consensus across diverse technical and non-technical stakeholders

  • .

    Apply for this Position

    Ready to join ? Click the button below to submit your application.

    Submit Application