Job Description

Reinforcement Learning Environment Engineer

RL Environments; MLE; LLM Tasks; Difficulty Distribution; Remote Contractor; PST Overlap (≥4h); Advanced English (C1/C2);

We’re hiring RL Environments Engineers to design and build MLE/SWE environments that deliver high-quality, diverse tasks with minimal supervision. You will target a specific language model, meet a defined difficulty distribution, and deliver about one task every 10 hours. This is a remote contractor role with ≥4 hours overlap to PST and advanced English (C1/C2) required.

About the company

Preference Model is building the next generation of training data to power the future of AI. Today's models are powerful but fail to reach their potential across diverse use cases because so many of the tasks that we want to use these models for are outside of their training data distribution. Preference Model creates reinforcement learning environments that encapsulate real-world use cases...

Apply for this Position

Ready to join Open Data Science? Click the button below to submit your application.

Submit Application