Principal Coding Annotator / LLM Evaluation Engineer (Contract)

Braintrust

📍 España, Romblon, Philippines

Full-time Engineering Posted February 06, 2026

Apply Now Similar Jobs

Job Description

Overview This is a contracting engagement - initially 6 months - with potential for long term engagement.  
Location:  Paris-based preferred; alternatively Europe remote for strong candidates. 
What You’ll Do Create high-quality coding prompts and reference answers (benchmark-style, e.g. SWE-Bench-like problems). 
Evaluate LLM outputs for code generation, refactoring, debugging, and implementation tasks. 
Identify and document model failures, edge cases, and reasoning gaps. 
Perform head-to-head evaluations between private LLMs (Mistral-based) and leading external models. 
Build or configure coding environments to support evaluation and reinforcement learning (RL). 
Follow detailed annotation and evaluation guidelines with high consistency. 
What We’re Looking For 10+ years of professional software development experience. 
Strong Python skills (required). 
                

Apply for this Position

Ready to join Braintrust? Click the button below to submit your application.

Submit Application

Job Details

Location

España, Romblon, Philippines

Job Type

Full-time