Agent Quality / Evals Engineer 1754

SOFTGIC

📍 Colombia , ANT, Colombia

Full-time other-general Posted June 25, 2026

Apply Now Similar Jobs

Job Description

                        This is a remote position. Owns the eval harness and quality gate from the beginning. This role replaces the old late-stage Evals Specialist model with a standing owner for measurable agent quality. Key Responsibilities  Build and maintain the MVP eval harness: golden tasks, exception tasks, scorecard metrics, and regression packs.  Wire evals into CI so quality regressions fail builds and releases.  Define and maintain release-gate thresholds with Product and the Tech Lead.  Lay the path for later adversarial and drift-testing expansion without overbuilding MVP scope. Requirements Must-Have Qualifications  Experience evaluating ML, LLM, or non-deterministic systems.  Strong test and benchmark design capability.  Comfort working with noisy metrics, thresholds, and probabilistic behavior.  Good scripting and automation skills. AI-First Expectations  Uses AI to generate candidate eval cases and failure hypotheses, but never confuses generated tests with validated quality.  Approaches AI ...
                    

Apply for this Position

Ready to join SOFTGIC? Click the button below to submit your application.

Submit Application

Job Details

Location

Colombia , ANT, Colombia

Job Type

Full-time