Job Description

Job Title: AI Evaluation Researcher - Remote

Role Overview

In this role, you will help fine-tune and evaluate large language models (e.g., ChatGPT) by designing advanced, concept-driven Computer Science problems and analyzing how effectively AI systems reason through them. You will work closely with researchers to develop robust benchmarks spanning undergraduate to PhD-level computer science topics.

Key Responsibilities

  • Design advanced, non-coding Computer Science problems to test AI reasoning capabilities
  • (e.g., algorithms, systems design, security, theory)
  • Develop clear, step-by-step solutions with rigorous logical reasoning
  • Evaluate AI-generated responses for:
  • Correctness
  • Depth of reasoning
  • Logical consistency
  • Identify reasoning gaps, incorrect assumptions, and failure modes in AI outputs
  • Collaborate with research teams to refine and improve AI evalua...

Apply for this Position

Ready to join Highbrow Technology Inc? Click the button below to submit your application.

Submit Application