Job Description
Job Title: AI Evaluation Researcher - Remote
Role Overview
In this role, you will help fine-tune and evaluate large language models (e.g., ChatGPT) by designing advanced, concept-driven Computer Science problems and analyzing how effectively AI systems reason through them. You will work closely with researchers to develop robust benchmarks spanning undergraduate to PhD-level computer science topics.
Key Responsibilities
- Design advanced, non-coding Computer Science problems to test AI reasoning capabilities
- (e.g., algorithms, systems design, security, theory)
- Develop clear, step-by-step solutions with rigorous logical reasoning
- Evaluate AI-generated responses for:
- Correctness
- Depth of reasoning
- Logical consistency
- Identify reasoning gaps, incorrect assumptions, and failure modes in AI outputs
- Collaborate with research teams to refine and improve AI evalua...
Apply for this Position
Ready to join Highbrow Technology Inc? Click the button below to submit your application.
Submit Application