Job Description
Job Title: AI Evaluation Researcher - Remote
Role Overview
In this role, you will help fine-tune and evaluate large language models (e.g., ChatGPT) by designing advanced, concept-driven Computer Science problems and analyzing how effectively AI systems reason through them. You will work closely with researchers to develop robust benchmarks spanning undergraduate to PhD-level computer science topics.
Key Responsibilities
Design advanced, non-coding Computer Science problems to test AI reasoning capabilities
(e.g., algorithms, systems design, security, theory)
Develop clear, step-by-step solutions with rigorous logical reasoning
Evaluate AI-generated responses for:
Correctness
Depth of reasoning
Logical consistency
Identify reasoning gaps, incorrect assumptions, and failure modes in AI outputs
Collaborate with research teams to refine and improve AI evaluation benchmarks
Provide structured, high-quality written feedback to support model improve...
Role Overview
In this role, you will help fine-tune and evaluate large language models (e.g., ChatGPT) by designing advanced, concept-driven Computer Science problems and analyzing how effectively AI systems reason through them. You will work closely with researchers to develop robust benchmarks spanning undergraduate to PhD-level computer science topics.
Key Responsibilities
Design advanced, non-coding Computer Science problems to test AI reasoning capabilities
(e.g., algorithms, systems design, security, theory)
Develop clear, step-by-step solutions with rigorous logical reasoning
Evaluate AI-generated responses for:
Correctness
Depth of reasoning
Logical consistency
Identify reasoning gaps, incorrect assumptions, and failure modes in AI outputs
Collaborate with research teams to refine and improve AI evaluation benchmarks
Provide structured, high-quality written feedback to support model improve...
Apply for this Position
Ready to join Highbrow Technology Inc? Click the button below to submit your application.
Submit Application