Job Description

Overview


As a Member of Technical Staff, LLM Evaluation, you will develop and implement cutting-edge methodologies to help us evaluate how well Copilot performs in real-world usage scenarios. Users turn to Copilot for all types of endeavors, making it critical that we ensure our AI systems effectively help them meet their needs. Our vision for meeting user needs is expansive and includes not only task completion, but also affective aspects of the experience. You will be responsible for developing new methods to evaluate LLMs, train classifiers, experimenting with data collection techniques, and implementing methodologies to provide real-time signals on Copilot performance. We're looking for outstanding individuals with experience in the social sciences, machine learning, and analysis of natural language. The right candidate is a creative problem solver who will work closely with user researchers and product leaders to build automated evaluation frameworks that ...

Apply for this Position

Ready to join Microsoft? Click the button below to submit your application.

Submit Application