Graduate student or postdoctoral researcher

Project
PREP0003676
Overview

The focus of this project is to conduct a pilot study and develop a demonstration of reproducible AI evaluations. The student will explore the different ways AI evaluations are conducted and the challenges of reproducibility in these contexts. The project aims to produce a study report and a working demonstration of reproducible AI evaluations, supporting broader work at NIST in the measurement of AI.

Developing a Demonstration of Reproducible AI Evaluations

Qualifications
  • Background in Computer Science, Software Engineering, Systems Engineering, Data Science, or related field.
  • Education level: graduate student or higher.
  • Strong interest in software development, AI measurement, reproducibility 
  • Experience with software development in Python, version control systems, AI models, and the shell, as well as scientific reading and technical writing.
  • Experience conducting AI evaluations and designing reproducible software experiments preferred.
Research Proposal

Key Responsibilities

  • Conduct literature survey on the state-of-the-art of reproducible evaluations of software systems
  • Gain familiarity with existing AI evaluation frameworks
  • Contribute to a plan detailing a demonstration of reproducible AI evaluations
  • Design, implement, test, and document software and systems used for demonstration
  • Document overall demonstration, including current limitations and challenges

Deliverables

  • Survey briefly describing key research on software experiment reproducibility
  • Summary report of existing AI evaluation frameworks
  • Working demonstration of reproducible AI evaluations
  • Report describing the demonstration and discussing the challenges in AI evaluation reproducibility.
NIST Sponsor
Mark A. Przybocki
Group
Information Access - HQ
Schedule of Appointment
Full time
Start Date
Sponsor email
Work Location
Onsite NIST
Salary / Hourly rate {Max}
$120,000.00
Total Hours per week
40
End Date