Graduate student or postdoctoral researcher
Project
PREP0003676
Overview
The focus of this project is to conduct a pilot study and develop a demonstration of reproducible AI evaluations. The student will explore the different ways AI evaluations are conducted and the challenges of reproducibility in these contexts. The project aims to produce a study report and a working demonstration of reproducible AI evaluations, supporting broader work at NIST in the measurement of AI.
Developing a Demonstration of Reproducible AI Evaluations
Qualifications
- Background in Computer Science, Software Engineering, Systems Engineering, Data Science, or related field.
- Education level: graduate student or higher.
- Strong interest in software development, AI measurement, reproducibility
- Experience with software development in Python, version control systems, AI models, and the shell, as well as scientific reading and technical writing.
- Experience conducting AI evaluations and designing reproducible software experiments preferred.
Research Proposal
Key Responsibilities
- Conduct literature survey on the state-of-the-art of reproducible evaluations of software systems
- Gain familiarity with existing AI evaluation frameworks
- Contribute to a plan detailing a demonstration of reproducible AI evaluations
- Design, implement, test, and document software and systems used for demonstration
- Document overall demonstration, including current limitations and challenges
Deliverables
- Survey briefly describing key research on software experiment reproducibility
- Summary report of existing AI evaluation frameworks
- Working demonstration of reproducible AI evaluations
- Report describing the demonstration and discussing the challenges in AI evaluation reproducibility.
NIST Sponsor
Mark A. Przybocki
Group
Information Access - HQ
Salary / Hourly Rate {Min}
$50,000.00
Schedule of Appointment
Full time
Start Date
Sponsor email
Work Location
Onsite NIST
Salary / Hourly rate {Max}
$120,000.00
Total Hours per week
40
End Date