Undergraduate or Graduate Student
We are seeking a senior undergraduate or graduate student with strong software engineering skills to
support the Guardians of Forensic Evidence Initiative—an effort to strengthen the scientific reliability of
AI-based deepfake detection systems used in forensic and judicial contexts.
This role emphasizes AI related software development, evaluation pipeline implementation, AI system
benchmarking infrastructure, and web platform development. The selected candidate will contribute
directly to the development of the Deepfake Challenge Kit, including dataset management systems,
scoring packages, and a secure evaluation website with user authentication and leaderboard
functionality
Software Engineer - Guardians of Forensic Evidence Project
This individual must have the following minimum knowledge, skills, and abilities:
- Senior undergraduate or graduate student in Computer Science, Software Engineering, or a related
field - Strong proficiency in Python to support timely project delivery
- Experience working in a Linux environment and with shell scripting (Bash) is required
- Background in media (audio, image, or video) processing and analysis
- Ability to work independently as well as in collaborative research environments
Furthermore, the following knowledge skills, and abilities are preferred:
- GPU programming or AI model development experience
- Experience with web development (HTML, CSS, JavaScript)
- Experience developing backend services (Flask, Django, FastAPI, Node.js, etc.)
- Previous experience with generative AI tools, including deepfake technologies and large language
models - Experience in cross-platform software development (Linux, macOS, Windows)
- Experience or interest in machine learning and AI system testing and evaluation
- Experience with database management (e.g., PostgreSQL)
- Experience with Jupyter Notebooks, R, Shiny, and interactive data visualization
- U.S. Citizen Preferred
Key responsibilities will include but are not limited to:
- Deepfake and Synthetic Data Generation and Automated Benchmarking Dataset Pipeline
Development, including but not limited to: developing automated or semi-automated state-of-
the-art deepfake image/audio generation pipelines; implementing metadata handling and
dataset validation tools; building infrastructure for deepfake media benchmarking; etc. - Deepfake Analytic AI System Implementation: Implement baseline deepfake detection
algorithms; design modular, well-documented, and maintainable codebases; deploy deepfake
detection tools on Linux servers and GPU clusters; ensure reproducibility and performance
optimization; maintain cross-platform compatibility (Linux, macOS, Windows); implement
containerized solutions (Docker-based workflows) as needed. - Scoring Package & Evaluation Infrastructure: Implement evaluation metrics (ROC curves, AUC,
confusion matrices, robustness analysis); develop reproducible evaluation pipelines; conduct
quantitative performance analysis across different data subsets. - Web Platform Development: Design, implement, and maintain a secure, scalable evaluation
web platform that includes a user authentication system (login, registration, role-based access
control), a dataset release portal with controlled access, and an admin dashboard for dataset
and user management. The technology stack may include (but is not limited to) Python, React or
modern JavaScript frameworks, PostgreSQL, Docker, and Linux-based deployment
environments.