As a Freelance Agent Evaluation Engineer, you will create challenging tasks and evaluation criteria for AI coding agents in simulated environments, build virtual companies, and design tasks that evaluate AI systems. The project involves part-time work, remote location, and requires experience in software development, test automation, and Python programming.

Requirements

  • Degree in Computer Science, Software Engineering, or related fields
  • 5+ years in software development, primarily Python
  • Background in full-stack development, with experience building React-based interfaces and robust back-end systems
  • Experience writing tests (functional, integration)
  • Docker containers, and familiarity with infrastructure tools
  • CI/CD understanding (GitHub Actions as a user)
  • English proficiency - B2

Benefits

  • Part-time opportunity
  • Remote work
  • Estimated $30 per hour equivalent pay