Freelance Agent Evaluation Engineer to evaluate AI coding agents, create challenging tasks, and evaluate AI systems for leading tech companies.

Requirements

  • Degree in Computer Science, Software Engineering, or related fields
  • 5+ years in software development, primarily Python
  • Background in full-stack development, with experience building React-based interfaces and robust back-end systems
  • Experience writing tests (functional, integration)
  • Docker containers, and familiarity with infrastructure tools (Postgres, Kafka, Redis)
  • CI/CD understanding (GitHub Actions as a user)

Benefits

  • Part-time, project-based work