We are looking for an AI QA Engineer to ensure the reliability, accuracy, security, and stability of AI agents in production, implementing automated evaluation frameworks, auditable metrics, and continuous validation processes.
Requirements
- Design, validate, and improve evaluation frameworks for AI agents.
- Implement automated testing suites and regression for generative models.
- Define and monitor quality metrics related to: Relevance, Fidelity, Coherence, Precision, and Hallucinations.
- Build evaluation systems of the type “LLM-as-a-Judge”.
- Establish performance benchmarks for new models and existing agents.
- Validate updates of prompts, models, and RAG pipelines.
- Collaborate with AI and development teams to define acceptance criteria (pass/fail).
- Analyze evaluation results and propose continuous improvements.
- Generate metric reports and traceability on the quality of agents.
Benefits
- Modality: 100% Remote
- Excellent work environment, young, dynamic, and committed team, growth opportunities, and participation in innovative projects.