We are looking for German-speaking AI Content Analysts to support evaluation of a new personalization capability within a leading AI assistant platform. The role involves designing realistic conversational prompts, evaluating AI responses, and assessing integration quality.

Requirements

  • Design and run short multi-turn conversations (typically 1–5 turns) intended to test AI personalization behavior
  • Create prompts grounded in realistic personal scenarios to evaluate contextual understanding
  • Review AI responses to determine whether personalization is correctly applied
  • Check grounding quality to ensure the model does not invent unsupported claims about the user
  • Evaluate integration quality — confirming personal signals are used naturally (not forced or robotic)
  • Compare two responses side-by-side and determine which is more helpful, natural, and relevant
  • Write clear, structured rationales explaining rankings and referencing specific conversation turns
  • Verify debug information showing whether correct data sources were used
  • Maintain strict workflow hygiene (including deleting evaluation conversations when required)

Benefits

  • Paid by hours logged and approved
  • 30-40 hours/week commitment