As a GenAI Data Scientist, you ensure that the agents we build actually perform well.
You work at the intersection of data, engineering, and product thinking, analyzing where agent outputs fail and how to fix them.
This is not a classical model training or ML research role. Models are off the shelf — the challenge is context engineering, system evaluation, and debugging agent behavior.
You help answer:
“Why is this agent not doing what we expect — and what context or structure does it need?”
Responsibilities
- Analyze agent outputs (PR comments, generated code, chat interactions, summaries)
- Diagnose context gaps and determine what additional information agents need
- Work with PM + engineers to define evaluation criteria for agent performance
- Improve retrieval, data structuring, and onboarding of contextual sources
- Design metrics or lightweight tests to validate improvements
- Perform technical integrations (APIs, batch syncs, data pipelines)
- Translate functional needs into actionable technical improvements
Requirement:
1. Strong understanding of agentic GenAI systemsExperience with LLM-based applications
- Comfortable reasoning about prompts, context windows, retrieval, grounding, etc.
- Able to evaluate whether an output is “good” or “bad” — and why
2. Technical integration skillsFamiliar with API integrations, batch syncs, and structured/unstructured data
- Comfortable working closely with engineers
- Can debug data flow issues and context ingestion problems
3. Analytical & evaluative mindsetAble to design evaluation frameworks beyond classical ML metrics
- Understands functional context: why does the agent exist and what user behavior matters?
- Can operate between engineering and product intuitively
4. Clear communicator
- Explain context gaps and reasoning to engineers, PMs, and stakeholders
- Write clear proposals for improvements
Nice to Have
- Experience with retrieval-augmented systems
- Exposure to generative code?analysis tools or DevOps automation
- Background in human-in-the-loop evaluation
- Strong GitHub, hobby projects or experiments with agent frameworks
About the role
- Contract: 12 months (with possible extension)
- Location: Utrecht area (hybrid)
- Salary: Competitive
- Start: ASAP
- English Speaking