Team Introduction:
Huawei Amsterdam Research Center is at the forefront of innovation in artificial intelligence and machine learning. We are dedicated to developing cutting-edge technologies that revolutionize how people interact with information and enhance productivity across various sectors. Our team of experts is committed to pushing the boundaries of what's possible in AI, and we are looking for talented individuals to join us on this exciting journey.
Position Overview:
We are seeking a highly skilled and motivated Research Engineer to join our team, focusing on building advanced AI systems based on Large Language Models (LLMs) enhanced with Reinforcement Learning (RL). The ideal candidate has hands-on experience applying RL techniques to train and align large language models, with a particular emphasis on agentic applications such as deep research and automated code generation.
In this role, you will be responsible for designing and developing cutting-edge
LLM-powered Multi-Agent systems trained and aligned using Reinforcement Learning, targeting complex agentic scenarios including deep research and automated code generation. You will work closely with our team of researchers, data scientists, and engineers to push the frontier of what LLM-based agents can achieve.
Key Responsibilities:
- Design and develop LLM-based multi-agent AI systems, applying Reinforcement Learning (e.g., RLHF, RLAIF, PPO, GRPO) to train, align, and optimize agents for complex agentic tasks such as deep research and automated code generation.
- Utilize RL training frameworks (e.g., verl, OpenRLHF, TRL) and agent orchestration frameworks (e.g., LangChain, LlamaIndex) to build, train, test, and deploy autonomous AI agents at scale.
- Research and implement RL-based training pipelines and reward modeling strategies to improve LLM agent performance on long-horizon tasks including deep research, multi-step reasoning, and code generation.
- Develop and integrate advanced knowledge management and Retrieval-Augmented Generation (RAG) systems to provide AI agents with accurate and timely information, reducing model hallucinations.
- Stay up-to-date with and contribute to the latest research in Large Language Models, Reinforcement Learning for LLMs (RLHF, RLAIF, process reward models), agentic AI systems, and domain-specific applications such as deep research automation and AI-driven code generation.
- Collaborate with cross-functional teams to integrate the AI solutions you develop into existing products and services.
- Evaluate and benchmark the performance of AI Agent systems and perform iterative optimization.
Qualifications:
- Master's or Ph.D. in Computer Science with a focus on NLP, ML, or AI.
- Hands-on experience applying Reinforcement Learning to train and align Large Language Models (e.g., RLHF, RLAIF, PPO, GRPO) and building LLM-based agent or multi-agent systems; this is a core requirement for the position.
- Strong programming skills in Python and familiarity with at least one major machine learning framework such as PyTorch.
- Solid understanding of deep learning architectures, including Transformers, BERT, GPT, etc.
- Proven track record of research excellence, demonstrated through publications in relevant conferences and journals.
- Ability to work independently and as part of a collaborative team.
- Excellent problem-solving skills and attention to detail.
Preferred Qualifications:
- Proficiency with RL training frameworks such as verl, OpenRLHF, TRL, or similar; experience with reward modeling, process reward models (PRM), and RLVR is a strong plus.
- Practical experience building deep research systems or AI coding agents (e.g., search-augmented reasoning, code execution agents, test-driven code generation); familiarity with evaluation benchmarks such as SWE-bench, HumanEval, GAIA, or BrowseComp.
- In-depth experience with agent orchestration frameworks such as LangChain, LlamaIndex, combined with advanced RAG techniques for grounding agents in factual knowledge.
- Background in software development and engineering best practices.