Job Description
Advance how agents and LLMs learn from feedback in realistic environments.
If you've been working at the intersection of reinforcement learning and large language models, this is an opportunity to work on the foundations of how AI systems are trained, evaluated, and supervised, with your research shipping into production.
You will work hands-on on fundamental problems spanning LLM post-training, RL simulation environments, and agentic evaluation, shaping core methods and benchmarks used by leading AI labs and enterprises around the world.
The team actively publishes and collaborates with external research labs, with recent work appearing at ACL and NeurIPS. You'll see your ideas move from concept to deployed systems, working alongside engineers who build fast and take research seriously.
This is a research-driven company growing quickly due to real demand for what they're building. If you want your work to matter, both in the literature and in production, this is where to do it.
You'll bring hands-on experience in applied research across RL, LLM post-training, or agent-based systems, with a strong understanding of transformer architectures and fine-tuning. As important as the theory is the ability to ship - you can translate research ideas into production-ready systems that actually work. A track record of publishing at top-tier venues such as NeurIPS, ICML, ACL, or EMNLP is a plus, but what matters most is the quality of your thinking and your ability to execute.
What you'll do