Job Details

Software Engineer: ML Optimization

  2026-05-28     Seer     San Francisco,CA  
Description:

ML Systems Engineer — Training & Inference Optimization (MBMB)

We are building large-scale embodied intelligence systems designed to operate in complex real-world environments. Our work spans robot foundation models, high-performance training infrastructure, and on-device inference systems that run directly on robotic hardware.

We are seeking ML Systems Engineers to optimize both training and on-robot inference stacks. This role is focused on pushing performance boundaries across hardware, software, and model design — where improvements are still step-function rather than incremental.

Internally, this team is known as MBMB (More Big More Better).

What You'll Do

Push Training and Inference Performance to the Limit

  • Optimize both large-scale training systems and on-robot inference stacks
  • Deliver meaningful, step-function improvements in throughput, latency, and efficiency
  • Improve end-to-end system performance across distributed training and deployment environments

Make GPUs Perform at Maximum Efficiency

  • Identify and remove bottlenecks across the full compute stack
  • Optimize GPU utilization across training and inference workloads
  • Improve performance of transformer and diffusion-based architectures under real-world constraints

Engineer Across the Full Stack

  • Implement ML, hardware-aware, and software-level optimizations that materially improve system performance
  • Work across:
  • CUDA kernels and low-level GPU execution
  • ML model architecture and compute efficiency
  • CPU bottlenecks and data pipelines
  • Network and distributed systems performance (NVLink, interconnects, and cluster communication)
  • Python, NumPy, and PyTorch-level inefficiencies

Drive System-Level Improvements

  • Evaluate and implement changes that lead to measurable gains in training and inference efficiency
  • Collaborate with ML researchers and systems engineers to identify high-leverage optimization opportunities
  • Continuously profile, benchmark, and improve system performance across evolving workloads

What We're Looking For

  • Strong experience with performance optimization in ML systems
  • Up-to-date knowledge of modern training and inference techniques for transformer and diffusion models
  • Ability to reason across the full stack, including:
  • GPU and CUDA-level optimization
  • Model architecture efficiency
  • CPU, memory, and I/O bottlenecks
  • Distributed networking and communication overhead
  • Framework-level performance (PyTorch, NumPy, Python)
  • Strong systems intuition and ability to identify bottlenecks quickly
  • Comfort operating in fast-moving environments where large performance gains are still available

Preferred Experience

  • Experience optimizing large-scale training or inference systems
  • Deep familiarity with GPU programming and kernel optimization
  • Experience working with distributed ML systems at scale
  • Exposure to model architecture-level efficiency improvements
  • Background spanning both systems engineering and machine learning

Why This Role Matters

  • Direct impact on both training speed and real-time robot performance
  • Work on problems where improvements are still large and measurable
  • Shape the efficiency and scalability of next-generation embodied intelligence systems
  • Operate across the full stack — from hardware execution to model design

About the Company

We are a research-driven AI and robotics company focused on building scalable embodied intelligence systems. By combining advances in machine learning, systems engineering, and robotics, we aim to push the frontier of efficient, real-world AI.

We are committed to building an inclusive and diverse workplace and encourage applicants from all backgrounds to apply.


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search