Job Details

Senior Infrastructure Engineer

  2025-10-20     AimHire     San Francisco,CA  
Description:

About the Role

Our client is looking for a Senior Infrastructure Engineer with 6+ years of experience scaling large, reliable systems at startups that have grown to hyperscale. The ideal candidate is deeply technical, high-agency, and thrives in fast-moving environments. You'll build and maintain auto-scaling infrastructure for RL training systems, working closely with AI researchers to ensure reliability and performance.

If you have hands-on expertise with Nomad, HashiCorp, or Kubernetes, and a strong appreciation for AI research and infrastructure, this is a rare opportunity to own critical systems at the frontier of AI.

Bonus points if you've worked with code sandboxes, function execution in the cloud, or provisioning compute for large-scale training runs.

What You'll Do

  • Build and maintain auto-scaling infrastructure underpinning RL training systems for major hyperscalers.
  • Own and drive scalability/reliability of critical infrastructure components.
  • Partner with AI researchers and engineers to align infra with research needs.
  • Design and implement systems for large-scale training runs.
  • Mentor junior engineers and bring your experience (“battle scars”) to operationalize the stack.

What We're Looking For

  • 6+ years of experience scaling large, reliable systems.
  • Proven background in startups at hyperscale growth.
  • Strong skills with Nomad, HashiCorp, Kubernetes.
  • Comfort building auto-scaling infra for RL training systems.
  • Deep technical expertise, fast execution, and ability to delegate to fleets of coding agents.
  • Appreciation for AI research and low-level infra.
  • Bonus: experience with code sandboxes, function execution in the cloud, or provisioning compute for large-scale runs.

About Our Client

Our client partners with frontier labs, hyperscalers, and enterprises to develop and deploy the next generation of embodied agents. They see creating evals and environments—codifying human goals for agents—as the highest-leverage human activity on the road to ASI.

  • ???? Raised $14.5M from BCV, Sequoia Capital, Menlo Ventures, and SV Angel
  • ???? Experiencing rapid growth, having surpassed revenue benchmarks
  • ???? Mission: Build the core infrastructure to unlock double-digit economic growth with computer-use agents
#J-18808-Ljbffr


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search