AI Performance Engineer – CUDA & PyTorch Focus
Location: San Fransisco, CA
Compensation: $200,000-$300,000
A stealth-mode AI systems company is reimagining how large-scale inference is done. With generative AI workloads scaling rapidly, inference efficiency has become a critical bottleneck. We're building an integrated hardware-software platform that brings breakthrough performance and usability to production-scale LLM applications.
This is an opportunity to work on a highly technical team spun out of top-tier academic research, focused on the cutting edge of AI, distributed systems, and performance optimization.
What You'll Do:
What We're Looking For:
If you've spent your time making large models faster, leaner, and more efficient—and want to solve hard technical problems at the core of GenAI infrastructure—this role is for you.
Reach out to learn more.