Job Details

View jobs in our app

Learn more about the app. Workinapps.com

Staff GenAI Inference Engineer: Optimize LLM Serving Latency

2026-06-30 Menlo Ventures San Francisco,CA

Description:

A leading data and AI company is seeking a Staff Software Engineer for GenAI inference to lead the architecture and optimization of the inference engine. The role requires expertise in CUDA, GPU programming, and distributed systems design. Ideal candidates will have a strong software engineering background and a proven ability to collaborate with researchers and drive architectural decisions. Competitive compensation is offered, with a salary range of $190,900 to $232,800 USD.#J-18808-Ljbffr

Job Details

View jobs in our app

Staff GenAI Inference Engineer: Optimize LLM Serving Latency

Apply for this Job

Registration Required

Login to Apply

You are leaving our site

Registration Required

Email this job to a friend

Job: Staff GenAI Inference Engineer: Optimize LLM Serving Latency

Job Alert Sign Up

Add To Job Alert

Job Alert Updated

Email Customer Care