Job Details

Staff Engineer - ML Inference & Model Efficiency

  2026-05-04     Cohere     San Francisco,CA  
Description:

A leading AI research firm in San Francisco is seeking a Member of Technical Staff specialized in Model Efficiency. In this role, you will enhance LLM inference systems by tackling performance issues and collaborating with cross-functional teams. Ideal candidates have over 5 years of coding experience in C++ or Python and a solid understanding of the LLM inference environment. This position offers a remote-friendly work model, a competitive salary, and extensive benefits including a generous vacation policy.J-18808-Ljbffr


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search