Job Details

ML/AI Research Engineer — Agentic AI Lab (Founding Team)

  2025-10-06     Fabrion     San Francisco,CA  
Description:

Overview

ML/AI Research Engineer — Agentic AI Lab (Founding Team) at Fabrion. Location: San Francisco Bay Area. Type: Full-Time. Compensation: Competitive salary + meaningful equity (founding tier). Backed by 8VC, we are building a world-class team to tackle one of the industry's most critical infrastructure problems.

About the Role

We're designing the future of enterprise AI infrastructure — grounded in agents, retrieval-augmented generation (RAG), knowledge graphs, and multi-tenant governance. We're looking for an ML/AI Research Engineer to join our AI Lab and lead the design, training, evaluation, and optimization of agent-native AI models. You'll work at the intersection of LLMs, vector search, graph reasoning, and reinforcement learning — building the intelligence layer that sits on top of our enterprise data fabric.

This isn't a prompt engineer role. It's full-cycle ML: from data curation and fine-tuning to evaluation, interpretability, and deployment — with cost-awareness, alignment, and agent coordination all in scope.

Core Responsibilities

  • Fine-tune and evaluate open-source LLMs (e.g. LLaMA 3, Mistral, Falcon, Mixtral) for enterprise use cases with both structured and unstructured data
  • Build and optimize RAG pipelines using LangChain, LangGraph, LlamaIndex, or Dust — integrated with our vector DBs and internal knowledge graph
  • Train agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using enterprise task data
  • Develop embedding-based memory and retrieval chains with token-efficient chunking strategies
  • Create reinforcement learning pipelines to optimize agent behaviors (e.g. RLHF, DPO, PPO)
  • Establish scalable evaluation harnesses for LLM and agent performance, including synthetic evals, trace capture, and explainability tools
  • Contribute to model observability, drift detection, error classification, and alignment
  • Optimize inference latency and GPU resource utilization across cloud and on-prem environments

Desired Experience

Model Training

  • Deep experience fine-tuning open-source LLMs using HuggingFace Transformers, DeepSpeed, vLLM, FSDP, LoRA/QLoRA
  • Worked with both base and instruction-tuned models; familiar with SFT, RLHF, DPO pipelines
  • Comfortable building and maintaining custom training datasets, filters, and eval splits
  • Understand tradeoffs in batch size, token window, optimizer, precision (FP16, bfloat16), and quantization

RAG + Knowledge Graphs

  • Experience building enterprise-grade RAG pipelines integrated with real-time or contextual data
  • Familiar with LangChain, LangGraph, LlamaIndex, and open-source vector DBs (Weaviate, Qdrant, FAISS)
  • Experience grounding models with structured data (SQL, graph, metadata) + unstructured sources
  • Bonus: Worked with Neo4j, Puppygraph, RDF, OWL, or other semantic modeling systems

Agent Intelligence

  • Experience training or customizing agent frameworks with multi-step reasoning and memory
  • Understand common agent loop patterns (e.g. Plan→Act→Reflect), memory recall, and tools
  • Familiar with self-correction, multi-agent communication, and agent ops logging

Optimization

  • Strong background in token cost optimization, chunking strategies, reranking (e.g. Cohere, Jina), compression, and retrieval latency tuning
  • Experience running models under quantized (int4/int8) or multi-GPU settings with inference tuning (vLLM, TGI)

Preferred Tech Stack

  • LLM Training & Inference: HuggingFace Transformers, DeepSpeed, vLLM, FlashAttention, FSDP, LoRA
  • Agent Orchestration: LangChain, LangGraph, ReAct, OpenAgents, LlamaIndex
  • Vector DBs: Weaviate, Qdrant, FAISS, Pinecone, Chroma
  • Graph Knowledge Systems: Neo4j, Puppygraph, RDF, Gremlin, JSON-LD
  • Storage & Access: Iceberg, DuckDB, Postgres, Parquet, Delta Lake
  • Evaluation: OpenLLM Evals, Trulens, Ragas, LangSmith, Weight & Biases
  • Compute: Ray, Kubernetes, TGI, Sagemaker, LambdaLabs, Modal
  • Languages: Python (core), optionally Rust (for inference layers) or JS (for UX experimentation)

Soft Skills & Mindset

  • Startup DNA: resourceful, fast-moving, and capable of working in ambiguity
  • Deep curiosity about agent-based architectures and real-world enterprise complexity
  • Comfortable owning model performance end-to-end: from dataset to deployment
  • Strong instincts around explainability, safety, and continuous improvement
  • Enjoy pair-designing with product and UX to shape capabilities, not just APIs

Why This Role Matters

This role is foundational to our thesis: that agents + enterprise data + knowledge modeling can create intelligent infrastructure for real-world, multi-billion-dollar workflows. Your work won't be buried in research reports — it will be productionized and activated by hundreds of users and hundreds of thousands of decisions. If this is your dream role - we would love to hear from you.

Seniority level

  • Entry level

Employment type

  • Full-time

Job function

  • Engineering and Information Technology
  • Industries: Technology, Information and Internet

Referrals increase your chances of interviewing at Fabrion by 2x

#J-18808-Ljbffr


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search