Top 3 must have skills
• AI/ML engineering with hands-on experience in multimodal models (CLIP, BLIP, Whisper, or similar models)
• Python
• vector databases (e.g., FAISS, Milvus, Weaviate) and embedding pipelines.
Job Description
• Analyze the current multimodal indexing pipeline to identify performance bottlenecks (latency, scalability, and throughput).
• Design and implement GenAI-driven optimizations for data ingestion, preprocessing, embedding generation, vector storage, and retrieval and indexing.
• Improve embedding quality and efficiency for diverse modalities (text, image, audio, video).
• Integrate and optimize vector databases / retrieval systems (e.g., Weaviate, FAISS, Milvus).
• Build scalable microservices/APIs for multimodal embedding and retrieval workflows.
• Collaborate with data scientists, ML engineers, and platform teams to streamline ETL and orchestration pipelines.
• Develop monitoring, logging, and alerting for indexing pipeline health and performance.
• Stay updated with emerging GenAI frameworks (OpenAI, Hugging Face, LangChain, LlamaIndex, etc.) and apply them to pipeline improvements.