Senior Generative Ai Engineer

📍 Vapi, Gujarat, India
Full-time Computer Occupations Posted January 17, 2026
Apply Now Similar Jobs
Job Description

Job Title: Senior Generative AI Engineer (Drafting & RAG Systems)
Role Overview
We are looking for a Senior Generative AI Engineer to lead the development and deployment of our next-generation Automated Drafting Tool. You will be responsible for the entire lifecycle of the AI features—from local prototyping using Ollama to scaling globally via OpenAI APIs.
The ideal candidate has a "Full-Stack AI" mindset: you understand how to retrieve context using RAG, manage high-dimensional data in Vector Databases, and ensure the final drafted output is coherent, accurate, and contextually aware.

Key Responsibilities
1. AI Architecture & Drafting Logic
Design and implement end-to-end Retrieval-Augmented Generation (RAG) pipelines specifically optimized for document drafting.
Develop advanced Prompt Engineering strategies to handle complex drafting constraints (tone, legal/technical compliance, and formatting).
Implement hybrid model strategies, utilizing Ollama for local development, testing, and privacy-sensitive tasks, while orchestrating OpenAI (GPT-4o/o1) for production-level reasoning.
2. Data & Vector Engineering
Build and maintain scalable Vector Databases (e.G., Pinecone, Weaviate, Milvus, or FAISS).
Optimize document ingestion pipelines: chunking strategies, embedding model selection, and metadata filtering to improve retrieval precision.
Implement "Agentic RAG" where the system can self-correct or multi-step reason through a draft.
3. Deployment & MLOps (Local to Cloud)
Bridge the gap between local ideation (running models on Ollama/Local GPUs) and cloud production environments.
Deploy AI services using containerization (Docker/Kubernetes) and manage API latency, rate limits, and token costs.
Establish monitoring for AI performance, including hallucination detection and "groundedness" metrics.

Required Skills & Qualifications
Mandatory Experience
Experience: 3+ years of professional experience in AI/Machine Learning or Backend Engineering with a heavy GenAI focus.
LLM Orchestration: Deep hands-on experience with LangChain or LlamaIndex.
Model Proficiency: Expert knowledge of the OpenAI API ecosystem and local model runners like Ollama.
Vector Expertise: Proven track record of implementing and optimizing Vector Databases and RAG workflows.
Programming: Mastery of Python (FastAPI/Flask) and asynchronous programming
JIRA + Confluence exposure is must have
Technical Stack
Models: OpenAI (GPT-4), Ollama (Llama 3, Mistral, Mixtral).
Tools: LangChain, LlamaIndex, LangSmith (for tracing).
Database: Pinecone, ChromaDB, or pgvector.
Infrastructure: Docker, AWS/GCP/Azure, GitHub Actions for CI/CD.

What We Look For (The "Hacker" Mindset)
Production Proven: You have moved at least one GenAI product from a Jupyter Notebook/Local Script to a live environment with real users.
Problem Solver: You know how to handle the "stochastic" nature of LLMs and can build guardrails to prevent hallucinations in drafting.
Architecture First: You care about token optimization and latency just as much as you care about the quality of the text generated.
Apply for this Position

Ready to join ? Click the button below to submit your application.
Submit Application
Job Details

Location
Vapi, Gujarat, India
Job Type
Full-time