Gen AI Engineer

📍 Pune, Maharashtra, India

Full-time IT Services and IT Consulting Posted January 28, 2026

Apply Now Similar Jobs

Job Description

Job Title :  Gen AI Engineer 
 Experience:  5+ Years 
 Location:  Onsite (Pune) 
 Duration :  3 to 6 Months 


We are seeking a GenAI Engineer to design, build, and deploy Generative AI solutions that enhance business workflows and user experiences. The ideal candidate will have strong hands-on expertise in Large Language Models (LLMs), prompt engineering, and Retrieval-Augmented Generation (RAG), with the ability to integrate AI capabilities into scalable, production-grade applications. 

Responsibilities:  

Model Integration & APIs 
Integrate and fine-tune LLMs (OpenAI / Azure OpenAI / Open-source models) 
Build scalable APIs and microservices for GenAI-powered features 
Deploy GenAI solutions using Docker and Kubernetes 

Prompt Engineering 
Design, optimize, and evaluate prompts for accuracy, safety, and performance 
Implement guardrails, prompt versioning, and evaluation strategies 

RAG (Retrieval-Augmented Generation) 
Build end-to-end RAG pipelines: 
Document ingestion 
Chunking strategies 
Vector embeddings 
Semantic search 
Implement Top-K retrieval and re-ranking techniques 

Application Development 
Integrate GenAI features into web/mobile applications 
Develop APIs using FastAPI / Flask / Django 
Build interactive GenAI apps using Streamlit or React 

Real-Time AI & Streaming 
Implement streaming responses using Server-Sent Events (SSE) or WebSockets 
Optimize perceived latency for LLM-based applications 

Optimization & Monitoring 
Monitor token usage, latency, and inference costs 
Implement cost-optimization strategies and performance tuning 
Track model quality and drift 

AI Safety & Governance 
Implement moderation, bias detection, and responsible AI guidelines 
Ensure compliance with AI safety best practices 

Required Skills 
Programming: Python (FastAPI, Flask, Django) 
LLMs & GenAI: OpenAI APIs, Azure OpenAI, Prompt Engineering 
RAG & Search: Vector embeddings, semantic search 
Vector Databases: Pinecone, Weaviate, FAISS 
Cloud Platforms: AWS / Azure / GCP 
DevOps: Docker, Kubernetes, Microservices 
ML Fundamentals: Embeddings, tokenization, similarity metrics 
Real-Time AI: SSE, WebSockets 

Preferred / Good-to-Have Skills  
LangChain, LlamaIndex 
Image generation models (Stable Diffusion) 
MLOps, CI/CD pipelines 
Model evaluation techniques (LLM-as-a-Judge) 
Cost monitoring and inference optimization 
Experience with Matryoshka embeddings or vector compression techniques 

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application

Job Details

Location

Pune, Maharashtra, India

Job Type

Full-time