Job Description
Job Title : Gen AI Engineer
Experience: 5+ Years
Location: Onsite (Pune)
Duration : 3 to 6 Months
We are seeking a GenAI Engineer to design, build, and deploy Generative AI solutions that enhance business workflows and user experiences. The ideal candidate will have strong hands-on expertise in Large Language Models (LLMs), prompt engineering, and Retrieval-Augmented Generation (RAG), with the ability to integrate AI capabilities into scalable, production-grade applications.
Responsibilities:
Model Integration & APIs
- Integrate and fine-tune LLMs (OpenAI / Azure OpenAI / Open-source models)
- Build scalable APIs and microservices for GenAI-powered features
- Deploy GenAI solutions using Docker and Kubernetes
Prompt Engineering
- Design, optimize, and evaluate prompts for accuracy, safety, and performance
- Implement guardrails, prompt versioning, and evaluation strategies
RAG (Retrieval-Augmented Generation)
- Build end-to-end RAG pipelines:
- Document ingestion
- Chunking strategies
- Vector embeddings
- Semantic search
- Implement Top-K retrieval and re-ranking techniques
Application Development
- Integrate GenAI features into web/mobile applications
- Develop APIs using FastAPI / Flask / Django
- Build interactive GenAI apps using Streamlit or React
Real-Time AI & Streaming
- Implement streaming responses using Server-Sent Events (SSE) or WebSockets
- Optimize perceived latency for LLM-based applications
Optimization & Monitoring
- Monitor token usage, latency, and inference costs
- Implement cost-optimization strategies and performance tuning
- Track model quality and drift
AI Safety & Governance
- Implement moderation, bias detection, and responsible AI guidelines
- Ensure compliance with AI safety best practices
Required Skills
- Programming: Python (FastAPI, Flask, Django)
- LLMs & GenAI: OpenAI APIs, Azure OpenAI, Prompt Engineering
- RAG & Search: Vector embeddings, semantic search
- Vector Databases: Pinecone, Weaviate, FAISS
- Cloud Platforms: AWS / Azure / GCP
- DevOps: Docker, Kubernetes, Microservices
- ML Fundamentals: Embeddings, tokenization, similarity metrics
- Real-Time AI: SSE, WebSockets
Preferred / Good-to-Have Skills
- LangChain, LlamaIndex
- Image generation models (Stable Diffusion)
- MLOps, CI/CD pipelines
- Model evaluation techniques (LLM-as-a-Judge)
- Cost monitoring and inference optimization
- Experience with Matryoshka embeddings or vector compression techniques
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application