Job Description

Job Title : Gen AI Engineer

Experience: 5+ Years

Location: Onsite (Pune)

Duration : 3 to 6 Months



We are seeking a GenAI Engineer to design, build, and deploy Generative AI solutions that enhance business workflows and user experiences. The ideal candidate will have strong hands-on expertise in Large Language Models (LLMs), prompt engineering, and Retrieval-Augmented Generation (RAG), with the ability to integrate AI capabilities into scalable, production-grade applications.


Responsibilities:


Model Integration & APIs

  • Integrate and fine-tune LLMs (OpenAI / Azure OpenAI / Open-source models)
  • Build scalable APIs and microservices for GenAI-powered features
  • Deploy GenAI solutions using Docker and Kubernetes


Prompt Engineering

  • Design, optimize, and evaluate prompts for accuracy, safety, and performance
  • Implement guardrails, prompt versioning, and evaluation strategies


RAG (Retrieval-Augmented Generation)

  • Build end-to-end RAG pipelines:
  • Document ingestion
  • Chunking strategies
  • Vector embeddings
  • Semantic search
  • Implement Top-K retrieval and re-ranking techniques


Application Development

  • Integrate GenAI features into web/mobile applications
  • Develop APIs using FastAPI / Flask / Django
  • Build interactive GenAI apps using Streamlit or React


Real-Time AI & Streaming

  • Implement streaming responses using Server-Sent Events (SSE) or WebSockets
  • Optimize perceived latency for LLM-based applications


Optimization & Monitoring

  • Monitor token usage, latency, and inference costs
  • Implement cost-optimization strategies and performance tuning
  • Track model quality and drift


AI Safety & Governance

  • Implement moderation, bias detection, and responsible AI guidelines
  • Ensure compliance with AI safety best practices


Required Skills

  • Programming: Python (FastAPI, Flask, Django)
  • LLMs & GenAI: OpenAI APIs, Azure OpenAI, Prompt Engineering
  • RAG & Search: Vector embeddings, semantic search
  • Vector Databases: Pinecone, Weaviate, FAISS
  • Cloud Platforms: AWS / Azure / GCP
  • DevOps: Docker, Kubernetes, Microservices
  • ML Fundamentals: Embeddings, tokenization, similarity metrics
  • Real-Time AI: SSE, WebSockets


Preferred / Good-to-Have Skills

  • LangChain, LlamaIndex
  • Image generation models (Stable Diffusion)
  • MLOps, CI/CD pipelines
  • Model evaluation techniques (LLM-as-a-Judge)
  • Cost monitoring and inference optimization
  • Experience with Matryoshka embeddings or vector compression techniques

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application