Job Description

Dear All,


We are seeking a highly capable Machine Learning Engineer with deep expertise in fine-tuning Large Language Models (LLMs) and Vision-Language Models (VLMs) for intelligent document processing. This role requires strong knowledge of PEFT techniques (LoRA, QLoRA), quantization , transformer architectures , prompt engineering , and orchestration frameworks like LangChain . You’ll work on building and scaling end-to-end document processing workflows using both open-source and commercial models (OpenAI, Google, etc.), with an emphasis on performance, reliability, and observability.

Key Responsibilities:
  • Fine-tune and optimize open-source and commercial LLMs/VLMs (e.g., LLaMA,Cohere, Gemini, GPT-4) for structured and unstructured document processing tasks.
  • Apply advanced PEFT techniques (LoRA, QLoRA) and model quantization to enable efficient deployment and experimentation.
  • Design LLM-based document intelligence pipelines for tasks like OCR extraction, entity recognition, key-value pairing, summarization, and layout understanding.
  • Develop and manage prompting techniques (zero-shot, few-shot, chain-of-thought, self-consistency) tailored to document use-cases.
  • Implement LangChain -based workflows integrating tools, agents, and vector stores for RAG-style processing.
  • Monitor experiments and production models using Weights & Biases (W&B) or similar ML observability tools.
  • Work with OpenAI (GPT series) , Google PaLM / Gemini , and other LLM/VLM APIs for hybrid system design.
  • Collaborate with cross-functional teams to deliver scalable, production-ready ML systems and continuously improve model performance.
  • Build reusable, well-documented code and maintain a high standard of reproducibility and traceability.
Required Skills & Experience:
  • Hands-on experience with transformer architectures and libraries like HuggingFace Transformers.
  • Deep knowledge of fine-tuning strategies for large models, including LoRA , QLoRA , and other PEFT approaches.
  • Experience in prompt engineering and developing advanced prompting strategies.
  • Familiarity with LangChain , vector databases (e.g., FAISS, Pinecone), and tool/agent orchestration.
  • Strong applied knowledge of OpenAI , Google (Gemini/PaLM) , and other foundational LLM/VLM APIs.
  • Proficiency in model training, tracking, and monitoring using tools like Weights & Biases (W&B) .
  • Solid understanding of deep learning , machine learning , natural language processing , and computer vision concepts.
  • Experience working with document AI models (e.g., LayoutLM, Donut, Pix2Struct) and OCR tools (Tesseract, EasyOCR, etc.).
  • Proficient in Python , PyTorch , and related ML tooling.
Nice-to-Have:
  • Experience with multi-modal architectures for document + image/text processing.
  • Knowledge of RAG systems , embedding models , and custom vector store integrations.
  • Experience in deploying ML models via FastAPI , Triton , or similar frameworks.
  • Contributions to open-source AI tools or model repositories.
  • Exposure to MLOps , CI/CD pipelines , and data versioning.
Qualifications:
  • Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field.
Why Join Us?
  • Work on cutting-edge GenAI and Document AI use-cases.
  • Collaborate in a fast-paced, research-driven environment.
  • Flexible work arrangements and growth-focused culture.
  • Opportunity to shape real-world applications of LLMs and VLMs.


Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application