Job Description
Dear All,
We are seeking a highly capable Machine Learning Engineer with deep expertise in fine-tuning Large Language Models (LLMs) and Vision-Language Models (VLMs) for intelligent document processing. This role requires strong knowledge of PEFT techniques (LoRA, QLoRA), quantization , transformer architectures , prompt engineering , and orchestration frameworks like LangChain . You’ll work on building and scaling end-to-end document processing workflows using both open-source and commercial models (OpenAI, Google, etc.), with an emphasis on performance, reliability, and observability.
Key Responsibilities:- Fine-tune and optimize open-source and commercial LLMs/VLMs (e.g., LLaMA,Cohere, Gemini, GPT-4) for structured and unstructured document processing tasks.
- Apply advanced PEFT techniques (LoRA, QLoRA) and model quantization to enable efficient deployment and experimentation.
- Design LLM-based document intelligence pipelines for tasks like OCR extraction, entity recognition, key-value pairing, summarization, and layout understanding.
- Develop and manage prompting techniques (zero-shot, few-shot, chain-of-thought, self-consistency) tailored to document use-cases.
- Implement LangChain -based workflows integrating tools, agents, and vector stores for RAG-style processing.
- Monitor experiments and production models using Weights & Biases (W&B) or similar ML observability tools.
- Work with OpenAI (GPT series) , Google PaLM / Gemini , and other LLM/VLM APIs for hybrid system design.
- Collaborate with cross-functional teams to deliver scalable, production-ready ML systems and continuously improve model performance.
- Build reusable, well-documented code and maintain a high standard of reproducibility and traceability.
- Hands-on experience with transformer architectures and libraries like HuggingFace Transformers.
- Deep knowledge of fine-tuning strategies for large models, including LoRA , QLoRA , and other PEFT approaches.
- Experience in prompt engineering and developing advanced prompting strategies.
- Familiarity with LangChain , vector databases (e.g., FAISS, Pinecone), and tool/agent orchestration.
- Strong applied knowledge of OpenAI , Google (Gemini/PaLM) , and other foundational LLM/VLM APIs.
- Proficiency in model training, tracking, and monitoring using tools like Weights & Biases (W&B) .
- Solid understanding of deep learning , machine learning , natural language processing , and computer vision concepts.
- Experience working with document AI models (e.g., LayoutLM, Donut, Pix2Struct) and OCR tools (Tesseract, EasyOCR, etc.).
- Proficient in Python , PyTorch , and related ML tooling.
- Experience with multi-modal architectures for document + image/text processing.
- Knowledge of RAG systems , embedding models , and custom vector store integrations.
- Experience in deploying ML models via FastAPI , Triton , or similar frameworks.
- Contributions to open-source AI tools or model repositories.
- Exposure to MLOps , CI/CD pipelines , and data versioning.
- Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field.
- Work on cutting-edge GenAI and Document AI use-cases.
- Collaborate in a fast-paced, research-driven environment.
- Flexible work arrangements and growth-focused culture.
- Opportunity to shape real-world applications of LLMs and VLMs.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application