Job Description

IndiaAI is building India’s next-gen foundational LLMs. We’re looking for a hands-on Senior ML Engineer experienced in large-scale pre-training, distributed GPU systems, and data creation pipelines. You will work with Megatron-LM, NVIDIA NeMo, DeepSpeed, PyTorch Distributed, and SLURM to train 7B–70B+ models on multi-node GPU clusters.


What You’ll Do

  • Build & optimize LLM pre-training pipelines (7B–70B+).
  • Implement distributed training using PyTorch Distributed, DeepSpeed (ZeRO/FSDP), Megatron-LM, NVIDIA NeMo.
  • Manage multi-node GPU jobs via SLURM and optimize NCCL communication.
  • Lead large-scale data creation, cleaning, deduplication, tokenization & sharding for multilingual datasets (with focus on Indian languages).
  • Build high-throughput dataloaders, monitoring dashboards & training workflows.
  • Collaborate with infra teams to optimize GPU utilization, networking, and storage systems.
...

Apply for this Position

Ready to join gnani.ai? Click the button below to submit your application.

Submit Application