Job Description
Machine Learning Engineer III – LLM Training (RL + PEFT)
On-site, Bangalore
Latent Force
About the Role
We are building specialized LLMs that understand and reason over massive enterprise codebases. This is real model training — RL loops, PEFT, verifiable rewards, long-context modeling — not API integration. You’ll own end-to-end experimentation and work directly with founders.
Responsibilities
- Train LLMs using RL (PPO/GRPO/RLHF/RLVR) and PEFT (Lo RA, QLo RA, Do RA, IA3).
- Build custom training loops with Py Torch, Hugging Face, TRL, Unsloth.
- Design reward functions and verifiers for code-understanding tasks.
- Run full-stack ML experiments: data → training → eval → iteration.
- Develop scalable training infra (FSDP/Deep Speed, distributed training).
- Build evaluation suites for reasoning and code comprehension.
-
Minimum Qualifications
- 3+ years of real deep learning experience (actual model training).
- Strong fundamentals: linear a...
On-site, Bangalore
Latent Force
About the Role
We are building specialized LLMs that understand and reason over massive enterprise codebases. This is real model training — RL loops, PEFT, verifiable rewards, long-context modeling — not API integration. You’ll own end-to-end experimentation and work directly with founders.
Responsibilities
- Train LLMs using RL (PPO/GRPO/RLHF/RLVR) and PEFT (Lo RA, QLo RA, Do RA, IA3).
- Build custom training loops with Py Torch, Hugging Face, TRL, Unsloth.
- Design reward functions and verifiers for code-understanding tasks.
- Run full-stack ML experiments: data → training → eval → iteration.
- Develop scalable training infra (FSDP/Deep Speed, distributed training).
- Build evaluation suites for reasoning and code comprehension.
-
Minimum Qualifications
- 3+ years of real deep learning experience (actual model training).
- Strong fundamentals: linear a...
Apply for this Position
Ready to join LatentForce? Click the button below to submit your application.
Submit Application