Job Description
Total Exp : 5 to 10 Years
Location : Bangalore
Role: Lead development and optimization of speech‑based machine learning models (STT/TTS) using modern open‑source frameworks.
Responsibilities:
- Drive end‑to‑end development of speech‑related ML components and model pipelines.
- Implement fine‑tuning, optimization, and evaluation workflows for AI models.
- Ensure model efficiency, scalability, and readiness for production environments.
- Work closely with data, engineering, and platform teams to align training and deployment workflows.
- Provide technical leadership on model performance, experimentation, and best practices.
Required Skills :
- Deep expertise in PyTorch: training loops, fine‑tuning, distributed training (DDP, Accelerate).
- ONNX Runtime & TensorRT experience for model conversion, optimization, and accelerated inference.
- Understanding of speech/audio processing: spectrograms, MFCCs, VAD, noise reduction.
- Hands‑on experience with at least one speech framework:
- Whisper, ESPnet, Coqui TTS, NVIDIA NeMo, or similar.
- Experience training ML models on GPU clusters (A100, V100) and managing long‑duration jobs.
- Familiarity with Hugging Face ecosystem: datasets, tokenizers, training utilities.
Preferred Skills:
- Experience with multilingual or multi‑accent datasets.
- Basic proficiency in C++ for inference optimization or runtime improvements.
- Exposure to deploying real‑time STT/TTS or low‑latency ML systems.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application