Research Scientist TTS

📍 India, India, India
Full-time Other-General Posted January 19, 2026
Apply Now Similar Jobs
Job Description

About Pocket FM  

 
Pocket FM is on a mission to deliver personalized and immersive audio experiences to listeners worldwide. We are revolutionizing the audio entertainment industry through long-form storytelling, supported by our cutting-edge platform that serves millions of listeners and generates billions of minutes of engagement monthly. We leverage Generative AI in producing content and streamlining operations, developing innovative solutions for cutting-edge challenges in the AI landscape across all modalities—text, audio, and images. With strong backing and rapid user base growth, Pocket FM is an exciting and dynamic place to join. 

 
About the Role  

 
We are seeking an experienced research scientist to drive innovation in long-form content generation and localization. Your work will focus on creating seamless, culturally-tailored storytelling experiences, evaluating content quality through user engagement metrics, and transforming research breakthroughs into tangible solutions. You will lead the development of state-of-the-art TTS systems to create highly natural and expressive voices for our immersive audio storytelling platform. Your focus will be on building low-latency, end-to-end neural speech models that can accurately capture emotion and cultural nuances in multiple languages. This role offers the opportunity to contribute to cutting-edge research while also having a direct and measurable impact on the company’s success. 
The team is open for the candidate to be located anywhere in North America/India with the requirement to travel occasionally to meet the team in person a few times a year. 

 
Key Responsibilities 
Model Development :Design, implement, and optimize modern neural TTS systems, including diffusion- and flow-based architectures, neural codec–based speech generation, and LLM-conditioned or hybrid speech synthesis models for expressive, long-form audio.  
Speech Controllability : Develop methods for fine-grained control over speech attributes like pitch, rhythm, emotion, and speaker style to enhance storytelling quality. 
Efficiency & Latency : Optimize models for real-time inference and high-scale production, utilizing techniques like knowledge distillation and model quantization. 
Multilingual Synthesis : Spearhead research into cross-lingual and multilingual TTS to support global content localization. 
Quality Evaluation : Design and implement robust evaluation frameworks, including MOS (Mean Opinion Score) and objective metrics, to assess the naturalness and intelligibility of generated speech. 

 

 
Qualifications 
Domain Expertise : Demonstrated experience in speech synthesis, digital signal processing (DSP), and audio analysis. 
TTS Tooling : Proficiency with speech-specific frameworks and libraries such as Coqui TTS, ESPnet, or NVIDIA NeMo. 
Advanced Architectures : Hands-on experience with sequence-to-sequence models, GANs, Variational Autoencoders (VAEs), and Diffusion models for audio. 
Data Processing : Experience in building high-quality audio datasets, including voice cloning, speaker verification, and handling prosody. 
Master’s or PhD degree in Computer Science, Machine Learning, or a related field 
Significant Python and applied research experience in industrial settings 
Proficiency in frameworks such as PyTorch or TensorFlow 
Demonstrated experience in deep learning, especially language modling with transformers and machine translation 
Prior experience working with vector databases, search indices, or other data stores for search and retrieval use cases 
Preference for fast-paced, collaborative projects with concrete goals, quantitatively tested through A/B experiments 
Published research in peer-reviewed journals and conferences on relevant topics 

 
Join us in this exciting opportunity to contribute to ground breaking research in Generative AI technologies that will impact millions of users globally. 
Apply for this Position

Ready to join ? Click the button below to submit your application.
Submit Application
Job Details

Location
India, India, India
Job Type
Full-time