Introduction to the position
As a Data Scientist – R&D (Speech-focused) at ToumAI, your mission will be to research, develop and optimize speech and voice intelligence models that power our Voice AI platforms, SDKs and on-device solutions.
You will work on end-to-end speech pipelines, from data and annotation strategies to model training, evaluation and optimization, with strong constraints on accuracy, latency, robustness and footprint, particularly for low-resource languages and dialects such as Moroccan Darija.
Your work will directly impact production systems used in voicebots, QMS, VoC analytics, APIs and on-device SDKs, bridging applied research and real-world deployment.
Your role
• Research, design and train speech-related models (STT components, language identification, diarization, emotion recognition, speech segmentation, code-switching).
• Improve transcription accuracy and robustness in noisy, conversational and real-world audio conditions.
• Work on multilingual and dialectal speech data with a focus on low-resource settings.
• Define and refine data collection, annotation strategies and quality control processes for speech datasets.
• Design evaluation protocols and metrics aligned with product and business requirements.
• Collaborate with platform and ML engineers to integrate models into real-time and batch pipelines.
• Optimize models for latency, memory footprint and inference efficiency (quantization, pruning, distillation).
• Analyze model errors and failure modes, propose corrective strategies and iterate.
• Document experiments, architectures and learnings to support long-term R&D.
• Stay up to date with state-of-the-art research in speech processing and applied Voice AI.
Your qualifications
• Strong curiosity for speech processing and applied AI research.
• Solid experience in machine learning and data science, with a focus on speech or audio-based models.
• Hands-on experience with PyTorch (or equivalent deep learning frameworks).
• Understanding of speech tasks such as ASR/STT, diarization, VAD, language identification or emotion recognition.
• Experience working with audio data pipelines and annotation workflows.
• Familiarity with model evaluation, error analysis and benchmarking in speech systems.
• Interest in low-latency and on-device inference constraints is a strong plus.
• Experience with multilingual, low-resource or dialectal speech is highly valued.
• Familiarity with LLM-based speech pipelines or speech-to-text post-processing is a plus.
• Strong analytical mindset and ability to collaborate with engineering teams.
• Proficiency in French or Arabic is a plus; English required for technical work.
Benefits
At ToumAI, you will work at the intersection of applied research and production Voice AI.
You will:
• Contribute to core R&D on speech and voice intelligence
• Work on real datasets and deployed systems
• Influence model choices, architectures and optimization strategies
• Help shape the future of Voice AI for low-resource languages
If you enjoy pushing speech models from research to real-world deployment under tight constraints, this role is for you.
Recruitment process
- First conversation with the HR team to get to know you better and introduce you to ToumAI
- A technical test or applied research case (speech-focused)
- Role-specific interview with your future manager / R&D lead
- Final meeting with top management (if needed)