Welcome to the VideoSDK Monthly updates, your all-in-one recap of our latest releases and platform enhancements! This month was all about making our AI agents smarter and more responsive, giving you deeper control over your media streams, and improving stability across all our SDKs. We've launched a massive evolution for the VideoSDK Agents SDK, a brand new WhatsApp Agent Quickstart, and rolled out powerful new features for our Android and React SDKs. Let's get into the details!
A Major Leap Forward for AI Agents 
This month, our open-source VideoSDK Agents SDK received a series of transformative updates, focusing on making AI-powered voice interactions more natural, stable, and efficient. At the heart of this evolution is Namo, our new proprietary multilingual turn detection model.
Introducing Namo: Our In-House Model for Perfect Turn-Taking 
A key part of making AI interactions feel human is knowing exactly when to speak. To solve this complex challenge, we're proud to introduce Namo-Turn-Detector, our in-house turn detection model, developed right here at VideoSDK.
While standard Voice Activity Detection (VAD) can tell you if someone is talking, Namo goes a step further. It's specifically trained to understand the nuances of conversation, accurately determining the precise moment a user has finished their thought and is ready for the agent to respond.
This leads to:
- Seamless Turn-Taking: Eliminates awkward pauses and prevents the agent from interrupting the user mid-sentence.
 - Higher Accuracy: Reduces errors in detecting the end of speech, making the conversation more reliable.
 - A Truly Natural Flow: Creates interactions that feel less robotic and more like talking to a real person.
 - Explore Docs
 - Checkout all the languages supported and upvote/like if you find it worthwhile
 
🚀 SDK Releases & Updates 
Here’s a full breakdown of all the SDK releases and enhancements from the past month.
Agents SDK (v0.0.37 - v0.0.41) 
Beyond the conversational improvements above, we've packed the Agents SDK with powerful new tools for developers.
- UtteranceHandle for Lifecycle Management: Gain granular control over the lifecycle of an agent's speech. You can now track completion, handle user interruptions, and await utterances to prevent overlapping TTS.
 - Enhanced Background Audio: Create more immersive agent experiences by playing background audio (e.g., thinking sounds, music) during agent interactions with new methods like play_background_audio().
 - CometAPI Plugin Integration: Simplify your AI stack by using multiple STT, LLM, and TTS services from different providers with a single API key.
 - Improved: We've also enhanced our plugin ecosystem with support for the Namo TurnDetector model, Deepgram TTS, a new ElevenLabs TTS plugin, and full integration with Azure's real-time voice services.
 - View full Agents SDK changelog
 
Android SDK (v0.6.0) 
This release introduces advanced video track optimization features, giving you greater control over quality and bandwidth.
- Video Bitrate Control (BitrateMode): Easily manage video quality by choosing between three modes: BANDWIDTH_OPTIMIZED, BALANCED (default), and HIGH_QUALITY.
 - Simulcast Layer Control (maxLayer): Specify the maximum number of simulcast layers to publish for a video track, allowing for fine-tuned performance.
 - Improved: The getVideoStats() method now returns a JsonArray, providing detailed statistics for all produced video layers.
 - View full Android SDK changelog
 
React SDK (v0.4.3 - v0.4.9) 
Our React SDK received a host of new features focused on real-time monitoring and easier stream management.
- onQualityLimitation Event: Proactively monitor local call quality by detecting bandwidth limits, network congestion, or CPU limitations in real time.
 - useStream Hook: Get direct access to all methods and properties of media streams within your components for simplified stream management.
 - onStreamStateChanged Event: Better monitor the stream health of remote participants by detecting freeze, stuck, or recovery events.
 - Improved: The VideoPlayer component now includes a muted attribute and supports passing a custom ref.
 - Fixed: We resolved a memory leak in EventEmitter and fixed a video orientation issue on iOS browsers.
 - View full React SDK changelog
 
JavaScript SDK (v0.3.6 - v0.3.7) 
- Improved: Enabled simulcast layers for all custom tracks.
 - Fixed: Resolved a default camera issue in React Native, fixed a bug that created multiple webcam producers, and corrected a video orientation issue on iOS browsers.
 - ⚠Deprecated: The getNetworkStats method has been deprecated.
 - View full JS SDK changelog
 
React Native SDK (v0.4.1 - v0.4.2) 
- Improved: The SDK is now compatible with React Native 0.82+ and Expo SDK 54+.
 - Fixed: Resolved issues with the defaultCamera parameter and changeWebcam() method behavior.
 - View full React Native SDK changelog
 
Flutter SDK (v3.1.0)  
- Fixed: Addressed an issue where the microphone would stop working after a device was removed and fixed a bug preventing stats from displaying correctly on the dashboard.
 - View full Flutter SDK changelog
 
🔧 Platform & Dashboard Updates 
A Brand New Dashboard Experience! 
We've rolled out a redesigned developer dashboard! The new interface is cleaner, more intuitive, and makes it easier than ever to navigate your projects, monitor usage, and access your API keys. Log in to check it out!
📚 New Content & Resources 
New Platform-Specific Quickstart Guides for Agents 
- Getting started with VideoSDK Agents has never been easier. We've published a full suite of new quick-start agent integration guides tailored to your favourite platform. Whether you're building for web, mobile, or even IoT, we've got you covered.
 
Find your guide here:
| Name | Link | 
|---|---|
| Web | |
| JavaScript Quickstart | [Doc] | 
| React Quickstart | [Doc] | 
| Mobile | |
| React Native Quickstart | [Doc] | 
| iOS Quickstart | [Doc] | 
| Flutter Quickstart | [Doc] | 
| Unity Quickstart | [Doc] | 
| Physical | |
| IoT Quickstart | [Doc] | 
Guide: Build an AI Voice Agent with a RAG Pipeline 
- Take your AI agents to the next level by connecting them to your own knowledge base. Our latest guide walks you through building a sophisticated AI voice agent using the Retrieval-Augmented Generation (RAG) pipeline, allowing it to answer questions based on your custom documents and data.
 - 📖 Read the RAG pipeline guide
 
Guide: Handle Speech with the Namo Turn Detection Model 
- Ready to put our powerful new Namo model into action? This practical guide provides the code and step-by-step instructions you need to implement Namo in your own AI voice agents for seamless, human-like turn-taking.
 - 📖 Read the Namo implementation guide
 
The WhatsApp AI Voice Agent Quickstart 
- You can now build and deploy AI voice agents that answer WhatsApp Business calls instantly. Our new Quickstart Guide leverages direct SIP integration with the Meta Business Platform, removing the need for third-party telephony and enabling fast, seamless conversational automation.
 - 📖 Get Started with the WhatsApp Quickstart
 
🔮 What's Next? 
This is just the beginning! We're already hard at work on our next set of features, including expanding our AI capabilities and improving SDK performance across the board. Stay tuned for more updates next month!
✨ Community Spotlight 
Hear how the team at Fi money is enhancing their customer experience using VideoSDK.
SDK Sketches 
This month's sketch: The difference between a frustrating, robotic conversation and one powered by Namo.
That’s a wrap for this month! Upgrade your SDKs to the latest versions to take advantage of all these new features and improvements.
We'd love to hear your feedback! If you have any questions, suggestions, or issues, please don't hesitate to contact our support team.
➡️ New to VideoSDK? Sign up now and get 10,000 free minutes to start building amazing audio & video experiences!
