Announcing VideoSDK Inference: One Magic API for Every Voice AI ModelWe’re thrilled to announce Inferencing in VideoSDK AI Voice Agents a unified way to run STT, LLM, TTS, and Realtime models directly inside your voice pipeline without managing multiple accounts through Agent Runtime Dashboard and Python Agents SDK.
Introducing VideoSDK Phone Numbers: Build AI Call Agents in 60 secondsToday, we’re launching VideoSDK Phone Numbers, a first-party telephony capability that lets you connect AI voice agents directly to the phone network.
Introducing the Ultravox Realtime Plugin in VideoSDKLearn more about building real-time voice agents with Ultravox and VideoSDK Agents, where listening, reasoning, and speaking happen together for ultra-low latency, natural conversations.
Introducing xAI Grok Real-Time Speech-to-Speech Plugin for VideoSDK AgentsBuild real-time voice and text agents with xAI’s Grok now natively integrated into VideoSDK Agents for multimodal, context-aware AI experiences.
Introducing the Nvidia Speech to Text Plugin in VideoSDKLearn how to integrate NVIDIA STT with the VideoSDK Agents SDK to generate fast, accurate, and production-ready transcriptions.
Introducing the MurfAI Text To Speech Plugin in VideoSDKLearn how to integrate Murf AI Text-to-Speech with VideoSDK Agents to generate natural, expressive, and low-latency voice output for AI agents.
Introducing the Nvidia Text to Speech Plugin in VideoSDKLearn how to integrate NVIDIA Riva TTS with the VideoSDK Agents SDK to deliver real-time, low-latency speech that makes AI voice agents sound natural, responsive, and production-ready.
Introducing the Gladia Speech to Text Plugin in VideoSDKWe’re introducing the Gladia Speech-to-Text plugin for VideoSDK. With multilingual support, instant partial results, and handling of mixed languages, it provides a reliable speech input layer for voice-driven applications.
Introducing Testing and Evaluation in AI Voice AgentsLearn how to run testing and evaluation for AI voice agents using the VideoSDK Agent SDK, including STT, LLM, and TTS benchmarking, latency metrics, and LLM-based response judging.