Technology Partner
Stream provides developer-friendly APIs and SDKs for real-time chat, video, audio, feeds, and AI-powered moderation, powering in-app communication for 1B+ end users. Vision Agents is Stream's open-source framework for adding real-time vision and voice AI to live communication, and Deepgram plugs in natively as the STT provider for fast, accurate real-time transcription and diarization inside Vision Agents workflows.
For product managers and feature teams, this means adding live captions, voice search, and AI-powered conversation summaries to a video-call or in-app messaging product becomes a configuration change rather than a six-week build. Vision Agents handles the orchestration; Deepgram provides speech recognition and speech synthesis designed for real-time production use.
Vision Agents v0.2 ships with broad model support out of the box, including Deepgram, OpenAI Realtime, and Gemini integrations, with continuous improvements to latency, audio handling, and video handling. Recent launches in the Stream + Deepgram ecosystem include real-time AI character chat experiences (Lemon Slice Live) that combine streaming transcription with TTS-driven character voice.
If you are building communication features on Stream and want voice intelligence without standing up a separate transcription pipeline, Vision Agents ships with Deepgram already wired in as the speech provider. The framework is open source and the developer docs are linked below.

Media Transcription
Contact Centers
Conversational AI
Looking to use Deepgram + Stream?
Talk to an ExpertOther Partners

Twilio

OneReach.ai

Think41

Vapi
Carahsoft

Genesys

Vonage

Cloudflare

Daily.co

Kore

Google Cloud

AudioCodes

Vida

Recall.ai

Porter

Perlon AI

OneSix Solutions

Lumio AI

LucidPoint

Lindy

InfoCap

Five9

Caylent

APrime

AI Heroes

AICG

Deepgram & Vercel Next.js Templates

AWS
Abby Connect

Voximplant

Cognigy

Enterprise Bot

Deepgram × IBM: Enterprise Voice AI Inside watsonx CX