Prem AI added real-time voice to its sovereign AI ecosystem by self-hosting Deepgram Nova-3 inside Trusted Execution Environments, giving regulated enterprises low-latency, multilingual STT that never leaves their trusted infrastructure.
Prem AI builds a private, sovereign AI ecosystem for organizations that need complete control over their data and model weights, with infrastructure primarily hosted in Switzerland. The platform lets customers build AI-native applications and multi-agent workflows on sovereign infrastructure (on-prem, VPC, air-gapped) with customer-held keys and strict residency controls.
Visit:
Prem AIIndustry
Private / Sovereign AI Infrastructure
Business Needs
Operationalize AI digital employees under strict sovereignty, security, and compliance constraints; add voice as a first-class interface to agents without leaking data outside customer-trusted environments.
Solution
Prem AI builds a private, sovereign AI ecosystem for organizations that need complete control over their data and model weights, with infrastructure primarily hosted in Switzerland. The platform lets customers build AI-native applications and multi-agent workflows on sovereign infrastructure (on-prem, VPC, air-gapped) with customer-held keys and strict residency controls.
Visit:
Prem AIIndustry
Private / Sovereign AI Infrastructure
Business Needs
Operationalize AI digital employees under strict sovereignty, security, and compliance constraints; add voice as a first-class interface to agents without leaking data outside customer-trusted environments.
Solution
Prem AI is building a private, sovereign AI ecosystem for organizations that need complete control over their data and model weights, with infrastructure primarily hosted in Switzerland.
To add real-time conversation into that stack, Prem turned to Deepgram and self-hosted Nova-3, giving its customers low-latency, multilingual speech-to-text that never leaves their trusted environment.
Prem AI exists to solve a structural gap. Most enterprises can't safely operationalize AI in production when they're bound by strict sovereignty, security, and compliance requirements.
Most AI platforms assume cloud-first, multi-tenant architectures. That model fails for Prem's customers: regulators, healthcare providers, financial institutions, and other sensitive environments. They need AI that runs on sovereign infrastructure (on-prem, VPC, air-gapped), with customer-held keys, full data control, and clearly enforceable residency and jurisdictional boundaries.
Prem's platform lets customers build AI-native applications and multi-agent workflows directly on that kind of infrastructure, turning proprietary data into digital employees that can operate safely at scale.
As those agents mature, interaction can't stay text-only. Many of Prem's most important workflows (support, operations, field work, internal productivity) are conversational and real-time by nature. Voice is increasingly the most natural interface, especially in secure networks where browser-only chat is limited, in mobile environments, and in partially connected contexts.
Prem needed a way to add real-time voice to that ecosystem without compromising its core promise of sovereign AI that never leaks data. Cloud-only STT providers weren't an option. Self-hosting open-source models like Whisper helped, but diarization quality and EU language accuracy fell short of Prem's bar, especially in noisy real-world enterprise audio.
Prem selected Deepgram Nova-3 Base as its primary speech-to-text engine for voice workloads.
Nova-3 delivered strong accuracy on English and EU languages, including in noisy enterprise environments. That accuracy mattered for Prem's Europe-heavy customer base. Deepgram's native speaker diarization outperformed the self-hosted Whisper plus pyannote combinations Prem had tested, especially on multi-speaker conversations in regulated settings. Nova-3's streaming performance hit Prem's sub-300 to 500ms turn-level latency targets for voice agents. And Deepgram's deployment model lets Prem run Nova-3 entirely inside its own private inference stack, aligning with Prem's stateless-by-design and hold-your-own-keys principles.
In internal benchmarks, Nova-3 proved more accurate and more reliable on real-world calls than Prem's prior self-hosted stack, particularly for diarization and multilingual EU audio. The combination of accuracy, diarization, latency, and self-hosting made Deepgram the clear fit.
Prem deploys Deepgram as a self-hosted, on-prem solution inside its Trusted Execution Environments: Data exists only in encrypted memory during inference. Nothing is logged or stored by default. Deepgram services are containerized and orchestrated via Kubernetes across Prem's infrastructure, matching Deepgram's own high-availability guidance. Nova-3 runs within the Prem Compute Stack alongside Prem's LLM orchestration, tool execution, and logging layers.
Customers never call Deepgram directly. They use a proprietary Prem SDK that sends audio to private REST endpoints inside Prem's stack. The SDK is implemented in Node.js and supports browser runtimes, so Prem's customers can add secure audio capture into their applications without exposing traffic to third-party clouds.
The result: Deepgram looks and feels like a native capability of Prem's platform, a sovereign voice layer that lives wherever Prem is deployed.
A typical Deepgram-powered workflow inside Prem starts with audio in. Users upload files or stream audio via WebSocket from voice mode features embedded in Prem-powered applications. Prem's SDK sends the audio to private inference APIs hosted on the Prem Compute Stack, where Deepgram Nova-3 transcribes in real time or batch depending on the use case. Features in active use include speaker diarization, PII redaction, and language detection.
The output is a diarized conversation transcript that flows into Prem's broader agent graphs: LLM orchestration, tool calls, audit, and logging. Voice becomes a first-class modality alongside text and documents. Builders wire up voice-enabled agents for regulated environments. Compliance teams get structured, searchable transcripts for review and analytics. All of it runs inside Prem's sovereign AI ecosystem with no compromise on privacy.
Prem uses both real-time streaming and pre-recorded batch processing. Early deployments are architected for up to ~50 concurrent voice sessions and roughly 2,000 hours of audio per month, with room to scale as customer adoption grows.
Beyond the platform API, Prem also productizes Deepgram in a direct-to-user application: Sotto, a push-to-talk Mac dictation app powered by Deepgram Nova-3.
Sotto is a system-wide dictation tool. Hold the right Option key, speak, release, and your words appear at the cursor in any Mac app. It delivers sub-200ms transcription latency, end-to-end encryption, and zero data retention: audio is encrypted on-device, processed inside a dedicated Trusted Execution Environment on Prem's infrastructure, and discarded the moment transcription completes. Nothing is stored. Nothing is logged. Nothing is used for training. The app was built for professionals whose workflows demand both speed and confidentiality: developers who don't want their code discussions hitting third-party servers, clinicians who need HIPAA-compliant dictation without a six-month procurement cycle, legal teams where client confidentiality is table stakes, and founders who think faster than they type.
Sotto runs on the same sovereign architecture as Prem's enterprise platform. Users authenticate with a Prem API key, audio routes through the same isolated inference stack, and the backend is fully self-hostable for organizations that want to deploy it on their own infrastructure. It is a concrete, shipped demonstration that sovereign voice AI can be as fast and accurate as any cloud alternative, and it's built on Deepgram.
Prem's promise to customers is simple: don't trust, verify. The platform is designed so that data exists only in encrypted memory during inference, is never logged or stored by default, and is governed by customer-held keys under strict jurisdictional controls.
Deepgram fits into that model. Trusted Execution Environments protect customer data in use, including audio and transcripts processed by Nova-3. End-to-end encryption is enforced, with every token encrypted in transit and at rest using customer-managed keys. Data residency is enforced per customer, with deployments primarily in Switzerland for strong sovereignty and regulatory alignment. Audio is processed transiently and not retained beyond the duration of the request.
Prem executes Data Processing Agreements with customers as needed and is actively pursuing SOC 2 certification, with plans to expand into additional frameworks over time. More detail is available at trust.prem.io.
By self-hosting Deepgram, Prem adds best-in-class STT to this security posture without introducing a new data processor or sending audio outside the environment customers already trust.
Prem is still early in its Deepgram rollout, so the emphasis is on capability and fit rather than public metrics. A few patterns have already emerged:
Deepgram is already enabling Prem to ship voice-enabled, privacy-preserving workflows that feel consistent with the rest of its private AI ecosystem, even before large-scale customer stats are ready to publish.
Prem describes the collaboration with Deepgram as straightforward and responsive. When questions have come up around deployment and configuration, Deepgram's team has been quick to provide information and responsive, helping Prem move from prototype to self-hosted deployment without unnecessary friction.
Because the rollout is still early, Prem hasn't yet needed heavy iteration on model configs or large-scale tuning playbooks. Both teams expect deeper collaboration over time as voice volumes grow, more languages come online, and additional regulated workloads move from pilot to production.
Prem sees voice becoming the primary interface in many of the environments it serves. Healthcare is a clear example: hands-free conversational interfaces reduce friction and cognitive load for clinicians. Internal productivity is another: employees interacting with sovereign AI agents by speaking instead of typing, without worrying about where their data goes.
In the near term, Prem plans to ship more voice-native features across its applications and expand into additional European regions with multilingual, low-latency voice running on the Swiss-hosted sovereign stack. As voice volumes grow and more languages come online, both teams expect deeper collaboration on model tuning and deployment optimization.
For Deepgram, the Prem integration opens a reference path into sovereignty-conscious markets: regulated industries, European enterprises, and any organization where 'your model runs on our infrastructure' is a hard requirement.
Simone Giacomelli, CEO and Founder, Prem AI:
Across many conversations, especially in healthcare, we're seeing a clear pattern: voice is becoming the primary interface, as long as it's private by design. The integration with Deepgram is unlocking real-time workflows while preserving the strict confidentiality these environments demand.
For voice AI vendors considering sovereign deployments, Prem's position is clear:
We deliver complete private AI to enterprises and consumers. For us, voice AI vendors should be collaborative partners who help us bring state-of-the-art intelligence to end users who value privacy.
For Prem, Deepgram is one of the key components turning that vision into something enterprises can run today: sovereign, verifiable AI agents that can finally speak the languages of their most regulated, privacy-sensitive users.

Prem AI builds a private, sovereign AI ecosystem for organizations that need complete control over their data and model weights, with infrastructure primarily hosted in Switzerland. The platform lets customers build AI-native applications and multi-agent workflows on sovereign infrastructure (on-prem, VPC, air-gapped) with customer-held keys and strict residency controls.
Visit:
Prem AIIndustry
Private / Sovereign AI Infrastructure
Business Needs
Operationalize AI digital employees under strict sovereignty, security, and compliance constraints; add voice as a first-class interface to agents without leaking data outside customer-trusted environments.
Solution
Unlock language AI at scale with an API call.
Book a Free Demo