Share this article

By Zian (Andy) Wang

AI Content Fellow

Last Updated

Nov 1, 2024

In this article, we are going to transform an ordinary platformer game into one that can be controlled by your voice using Deepgram’s API. The focus here is more on how Deepgram can be integrated into the mechanics of a game and less on how to program a platformer in Python.

The game we will be working with is an infinite scrolling 2D platformer, where the player is initially controlled by a single space bar. The length of the space bar press determines the height and distance of the player’s jump. The game features platforms for the player to land on, and the player loses if they fall between the gaps in the platforms.

The base game itself is a Python/Pygame rewrite of a side-project I worked on in Javascript. We are not going to dive into the details of how the game is written, but rather provide a high level overview of the different components that’s involved in the functioning of the script.

Base Game Overview

There are three classes in the game, Player, Platform, and Game. To start, we will define a couple constants and import the required libraries.

import pygame
import random
import math
import pyaudio
import wave
import audioop
import time
import requests
import json
# used later for the added audio features
from collections import deque
import threading

# Constants
WIDTH, HEIGHT = 1200, 700
GRAVITY = 0.65
MIN_PLATFORM_WIDTH, MAX_PLATFORM_WIDTH = 100, 300
MIN_GAP_WIDTH, MAX_GAP_WIDTH = 45, 200
MAX_HEIGHT_DIFFERENCE = 100
MIN_PLATFORM_HEIGHT = 75
MOVING_PLATFORM_BUFFER = 80
SINKING_PLATFORM_SCORE = 20
MOVING_PLATFORM_SCORE = 40
DIFFICULTY_INCREASE_INTERVAL = 20

# Colors
COLORS = {
    'background': (224, 224, 224),
    'player': (74, 74, 74),
    'player_charged': (140, 110, 9),
    'static_platform': (109, 109, 109),
    'moving_platform': (90, 125, 154),
    'sinking_platform': (154, 90, 90),
    'text': (50, 50, 50),
    'highlight': (255, 165, 0)
}

The Player Class

The Player class is responsible for handling and updating all things related to the character on the screen — represented as a cube.

The update() method updates the player’s position based on its velocity, applies gravity, checks collision with platforms and finally handles jumping (when the spacebar is released) and charging (when the spacebar is pressed down) states.
The draw() method handles the player’s visual, responsible for putting the cube on the game window and changing its appearances when charging, giving a squeezing and glowing effect.

The class is shown below:

class Player:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        self.width = self.original_width = 35
        self.height = self.original_height = 35
        self.speed = 10
        self.jump_force = 0
        self.max_jump_force = 32
        self.charge_rate = 0.55
        self.velocity_y = 0
        self.velocity_x = 0
        self.is_jumping = False
        self.is_charging = False
        self.squeeze_factor = 1.0
        self.rotation = 0
        self.rotation_speed = 0

    def update(self, platforms):
        if self.is_charging and not self.is_jumping:
            self.jump_force = min(self.jump_force + self.charge_rate, self.max_jump_force)
            charge_progress = self.jump_force / self.max_jump_force
            self.squeeze_factor = 1 - (charge_progress * 0.3)
        elif not self.is_charging and self.squeeze_factor < 1.0:
            self.squeeze_factor = min(self.squeeze_factor + 0.1, 1.0)

        self.velocity_y += GRAVITY
        self.y += self.velocity_y
        self.x += self.velocity_x

        if self.is_jumping:
            self.rotation += self.rotation_speed
            self.rotation_speed *= 0.99
        else:
            self.rotation *= 0.8

        on_platform = False
        for platform in platforms:
            if self.check_platform_collision(platform):
                on_platform = True
                self.y = HEIGHT - platform.platform_height - platform.height - self.height
                self.velocity_y = 0
                self.is_jumping = False
                self.velocity_x = 0
                self.rotation = 0

                if platform.is_moving:
                    self.x += platform.move_speed

                if platform.is_sinking:
                    platform.sink_delay -= 16
                    if platform.sink_delay <= 0:
                        platform.height -= platform.sink_speed
                        self.y += platform.sink_speed
                        if platform.height <= -platform.platform_height:
                            on_platform = False
                            self.is_jumping = True

        if not on_platform and not self.is_jumping:
            self.is_jumping = True

        if self.is_jumping:
            self.velocity_x *= 0.995

    def check_platform_collision(self, platform):
        return (self.y + self.height >= HEIGHT - platform.platform_height - platform.height and
                self.y + self.height <= HEIGHT - platform.height and
                self.x + self.width > platform.x and
                self.x < platform.x + platform.width)

    def draw(self, screen):
        squeezed_width = int(self.original_width * (2 - self.squeeze_factor))
        squeezed_height = int(self.original_height * self.squeeze_factor)

        charge_progress = self.jump_force / self.max_jump_force
        r = int(COLORS['player'][0] + (COLORS['player_charged'][0] - COLORS['player'][0]) * charge_progress)
        g = int(COLORS['player'][1] + (COLORS['player_charged'][1] - COLORS['player'][1]) * charge_progress)
        b = int(COLORS['player'][2] + (COLORS['player_charged'][2] - COLORS['player'][2]) * charge_progress)
        player_color = (r, g, b)

        player_surface = pygame.Surface((squeezed_width, squeezed_height), pygame.SRCALPHA)
        pygame.draw.rect(player_surface, player_color, (0, 0, squeezed_width, squeezed_height))

        rotated_surface = pygame.transform.rotate(player_surface, self.rotation)
        new_rect = rotated_surface.get_rect(midbottom=(self.x + self.width//2, self.y + self.height))

        screen.blit(rotated_surface, new_rect.topleft)

The Platform Class

The Platform class deals with everything about the platforms the player jumps on — they come in three flavors: static, moving, and sinking.

The update() method is where the magic happens for moving and sinking platforms. For moving platforms, it updates their horizontal position and flips their direction when they hit their movement limits. Sinking platforms start their descent after a short delay when the player lands on them.
The draw() method slaps the platform onto the game window, using different colors to distinguish between the three types — grey for static, blue for moving, and red for sinking platforms.

There’s the platform class.

class Platform:
    def __init__(self, x, width, height, is_sinking=False, is_moving=False):
        self.x = self.original_x = x
        self.width = width
        self.height = self.original_height = height
        self.platform_height = 15 if is_moving else 30
        self.is_sinking = is_sinking
        self.sink_delay = 2100
        self.sink_speed = 1
        self.is_moving = is_moving
        self.move_distance = random.randint(90, MIN_GAP_WIDTH + MOVING_PLATFORM_BUFFER) if is_moving else 0
        self.move_speed = random.choice([-1, 1]) * (random.random() * 1.2 + 0.6) if is_moving else 0
        self.min_x = x
        self.max_x = x + self.move_distance

    def update(self):
        if self.is_moving:
            self.x += self.move_speed
            if self.x <= self.min_x or self.x + self.width >= self.max_x:
                self.move_speed *= -1
                self.x = max(self.min_x, min(self.x, self.max_x - self.width))

    def draw(self, screen):
        color = COLORS['sinking_platform'] if self.is_sinking else COLORS['moving_platform'] if self.is_moving else COLORS['static_platform']
        pygame.draw.rect(screen, color, (self.x, HEIGHT - self.platform_height - self.height, self.width, self.platform_height))

The Game Class

The Game class chains everything together, orchestrating the whole show and keeping all the game elements in check.

The update() method is the heart of the game loop. It updates the player and all platforms, shifts the world when the player moves past the middle of the screen, kicks out off-screen platforms, spawns new ones, and checks if the player has fallen to their doom.
The draw() method is responsible for putting everything on the screen. It draws all platforms, the player, and the UI stuff like score and difficulty level.
The generate_platform() method is the platform factory, cranking out new platforms with random properties based on constants defined in the script as the game progresses. It’s where the difficulty setting comes into play, determining how likely you are to get those tricky moving or sinking platforms.
The run() method manages the main game loop. It handles events (like quitting or jumping), updates the game state, draws everything, and keeps the whole thing running at a smooth 60 frames per second.

Here is the Game class.

class Game:
    def __init__(self):
        pygame.init()
        self.screen = pygame.display.set_mode((WIDTH, HEIGHT))
        pygame.display.set_caption("Infinite Platformer")
        self.clock = pygame.time.Clock()
        
        pygame.font.init()
        self.font = pygame.font.Font(None, 36)
        self.large_font = pygame.font.Font(None, 72)
        self.title_font = pygame.font.Font(None, 100)
        
        self.high_score = 0
        self.reset_game()

    def reset_game(self):
        self.player = Player(100, 0)
        self.platforms = []
        self.game_over = False
        self.total_distance = 0
        self.difficulty = {'current_level': 0, 'sinking_platform_prob': 0.27, 'moving_platform_prob': 0.27}
        self.generate_initial_platforms()

    def generate_initial_platforms(self):
        self.generate_platform(0, is_first=True)
        while self.platforms[-1].x + self.platforms[-1].width < WIDTH * 2:
            last_platform = self.platforms[-1]
            gap_width = random.randint(MIN_GAP_WIDTH, MAX_GAP_WIDTH)
            self.generate_platform(last_platform.x + last_platform.width + gap_width)

        first_platform = self.platforms[0]
        self.player.x = first_platform.x + first_platform.width // 2 - self.player.width // 2
        self.player.y = HEIGHT - first_platform.platform_height - first_platform.height - self.player.height

    def generate_platform(self, x, is_first=False):
        width = 200 if is_first else random.randint(MIN_PLATFORM_WIDTH, MAX_PLATFORM_WIDTH)
        height = MIN_PLATFORM_HEIGHT if is_first else random.randint(MIN_PLATFORM_HEIGHT, MAX_HEIGHT_DIFFERENCE + MIN_PLATFORM_HEIGHT)
        is_sinking = not is_first and random.random() < self.difficulty['sinking_platform_prob']
        is_moving = not is_first and random.random() < self.difficulty['moving_platform_prob']

        adjusted_width = min(width, MIN_GAP_WIDTH + MOVING_PLATFORM_BUFFER // 2) if is_moving else width
        x = x + MOVING_PLATFORM_BUFFER if is_moving else x

        self.platforms.append(Platform(x, adjusted_width, height, is_sinking, is_moving))

    def update_difficulty(self):
        score = self.total_distance // 100
        new_level = score // DIFFICULTY_INCREASE_INTERVAL
        if new_level > self.difficulty['current_level']:
            self.difficulty['current_level'] = new_level
            self.difficulty['sinking_platform_prob'] = 0.27 + new_level * 0.03
            self.difficulty['moving_platform_prob'] = 0.27 + (new_level - 2) * 0.03 if score >= MOVING_PLATFORM_SCORE else 0

    def update(self):
        if self.game_over:
            return

        self.update_difficulty()
        self.player.update(self.platforms)

        if self.player.velocity_x > 0:
            self.total_distance += self.player.velocity_x

        if self.player.x > WIDTH // 2:
            diff = self.player.x - WIDTH // 2
            self.player.x = WIDTH // 2
            for platform in self.platforms:
                platform.x -= diff
                platform.original_x -= diff
                platform.min_x -= diff
                platform.max_x -= diff

        if self.player.y + self.player.height > HEIGHT:
            self.game_over = True
            return

        for platform in self.platforms:
            platform.update()

        self.platforms = [p for p in self.platforms if p.x + p.width > 0]
        
        if self.platforms[-1].x + self.platforms[-1].width < WIDTH * 2:
            gap_width = random.randint(MIN_GAP_WIDTH, MAX_GAP_WIDTH)
            self.generate_platform(self.platforms[-1].x + self.platforms[-1].width + gap_width)

    def draw(self):
        self.screen.fill(COLORS['background'])

        for platform in self.platforms:
            if platform.x + platform.width > 0 and platform.x < WIDTH:
                platform.draw(self.screen)

        self.player.draw(self.screen)

        self.draw_ui()

        pygame.display.flip()

    def draw_ui(self):
        # Current score
        score = self.total_distance // 100
        score_text = self.font.render(f"Score: {score}", True, COLORS['text'])
        score_rect = score_text.get_rect(topright=(WIDTH - 20, 20))
        self.screen.blit(score_text, score_rect)

        # High score
        high_score_text = self.font.render(f"High Score: {self.high_score}", True, COLORS['text'])
        high_score_rect = high_score_text.get_rect(topright=(WIDTH - 20, 60))
        self.screen.blit(high_score_text, high_score_rect)

        # Difficulty level
        difficulty_text = self.font.render(f"Level: {self.difficulty['current_level']}", True, COLORS['text'])
        difficulty_rect = difficulty_text.get_rect(topleft=(20, 20))
        self.screen.blit(difficulty_text, difficulty_rect)

        if self.game_over:
            self.draw_game_over()

    def draw_game_over(self):
        overlay = pygame.Surface((WIDTH, HEIGHT), pygame.SRCALPHA)
        overlay.fill((0, 0, 0, 128))  # Semi-transparent black overlay
        self.screen.blit(overlay, (0, 0))

        game_over_text = self.title_font.render("Game Over", True, COLORS['highlight'])
        game_over_rect = game_over_text.get_rect(center=(WIDTH // 2, HEIGHT // 2 - 50))
        self.screen.blit(game_over_text, game_over_rect)

        score = self.total_distance // 100
        final_score_text = self.large_font.render(f"Final Score: {score}", True, COLORS['text'])
        final_score_rect = final_score_text.get_rect(center=(WIDTH // 2, HEIGHT // 2 + 30))
        self.screen.blit(final_score_text, final_score_rect)

        if score > self.high_score:
            new_high_score_text = self.font.render("New High Score!", True, COLORS['highlight'])
            new_high_score_rect = new_high_score_text.get_rect(center=(WIDTH // 2, HEIGHT // 2 + 80))
            self.screen.blit(new_high_score_text, new_high_score_rect)

        restart_text = self.font.render("Press Up Arrow to Restart", True, COLORS['text'])
        restart_rect = restart_text.get_rect(center=(WIDTH // 2, HEIGHT // 2 + 130))
        self.screen.blit(restart_text, restart_rect)

    def run(self):
        running = True
        while running:
            for event in pygame.event.get():
                if event.type == pygame.QUIT:
                    running = False
                elif event.type == pygame.KEYDOWN:
                    if event.key == pygame.K_UP:
                        if self.game_over:
                            self.high_score = max(self.high_score, self.total_distance // 100)
                            self.reset_game()
                        elif not self.player.is_jumping and not self.player.is_charging:
                            self.player.is_charging = True
                            self.player.jump_force = 0
                elif event.type == pygame.KEYUP:
                    if event.key == pygame.K_UP and self.player.is_charging:
                        self.player.is_charging = False
                        self.player.is_jumping = True
                        self.player.velocity_y = -self.player.jump_force
                        self.player.velocity_x = self.player.jump_force * 0.55
                        self.player.jump_force = 0
                        self.player.rotation_speed = -0.1

            self.update()
            self.draw()
            self.clock.tick(60)

        pygame.quit()

Putting it All Together

Running the game is as simple as initializing the Game class and calling the run method. But we also need to define some constants used by the classes at the top, which controls everything from the window size to the gravity to various platform generation parameters. Here’s what it would look like:

import pygame
import random

# Constants
WIDTH, HEIGHT = 1200, 700
GRAVITY = 0.65
MIN_PLATFORM_WIDTH, MAX_PLATFORM_WIDTH = 100, 300
MIN_GAP_WIDTH, MAX_GAP_WIDTH = 45, 200
MAX_HEIGHT_DIFFERENCE = 100
MIN_PLATFORM_HEIGHT = 75
MOVING_PLATFORM_BUFFER = 80
SINKING_PLATFORM_SCORE = 20
MOVING_PLATFORM_SCORE = 40
DIFFICULTY_INCREASE_INTERVAL = 20

# Colors
COLORS = {
    'background': (224, 224, 224),
    'player': (74, 74, 74),
    'player_charged': (140, 110, 9),
    'static_platform': (109, 109, 109),
    'moving_platform': (90, 125, 154),
    'sinking_platform': (154, 90, 90),
    'text': (50, 50, 50),
    'highlight': (255, 165, 0)
}

class Player:
    # player class as shown above
    pass

class Platform:
    # platform class as shown above
    pass

class Game:
    # game class as shown above
    pass
    
if __name__ == "__main__":
    game = Game()
    game.run()

The only dependency here is Pygame and with that installed, you can run the game and play around with the parameters yourself and see how long you can last jumping across the infinite stretch of platforms (I got to 205!).

Adding the Deepgram Audio Magic

Here’s where the meat of the article comes in, implementing the audio controls. For python, I’ve found that instead of using the streaming feature, which would be convenient, sticking to the good old-fashioned transcription through an audio file is the most reliable. I experienced countless package and audio detection issues that’s not necessarily related to Deepgram but with my audio inputs and python packages.

The premise of the audio “controls” is based on the number of times a trigger word is spoken, and the frequency of that word will determine the jump force of the player.

For example, if the trigger word is “jump”, then the player would be sent flying if jump was repeated 10 times while a tiny jump would be executed if “jump” was only heard once.

Here’s how the audio mechanism is going to fit in the game:

Listening for audio: the game will listen for any audio from the input source (presumably a microphone) on a seperate thread while the main game is running.
Audio detection: When the player starts speaking, audio will be detected and if the volume is above a certain threshold, the script starts recording. The recording continues until a period of silence is detected, where the recording stops.
Post processing: The audio is amplified and saved into an audio file.
Speech to text: The file is then sent through Deepgram’s API for transcription, and a list of transcribed words is returned.
Command interpretation: The game looks at the text and counts the number of consecutive “jumps” (or whatever the trigger word is set to) present in the transcription.

Command execution: The game will then translate the frequency of the trigger word into how high/far the player will jump, then the jump is executed. While the player is in the air, audio recording is disabled until the player lands to avoid conflicting inputs.

We define a new class, AudioProcessor, that inherits from the threading.Thread class, which gives the ability for the class to run in the background in another thread to avoid blocking the main game loop.

The meat of the class lies in the run method. It continuously checks if the player is ready for audio input. When ready, it records audio, transcribes it, counts jumps, and triggers the player’s jump action.

class AudioProcessor(threading.Thread):
    def __init__(self, game):
        threading.Thread.__init__(self)
        self.game = game
        self.daemon = True
        self.running = True
        print("AudioProcessor initialized")

    def run(self):
        print("AudioProcessor thread started")
        while self.running:
            if not self.game.player.is_jumping and not self.game.player.is_charging:
                print("Player ready for audio input")
                audio_file = self.save_audio(self.record_audio())
                print(f"Audio saved to {audio_file}")
                transcript = self.transcribe_audio(audio_file)
                if transcript:
                    print(f"Transcription: {transcript}")
                    jump_count = self.count_consecutive_jumps(transcript)
                    print(f"Detected {jump_count} consecutive jumps")
                    self.game.set_transcript(transcript, jump_count)
                    self.game.player.audio_jump(jump_count)
                else:
                    print("Transcription failed or returned None")
            else:
                print("Player is jumping or charging, skipping audio input")
            time.sleep(0.1)  # Short sleep to prevent CPU overuse
        print("AudioProcessor thread stopped")

Here’s what’s happening:

Player State Check: It first checks if the player is not jumping or charging. This prevents audio commands from interfering with ongoing actions.
Audio Recording: If the player is ready, it calls record_audio() to capture audio from the microphone.
Audio Saving: The recorded audio is immediately saved to a file using save_audio().
Transcription: The audio file is sent to Deepgram for transcription using transcribe_audio().
Command Interpretation: If transcription is successful, it counts the number of consecutive “jump” commands using count_consecutive_jumps().
Game State Update: The transcript and jump count are sent to the game to update its state.
Player Action: The audio_jump() method of the player is called with the jump count, triggering the jump action.
Thread Management: A small sleep is added to prevent the thread from consuming too much CPU.

The record_audio method will handle the logic for detecting speech, starting the recording, and ending the recording when silence is detected. The method cleverly uses a double-ended-queue to store small durations of audio prior to detecting any speech, this avoids any audio being cut off from the beginning. The recorded audio is amplified before being returned.

def record_audio(self):
    print("Starting audio recording")
    p = pyaudio.PyAudio()
    stream = p.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True, frames_per_buffer=CHUNK)
    
    frames, silent_chunks, is_recording = [], 0, False
    pre_buffer = deque(maxlen=PRE_BUFFER_SIZE)

    while True:
        data = stream.read(CHUNK)
        if not is_recording:
            pre_buffer.append(data)
            if audioop.rms(data, 2) >= SILENCE_THRESHOLD:
                print("Speech detected, starting recording")
                is_recording = True
                frames.extend(pre_buffer)
        else:
            if audioop.rms(data, 2) < SILENCE_THRESHOLD:
                silent_chunks += 1
                if silent_chunks > int(SILENCE_DURATION * RATE / CHUNK):
                    print("Silence detected, stopping recording")
                    break
            else:
                silent_chunks = 0
            frames.append(data)

    stream.stop_stream()
    stream.close()
    p.terminate()
    print("Audio recording finished")
    return audioop.mul(b''.join(frames), 2, AMPLIFICATION)

The save_audio method writes the recorded audio to a file:

def save_audio(self, audio_data, filename="recorded_audio.wav"):
    with wave.open(filename, 'wb') as wf:
        wf.setnchannels(CHANNELS)
        wf.setsampwidth(pyaudio.PyAudio().get_sample_size(FORMAT))
        wf.setframerate(RATE)
        wf.writeframes(audio_data)
    print(f"Audio saved to {filename}")
    return filename

This method creates a WAV file with the recorded audio data, which can then be sent to Deepgram.

The transcribe_audio method sends the audio file to Deepgram for transcription:

def transcribe_audio(self, audio_file):
    print(f"Transcribing audio file: {audio_file}")
    with open(audio_file, 'rb') as f:
        response = requests.post(DEEPGRAM_URL, headers={"Authorization": f"Token {DEEPGRAM_KEY}"}, data=f)
    
    if response.status_code == 200:
        transcript = response.json()['results']['channels'][0]['alternatives'][0]['transcript']
        print(f"Transcription successful: {transcript}")
        return transcript
    else:
        print(f"Transcription error: Status code {response.status_code}")
        print(f"Response text: {response.text}")
        return None

This method sends a POST request to Deepgram’s API with the audio file. If successful, it extracts and returns the transcribed text.

The count_consecutive_jumps method interprets the transcribed text to determine the jump count:

def count_consecutive_jumps(self, transcript):
    words = transcript.lower().split()
    print(f"Counting jumps in words: {words}")
    count = 0
    for word in words:
        if word in ["jump", "jump.", "jump,", "go", "go,", "go.", "yep", "yep,", "yep."]:
            count += 1
        elif count > 0:
            break
    count = min(count, 10)  # Limit to 10 consecutive jumps
    print(f"Counted {count} consecutive jumps")
    return count

This method counts consecutive occurrences of “jump” or similar-sounding words at the beginning of the transcript. It’s designed to be forgiving of transcription errors and limits the maximum jump count to 10.

Integration with the Game

To integrate this audio system into the game, several modifications are made to the existing classes:

We will initialize the AudioProcessor class in the Game class’s __init__ method:

pythonCopyself.audio_processor = AudioProcessor(self)
self.audio_processor.start()

In the Game class, a new method to update the transcript and jump count:

def set_transcript(self, transcript, jump_count):
    self.transcript = transcript
    self.jump_count = jump_count

In the Player class, a new method to handle audio-triggered jumps:

def audio_jump(self, jump_count):
    if not self.is_jumping and not self.is_charging:
        self.is_charging = True
        self.jump_force = (jump_count / 10) * self.max_jump_force
        self.is_charging = False
        self.is_jumping = True
        self.velocity_y = -self.jump_force
        self.velocity_x = self.jump_force * 0.55
        self.jump_force = 0
        self.rotation_speed = -0.1

In the Game class draw_ui method, we can optionally add a new UI elements to display the transcript and jump count:

def draw_ui(self):
    # previous code remains the same
    # Difficulty level
    difficulty_text = self.font.render(f"Level: {self.difficulty['current_level']}", True, COLORS['text'])
    difficulty_rect = difficulty_text.get_rect(topleft=(20, 20))
    self.screen.blit(difficulty_text, difficulty_rect)

    # new code
    # Display transcript
    transcript_text = self.font.render(f"Transcript: {self.transcript}", True, COLORS['text'])
    transcript_rect = transcript_text.get_rect(bottomleft=(20, HEIGHT - 60))
    self.screen.blit(transcript_text, transcript_rect)

    # Display jump count
    jump_count_text = self.font.render(f"Jumps: {self.jump_count}", True, COLORS['highlight'])
    jump_count_rect = jump_count_text.get_rect(bottomleft=(20, HEIGHT - 20))
    self.screen.blit(jump_count_text, jump_count_rect)
    # end new code

    if self.game_over:
            self.draw_game_over()

In the Game class run method, ensure the audio processor is properly stopped:

def run(self):
    # previous code remains same
        self.update() 
        self.draw() 
        self.clock.tick(60)
    # new code
    self.audio_processor.running = False
    self.audio_processor.join()
    # end new code
    pygame.quit()

Finally, at the top of the file, we need to add new constants that the new audio processing capabilities require:

# Audio Constants
CHUNK, FORMAT, CHANNELS, RATE = 1024, pyaudio.paInt16, 1, 16000
SILENCE_THRESHOLD, SILENCE_DURATION, AMPLIFICATION = 250, 0.9, 8
PRE_BUFFER_SIZE = 10
DEEPGRAM_URL = "https://api.deepgram.com/v1/listen?smart_format=true&model=nova-2&language=en-US"
DEEPGRAM_KEY = ""  # Replace with your actual API key

And that’s it! Remember to replace the constant DEEPGRAM_KEY with your actual API key obtained here. Register a free account and you will receive $200 in credits. Run the game with python file_name.py and voilà! You should be able to just yell “jump” into your microphone and watch your character fly through the infinite line of platforms.

The AudioProcessor class can be readily adapted to any game or application with a similar need for audio integration. We can simply replace the count_consecutive_jumps method that takes in the transcript with any other processing function needed for the specific application. Deepgram’s efficient, low-latency audio transcription API will do the rest.

The full code for the completed, audio-integrated game is available on Github.

Tutorial and Demo: How to Build a Voice AI Video Game in Python

Table of Contents