Developer Portal
Experimental The bot & app platform is in active development and currently works only on cloud-hosted GameVox servers. Self-hosted servers are not supported yet.
← Docs

Voice

Voice bots — including Lavalink and @discordjs/voice — work without code changes. GameVox speaks Discord's voice gateway protocol verbatim, then bridges the AEAD-RTP stream to its native WebRTC SFU as a synthetic peer.

Compatible libraries

  • Lavalink v3 + v4 (Java, JVM bots front-ending it via JDA / discord4j / etc.)
  • @discordjs/voice (Node)
  • discord.py voice extras (PyNaCl)
  • JDA with the audio module
  • discord4j
  • Eris with the optional voice deps

Endpoints

Per region — the bot connects to whichever region the VOICE_SERVER_UPDATE dispatch tells it to.

  • compatible-voice-gateway-us.gamevox.com
  • compatible-voice-gateway-eu.gamevox.com
  • compatible-voice-gateway-ap.gamevox.com

The main gateway dispatches the right region based on the channel's SFU location. Your bot library reads endpoint from the dispatch and connects there — no manual routing required.

Connection flow

  1. Bot sends op:4 Update Voice State on the main gateway (gateway.gamevox.com) with channel_id.
  2. Main gateway dispatches VOICE_STATE_UPDATE + VOICE_SERVER_UPDATE with an endpoint and a per-session token.
  3. Bot opens WSS to wss://{endpoint}/?v=8, sends op:0 Identify with {server_id, user_id, session_id, token}.
  4. Voice gateway responds op:2 Ready with ssrc, ip, port, and supported encryption modes.
  5. Bot performs UDP IP discovery (74-byte type-0x0001 packet → echo back type-0x0002).
  6. Bot sends op:1 Select Protocol with the encryption mode it picked.
  7. Voice gateway responds op:4 Session Description with the 32-byte secret_key.
  8. Bot streams Opus over RTP, AEAD-encrypted with the negotiated mode.

Encryption modes

Two AEAD-RTPSize modes are supported, matching Discord:

  • aead_aes256_gcm_rtpsize — preferred. AES-256-GCM with the nonce derived from a 4-byte counter appended to the encrypted payload.
  • aead_xchacha20_poly1305_rtpsize — fallback for libraries without AES-NI.

Older modes (xsalsa20_poly1305, _lite, _suffix) are not offered — Discord retired them in 2024 and the major libraries have moved off. If your bot library is pinned to an old mode, update it.

Lavalink notes

Lavalink reads the endpoint straight off VOICE_SERVER_UPDATE and dials UDP on its own. As long as the endpoint is reachable (it is), Lavalink doesn't know it isn't talking to Discord.

Tested versions: Lavalink v3.7.x and v4.0.x. If you hit a version-specific issue, report it on the developer portal so we can pin against your version in CI.

Latency budget

Target bridge added latency: under 5 ms measured from inbound UDP to SFU egress. Above 5 ms music sounds wrong (timing drift on note onsets, hi-hat jitter). The bridge does Opus passthrough — no transcoding — so the headline number is decrypt + repacketize.

Speaking events

op:5 Speaking is bidirectional and identical to Discord. The voice gateway maps inbound human speakers' SSRCs to their user IDs and dispatches op:5 with {ssrc, user_id} so the bot can correlate.

Stopping cleanly

Sending op:4 Update Voice State with channel_id: null on the main gateway drains the session. The voice gateway closes the UDP socket and removes the synthetic SFU peer; human listeners stop receiving audio immediately.

Not yet supported

  • DAVE / E2EE — Discord's per-channel end-to-end-encrypted voice mode. No timeline yet.
  • Soundboard — bots can't post soundboard sounds.
  • Stage channels — accepted on the wire, but no separate audience / speaker handling.

← Back to docs