Voice
Voice bots — including Lavalink and @discordjs/voice —
work without code changes. GameVox speaks Discord's voice gateway
protocol verbatim, then bridges the AEAD-RTP stream to its native
WebRTC SFU as a synthetic peer.
Compatible libraries
- Lavalink v3 + v4 (Java, JVM bots front-ending it via JDA / discord4j / etc.)
- @discordjs/voice (Node)
- discord.py voice extras (
PyNaCl) - JDA with the audio module
- discord4j
- Eris with the optional voice deps
Endpoints
Per region — the bot connects to whichever region the
VOICE_SERVER_UPDATE dispatch tells it to.
compatible-voice-gateway-us.gamevox.comcompatible-voice-gateway-eu.gamevox.comcompatible-voice-gateway-ap.gamevox.com
The main gateway dispatches the right region based on the channel's
SFU location. Your bot library reads endpoint from the
dispatch and connects there — no manual routing required.
Connection flow
- Bot sends
op:4 Update Voice Stateon the main gateway (gateway.gamevox.com) withchannel_id. - Main gateway dispatches
VOICE_STATE_UPDATE+VOICE_SERVER_UPDATEwith an endpoint and a per-session token. - Bot opens WSS to
wss://{endpoint}/?v=8, sendsop:0 Identifywith{server_id, user_id, session_id, token}. - Voice gateway responds
op:2 Readywithssrc,ip,port, and supported encryption modes. - Bot performs UDP IP discovery (74-byte type-
0x0001packet → echo back type-0x0002). - Bot sends
op:1 Select Protocolwith the encryption mode it picked. - Voice gateway responds
op:4 Session Descriptionwith the 32-bytesecret_key. - Bot streams Opus over RTP, AEAD-encrypted with the negotiated mode.
Encryption modes
Two AEAD-RTPSize modes are supported, matching Discord:
aead_aes256_gcm_rtpsize— preferred. AES-256-GCM with the nonce derived from a 4-byte counter appended to the encrypted payload.aead_xchacha20_poly1305_rtpsize— fallback for libraries without AES-NI.
Older modes (xsalsa20_poly1305, _lite,
_suffix) are not offered — Discord
retired them in 2024 and the major libraries have moved off. If
your bot library is pinned to an old mode, update it.
Lavalink notes
Lavalink reads the endpoint straight off
VOICE_SERVER_UPDATE and dials UDP on its own. As long
as the endpoint is reachable (it is), Lavalink doesn't know it
isn't talking to Discord.
Tested versions: Lavalink v3.7.x and v4.0.x. If you hit a version-specific issue, report it on the developer portal so we can pin against your version in CI.
Latency budget
Target bridge added latency: under 5 ms measured from inbound UDP to SFU egress. Above 5 ms music sounds wrong (timing drift on note onsets, hi-hat jitter). The bridge does Opus passthrough — no transcoding — so the headline number is decrypt + repacketize.
Speaking events
op:5 Speaking is bidirectional and identical to
Discord. The voice gateway maps inbound human speakers' SSRCs to
their user IDs and dispatches op:5 with
{ssrc, user_id} so the bot can correlate.
Stopping cleanly
Sending op:4 Update Voice State with
channel_id: null on the main gateway drains the
session. The voice gateway closes the UDP socket and removes the
synthetic SFU peer; human listeners stop receiving audio
immediately.
Not yet supported
- DAVE / E2EE — Discord's per-channel end-to-end-encrypted voice mode. No timeline yet.
- Soundboard — bots can't post soundboard sounds.
- Stage channels — accepted on the wire, but no separate audience / speaker handling.