API Reference
Audio transcriptions
Convert audio to text using non-stream or WebSocket streaming.
POST
/v1/audio/transcriptionsRequires avatar:interact or avatar:use. Accepts WAV files or raw PCM bytes.
Multipart upload
Send a WAV file via multipart/form-data using the file field.
curl https://<gateway-host>/v1/audio/transcriptions \
-H "Authorization: Bearer $DISRUPTIVERAIN_CLIENT_ID:$DISRUPTIVERAIN_CLIENT_SECRET" \
-F file=@sample.wavWAV format
WAV uploads must be 16-bit PCM. Other codecs are rejected.
Raw PCM upload
For raw PCM audio, provide sample_rate and optional channels and format query parameters.
curl https://<gateway-host>/v1/audio/transcriptions?sample_rate=48000&format=pcm_s16le \
-H "Authorization: Bearer $DISRUPTIVERAIN_CLIENT_ID:$DISRUPTIVERAIN_CLIENT_SECRET" \
-H "Content-Type: application/octet-stream" \
--data-binary @sample.pcmRequired fields
Raw audio requires a sample rate. The default format is
pcm_s16le.Response
{
"sessionId": "stt_123",
"messageId": "msg_123",
"text": "hello everyone",
"confidence": 0.92,
"alternatives": []
}WS
/v1/audio/transcriptions/streamProvide session_id and sample_rate in the query string. Send binary PCM frames and receive transcription events.
import WebSocket from 'ws';
const ws = new WebSocket('wss://<gateway-host>/v1/audio/transcriptions/stream?session_id=stt_123&sample_rate=48000', {
headers: {
Authorization: `Bearer ${process.env.DISRUPTIVERAIN_CLIENT_ID}:${process.env.DISRUPTIVERAIN_CLIENT_SECRET}`,
},
});
ws.binaryType = 'arraybuffer';
ws.onmessage = (event) => {
const message = JSON.parse(event.data);
if (message.type === 'transcription') {
console.log(message.text);
}
};
// Send PCM frames as ArrayBuffer
ws.send(pcmFrame);
// Signal end of stream
ws.send(JSON.stringify({ type: 'end' }));Browser note
WebSocket upgrades require auth headers. Browsers cannot set custom headers on WebSocket connections, so proxy streaming through your backend if you need browser transcription.
Was this page helpful?