I'd like to be able to end a Google speech-to-text stream (created with streamingRecognize
), and get back the pending SR (speech recognition) results.
In a nutshell, the relevant Node.js code:
// create SR stream
const stream = speechClient.streamingRecognize(request);
// observe data event
const dataPromise = new Promise(resolve => stream.on('data', resolve));
// observe error event
const errorPromise = new Promise((resolve, reject) => stream.on('error', reject));
// observe finish event
const finishPromise = new Promise(resolve => stream.on('finish', resolve));
// a 5 seconds test timeout
const timeoutPromise = new Promise(resolve => setTimeout(resolve, 5000));
// send the audio
stream.write(audioChunk);
// for testing purposes only, give the SR stream 2 seconds to absorb the audio
await new Promise(resolve => setTimeout(resolve, 2000));
// end the SR stream gracefully, by observing the completion callback
const endPromise = util.promisify(callback => stream.end(callback))();
// finishPromise wins the race here
await Promise.race([
dataPromise, errorPromise, finishPromise, endPromise, timeoutPromise]);
// endPromise wins the race here
await Promise.race([
dataPromise, errorPromise, endPromise, timeoutPromise]);
// timeoutPromise wins the race here
await Promise.race([dataPromise, errorPromise, timeoutPromise]);
// I don't see any data or error events, dataPromise and errorPromise don't get settled
What I experience is that the SR stream ends successfully, but I don't get any data events or error events. Neither dataPromise
nor errorPromise
gets resolved or rejected.
How can I signal the end of my audio, close the SR stream and still get the pending SR results?
I need to stick with streamingRecognize
API because the audio I'm streaming is real-time, even though it may stop suddenly.
To clarify, it works as long as I keep streaming the audio, I do receive the real-time SR results. However, when I send the final audio chunk and end the stream like above, I don't get the final results as I'd otherwise expect.
To get the final results, I actually have to keep streaming silence for several more seconds. I feel like there must be a better way to get them.
Updated: so it appears, the only proper time to end a streamingRecognize
stream is upon data
event where StreamingRecognitionResult.is_final
is true
. As well, it appears we're expected to keep streaming audio until data
event is fired, to get any result at all, final or interim.
This looks like a bug to me, filing an issue.
Updated:, it now seems to have been confirmed as a bug. Until it's fixed, I'm looking for a potential workaround.
from How to end Google Speech-to-Text streamingRecognize gracefully and get back the pending text results?
No comments:
Post a Comment