Wednesday, 10 October 2018

Streaming Audio in FLAC or AMR_WB to the Google Speech API

I need to run the google speech api in somewhat low bandwidth environments.

Based on reading about best practices, it seems my best bet is to use the AMR_WB format.

However, the following code produces no exceptions, and I get no responses in the onError(t: Throwable) method, but the API is not returning any values at all in the onNext(value: StreamingRecognizeResponse) method.

If I change the format in .setEncoding() from FLAC or AMR_WB back to LINEAR16 everything works fine.

AudioEmitter.kt

fun start(
            encoding: Int = AudioFormat.ENCODING_PCM_16BIT,
            channel: Int = AudioFormat.CHANNEL_IN_MONO,
            sampleRate: Int = 16000,
            subscriber: (ByteString) -> Unit
    )

MainActivity.kt

builder.streamingConfig = StreamingRecognitionConfig.newBuilder()
        .setConfig(RecognitionConfig.newBuilder()
                .setLanguageCode("en-US")
                .setEncoding(RecognitionConfig.AudioEncoding.AMR_WB)
                .setSampleRateHertz(16000)
                .build())
        .setInterimResults(true)
        .setSingleUtterance(false)
        .build()



from Streaming Audio in FLAC or AMR_WB to the Google Speech API

No comments:

Post a Comment