Get desktop application:
View/edit binary Protocol Buffers messages
Parameters for the observer, who converts the speech protos into TranscriptionResult.
Allows profanity to be filtered by the Cloud Speech API.
Details about how the audio signal should be compressed prior to sending it to the server.
Used in:
If the encoder isn't supported, uncompressed audio will be used. When this is false, other EncoderParams fields are ignored.
Uses variable bitrate encoding, if available. Currently this is available for OggOpus only.
Used in:
When true, hypotheses are held a bit longer until they likely will not change again before being displayed.
Required.
Words to be passed to the speech recognizer as bias. It is up to each implementation to decide whether these will be used or not.
Select which model to use. Not all models are necessarily available for all recognition systems or locales. It is up to the individual session to warn the user about availability
Used in:
An utterance level copy of the text.
Confidence for the whole utterance [0, 1].
The epoch time at which the utterance was started.
The epoch time at which the utterance was completed.
The identity of the speaker.
Word-level detail. NOTE: Some recognizers (namely the CloudSpeech API) do not give fine-grain information until results are finalized.
The language code in this result. See https://cloud.google.com/speech-to-text/docs/languages for more details. For example, English (United States) : en-US Chinese, Mandarin (Traditional, Taiwan) : cmn-Hant-TW
Fine-grain information about each word. NOTE: the TranscriptResultFormatter may colorize the coarse-grain transcript by the corresponding word information such as confidence and speaker_id if fine-grain word_level_detail is not empty.
Used in:
Confidence for just this word [0, 1].
An integer tag for the identity of the active speaker.
The time at which the word was started.
The time at which the word was completed.
Silences longer than this will cause a space to be inserted.
Number of '\n' characters to add in the event of extended silence. 1 moves to the next line, 2 leaves a blank space in between two lines, and so on...
Number of '\n' characters to add in the event of language switch. 1 moves to the next line, 2 leaves a blank space in between two lines, and so on...
Put current hypotheses in italics.
If true, use a yellow->blue colormap to indicate confidence.
The color theme used for the text.
A label that indicates which speaker is active.
Used in:
Color selection for the text (does not change background). Dark colors for a black-on-white theme. Bright colors for a white-on-black theme.
Used in:
Details on the manner in which the transcript will be colored.
Used in:
Will do NO_COLORING.