package google.cloud.speech.v1

Mouse Melon logoGet desktop application:
View/edit binary Protocol Buffers messages

service Speech

cloud_speech.proto:37

Service that implements Google Cloud Speech API.

message LongRunningRecognizeMetadata

cloud_speech.proto:563

Describes the progress of a long-running `LongRunningRecognize` call. It is included in the `metadata` field of the `Operation` returned by the `GetOperation` call of the `google::longrunning::Operations` service.

message LongRunningRecognizeResponse

cloud_speech.proto:554

The only message returned to the client by the `LongRunningRecognize` method. It contains the result as zero or more sequential `SpeechRecognitionResult` messages. It is included in the `result.response` field of the `Operation` returned by the `GetOperation` call of the `google::longrunning::Operations` service.

message RecognitionAudio

cloud_speech.proto:520

Contains audio data in the encoding specified in the `RecognitionConfig`. Either `content` or `uri` must be supplied. Supplying both or neither returns [google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]. See [content limits](https://cloud.google.com/speech-to-text/quotas#content).

Used in: LongRunningRecognizeRequest, RecognizeRequest

message RecognitionConfig

cloud_speech.proto:150

Provides information to the recognizer that specifies how to process the request.

Used in: LongRunningRecognizeRequest, RecognizeRequest, StreamingRecognitionConfig

enum RecognitionConfig.AudioEncoding

cloud_speech.proto:173

The encoding of the audio data sent in the request. All encodings support only 1 channel (mono) audio, unless the `audio_channel_count` and `enable_separate_recognition_per_channel` fields are set. For best results, the audio source should be captured and transmitted using a lossless encoding (`FLAC` or `LINEAR16`). The accuracy of the speech recognition can be reduced if lossy codecs are used to capture or transmit audio, particularly if background noise is present. Lossy codecs include `MULAW`, `AMR`, `AMR_WB`, `OGG_OPUS`, `SPEEX_WITH_HEADER_BYTE`, and `MP3`. The `FLAC` and `WAV` audio file formats include a header that describes the included audio content. You can request recognition for `WAV` files that contain either `LINEAR16` or `MULAW` encoded audio. If you send `FLAC` or `WAV` audio file format in your request, you do not need to specify an `AudioEncoding`; the audio encoding format is determined from the file header. If you specify an `AudioEncoding` when you send send `FLAC` or `WAV` audio, the encoding configuration must match the encoding described in the audio header; otherwise the request returns an [google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT] error code.

Used in: RecognitionConfig

message RecognitionMetadata

cloud_speech.proto:372

Description of audio data to be recognized.

Used in: RecognitionConfig

enum RecognitionMetadata.InteractionType

cloud_speech.proto:375

Use case categories that the audio recognition request can be described by.

Used in: RecognitionMetadata

enum RecognitionMetadata.MicrophoneDistance

cloud_speech.proto:412

Enumerates the types of capture settings describing an audio file.

Used in: RecognitionMetadata

enum RecognitionMetadata.OriginalMediaType

cloud_speech.proto:429

The original media the speech was recorded on.

Used in: RecognitionMetadata

enum RecognitionMetadata.RecordingDeviceType

cloud_speech.proto:441

The type of device the speech was recorded with.

Used in: RecognitionMetadata

message SpeakerDiarizationConfig

cloud_speech.proto:354

Config to enable speaker diarization.

Used in: RecognitionConfig

message SpeechContext

cloud_speech.proto:500

Provides "hints" to the speech recognizer to favor specific words and phrases in the results.

Used in: RecognitionConfig

message SpeechRecognitionAlternative

cloud_speech.proto:708

Alternative hypotheses (a.k.a. n-best list).

Used in: SpeechRecognitionResult, StreamingRecognitionResult

message SpeechRecognitionResult

cloud_speech.proto:694

A speech recognition result corresponding to a portion of the audio.

Used in: LongRunningRecognizeResponse, RecognizeResponse

message StreamingRecognitionConfig

cloud_speech.proto:123

Provides information to the recognizer that specifies how to process the request.

Used in: StreamingRecognizeRequest

message StreamingRecognitionResult

cloud_speech.proto:656

A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.

Used in: StreamingRecognizeResponse

enum StreamingRecognizeResponse.SpeechEventType

cloud_speech.proto:626

Indicates the type of speech event.

Used in: StreamingRecognizeResponse

message WordInfo

cloud_speech.proto:728

Word-specific information for recognized words.

Used in: SpeechRecognitionAlternative