package vosk.tts

Get desktop application:
View/edit binary Protocol Buffers messages

rpc UtteranceSynthesis (UtteranceSynthesisRequest, stream UtteranceSynthesisResponse)
tts_service.proto:93
Synthesizing text into speech.
message UtteranceSynthesisRequest
tts_service.proto:72
- string model = 1
  The name of the model. Specifies basic synthesis functionality. Currently should be empty. Do not use it.
- oneof Utterance
  Text to synthesis, one of text synthesis markups.
  - string text = 2
    Raw text (e.g. "Hello, Alice").
- repeated Hints hints = 3
  Optional hints for synthesis.
- optional AudioFormatOptions output_audio_spec = 4
  Optional. Default: 22050 Hz, linear 16-bit signed little-endian PCM, with WAV header
message UtteranceSynthesisResponse
tts_service.proto:46
- optional AudioChunk audio_chunk = 1
  Part of synthesized audio.

Used in: UtteranceSynthesisResponse

bytes data = 1
Sequence of bytes of the synthesized audio in format specified in output_audio_spec.

Used in: UtteranceSynthesisRequest

oneof AudioFormat
- RawAudio raw_audio = 1
  The audio format specified in request parameters.
- ContainerAudio container_audio = 2
  The audio format specified inside the container metadata.

Used in: AudioFormatOptions

ContainerAudio.ContainerAudioType container_audio_type = 1

Used in: ContainerAudio

CONTAINER_AUDIO_TYPE_UNSPECIFIED = 0
WAV = 1
Audio bit depth 16-bit signed little-endian (Linear PCM).
OGG_OPUS = 2
Data is encoded using the OPUS audio codec and compressed using the OGG container format.
MP3 = 3
Data is encoded using MPEG-1/2 Layer III and compressed using the MP3 container format.

Used in: UtteranceSynthesisRequest

oneof Hint
The hint for TTS engine to specify synthesised audio characteristics.
- int64 speaker_id = 1
  ID of speaker to use.
- double speech_rate = 2
  Hint to change speech rate.
- string role = 3
  Hint to specify pronunciation character for the speaker.

Used in: AudioFormatOptions

RawAudio.AudioEncoding audio_encoding = 1
Encoding type.
int64 sample_rate_hertz = 2
Sampling frequency of the signal.

Used in: RawAudio

AUDIO_ENCODING_UNSPECIFIED = 0
LINEAR16_PCM = 1
Audio bit depth 16-bit signed little-endian (Linear PCM).