Get desktop application:
View/edit binary Protocol Buffers messages
Next available ID: 4
Next available ID: 14 This message defines a generic utterance class that contains a lot of useful information. Original waveform data, a S*C matrix, S is number of samples and C is number of channels.
Sampling frequency
Text transcript
A TextGrid file serialized into a string
Full phonetic posteriorgram (PPG), a T*D matrix. T is number of frames and D is number of feature dimensions.
Monophone PPG, a T*D matrix.
Phoneme-level alignment, in frames
Word-level alignment, in frames
Frame-level alignment, have to use symtable to decode
Next available ID: 4
Used in:
, , ,Next available ID: 4
Used in:
Next available ID: 4
Used in:
Next available ID: 4 Kaldi related settings.
Used in:
Next available ID: 6
Original wave file location
Used in:
Next available ID: 11
General American English
Mandarin-accent English
Mid-/South-America-Spanish-accent English
Spain-Spanish-accent English
Arabic-accent English
Korean-accent English
Indian English
Vietnamese-accent English
Canadian English
British English
Scottish English
Used in:
Next available ID: 3
Male
Female
Other
Used in:
Next available ID: 5
Used in:
Used in:
Next available ID: 16
Alpha for MCEP analysis
F0 search range: floor
F0 search range: ceil
Analysis timestamp
Used in:
Next available ID: 4 Spectrogram, a T*D matrix.
Mel-Frequency Cepstral Coefficients, a T*D matrix.
Mel-Cepstral Coefficients, a T*D matrix.
Used in:
Next available ID: 6
Fundamental frequency
Full aperiodicity, a T*D matrix.
Band aperiodicity, a T*D matrix.
Voicing
Start time of each analysis window in seconds
Next available ID: 5 Contains everything from vocoder analysis.
Used in:
Next available ID: 3