package eidos.audition

Get desktop application:
View/edit binary Protocol Buffers messages

Interface for controlling the audio feature extractor. Next available ID: 3

optional StimulusConfig config = 1
Stimulus configuration containing input-specific parameters such as sampling rate, model-agnostic analysis parameters and so on.
optional AuditoryPipelineConfig pipeline = 2
Auditory pipeline consisting of one or more models.

Model configuration. Next available ID: 4

Used in: AuditoryPipelineConfig

AuditoryModelType model_type = 1
Model type.
AuditoryStageType stage_type = 2
Model stage type.
optional google.protobuf.Any config = 3
Concrete model configuration.

Auditory models currently supported. Next available ID: 12

Used in: AuditoryModelConfig

MODEL_UNKNOWN = 0
Placeholder for an unknown (unitialized) model.
MODEL_MOCK_BASILAR_MEMBRANE = 1
Mock basical membrane model.
MODEL_CARFAC = 2
Cascade of Asymmetric Resonators with Fast-Acting Compression (CARFAC). See Richard F. Lyon, "Using a Cascade of Asymmetric Resonators with Fast-Acting Compression as a Cochlear Model for Machine-Hearing Applications", Autumn Meeting of the Acoustical Society of Japan (2011), pp. 509--512.
MODEL_BAUMGARTE = 3
Peripheral ear model by Frank Baumgarte. See F. Baumgarte: "Ein psychophysiologisches Gehoermodell zur Nachbildung von Wahrnehmungsschwellen fuer die Audiocodierung", PhD Dissertation, University of Hannover, 2000.
MODEL_ZILANY_IHC_2014 = 4
Peripheral model up to and including the hair cell by Zilany, et. al. based on: Zilany, M.S.A., Bruce, I.C., and Carney, L.H. (2014). "Updated parameters and expanded simulation options for a model of the auditory periphery," Journal of the Acoustical Society of America. Zilany, M.S.A., Bruce, I.C., Nelson, P.C., and Carney, L.H. (2009). "A phenomenological model of the synapse between the inner hair cell and auditory nerve: Long-term adaptation with power-law dynamics," Journal of the Acoustical Society of America, 126(5): 2390-2412.
MODEL_GAMMATONE_SLANEY = 5
Gammatone filterbank model. Two implementations are available: 1. M. Slaney (1998): "Auditory Toolbox Version 2", Technical Report #1998-010, Interval Research Corporation, 1998; 2. Ning Ma's implementation of: M. Cooke (1993): "Modelling Auditory Processing and Organisation", Cambridge University Press, Series "Distinguished Dissertations in Computer Science", August. One can select either implementation via model-specific configuration.
MODEL_MEDDIS_SYNAPSE_1986 = 6
Inner hair cell synapse model by Ray Meddis, et. al.: - Ray Meddis (1986): "Simulation of mechanical to neural transduction in the auditory receptor", Journal of the Acoustical Society of America 79(3), 702--711. - Ray Meddis, Michael J. Hewitt, and Trevor M. Shackleton (1990): "Implementation details of a computation model of the inner hair‐cell auditory‐nerve synapse", The Journal of the Acoustical Society of America 87, 1813.
MODEL_BRUCE_SYNAPSE_2018 = 7
Inner hair cell synaptic model from Carney, Bruce and Zilany labs: Bruce, I.C., Erfani, Y., and Zilany, M.S.A. (2018). "A Phenomenological model of the synapse between the inner hair cell and auditory nerve: Implications of limited neurotransmitter release sites", Hearing research, 360, 40--54, (Special Issue on "Computational Models in Hearing").
MODEL_BRUCE_SPIKES_2018 = 8
Synapse spike generator model from Carney, Bruce and Zilany labs: Bruce, I.C., Erfani, Y., and Zilany, M.S.A. (2018). "A Phenomenological model of the synapse between the inner hair cell and auditory nerve: Implications of limited neurotransmitter release sites", Hearing research, 360, 40--54, (Special Issue on "Computational Models in Hearing").
MODEL_JACKSON_SPIKES = 9
Synapse spike generator model from Scott Jackson: Jackson BS, Carney LH (2005), "The spontaneous-rate histogram of the auditory nerve can be explained by only two or three spontaneous rates and long-range dependence", J. Assoc. Res. Otolaryngol. 6:148-159.
MODEL_ZHANG_SPIKES_2001 = 10
Synapse spike generator model from Zhang, et. al: Zhang, X., Heinz, M. G., Bruce, I. C., & Carney, L. H. (2001): "A phenomenological model for the responses of auditory-nerve fibers: I. Nonlinear tuning with compression and suppression.", The Journal of the Acoustical Society of America, 109(2), 648-670.
MODEL_SUMNER_SYNAPSE_2002 = 11
Inner hair cell synapse model by Sumner, et. al.: Sumner, C. J, Lopez-Poveda, E. A., O'Mard, L. P. and Meddis, R. (2002): "A revised model of the inner-hair cell and auditory-nerve complex.", The Journal of the Acoustical Society of America (JASA), vol.111, no.5, pp. 2178--2188.

Types of outputs the model can produce. Next available ID: 5

OUTPUT_UNKNOWN = 0
Placeholder for an unknown output.
OUTPUT_BASILAR_MEMBRANE_DISPLACEMENT = 1
Displacement of a basilar membrane in response to the pressure stimuli. This forms an excitation for the inner hair cells.
OUTPUT_IHC_TRANSMEMBRANE_POTENTIAL = 2
Transmembrane potential across inner hair cells (afferents).
OUTPUT_SYNAPSE_FIRING_RATES = 3
Auditory nerve synapse: Firing rate probabilities.
OUTPUT_SYNAPSE_SPIKE_TIMES = 4
Auditory nerve synapse: Spike onset times.

Next available ID: 2

Used in: AuditoryFeatureExtractorConfig

repeated AuditoryModelConfig models = 1

"The auditory system transforms sound waves into distinct patterns of neural activity, which are then integrated with information from other sensory systems to guide behavior, including orienting movements to acoustical stimuli and intraspecies communication. The first stage of this transformation occurs at the external and middle ears, which collect sound waves and amplify their pressure, so that the sound energy in the air can be successfully transmitted to the fluid-filled cochlea of the inner ear. In the inner ear, a series of biomechanical processes occur that break up the signal into simpler, sinusoidal components, with the result that the frequency, amplitude, and phase of the original signal are all faithfully transduced by the sensory hair cells and encoded by the electrical activity of the auditory nerve fibers. One product of this process of acoustical decomposition is the systematic representation of sound frequency along the length of the cochlea, referred to as tonotopy." From "Neuroscience", Dale Purves, et. al. (2011). The following message defines several stages in this processing pipeline. A model may implement one or more of these stages. Next available ID: 4

Used in: AuditoryModelConfig

STAGE_UNKNOWN = 0
Placeholder for an unknown auditory stage.
STAGE_BASILAR_MEMBRANE = 1
Basilar membrane.
STAGE_HAIR_CELLS = 2
Hair cells: inner (afferent) and outer (efferent) hair cells.
STAGE_AUDITORY_NERVE_SYNAPSE = 3
Auditory nerve (AN) synapse.

This message consists of several sub-configurations that are parsed and converted to the configuration structs required by the CARFAC API. This is cumbersome and could be avoided if CARFAC implementation used protocol buffers. Next available ID: 6

optional CarfacConfig.CarParams car = 1
The Cascade of Asymmetric Resonators (CAR).
optional CarfacConfig.IhcParams ihc = 2
Inner Hair Cell (IHC) filter params.
optional CarfacConfig.AgcParams agc = 3
Automatic Gain Control (AGC) parameters.
optional CarfacConfig.OutputTypes output_types = 4
Signal types to save.
bool agc_open_loop = 5
Enabling <agc_open_loop> breaks the AGC feedback loop, making the filters linear; false is the normal value, using feedback from the output level to control filter damping, thereby giving a compressive amplitude characteristic to reduce the output dynamic range.

Automatic gain control (AGC) parameters for designing AGC filters. Next available ID: 5

Used in: CarfacConfig

int32 num_stages = 1
If zero, the AGC is disabled.
double agc_stage_gain = 2
double agc_mix_coeff = 3
repeated double time_constants = 4
repeated int32 decimation = 5
repeated double agc1_scales = 6
repeated double agc2_scales = 7

Parameters required to design the set of coefficients implementing 'The Cascade of Asymmetric Resonators' (CAR). Next available ID: 12

Used in: CarfacConfig

double velocity_scale = 1
Used for the velocity nonlinearity.
double v_offset = 2
The offset gives us quadratic part.
double min_zeta = 3
The minimum damping factor in mid-freq channels.
double max_zeta = 4
The maximum damping factor in mid-freq channels.
double first_pole_theta = 5
double zero_ratio = 6
This is how far zero is above the pole.
double high_f_damping_compression = 7
A range from 0 to 1 to compress
double erb_per_step = 8
theta.
double min_pole_hz = 9
double erb_break_freq = 10
double erb_q = 11

Inner hair cell (IHC) parameters, which are used to design the IHC filters. Next available ID: 9

Used in: CarfacConfig

bool just_half_wave_rectify = 1
bool one_capacitor = 2
double tau_lpf = 3
double tau1_out = 4
double tau1_in = 5
double tau2_out = 6
double tau2_in = 7
double ac_corner_hz = 8

Message with the flags indicating which signals to store. Next available ID: 5

Used in: CarfacConfig

bool store_nap = 1
Store Neural Activity Patterns (NAP).
bool store_bm = 2
Store Basilar Membrane (BM) displacements.
bool store_ohc = 3
Store Outer Hair Cells (OHCs) signals.
bool store_agc = 4
Store Adaptive Gain Control (AGC).

Next available ID: 2

GammatoneFilterbankType filter_type = 1
Type of the filterbank.

Used in: GammatoneFilterbankConfig

GAMMATONE_FILTERBANK_SLANEY = 0
Malcolm Slaney's filterbank. See - M. Slaney (Apple TR #35), "An Efficient Implementation of the Patterson-Holdsworth Cochlear Filter Bank.", 33-34.
GAMMATONE_FILTERBANK_COOKE_AND_MA = 1
This gammatone filter is based on the implementation by Ning Ma from University of Sheffield who, in turn, based his implementation on an original algorithm from Martin Cooke's Ph.D thesis (Cooke, 1993) using the base-band impulse invariant transformation. See - http://www.dcs.shef.ac.uk/~ning/resources/gammatone

Some of the parameters may be chosen to be ignored by the particular models. For example, CARFAC uses its own adaptive computation of number of output channels, hence ignoring the <num_channels> parameters in this message. Next available ID: 14

Used in: AuditoryFeatureExtractorConfig

int32 sample_rate = 1
Input sampling rate.
float audio_scaling_gain = 2
Audio scaling gain. By default we scale the waveform to [-1.0,1.0] range. To these values we can also apply a gain factor. Note from Dick Lyon on CARFAC: "be aware that the -1 to 1 range, if used fully, represents very loud sound. For "normal" level, you probably want to throw in a gain of 0.01 to 0.1 on top of the mapping from int16 to that range".
int32 downsample_step = 3
Downsample the output by sampling every <n> samples. Some implementations prefer this value over customizing the output sample rate.
bool store_intermediate_outputs = 4
Store outputs from the intermediate stages of processing. For example, if the pipeline consists of basilar membrane followed by a synaptic model, the outputs of both stages are stored.
int32 num_channels = 5
Number of channels (frequency bands) for analysis corresponding to <n> equidistant locations along the cochlea. If unset, model-specific defaults will be used.
float lowest_cf_hz = 6
Lowest characteristic frequency (CF) for analysis (in Hz). If unset, model-specific defaults will be used.
float highest_cf_hz = 7
Highest characteristic frequency (CF) for analysis (in Hz). If unset, model-specific defaults will be used.
int32 output_resample_up_factor = 8
Resample the response: Upsampling factor <p>. The signal is resampled by <p/q>, where <q> is the downsampling factor.
int32 output_resample_down_factor = 9
Resample the response: Downsampling factor <q>. The signal is resampled by <p/q>, where <p> is the upsampling factor.
bool apply_window_to_outputs = 10
If enabled, will apply windowing function to the response. By default no windowing is applied and the response contains original number of stimulus samples.
float window_duration_sec = 11
Window (also frame) duration (in seconds).
float frame_shift_sec = 12
Frame shift (in seconds). After computing each frame, advance to the next frame by this amount.
WindowFunction window_function = 13
Type of the windowing function (if applying windowing).

Calcium conductance mode as defined in DSAM. Next available ID: 2

Used in: Sumner2002HairCellSynapseConfig

SUMNER2002_CA_COND_MODE_ORIGINAL = 0
As defined in the paper.
SUMNER2002_CA_COND_MODE_REVISED = 1

Cleft replenishment mode as defined in DSAM. Next available ID: 2

Used in: Sumner2002HairCellSynapseConfig

SUMNER2002_CLEFT_REPLENISH_MODE_ORIGINAL = 0
As defined in the paper.
SUMNER2002_CLEFT_REPLENISH_MODE_UNITY = 1

Next available ID: 7

Sumner2002CaCondMode ca_cond_mode = 1
Calcium conductance mode.
Sumner2002CleftReplenishMode cleft_replenish_mode = 2
Cleft replenishment mode.
bool output_spikes = 3
If enabled, outputs spike rates, otherwise outputs probabilities.
double g_ca_max = 4
Maximum calcium conductance (in Siemens units).
double perm_ca_0 = 5
Calcium threshold concentration.
int32 max_free_pool_m = 6
Maximum number of transmitter packets (quanta) in free pool.

Next available ID: 5

int32 num_channels = 1
Number of channels.
int32 sample_rate = 2
Sample rate (in Hz).
int32 bits_per_sample = 3
Number of bits per sample.
repeated float samples = 4
Samples.

Type of the windowing function. For various types of functions see "Discrete-time Signal Processing" by Alan V. Oppenheim and Ronald W. Schafer (1989). Next available ID: 3

Used in: StimulusConfig

WINDOW_FUNCTION_NONE = 0
No windowing function to apply.
WINDOW_FUNCTION_HANN = 1
Hann window (see https://en.wikipedia.org/wiki/Hann_function).
WINDOW_FUNCTION_HAMMING = 2
Hamming window. See: https://en.wikipedia.org/wiki/Window_function#Hamming_window.

Next available ID: 2

int32 num_fibers = 1
Collect responses for the fiber population of the supplied size.

package eidos.audition

message AuditoryFeatureExtractorConfig

optional StimulusConfig config = 1

optional AuditoryPipelineConfig pipeline = 2

message AuditoryModelConfig

AuditoryModelType model_type = 1

AuditoryStageType stage_type = 2

optional google.protobuf.Any config = 3

enum AuditoryModelType

MODEL_UNKNOWN = 0

MODEL_MOCK_BASILAR_MEMBRANE = 1

MODEL_CARFAC = 2

MODEL_BAUMGARTE = 3

MODEL_ZILANY_IHC_2014 = 4

MODEL_GAMMATONE_SLANEY = 5

MODEL_MEDDIS_SYNAPSE_1986 = 6

MODEL_BRUCE_SYNAPSE_2018 = 7

MODEL_BRUCE_SPIKES_2018 = 8

MODEL_JACKSON_SPIKES = 9

MODEL_ZHANG_SPIKES_2001 = 10

MODEL_SUMNER_SYNAPSE_2002 = 11

enum AuditoryOutputType

OUTPUT_UNKNOWN = 0

OUTPUT_BASILAR_MEMBRANE_DISPLACEMENT = 1

OUTPUT_IHC_TRANSMEMBRANE_POTENTIAL = 2

OUTPUT_SYNAPSE_FIRING_RATES = 3

OUTPUT_SYNAPSE_SPIKE_TIMES = 4

message AuditoryPipelineConfig

repeated AuditoryModelConfig models = 1

enum AuditoryStageType

STAGE_UNKNOWN = 0

STAGE_BASILAR_MEMBRANE = 1

STAGE_HAIR_CELLS = 2

STAGE_AUDITORY_NERVE_SYNAPSE = 3

message CarfacConfig

optional CarfacConfig.CarParams car = 1

optional CarfacConfig.IhcParams ihc = 2

optional CarfacConfig.AgcParams agc = 3

optional CarfacConfig.OutputTypes output_types = 4

bool agc_open_loop = 5

message CarfacConfig.AgcParams

int32 num_stages = 1

double agc_stage_gain = 2

double agc_mix_coeff = 3

repeated double time_constants = 4

repeated int32 decimation = 5

repeated double agc1_scales = 6

repeated double agc2_scales = 7

message CarfacConfig.CarParams

double velocity_scale = 1

double v_offset = 2

double min_zeta = 3

double max_zeta = 4

double first_pole_theta = 5

double zero_ratio = 6

double high_f_damping_compression = 7

double erb_per_step = 8

double min_pole_hz = 9

double erb_break_freq = 10

double erb_q = 11

message CarfacConfig.IhcParams

bool just_half_wave_rectify = 1

bool one_capacitor = 2

double tau_lpf = 3

double tau1_out = 4

double tau1_in = 5

double tau2_out = 6

double tau2_in = 7

double ac_corner_hz = 8

message CarfacConfig.OutputTypes

bool store_nap = 1

bool store_bm = 2

bool store_ohc = 3

bool store_agc = 4

message GammatoneFilterbankConfig

GammatoneFilterbankType filter_type = 1

enum GammatoneFilterbankType

GAMMATONE_FILTERBANK_SLANEY = 0

GAMMATONE_FILTERBANK_COOKE_AND_MA = 1

message StimulusConfig