Proto commits in triton-inference-server/common

These 74 commits are when the Protocol Buffers files have changed:

Commit:b0b5387
Author:Sai Kiran Polisetty
Committer:GitHub

doc: Enforce `max_inflight_requests` as a shared limit across ensemble requests (#152)

The documentation is generated from this commit.

Commit:6a318ca
Author:Sai Kiran Polisetty
Committer:GitHub

feat: Add support for `max_inflight_requests` parameter to prevent unbounded memory growth in ensemble models (#141) Co-authored-by: Yingge He <157551214+yinggeh@users.noreply.github.com>

Commit:bfb7c4d
Author:Indrajit Bhosale

Pre-Commit fix

Commit:86e510a
Author:Indrajit Bhosale

Draft for ModelInfer

Commit:54618ab
Author:Indrajit Bhosale

Draft for ModelInfer

Commit:95197f1
Author:Indrajit Bhosale

Create New service for callback

Commit:7478ed9
Author:Indrajit Bhosale

Create New service for callback

Commit:4843947
Author:Indrajit Bhosale

Create New service for callback

Commit:3948525
Author:Yingge He
Committer:GitHub

feat: Per-model metric customization (#126)

Commit:15f7227
Author:fpetrini15

Test

Commit:2e9cb9a
Author:Sai Kiran Polisetty
Committer:GitHub

Fix shape and reformat free tensor handling in the input byte size check (#125) * Update model_config.proto

Commit:00b3a71
Author:Jacky
Committer:GitHub

Add cancellation into response statistics (#113)

Commit:bf4b163
Author:Jacky
Committer:GitHub

Add response statistics (#112) * Add response stats to protobuf * Remove mentioning decoupled on comments

Commit:a506fbe
Author:Francesco Petrini
Committer:GitHub

Support Double-Type Infer/Response Parameters * Support Double-Type Infer/Response Parameters

Commit:00a4288
Author:Jacky
Committer:GitHub

Add runtime to model configuration (#103) * Add runtime to model config * Update copyright

Commit:a8a7341
Author:Iman Tabrizian
Committer:GitHub

Generative ->Iterative (#107) (#108) * name change * updated language * updated with default value * updated language Co-authored-by: Neelay Shah <neelays@nvidia.com>

Commit:3ecedb0
Author:Neelay Shah
Committer:GitHub

Generative ->Iterative (#107) * name change * updated language * updated with default value * updated language

Commit:805dbcf
Author:Iman Tabrizian
Committer:Misha Chornyi

Add options for growable memory and single state buffers (#104) * Add same input/output bstate buffer option * Add an option for using GrowableMemory * Review comments * Format * Review comments * Review comment * Fix description

Commit:9f8c873
Author:Iman Tabrizian
Committer:GitHub

Add options for growable memory and single state buffers (#104) * Add same input/output bstate buffer option * Add an option for using GrowableMemory * Review comments * Format * Review comments * Review comment * Fix description

Commit:adef772
Author:GuanLuo
Committer:GitHub

Add new sequence batcher parameter for generative sequence (#102)

Commit:468eb21
Author:dyastremsky
Committer:GitHub

Add GitHub action to format and lint code (#96) * Add pre-commit hook * Run commit hooks, remove ignored word list * Add GitHub action * Add Java to Clang * Fix pre-commit to include all Python files * Remove old formatter * Remove unused skipped files * Remove codeql because no more Python * Add more pre-commit filetype checkers * Trim whitespace hook * Remove unnecessary dependency * Add mixed-line-ending and case-conflicts checks * Add copyright * Update max-line-length * Remove unnecessary line * End of file * Fix comment * Add and apply isort * Remove duplicate copyrights, add hooks link * Pin workflow Ubuntu version * Flake8 Black style, move Flake8 conf to toml * Alphabetize configs by tool * Move flake8 back into pre-commit-config * Restore clang-format file * Eof newline * Fix yaml spacing * Normalize spacing * Normalize config indentation * Update line limit in clang-format to 80 chars * Update workflows to run on every PR * Run pre-commit

Commit:072ad13
Author:Tanmay Verma
Committer:GitHub

Add preserve_ordering field to oldest strategy in sequence scheduler config (#97) (#98) Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

Commit:a2de06f
Author:Ryan McCormick
Committer:GitHub

Add preserve_ordering field to oldest strategy in sequence scheduler config (#97)

Commit:1df32b9
Author:dyastremsky
Committer:GitHub

Auto-format (#95)

Commit:7ff0105
Author:Neelay Shah
Committer:GitHub

Updating Service and Model Config Protobuf for uint64 Request Priority (#93) Co-authored-by: qmas <q.massoz@evs.com>

Commit:869bf83
Author:Neelay Shah
Committer:GitHub

Revert "Updating Service and Model Config Protobuf for uint64 Request Priority" (#92) This reverts commit e3048594e2ed6d7532099c80b8fb26ec42dd7fe9.

Commit:d1ac878
Author:nnshah1

Revert "Updating Service and Model Config Protobuf for uint64 Request Priority" This reverts commit e3048594e2ed6d7532099c80b8fb26ec42dd7fe9.

Commit:e304859
Author:Neelay Shah
Committer:GitHub

Updating Service and Model Config Protobuf for uint64 Request Priority * change priority from uint32 to uint64 in model_config * add uint64 and double types to inference parameters Co-authored-by: qmas <q.massoz@evs.com>

Commit:34a0f79
Author:nnshah1

updated with documentation on support for double and uint64

Commit:b0d13a2
Author:Neelay Shah

adding uint64 and double param to infer parameter.

Commit:31004d0
Author:Neelay Shah
Committer:GitHub

change priority from uint32 to uint64 in model_config Co-authored-by: qmas <q.massoz@evs.com>

Commit:501aa75
Author:GuanLuo
Committer:GitHub

Add memory usage report in GRPC statistic service (#88) * Update GRPC service proto * Fix type * Fix type

Commit:f9904d9
Author:GuanLuo
Committer:GitHub

Update documentation for "platform" (#89)

Commit:974998c
Author:GuanLuo
Committer:GitHub

Add reserved namespace field in ensemble step (#81)

Commit:7b37a24
Author:dyastremsky
Committer:GitHub

Add protobuf for GRPC health check (#80) * Draft health service * Formatting * Clean up * Change build order * Add health proto to targets * Change ordering * Reordering build * Add comments * Copyrights, formatting * Keep implemented methods * Remove Python health executables * Rename health library * Naming

Commit:c06c43b
Author:Iman Tabrizian
Committer:GitHub

Improve the documentation for input_data_file. (#76)

Commit:cb62c76
Author:kthui
Committer:Misha Chornyi

Revert per response metrics

Commit:050e5ba
Author:kthui
Committer:GitHub

Revert per response metrics (#74)

Commit:b018b65
Author:Iman Tabrizian
Committer:GitHub

Add response statistics to GRPC frontend (#71) * Add response statistics to GRPC frontend * Improve docs * Improve comments * Add no response count * Improve documentation clarity Co-authored-by: kthui <18255193+kthui@users.noreply.github.com>

Commit:58a25d1
Author:Iman Tabrizian

Update documentation for execution accelerators

Commit:d401744
Author:Francesco Petrini
Committer:GitHub

Incorporating Dynamic Logging (#70) * Migrating Changes * New line * Add comments

Commit:051c706
Author:GuanLuo
Committer:GitHub

Add 'count' field for warmup (#61) * Add 'repeat_count' field for warmup * Address comment * Change "repeat_count" to "count"

Commit:976afde
Author:GuanLuo
Committer:GitHub

Extend GRPC ModelRepositoryParameter to allow bytes (#51)

Commit:2e51208
Author:Ryan McCormick
Committer:GitHub

Add TYPE_BF16 scaffolding (#49) * TYPE_BF16 scaffolding * Add note on BF16 datatype requiring use raw contents

Commit:fc2f0a6
Author:GuanLuo
Committer:GitHub

Add batch input item shape specification (#43) * Add batch input item shape specification * Fix copyright * Address comment

Commit:b9099c4
Author:Ryan McCormick
Committer:GitHub

Add cache_miss to grpc stub (#42) * Add cache_miss to grpc stub * Update 2022 copyright header * Review comments

Commit:59c891c
Author:GuanLuo
Committer:GitHub

Extend load API in GRPC service (#41) * Extend load API * Fix copyright

Commit:b1ef9c1
Author:Ryan McCormick
Committer:GitHub

Update GRPC stub to include cache stats (#37) * Add cache_hit stat to common GRPC protobuf * Update GRPC proto to match server/docs/protocol/extension_statistics.md * Add more details on cache hits per review feedback * Add more details to 'cache_hit' field and refer to it in the 'compute_*' fields

Commit:b7e11ba
Author:GuanLuo
Committer:GitHub

Add GRPC trace service (#40) * Add GRPC trace service * Fix up * Address comment * Expose JSON null check * Address comment * Address comment

Commit:65dec4c
Author:CoderHam

map cannot be in oneof - create new message for map

Commit:09b6735
Author:CoderHam

fix TensorStructure def

Commit:481d507
Author:CoderHam

review edits

Commit:f2e67ed
Author:CoderHam

cleanup

Commit:8f6ee44
Author:CoderHam

test

Commit:3a8e7a3
Author:CoderHam

Add TensorStructure field for I/O

Commit:c009eeb
Author:Iman Tabrizian
Committer:GitHub

Add state initialization setting to model config protobuf (#36) * Add state initialization setting to model config protobuf * Review edit * Remove nested metadata

Commit:f939abe
Author:GuanLuo
Committer:GitHub

Add optional field in ModelInput message (#35) * Add optional field in ModelInput message * Fix comment

Commit:dc3cbd2
Author:Iman Tabrizian
Committer:Iman Tabrizian

Review edit

Commit:175e2d5
Author:Iman Tabrizian
Committer:Iman Tabrizian

Add state initialization setting to model config protobuf

Commit:e8c269d
Author:deadeyegoodwin
Committer:GitHub

Fix GRPC protocol error. KServer protocol specifies 'bytes_contents' (#34)

Commit:cc58c85
Author:Iman Tabrizian
Committer:GitHub

Add state description to model config (#28) * Add state description to the protobuf * Review edits

Commit:893d3c1
Author:Tanmay Verma
Committer:GitHub

Add response cache enable setting in model config (#30) * Add response cache enable setting in model config * Format fix * Use composite message for response cache settings

Commit:fe7e548
Author:Tanmay Verma
Committer:GitHub

Add clarification for rate limiter config priority (#29)

Commit:e726d90
Author:David Goodwin

Remove some legacy 'custom backend' references

Commit:86f1931
Author:Tanmay Verma
Committer:GitHub

Document memory impact of the output_copy_stream (#27)

Commit:6b6e981
Author:Ashwini Khade

bug fix

Commit:5ab636c
Author:Ashwini Khade

add more configuration params for ORT

Commit:ce91438
Author:Kris Hung
Committer:GitHub

Extend START, END, READY controls to allow BOOL type (#22) * Add bool type * Update identifier * Update identifier Co-authored-by: Kris Hung <krish@krish-dt.nvidia.com>

Commit:2492327
Author:GuanLuo
Committer:GitHub

Add host policy field (#17) * Add numa id field * Enforce the NUMA id to be the same as GPU id for GPU instance * Modify to "host_policy" as a more general approach * Address comment * Fix rebase artifact

Commit:a0e3d6d
Author:Hemant Jain
Committer:GitHub

Add support for DLA/secondary device specification (#18) * Add support for DLA/secondary device specification * Address review comments * Improve description and other cleanup

Commit:996299e
Author:GuanLuo
Committer:GitHub

Add 'passive' field in ModelInstanceGroup (#16)

Commit:47f791e
Author:deadeyegoodwin
Committer:GitHub

Integrate minor doc changes (#15)

Commit:011b7ac
Author:David Goodwin
Committer:deadeyegoodwin

Move protobuf to common

Commit:feaebe7
Author:David Goodwin
Committer:deadeyegoodwin

Integrate change from triton-inference-server/server repo > e2208d2dd5effd0 src/core/grpc_service.proto > commit 09271f9c4d4d935bd9667dd2be2208d2dd5effd0 > Author: GuanLuo <41310872+GuanLuo@users.noreply.github.com> > Date: Mon Apr 5 09:48:32 2021 -0700 > > Add end point to unload model and its dependents (#2684)