These 74 commits are when the Protocol Buffers files have changed:
| Commit: | b0b5387 | |
|---|---|---|
| Author: | Sai Kiran Polisetty | |
| Committer: | GitHub | |
doc: Enforce `max_inflight_requests` as a shared limit across ensemble requests (#152)
The documentation is generated from this commit.
| Commit: | 6a318ca | |
|---|---|---|
| Author: | Sai Kiran Polisetty | |
| Committer: | GitHub | |
feat: Add support for `max_inflight_requests` parameter to prevent unbounded memory growth in ensemble models (#141) Co-authored-by: Yingge He <157551214+yinggeh@users.noreply.github.com>
| Commit: | bfb7c4d | |
|---|---|---|
| Author: | Indrajit Bhosale | |
Pre-Commit fix
| Commit: | 86e510a | |
|---|---|---|
| Author: | Indrajit Bhosale | |
Draft for ModelInfer
| Commit: | 54618ab | |
|---|---|---|
| Author: | Indrajit Bhosale | |
Draft for ModelInfer
| Commit: | 95197f1 | |
|---|---|---|
| Author: | Indrajit Bhosale | |
Create New service for callback
| Commit: | 7478ed9 | |
|---|---|---|
| Author: | Indrajit Bhosale | |
Create New service for callback
| Commit: | 4843947 | |
|---|---|---|
| Author: | Indrajit Bhosale | |
Create New service for callback
| Commit: | 3948525 | |
|---|---|---|
| Author: | Yingge He | |
| Committer: | GitHub | |
feat: Per-model metric customization (#126)
| Commit: | 15f7227 | |
|---|---|---|
| Author: | fpetrini15 | |
Test
| Commit: | 2e9cb9a | |
|---|---|---|
| Author: | Sai Kiran Polisetty | |
| Committer: | GitHub | |
Fix shape and reformat free tensor handling in the input byte size check (#125) * Update model_config.proto
| Commit: | 00b3a71 | |
|---|---|---|
| Author: | Jacky | |
| Committer: | GitHub | |
Add cancellation into response statistics (#113)
| Commit: | bf4b163 | |
|---|---|---|
| Author: | Jacky | |
| Committer: | GitHub | |
Add response statistics (#112) * Add response stats to protobuf * Remove mentioning decoupled on comments
| Commit: | a506fbe | |
|---|---|---|
| Author: | Francesco Petrini | |
| Committer: | GitHub | |
Support Double-Type Infer/Response Parameters * Support Double-Type Infer/Response Parameters
| Commit: | 00a4288 | |
|---|---|---|
| Author: | Jacky | |
| Committer: | GitHub | |
Add runtime to model configuration (#103) * Add runtime to model config * Update copyright
| Commit: | a8a7341 | |
|---|---|---|
| Author: | Iman Tabrizian | |
| Committer: | GitHub | |
Generative ->Iterative (#107) (#108) * name change * updated language * updated with default value * updated language Co-authored-by: Neelay Shah <neelays@nvidia.com>
| Commit: | 3ecedb0 | |
|---|---|---|
| Author: | Neelay Shah | |
| Committer: | GitHub | |
Generative ->Iterative (#107) * name change * updated language * updated with default value * updated language
| Commit: | 805dbcf | |
|---|---|---|
| Author: | Iman Tabrizian | |
| Committer: | Misha Chornyi | |
Add options for growable memory and single state buffers (#104) * Add same input/output bstate buffer option * Add an option for using GrowableMemory * Review comments * Format * Review comments * Review comment * Fix description
| Commit: | 9f8c873 | |
|---|---|---|
| Author: | Iman Tabrizian | |
| Committer: | GitHub | |
Add options for growable memory and single state buffers (#104) * Add same input/output bstate buffer option * Add an option for using GrowableMemory * Review comments * Format * Review comments * Review comment * Fix description
| Commit: | adef772 | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Add new sequence batcher parameter for generative sequence (#102)
| Commit: | 468eb21 | |
|---|---|---|
| Author: | dyastremsky | |
| Committer: | GitHub | |
Add GitHub action to format and lint code (#96) * Add pre-commit hook * Run commit hooks, remove ignored word list * Add GitHub action * Add Java to Clang * Fix pre-commit to include all Python files * Remove old formatter * Remove unused skipped files * Remove codeql because no more Python * Add more pre-commit filetype checkers * Trim whitespace hook * Remove unnecessary dependency * Add mixed-line-ending and case-conflicts checks * Add copyright * Update max-line-length * Remove unnecessary line * End of file * Fix comment * Add and apply isort * Remove duplicate copyrights, add hooks link * Pin workflow Ubuntu version * Flake8 Black style, move Flake8 conf to toml * Alphabetize configs by tool * Move flake8 back into pre-commit-config * Restore clang-format file * Eof newline * Fix yaml spacing * Normalize spacing * Normalize config indentation * Update line limit in clang-format to 80 chars * Update workflows to run on every PR * Run pre-commit
| Commit: | 072ad13 | |
|---|---|---|
| Author: | Tanmay Verma | |
| Committer: | GitHub | |
Add preserve_ordering field to oldest strategy in sequence scheduler config (#97) (#98) Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
| Commit: | a2de06f | |
|---|---|---|
| Author: | Ryan McCormick | |
| Committer: | GitHub | |
Add preserve_ordering field to oldest strategy in sequence scheduler config (#97)
| Commit: | 1df32b9 | |
|---|---|---|
| Author: | dyastremsky | |
| Committer: | GitHub | |
Auto-format (#95)
| Commit: | 7ff0105 | |
|---|---|---|
| Author: | Neelay Shah | |
| Committer: | GitHub | |
Updating Service and Model Config Protobuf for uint64 Request Priority (#93) Co-authored-by: qmas <q.massoz@evs.com>
| Commit: | 869bf83 | |
|---|---|---|
| Author: | Neelay Shah | |
| Committer: | GitHub | |
Revert "Updating Service and Model Config Protobuf for uint64 Request Priority" (#92) This reverts commit e3048594e2ed6d7532099c80b8fb26ec42dd7fe9.
| Commit: | d1ac878 | |
|---|---|---|
| Author: | nnshah1 | |
Revert "Updating Service and Model Config Protobuf for uint64 Request Priority" This reverts commit e3048594e2ed6d7532099c80b8fb26ec42dd7fe9.
| Commit: | e304859 | |
|---|---|---|
| Author: | Neelay Shah | |
| Committer: | GitHub | |
Updating Service and Model Config Protobuf for uint64 Request Priority * change priority from uint32 to uint64 in model_config * add uint64 and double types to inference parameters Co-authored-by: qmas <q.massoz@evs.com>
| Commit: | 34a0f79 | |
|---|---|---|
| Author: | nnshah1 | |
updated with documentation on support for double and uint64
| Commit: | b0d13a2 | |
|---|---|---|
| Author: | Neelay Shah | |
adding uint64 and double param to infer parameter.
| Commit: | 31004d0 | |
|---|---|---|
| Author: | Neelay Shah | |
| Committer: | GitHub | |
change priority from uint32 to uint64 in model_config Co-authored-by: qmas <q.massoz@evs.com>
| Commit: | 501aa75 | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Add memory usage report in GRPC statistic service (#88) * Update GRPC service proto * Fix type * Fix type
| Commit: | f9904d9 | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Update documentation for "platform" (#89)
| Commit: | 974998c | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Add reserved namespace field in ensemble step (#81)
| Commit: | 7b37a24 | |
|---|---|---|
| Author: | dyastremsky | |
| Committer: | GitHub | |
Add protobuf for GRPC health check (#80) * Draft health service * Formatting * Clean up * Change build order * Add health proto to targets * Change ordering * Reordering build * Add comments * Copyrights, formatting * Keep implemented methods * Remove Python health executables * Rename health library * Naming
| Commit: | c06c43b | |
|---|---|---|
| Author: | Iman Tabrizian | |
| Committer: | GitHub | |
Improve the documentation for input_data_file. (#76)
| Commit: | cb62c76 | |
|---|---|---|
| Author: | kthui | |
| Committer: | Misha Chornyi | |
Revert per response metrics
| Commit: | 050e5ba | |
|---|---|---|
| Author: | kthui | |
| Committer: | GitHub | |
Revert per response metrics (#74)
| Commit: | b018b65 | |
|---|---|---|
| Author: | Iman Tabrizian | |
| Committer: | GitHub | |
Add response statistics to GRPC frontend (#71) * Add response statistics to GRPC frontend * Improve docs * Improve comments * Add no response count * Improve documentation clarity Co-authored-by: kthui <18255193+kthui@users.noreply.github.com>
| Commit: | 58a25d1 | |
|---|---|---|
| Author: | Iman Tabrizian | |
Update documentation for execution accelerators
| Commit: | d401744 | |
|---|---|---|
| Author: | Francesco Petrini | |
| Committer: | GitHub | |
Incorporating Dynamic Logging (#70) * Migrating Changes * New line * Add comments
| Commit: | 051c706 | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Add 'count' field for warmup (#61) * Add 'repeat_count' field for warmup * Address comment * Change "repeat_count" to "count"
| Commit: | 976afde | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Extend GRPC ModelRepositoryParameter to allow bytes (#51)
| Commit: | 2e51208 | |
|---|---|---|
| Author: | Ryan McCormick | |
| Committer: | GitHub | |
Add TYPE_BF16 scaffolding (#49) * TYPE_BF16 scaffolding * Add note on BF16 datatype requiring use raw contents
| Commit: | fc2f0a6 | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Add batch input item shape specification (#43) * Add batch input item shape specification * Fix copyright * Address comment
| Commit: | b9099c4 | |
|---|---|---|
| Author: | Ryan McCormick | |
| Committer: | GitHub | |
Add cache_miss to grpc stub (#42) * Add cache_miss to grpc stub * Update 2022 copyright header * Review comments
| Commit: | 59c891c | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Extend load API in GRPC service (#41) * Extend load API * Fix copyright
| Commit: | b1ef9c1 | |
|---|---|---|
| Author: | Ryan McCormick | |
| Committer: | GitHub | |
Update GRPC stub to include cache stats (#37) * Add cache_hit stat to common GRPC protobuf * Update GRPC proto to match server/docs/protocol/extension_statistics.md * Add more details on cache hits per review feedback * Add more details to 'cache_hit' field and refer to it in the 'compute_*' fields
| Commit: | b7e11ba | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Add GRPC trace service (#40) * Add GRPC trace service * Fix up * Address comment * Expose JSON null check * Address comment * Address comment
| Commit: | 65dec4c | |
|---|---|---|
| Author: | CoderHam | |
map cannot be in oneof - create new message for map
| Commit: | 09b6735 | |
|---|---|---|
| Author: | CoderHam | |
fix TensorStructure def
| Commit: | 481d507 | |
|---|---|---|
| Author: | CoderHam | |
review edits
| Commit: | f2e67ed | |
|---|---|---|
| Author: | CoderHam | |
cleanup
| Commit: | 8f6ee44 | |
|---|---|---|
| Author: | CoderHam | |
test
| Commit: | 3a8e7a3 | |
|---|---|---|
| Author: | CoderHam | |
Add TensorStructure field for I/O
| Commit: | c009eeb | |
|---|---|---|
| Author: | Iman Tabrizian | |
| Committer: | GitHub | |
Add state initialization setting to model config protobuf (#36) * Add state initialization setting to model config protobuf * Review edit * Remove nested metadata
| Commit: | f939abe | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Add optional field in ModelInput message (#35) * Add optional field in ModelInput message * Fix comment
| Commit: | dc3cbd2 | |
|---|---|---|
| Author: | Iman Tabrizian | |
| Committer: | Iman Tabrizian | |
Review edit
| Commit: | 175e2d5 | |
|---|---|---|
| Author: | Iman Tabrizian | |
| Committer: | Iman Tabrizian | |
Add state initialization setting to model config protobuf
| Commit: | e8c269d | |
|---|---|---|
| Author: | deadeyegoodwin | |
| Committer: | GitHub | |
Fix GRPC protocol error. KServer protocol specifies 'bytes_contents' (#34)
| Commit: | cc58c85 | |
|---|---|---|
| Author: | Iman Tabrizian | |
| Committer: | GitHub | |
Add state description to model config (#28) * Add state description to the protobuf * Review edits
| Commit: | 893d3c1 | |
|---|---|---|
| Author: | Tanmay Verma | |
| Committer: | GitHub | |
Add response cache enable setting in model config (#30) * Add response cache enable setting in model config * Format fix * Use composite message for response cache settings
| Commit: | fe7e548 | |
|---|---|---|
| Author: | Tanmay Verma | |
| Committer: | GitHub | |
Add clarification for rate limiter config priority (#29)
| Commit: | e726d90 | |
|---|---|---|
| Author: | David Goodwin | |
Remove some legacy 'custom backend' references
| Commit: | 86f1931 | |
|---|---|---|
| Author: | Tanmay Verma | |
| Committer: | GitHub | |
Document memory impact of the output_copy_stream (#27)
| Commit: | 6b6e981 | |
|---|---|---|
| Author: | Ashwini Khade | |
bug fix
| Commit: | 5ab636c | |
|---|---|---|
| Author: | Ashwini Khade | |
add more configuration params for ORT
| Commit: | ce91438 | |
|---|---|---|
| Author: | Kris Hung | |
| Committer: | GitHub | |
Extend START, END, READY controls to allow BOOL type (#22) * Add bool type * Update identifier * Update identifier Co-authored-by: Kris Hung <krish@krish-dt.nvidia.com>
| Commit: | 2492327 | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Add host policy field (#17) * Add numa id field * Enforce the NUMA id to be the same as GPU id for GPU instance * Modify to "host_policy" as a more general approach * Address comment * Fix rebase artifact
| Commit: | a0e3d6d | |
|---|---|---|
| Author: | Hemant Jain | |
| Committer: | GitHub | |
Add support for DLA/secondary device specification (#18) * Add support for DLA/secondary device specification * Address review comments * Improve description and other cleanup
| Commit: | 996299e | |
|---|---|---|
| Author: | GuanLuo | |
| Committer: | GitHub | |
Add 'passive' field in ModelInstanceGroup (#16)
| Commit: | 47f791e | |
|---|---|---|
| Author: | deadeyegoodwin | |
| Committer: | GitHub | |
Integrate minor doc changes (#15)
| Commit: | 011b7ac | |
|---|---|---|
| Author: | David Goodwin | |
| Committer: | deadeyegoodwin | |
Move protobuf to common
| Commit: | feaebe7 | |
|---|---|---|
| Author: | David Goodwin | |
| Committer: | deadeyegoodwin | |
Integrate change from triton-inference-server/server repo > e2208d2dd5effd0 src/core/grpc_service.proto > commit 09271f9c4d4d935bd9667dd2be2208d2dd5effd0 > Author: GuanLuo <41310872+GuanLuo@users.noreply.github.com> > Date: Mon Apr 5 09:48:32 2021 -0700 > > Add end point to unload model and its dependents (#2684)