These 6 commits are when the Protocol Buffers files have changed:
Commit: | a9cbbdc | |
---|---|---|
Author: | yjc9696 | |
Committer: | GitHub |
support MOE EP (#73) Co-authored-by: yangjiacheng.yjc <yangjiacheng.yjc@alibaba-inc.com>
The documentation is generated from this commit.
Commit: | 163850f | |
---|---|---|
Author: | zhenglaiwen.zlw | |
Committer: | zhenglaiwen.zlw |
some bugfix - uuid crash issue - update lora implement - set page size by param - delete deprecated files
Commit: | a8b9f8e | |
---|---|---|
Author: | Jiejing Zhang | |
Committer: | zhenglaiwen.zlw |
Update For Version 2.0: add support for CUDA and VLM (#43) * release dashinfer 2.0 version thirdparty: add cutlass. python: spanattention build from source. benchmark: add stop model in the end.
Commit: | a216786 | |
---|---|---|
Author: | Jiejing Zhang | |
Committer: | Jiejing Zhang |
Update For Version 2.0: add support for CUDA and VLM (#43) * release dashinfer 2.0 version thirdparty: add cutlass. python: spanattention build from source. benchmark: add stop model in the end.
Commit: | 9ef6e35 | |
---|---|---|
Author: | zhenglaiwen.zlw | |
Committer: | zhenglaiwen.zlw |
fix memory leak bug, add default config to helper, update convert_model api - bugfix - helper: check if get empty generated_elem - fix python input memory leak - avoid async copy python inputs - fix bug caused by inconsistent definition of RequestHandle - engine - worker, model: EnqueueRequest -> StartRequestImpl - generation: output token_logprobs - helper - add defualt config - add ConfigManager to merge and check user config - use torch related api only within the helper class - release torch model after conversion - examples - cpp: erase screen before get inputs - py: shutdown executor after finishing tasks - py: use jinja template to format prompt - py: update ipynb basic example and corresponding doc - doc - add model_type to root readme - update modelscope notebook pic and doc - update future plan in root readme
Commit: | 877529e | |
---|---|---|
Author: | Laiwen Zheng | |
Committer: | Laiwen Zheng |
add source code