These 28 commits are when the Protocol Buffers files have changed:
| Commit: | fe80f53 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(server): auto max_batch_total_tokens for flash att models (#630)
The documentation is generated from this commit.
| Commit: | e74bd41 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(server): add paged attention to flash models (#516) Closes #478
| Commit: | 895c5f1 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(server): only compute prefill logprobs when asked (#406) Close #288
| Commit: | 218c9ad | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat: decrease IPC proto size (#367) Closes #307 #308
| Commit: | db2b4e0 | |
|---|---|---|
| Author: | Nicolas Patry | |
| Committer: | GitHub | |
feat(router): new healthcheck that skips the queue (#244) Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com> Co-authored-by: OlivierDehaene <olivier@huggingface.co>
| Commit: | ebc74d5 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(router): use number of tokens in batch as input for dynamic batching (#226) Co-authored-by: Nick Hill <nickhill@us.ibm.com>
| Commit: | 343437c | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(router): add device and dtype info (#215)
| Commit: | 9987960 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(router): make router input validation optional (#164)
| Commit: | 610bb1f | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(benchmark): tui based benchmarking tool (#149)
| Commit: | f000068 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(server): clear cache on error (#143)
| Commit: | b49dbf2 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
fix(server): use server tokenizer as gt (#128)
| Commit: | 1a2d682 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat: support typical sampling (#114) closes #112
| Commit: | 9b8ea6a | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(server): add logits watermark (#90)
| Commit: | 0ac184c | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(server): add special token bool (#85)
| Commit: | 20c3c59 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(router): refactor API and add openAPI schemas (#53)
| Commit: | 313194f | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat(server): support repetition penalty (#47)
| Commit: | 017a2a8 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat: Add token streaming using ServerSideEvents support (#41)
| Commit: | 54fec93 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
fix(server): fix seeding with multiple shards (#44)
| Commit: | 4f9ac67 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
Revert "feat: Add token streaming using ServerSideEvents support" (#40) Reverts huggingface/text-generation-inference#36
| Commit: | 7fbfbb0 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat: Add token streaming using ServerSideEvents support (#36) Add token streaming using ServerSideEvents (SSE). The signature of the SSE events is: ```rust struct Details { finish_reason: String, generated_tokens: u32, seed: Option<u64>, } struct StreamResponse { token: Token, generated_text: Option<String>, details: Option<Details>, } struct ErrorResponse { error: String, } ```
| Commit: | cd298bc | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat: Support sampling seeding (#37) Co-authored-by: Yannic Kilcher <yk@users.noreply.github.com>
| Commit: | 32a2530 | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat: Return logprobs (#8)
| Commit: | 718096f | |
|---|---|---|
| Author: | OlivierDehaene | |
| Committer: | GitHub | |
feat: Support stop sequences (#7)
| Commit: | 427d7cc | |
|---|---|---|
| Author: | OlivierDehaene | |
feat(server): Support AutoModelForSeq2SeqLM
| Commit: | c5665f5 | |
|---|---|---|
| Author: | OlivierDehaene | |
feat(server): Support generic AutoModelForCausalLM
| Commit: | f16f2f5 | |
|---|---|---|
| Author: | Olivier Dehaene | |
| Committer: | OlivierDehaene | |
v0.1.0
| Commit: | 4c693e6 | |
|---|---|---|
| Author: | Olivier Dehaene | |
Refactored gRPC interface Added validation logic
| Commit: | 295831a | |
|---|---|---|
| Author: | Olivier Dehaene | |
Init