Proto commits in Preemo-Inc/text-generation-inference

These 28 commits are when the Protocol Buffers files have changed:

2023-07-19

Commit:	fe80f53
Author:	OlivierDehaene	2023-07-19 09:31:25 +0200
Committer:	GitHub	2023-07-19 09:31:25 +0200

feat(server): auto max_batch_total_tokens for flash att models (#630)

The documentation is generated from this commit.

2023-06-30

Commit:	e74bd41
Author:	OlivierDehaene	2023-06-30 19:09:59 +0200
Committer:	GitHub	2023-06-30 19:09:59 +0200

feat(server): add paged attention to flash models (#516) Closes #478

2023-06-02

Commit:	895c5f1
Author:	OlivierDehaene	2023-06-02 17:12:30 +0200
Committer:	GitHub	2023-06-02 17:12:30 +0200

feat(server): only compute prefill logprobs when asked (#406) Close #288

2023-05-24

Commit:	218c9ad
Author:	OlivierDehaene	2023-05-24 19:19:57 +0200
Committer:	GitHub	2023-05-24 19:19:57 +0200

feat: decrease IPC proto size (#367) Closes #307 #308

2023-04-26

Commit:	db2b4e0
Author:	Nicolas Patry	2023-04-26 20:23:54 +0200
Committer:	GitHub	2023-04-26 20:23:54 +0200

feat(router): new healthcheck that skips the queue (#244) Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com> Co-authored-by: OlivierDehaene <olivier@huggingface.co>

2023-04-24

Commit:	ebc74d5
Author:	OlivierDehaene	2023-04-24 17:59:00 +0200
Committer:	GitHub	2023-04-24 17:59:00 +0200

feat(router): use number of tokens in batch as input for dynamic batching (#226) Co-authored-by: Nick Hill <nickhill@us.ibm.com>

2023-04-21

Commit:	343437c
Author:	OlivierDehaene	2023-04-21 15:36:29 +0200
Committer:	GitHub	2023-04-21 15:36:29 +0200

feat(router): add device and dtype info (#215)

2023-04-09

Commit:	9987960
Author:	OlivierDehaene	2023-04-09 20:22:27 +0200
Committer:	GitHub	2023-04-09 20:22:27 +0200

feat(router): make router input validation optional (#164)

2023-03-30

Commit:	610bb1f
Author:	OlivierDehaene	2023-03-30 15:26:27 +0200
Committer:	GitHub	2023-03-30 15:26:27 +0200

feat(benchmark): tui based benchmarking tool (#149)

2023-03-28

Commit:	f000068
Author:	OlivierDehaene	2023-03-28 11:29:35 +0200
Committer:	GitHub	2023-03-28 11:29:35 +0200

feat(server): clear cache on error (#143)

2023-03-16

Commit:	b49dbf2
Author:	OlivierDehaene	2023-03-16 12:12:26 +0100
Committer:	GitHub	2023-03-16 12:12:26 +0100

fix(server): use server tokenizer as gt (#128)

2023-03-09

Commit:	1a2d682
Author:	OlivierDehaene	2023-03-09 11:33:57 +0100
Committer:	GitHub	2023-03-09 11:33:57 +0100

feat: support typical sampling (#114) closes #112

2023-03-02

Commit:	9b8ea6a
Author:	OlivierDehaene	2023-03-02 12:30:41 +0100
Committer:	GitHub	2023-03-02 12:30:41 +0100

feat(server): add logits watermark (#90)

2023-02-24

Commit:	0ac184c
Author:	OlivierDehaene	2023-02-24 15:55:57 +0100
Committer:	GitHub	2023-02-24 15:55:57 +0100

feat(server): add special token bool (#85)

2023-02-03

Commit:	20c3c59
Author:	OlivierDehaene	2023-02-03 12:43:37 +0100
Committer:	GitHub	2023-02-03 12:43:37 +0100

feat(router): refactor API and add openAPI schemas (#53)

2023-02-01

Commit:	313194f
Author:	OlivierDehaene	2023-02-01 15:58:42 +0100
Committer:	GitHub	2023-02-01 15:58:42 +0100

feat(server): support repetition penalty (#47)

2023-01-31

Commit:	017a2a8
Author:	OlivierDehaene	2023-01-31 17:04:00 +0100
Committer:	GitHub	2023-01-31 17:04:00 +0100

feat: Add token streaming using ServerSideEvents support (#41)

Commit:	54fec93
Author:	OlivierDehaene	2023-01-31 16:01:15 +0100
Committer:	GitHub	2023-01-31 16:01:15 +0100

fix(server): fix seeding with multiple shards (#44)

Commit:	4f9ac67
Author:	OlivierDehaene	2023-01-31 14:21:51 +0100
Committer:	GitHub	2023-01-31 14:21:51 +0100

Revert "feat: Add token streaming using ServerSideEvents support" (#40) Reverts huggingface/text-generation-inference#36

Commit:	7fbfbb0
Author:	OlivierDehaene	2023-01-31 11:49:43 +0100
Committer:	GitHub	2023-01-31 11:49:43 +0100

feat: Add token streaming using ServerSideEvents support (#36) Add token streaming using ServerSideEvents (SSE). The signature of the SSE events is: ```rust struct Details { finish_reason: String, generated_tokens: u32, seed: Option<u64>, } struct StreamResponse { token: Token, generated_text: Option<String>, details: Option<Details>, } struct ErrorResponse { error: String, } ```

2023-01-30

Commit:	cd298bc
Author:	OlivierDehaene	2023-01-30 15:36:16 +0100
Committer:	GitHub	2023-01-30 15:36:16 +0100

feat: Support sampling seeding (#37) Co-authored-by: Yannic Kilcher <yk@users.noreply.github.com>

2022-12-15

Commit:	32a2530
Author:	OlivierDehaene	2022-12-15 17:03:56 +0100
Committer:	GitHub	2022-12-15 17:03:56 +0100

feat: Return logprobs (#8)

2022-12-12

Commit:	718096f
Author:	OlivierDehaene	2022-12-12 18:25:22 +0100
Committer:	GitHub	2022-12-12 18:25:22 +0100

feat: Support stop sequences (#7)

2022-11-04

Commit:	427d7cc
Author:	OlivierDehaene	2022-11-04 18:03:04 +0100

feat(server): Support AutoModelForSeq2SeqLM

Commit:	c5665f5
Author:	OlivierDehaene	2022-11-04 14:22:47 +0100

feat(server): Support generic AutoModelForCausalLM

2022-10-20

Commit:	f16f2f5
Author:	Olivier Dehaene	2022-10-18 15:19:03 +0200
Committer:	OlivierDehaene	2022-10-20 19:14:44 +0200

v0.1.0

2022-10-11

Commit:	4c693e6
Author:	Olivier Dehaene	2022-10-11 16:50:54 +0200

Refactored gRPC interface Added validation logic

2022-10-08

Commit:	295831a
Author:	Olivier Dehaene	2022-10-08 12:30:12 +0200

Init