Proto commits in mudler/LocalAI

These 58 commits are when the Protocol Buffers files have changed:

2025-04-26

Commit:	2c9279a
Author:	Ettore Di Giacinto	2025-04-26 18:05:01 +0200
Committer:	GitHub	2025-04-26 18:05:01 +0200

feat(video-gen): add endpoint for video generation (#5247) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

The documentation is generated from this commit.

2025-04-19

Commit:	61cc76c
Author:	Ettore Di Giacinto	2025-04-19 15:52:29 +0200
Committer:	GitHub	2025-04-19 15:52:29 +0200

chore(autogptq): drop archived backend (#5214) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Commit:	a7be2d2
Author:	Ettore Di Giacinto	2025-04-19 09:44:34 +0200
Committer:	Ettore Di Giacinto	2025-04-19 09:45:47 +0200

chore(autogptq): drop archived backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2025-03-05

Commit:	67f7bff
Author:	Ettore Di Giacinto	2025-03-06 00:40:58 +0100
Committer:	GitHub	2025-03-06 00:40:58 +0100

chore(deps): update llama.cpp and sync with upstream changes (#4950) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2025-02-18

Commit:	6a6e1a0
Author:	Brandon Beiler	2025-02-18 13:27:58 -0500
Committer:	GitHub	2025-02-18 19:27:58 +0100

feat(vllm): Additional vLLM config options (Disable logging, dtype, and Per-Prompt media limits) (#4855) * Adding the following vLLM config options: disable_log_status, dtype, limit_mm_per_prompt Signed-off-by: TheDropZone <brandonbeiler@gmail.com> * using " marks in the config.yaml file Signed-off-by: TheDropZone <brandonbeiler@gmail.com> * adding in missing colon Signed-off-by: TheDropZone <brandonbeiler@gmail.com> --------- Signed-off-by: TheDropZone <brandonbeiler@gmail.com>

2025-02-02

Commit:	1d6afbd
Author:	Ettore Di Giacinto	2025-02-02 13:25:03 +0100
Committer:	GitHub	2025-02-02 13:25:03 +0100

feat(llama.cpp): Add support to grammar triggers (#4733) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2025-01-21

Commit:	a1d5462
Author:	Ettore Di Giacinto	2025-01-21 10:35:01 +0100

Stores to chromem (WIP) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2025-01-17

Commit:	96f8ec0
Author:	mintyleaf	2025-01-17 20:05:58 +0400
Committer:	GitHub	2025-01-17 17:05:58 +0100

feat: add machine tag and inference timings (#4577) * Add machine tag option, add extraUsage option, grpc-server -> proto -> endpoint extraUsage data is broken for now Signed-off-by: mintyleaf <mintyleafdev@gmail.com> * remove redurant timing fields, fix not working timings output Signed-off-by: mintyleaf <mintyleafdev@gmail.com> * use middleware for Machine-Tag only if tag is specified Signed-off-by: mintyleaf <mintyleafdev@gmail.com> --------- Signed-off-by: mintyleaf <mintyleafdev@gmail.com>

2025-01-14

Commit:	9b6826d
Author:	Ettore Di Giacinto	2024-10-18 18:19:42 +0200
Committer:	Ettore Di Giacinto	2025-01-14 17:13:58 +0100

aujdio Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Commit:	ebfe8dd
Author:	Ettore Di Giacinto	2024-11-18 19:12:27 +0100
Committer:	Ettore Di Giacinto	2025-01-14 17:13:58 +0100

gRPC client stubs Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-12-06

Commit:	d4c1746
Author:	Ettore Di Giacinto	2024-12-06 10:23:59 +0100
Committer:	GitHub	2024-12-06 10:23:59 +0100

feat(llama.cpp): expose cache_type_k and cache_type_v for quant of kv cache (#4329) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-12-03

Commit:	44a5dac
Author:	Ettore Di Giacinto	2024-12-03 22:41:22 +0100
Committer:	GitHub	2024-12-03 22:41:22 +0100

feat(backend): add stablediffusion-ggml (#4289) * feat(backend): add stablediffusion-ggml Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ci): track stablediffusion-ggml Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Use default scheduler and sampler if not specified Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Move cfg scale out of diffusers block Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Make it working Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: set free_params_immediately to false to call the model in sequence https://github.com/leejet/stable-diffusion.cpp/issues/366 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-11-20

Commit:	b1ea931
Author:	Ettore Di Giacinto	2024-11-20 14:48:40 +0100
Committer:	GitHub	2024-11-20 14:48:40 +0100

feat(silero): add Silero-vad backend (#4204) * feat(vad): add silero-vad backend (WIP) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(vad): add API endpoint Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(vad): correctly place the onnxruntime libs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(vad): hook silero-vad to binary and container builds Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gRPC): register VAD Server Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(Makefile): consume ONNX_OS consistently Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(Makefile): handle macOS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2024-11-05

Commit:	947224b
Author:	Ettore Di Giacinto	2024-11-05 15:14:33 +0100
Committer:	GitHub	2024-11-05 15:14:33 +0100

feat(diffusers): allow multiple lora adapters (#4081) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-10-31

Commit:	61c964d
Author:	Ettore Di Giacinto	2024-10-31 12:12:22 +0100
Committer:	GitHub	2024-10-31 12:12:22 +0100

fix(grpc): pass by modelpath (#4023) Instead of trying to derive it from the model file. In backends that specify HF url this results in a fragile logic. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-10-23

Commit:	835932e
Author:	Ettore Di Giacinto	2024-10-23 15:46:06 +0200

feat: update proto file Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-10-01

Commit:	f84b55d
Author:	siddimore	2024-10-01 05:41:20 -0700
Committer:	GitHub	2024-10-01 14:41:20 +0200

feat: Add Get Token Metrics to GRPC server (#3687) * Add Get Token Metrics to GRPC server Signed-off-by: Siddharth More <siddimore@gmail.com> * Expose LocalAI endpoint Signed-off-by: Siddharth More <siddimore@gmail.com> --------- Signed-off-by: Siddharth More <siddimore@gmail.com>

2024-09-28

Commit:	50a3b54
Author:	siddimore	2024-09-28 08:23:56 -0700
Committer:	GitHub	2024-09-28 17:23:56 +0200

feat(api): add correlationID to Track Chat requests (#3668) * Add CorrelationID to chat request Signed-off-by: Siddharth More <siddimore@gmail.com> * remove get_token_metrics Signed-off-by: Siddharth More <siddimore@gmail.com> * Add CorrelationID to proto Signed-off-by: Siddharth More <siddimore@gmail.com> * fix correlation method name Signed-off-by: Siddharth More <siddimore@gmail.com> * Update core/http/endpoints/openai/chat.go Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: Siddharth More <siddimore@gmail.com> * Update core/http/endpoints/openai/chat.go Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: Siddharth More <siddimore@gmail.com> --------- Signed-off-by: Siddharth More <siddimore@gmail.com> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2024-09-19

Commit:	191bc2e
Author:	Ettore Di Giacinto	2024-09-19 12:26:53 +0200
Committer:	GitHub	2024-09-19 12:26:53 +0200

feat(api): allow to pass audios to backends (#3603) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Commit:	fbb9fac
Author:	Ettore Di Giacinto	2024-09-19 11:21:59 +0200
Committer:	GitHub	2024-09-19 11:21:59 +0200

feat(api): allow to pass videos to backends (#3601) This prepares the API to receive videos as well for video understanding. It works similarly to images, where the request should be in the form: { "type": "video_url", "video_url": { "url": "url or base64 data" } } Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-08-24

Commit:	81ae92f
Author:	Dave	2024-08-23 20:20:28 -0400
Committer:	GitHub	2024-08-24 00:20:28 +0000

feat: elevenlabs `sound-generation` api (#3355) * initial version of elevenlabs compatible soundgeneration api and cli command Signed-off-by: Dave Lee <dave@gray101.com> * minor cleanup Signed-off-by: Dave Lee <dave@gray101.com> * restore TTS, add test Signed-off-by: Dave Lee <dave@gray101.com> * remove stray s Signed-off-by: Dave Lee <dave@gray101.com> * fix Signed-off-by: Dave Lee <dave@gray101.com> --------- Signed-off-by: Dave Lee <dave@gray101.com> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2024-06-24

Commit:	03b1cf5
Author:	Ettore Di Giacinto	2024-06-24 19:21:22 +0200
Committer:	GitHub	2024-06-24 19:21:22 +0200

feat(whisper): add translate option (#2649) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-06-01

Commit:	b99182c
Author:	Chakib Benziane	2024-06-01 20:26:27 +0200
Committer:	GitHub	2024-06-01 18:26:27 +0000

TTS API improvements (#2308) * update doc on COQUI_LANGUAGE env variable Signed-off-by: blob42 <contact@blob42.xyz> * return errors from tts gRPC backend Signed-off-by: blob42 <contact@blob42.xyz> * handle speaker_id and language in coqui TTS backend Signed-off-by: blob42 <contact@blob42.xyz> * TTS endpoint: add optional language paramter Signed-off-by: blob42 <contact@blob42.xyz> * tts fix: empty language string breaks non-multilingual models Signed-off-by: blob42 <contact@blob42.xyz> * allow tts param definition in config file - consolidate TTS options under `tts` config entry Signed-off-by: blob42 <contact@blob42.xyz> * tts: update doc Signed-off-by: blob42 <contact@blob42.xyz> --------- Signed-off-by: blob42 <contact@blob42.xyz> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2024-05-13

Commit:	e49ea01
Author:	Ettore Di Giacinto	2024-05-13 19:07:51 +0200
Committer:	GitHub	2024-05-13 19:07:51 +0200

feat(llama.cpp): add `flash_attention` and `no_kv_offloading` (#2310) feat(llama.cpp): add flash_attn and no_kv_offload Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-04-24

Commit:	b664edd
Author:	Ettore Di Giacinto	2024-04-25 00:19:02 +0200
Committer:	GitHub	2024-04-25 00:19:02 +0200

feat(rerankers): Add new backend, support jina rerankers API (#2121) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2024-04-20

Commit:	03adc1f
Author:	Taikono-Himazin	2024-04-20 23:37:02 +0900
Committer:	GitHub	2024-04-20 14:37:02 +0000

Add tensor_parallel_size setting to vllm setting items (#2085) Signed-off-by: Taikono-Himazin <kazu@po.harenet.ne.jp>

2024-04-15

Commit:	e843d7d
Author:	Ettore Di Giacinto	2024-04-15 19:47:11 +0200
Committer:	GitHub	2024-04-15 19:47:11 +0200

feat(grpc): return consumed token count and update response accordingly (#2035) Fixes: #1920

2024-04-11

Commit:	12c0d94
Author:	Ludovic Leroux	2024-04-11 13:20:22 -0400
Committer:	GitHub	2024-04-11 19:20:22 +0200

feat: use tokenizer.apply_chat_template() in vLLM (#1990) Use tokenizer.apply_chat_template() in vLLM Signed-off-by: Ludovic LEROUX <ludovic@inpher.io>

2024-03-22

Commit:	643d85d
Author:	Richard Palethorpe	2024-03-22 20:14:04 +0000
Committer:	GitHub	2024-03-22 21:14:04 +0100

feat(stores): Vector store backend (#1795) Add simple vector store backend Signed-off-by: Richard Palethorpe <io@richiejp.com>

2024-03-14

Commit:	20136ca
Author:	Ettore Di Giacinto	2024-03-14 23:08:34 +0100
Committer:	GitHub	2024-03-14 23:08:34 +0100

feat(tts): add Elevenlabs and OpenAI TTS compatibility layer (#1834) * feat(elevenlabs): map elevenlabs API support to TTS This allows elevenlabs Clients to work automatically with LocalAI by supporting the elevenlabs API. The elevenlabs server endpoint is implemented such as it is wired to the TTS endpoints. Fixes: https://github.com/mudler/LocalAI/issues/1809 * feat(openai/tts): compat layer with openai tts Fixes: #1276 * fix: adapt tts CLI

Commit:	d2934dd
Author:	Ettore Di Giacinto	2024-03-14 18:06:23 +0100
Committer:	Ettore Di Giacinto	2024-03-14 18:12:47 +0100

feat(elevenlabs): map elevenlabs API support to TTS This allows elevenlabs Clients to work automatically with LocalAI by supporting the elevenlabs API. The elevenlabs server endpoint is implemented such as it is wired to the TTS endpoints. Fixes: https://github.com/mudler/LocalAI/issues/1809

2024-03-01

Commit:	9394113
Author:	Ludovic Leroux	2024-03-01 16:48:53 -0500
Committer:	GitHub	2024-03-01 22:48:53 +0100

Bump vLLM version + more options when loading models in vLLM (#1782) * Bump vLLM version to 0.3.2 * Add vLLM model loading options * Remove transformers-exllama * Fix install exllama

2024-01-25

Commit:	cb75127
Author:	Ettore Di Giacinto	2024-01-26 00:13:21 +0100
Committer:	GitHub	2024-01-26 00:13:21 +0100

transformers: correctly load automodels (#1643) * backends(transformers): use AutoModel with LLM types * examples: animagine-xl * Add codellama examples

2024-01-19

Commit:	9e653d6
Author:	Ettore Di Giacinto	2024-01-19 23:42:50 +0100
Committer:	GitHub	2024-01-19 23:42:50 +0100

feat: 🐍 add mamba support (#1589) feat(mamba): Initial import This is a first iteration of the mamba backend, loosely based on mamba-chat(https://github.com/havenhq/mamba-chat).

2023-12-13

Commit:	7641f92
Author:	Ettore Di Giacinto	2023-12-13 13:20:22 -0500
Committer:	GitHub	2023-12-13 19:20:22 +0100

feat(diffusers): update, add autopipeline, controlnet (#1432) * feat(diffusers): update, add autopipeline, controlenet * tests with AutoPipeline * simplify logic

2023-11-13

Commit:	ad0e30b
Author:	Ettore Di Giacinto	2023-11-13 22:40:16 +0100
Committer:	GitHub	2023-11-13 22:40:16 +0100

refactor: move backends into the backends directory (#1279) * refactor: move backends into the backends directory Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor: move main close to implementation for every backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-11-11

Commit:	803a0ac
Author:	Ettore Di Giacinto	2023-11-11 18:40:48 +0100
Committer:	GitHub	2023-11-11 18:40:48 +0100

feat(llama.cpp): support lora with scale and yarn (#1277) * feat(llama.cpp): support lora with scale Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(llama.cpp): support yarn Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Commit:	0eae727
Author:	Ettore Di Giacinto	2023-11-11 13:14:59 +0100
Committer:	GitHub	2023-11-11 13:14:59 +0100

:fire: add LaVA support and GPT vision API, Multiple requests for llama.cpp, return JSON types (#1254) * wip * wip * Make it functional Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * wip * Small fixups * do not inject space on role encoding, encode img at beginning of messages Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add examples/config defaults * Add include dir of current source dir * cleanup * fixes Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups * Revert "fixups" This reverts commit f1a4731ccadf7226c6589d6d39131376f0811625. * fixes Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-09-22

Commit:	a28ab18
Author:	Ettore Di Giacinto	2023-09-22 15:52:38 +0200
Committer:	GitHub	2023-09-22 15:52:38 +0200

feat(vllm): Allow to set quantization (#1094) This particularly useful to set AWQ **Description** Follow up of #1015 **Notes for Reviewers** **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)** - [ ] Yes, I signed my commits.  --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-09-14

Commit:	8ccf5b2
Author:	Ettore Di Giacinto	2023-09-14 17:44:16 +0200
Committer:	GitHub	2023-09-14 17:44:16 +0200

feat(speculative-sampling): allow to specify a draft model in the model config (#1052) **Description** This PR fixes #1013. It adds `draft_model` and `n_draft` to the model YAML config in order to load models with speculative sampling. This should be compatible as well with grammars. example: ```yaml backend: llama context_size: 1024 name: my-model-name parameters: model: foo-bar n_draft: 16 draft_model: model-name ``` --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-09-04

Commit:	dc307a1
Author:	Ettore Di Giacinto	2023-09-04 19:25:23 +0200
Committer:	GitHub	2023-09-04 19:25:23 +0200

feat: add vall-e-x (#1007) **Description** This PR fixes #985 **Notes for Reviewers** **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)** - [ ] Yes, I signed my commits.  Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-08-25

Commit:	44bc7aa
Author:	Ettore Di Giacinto	2023-08-25 21:58:46 +0200
Committer:	GitHub	2023-08-25 21:58:46 +0200

feat: Allow to load lora adapters for llama.cpp (#955) **Description** This PR fixes # **Notes for Reviewers** **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)** - [ ] Yes, I signed my commits.  Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-08-22

Commit:	901f070
Author:	Dave	2023-08-22 12:48:06 -0400
Committer:	GitHub	2023-08-22 18:48:06 +0200

Feat: rwkv improvements: (#937)

2023-08-18

Commit:	8cb1061
Author:	Dave	2023-08-18 15:23:14 -0400
Committer:	GitHub	2023-08-18 21:23:14 +0200

Usage Features (#863)

2023-08-17

Commit:	2bacd01
Author:	Ettore Di Giacinto	2023-08-17 23:38:59 +0200
Committer:	GitHub	2023-08-17 23:38:59 +0200

feat(diffusers): add img2img and clip_skip, support more kernels schedulers (#906) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-08-15

Commit:	37700f2
Author:	Ettore Di Giacinto	2023-08-16 01:11:42 +0200
Committer:	GitHub	2023-08-16 01:11:42 +0200

feat(diffusers): add DPMSolverMultistepScheduler++, DPMSolverMultistepSchedulerSDE++, guidance_scale (#903) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-08-14

Commit:	a96c3bc
Author:	Ettore Di Giacinto	2023-08-14 23:12:00 +0200
Committer:	GitHub	2023-08-14 23:12:00 +0200

feat(diffusers): various enhancements (#895)

2023-08-09

Commit:	8c781a6
Author:	Ettore Di Giacinto	2023-08-09 08:38:51 +0200
Committer:	GitHub	2023-08-09 08:38:51 +0200

feat: Add Diffusers (#874) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-08-07

Commit:	3c8fc37
Author:	Ettore Di Giacinto	2023-08-08 01:10:05 +0200

feat: Add UseFastTokenizer Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Commit:	a843e64
Author:	Ettore Di Giacinto	2023-08-07 22:39:10 +0200
Committer:	Ettore Di Giacinto	2023-08-07 22:53:28 +0200

feat: add initial AutoGPTQ backend implementation

2023-08-02

Commit:	5ca21ee
Author:	Ettore Di Giacinto	2023-08-03 00:51:08 +0200
Committer:	GitHub	2023-08-03 00:51:08 +0200

feat: add ngqa and RMSNormEps parameters (#860) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-07-27

Commit:	096d98c
Author:	Ettore Di Giacinto	2023-07-27 21:56:05 +0200
Committer:	GitHub	2023-07-27 21:56:05 +0200

fix: add rope settings during model load, fix CUDA (#821) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Commit:	b96e30e
Author:	Ettore Di Giacinto	2023-07-27 18:41:04 +0200
Committer:	GitHub	2023-07-27 18:41:04 +0200

fix: use bytes in gRPC proto instead of strings (#813) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-07-25

Commit:	569c1d1
Author:	Ettore Di Giacinto	2023-07-25 19:05:27 +0200
Committer:	GitHub	2023-07-25 19:05:27 +0200

feat: add rope settings and negative prompt, drop grammar backend (#797) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-07-14

Commit:	ae533ca
Author:	Ettore Di Giacinto	2023-07-15 01:19:43 +0200

feat: move gpt4all to a grpc service Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Commit:	b816009
Author:	Ettore Di Giacinto	2023-07-15 01:19:43 +0200

feat: add falcon ggllm via grpc client Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Commit:	58f6aab
Author:	Ettore Di Giacinto	2023-07-15 01:19:43 +0200

feat: move llama to a grpc Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Commit:	1d0ed95
Author:	Ettore Di Giacinto	2023-07-15 01:19:43 +0200

feat: move other backends to grpc This finally makes everything more consistent Signed-off-by: Ettore Di Giacinto <mudler@localai.io>