These 5 commits are when the Protocol Buffers files have changed:
Commit: | 828468c | |
---|---|---|
Author: | dm4 | |
Committer: | hydai |
[WASI-NN] ggml: support compute single in RPC mode Signed-off-by: dm4 <dm4@secondstate.io>
The documentation is generated from this commit.
Commit: | 6f7ffa1 | |
---|---|---|
Author: | dm4 | |
Committer: | hydai |
[WASI-NN] rpc: implement `load_by_name_with_config` Signed-off-by: dm4 <dm4@secondstate.io>
Commit: | 9c60444 | |
---|---|---|
Author: | dm4 | |
Committer: | dm4 |
[WASI-NN] rpc: implement `load_by_name_with_config` Signed-off-by: dm4 <dm4@secondstate.io>
Commit: | f38b8db | |
---|---|---|
Author: | Akihiro Suda | |
Committer: | hydai |
[WASI-NN] Support RPC mode RPC mode allows using another Wasi-NN instance that is running on a remote WasmEdge instance, via `ssh -R remote.sock:local.sock`. An example usecase is to allow Linux VM (e.g., Lima) guest to use the host GPU. The gRPC proto can be repurposed for non-WASM applications as well. - - - Build ===== Set `WASMEDGE_BUILD_WASI_NN_RPC` to `ON`. Enabled by default when gRPC (libgrpc++-dev) is installed. Usage ===== Host 1 (rpc server / ssh client, e.g., Lima host with physical GPU) ----- ``` wasi_nn_rpcserver \ --nn-rpc-uri unix:///$HOME/nn_server.sock \ --nn-preload default:GGML:AUTO:llama-2-7b-chat.Q5_K_M.gguf ssh \ -R /tmp/nn_client.sock:$HOME/nn_server.sock \ host2 ``` Host 2 (rpc client / ssh server, e.g., Lima guest) ----- ``` wasmedge \ --nn-rpc-uri unix:///tmp/nn_client.sock \ wasmedge-ggml-llama-interactive.wasm \ default "1 + 1 = ?" ``` See <https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml-llama-interactive> for how to obtain `llama-2-7b-chat.Q5_K_M.gguf` and `wasmedge-ggml-llama-interactive.wasm`. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Commit: | e78fedf | |
---|---|---|
Author: | Akihiro Suda | |
Committer: | hydai |
[WASI-NN] Support RPC mode RPC mode allows using another Wasi-NN instance that is running on a remote WasmEdge instance, via `ssh -R remote.sock:local.sock`. An example usecase is to allow Linux VM (e.g., Lima) guest to use the host GPU. The gRPC proto can be repurposed for non-WASM applications as well. - - - Build ===== Set `WASMEDGE_BUILD_WASI_NN_RPC` to `ON`. Enabled by default when gRPC (libgrpc++-dev) is installed. Usage ===== Host 1 (rpc server / ssh client, e.g., Lima host with physical GPU) ----- ``` wasi_nn_rpcserver \ --nn-rpc-uri unix:///$HOME/nn_server.sock \ --nn-preload default:GGML:AUTO:llama-2-7b-chat.Q5_K_M.gguf ssh \ -R /tmp/nn_client.sock:$HOME/nn_server.sock \ host2 ``` Host 2 (rpc client / ssh server, e.g., Lima guest) ----- ``` wasmedge \ --nn-rpc-uri unix:///tmp/nn_client.sock \ wasmedge-ggml-llama-interactive.wasm \ default "1 + 1 = ?" ``` See <https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml-llama-interactive> for how to obtain `llama-2-7b-chat.Q5_K_M.gguf` and `wasmedge-ggml-llama-interactive.wasm`. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>