llama.cpp supervisor #32

mcharytoniuk · 2024-12-20T08:34:11Z

[draft]

General idea:

There should be a tool to manage existing llama.cpp instances. Not all parameters in llama.cpp can be changed on runtime, it should also be brought up when it exits for any reason.

Paddler can potentially download and manage llama.cpp version that it supports.

For example:

paddler supervisor \
    --llama-server-path ./llama-server \
    --supervisor-aggregate-addr 127.0.0.1:8085 \ # reports llamacpp status to that server
    --supervisor-controller-addr 127.0.0.1:8089 # exposes API to manage a specific llamacpp instance

paddler download # downloads the latest supported llama.cpp version

Then, it should be possible to restart that specific llama.cpp instance through supervisor-controller-addr API

Paddler should keep requests on hold while supervisor restarts llamacpp instances

The text was updated successfully, but these errors were encountered:

Propfend · 2024-12-20T12:10:43Z

So the initial usage flow and rules might be:

1 - User will use some command to start a new Supervisor instance pointing to an existing llamacpp instance.

2 - User will control llamacpp instance through Supervisor Rest Api.

3 - Changing llamacpp instance configuration will make the Supervisor restart the llamacpp instance with the new configuration options applied. Supervisor will restart the llamacpp instance with the initial llamacpp address.

4 - Llamacpp address can also be changed in the Supervisor Rest api. If old running Agents are broken with new llamacpp address is on the user responsibility.

5 - While restarting llamacpp instances, reverseproxy Loadbalancer must not drop incoming requests to llamacpp instances.

Propfend · 2025-01-07T14:20:25Z

Should Supervisor be optional?

If the basic Paddler ecosystem can work without supervisor, just with balancer, llamacpp and some agent instance, should the supervisor have an optional compilation?

Propfend · 2025-01-07T14:36:01Z

Supervisor aggregate address

Its not clear for me its purpose. Whats the point of supervisor-aggregate-addr arg? would 8085 be the loadbalancer management port server? Why would supervisor report llamacpp status to loadbalancer management if agents already do so? you mean its status as an OS process?

Propfend · 2025-01-07T15:00:09Z

Paddler binaries downloading

Ideas, Suggestions or any more details from the community on the:

Paddler can potentially download and manage llama.cpp version that it supports.
paddler download # downloads the latest supported llama.cpp version

behavior?

mcharytoniuk · 2025-02-11T14:57:13Z

Should Supervisor be optional?

If the basic Paddler ecosystem can work without supervisor, just with balancer, llamacpp and some agent instance, should the supervisor have an optional compilation?

Nope, because it doesn't introduce additional build requirements. I've made web GUI optional because it introduced node as a dependency to build the front-end; I wanted Paddler to have a way to be built with just Rust. Even though the supervisor is optional, it does not require Node or anything like that.

mcharytoniuk added the enhancement New feature or request label Dec 20, 2024

Propfend mentioned this issue Jan 7, 2025

Supervisor #33

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama.cpp supervisor #32

llama.cpp supervisor #32

mcharytoniuk commented Dec 20, 2024 •

edited

Loading

Propfend commented Dec 20, 2024

Propfend commented Jan 7, 2025

Propfend commented Jan 7, 2025 •

edited

Loading

Propfend commented Jan 7, 2025

mcharytoniuk commented Feb 11, 2025

Should Supervisor be optional?

llama.cpp supervisor #32

llama.cpp supervisor #32

Comments

mcharytoniuk commented Dec 20, 2024 • edited Loading

Propfend commented Dec 20, 2024

Propfend commented Jan 7, 2025

Should Supervisor be optional?

Propfend commented Jan 7, 2025 • edited Loading

Supervisor aggregate address

Propfend commented Jan 7, 2025

Paddler binaries downloading

mcharytoniuk commented Feb 11, 2025

Should Supervisor be optional?

mcharytoniuk commented Dec 20, 2024 •

edited

Loading

Propfend commented Jan 7, 2025 •

edited

Loading