Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama.cpp supervisor #32

Open
mcharytoniuk opened this issue Dec 20, 2024 · 5 comments
Open

llama.cpp supervisor #32

mcharytoniuk opened this issue Dec 20, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@mcharytoniuk
Copy link
Member

mcharytoniuk commented Dec 20, 2024

[draft]

General idea:

There should be a tool to manage existing llama.cpp instances. Not all parameters in llama.cpp can be changed on runtime, it should also be brought up when it exits for any reason.

Paddler can potentially download and manage llama.cpp version that it supports.

For example:

paddler supervisor \
    --llama-server-path ./llama-server \
    --supervisor-aggregate-addr 127.0.0.1:8085 \ # reports llamacpp status to that server
    --supervisor-controller-addr 127.0.0.1:8089 # exposes API to manage a specific llamacpp instance
paddler download # downloads the latest supported llama.cpp version

Then, it should be possible to restart that specific llama.cpp instance through supervisor-controller-addr API

Paddler should keep requests on hold while supervisor restarts llamacpp instances

@mcharytoniuk mcharytoniuk added the enhancement New feature or request label Dec 20, 2024
@Propfend
Copy link
Collaborator

So the initial usage flow and rules might be:

1 - User will use some command to start a new Supervisor instance pointing to an existing llamacpp instance.

2 - User will control llamacpp instance through Supervisor Rest Api.

3 - Changing llamacpp instance configuration will make the Supervisor restart the llamacpp instance with the new configuration options applied. Supervisor will restart the llamacpp instance with the initial llamacpp address.

4 - Llamacpp address can also be changed in the Supervisor Rest api. If old running Agents are broken with new llamacpp address is on the user responsibility.

5 - While restarting llamacpp instances, reverseproxy Loadbalancer must not drop incoming requests to llamacpp instances.

@Propfend
Copy link
Collaborator

Propfend commented Jan 7, 2025

Should Supervisor be optional?

If the basic Paddler ecosystem can work without supervisor, just with balancer, llamacpp and some agent instance, should the supervisor have an optional compilation?

@Propfend
Copy link
Collaborator

Propfend commented Jan 7, 2025

Supervisor aggregate address

Its not clear for me its purpose. Whats the point of supervisor-aggregate-addr arg? would 8085 be the loadbalancer management port server? Why would supervisor report llamacpp status to loadbalancer management if agents already do so? you mean its status as an OS process?

@Propfend
Copy link
Collaborator

Propfend commented Jan 7, 2025

Paddler binaries downloading

Ideas, Suggestions or any more details from the community on the:

Paddler can potentially download and manage llama.cpp version that it supports.

paddler download # downloads the latest supported llama.cpp version

behavior?

@Propfend Propfend mentioned this issue Jan 7, 2025
@mcharytoniuk
Copy link
Member Author

Should Supervisor be optional?

If the basic Paddler ecosystem can work without supervisor, just with balancer, llamacpp and some agent instance, should the supervisor have an optional compilation?

Nope, because it doesn't introduce additional build requirements. I've made web GUI optional because it introduced node as a dependency to build the front-end; I wanted Paddler to have a way to be built with just Rust. Even though the supervisor is optional, it does not require Node or anything like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants