-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama.cpp supervisor #32
Comments
So the initial usage flow and rules might be: 1 - User will use some command to start a new Supervisor instance pointing to an existing llamacpp instance. 2 - User will control llamacpp instance through Supervisor Rest Api. 3 - Changing llamacpp instance configuration will make the Supervisor restart the llamacpp instance with the new configuration options applied. Supervisor will restart the llamacpp instance with the initial llamacpp address. 4 - Llamacpp address can also be changed in the Supervisor Rest api. If old running Agents are broken with new llamacpp address is on the user responsibility. 5 - While restarting llamacpp instances, reverseproxy Loadbalancer must not drop incoming requests to llamacpp instances. |
Should Supervisor be optional?If the basic Paddler ecosystem can work without supervisor, just with balancer, llamacpp and some agent instance, should the supervisor have an optional compilation? |
Supervisor aggregate addressIts not clear for me its purpose. Whats the point of |
Paddler binaries downloadingIdeas, Suggestions or any more details from the community on the:
behavior? |
Nope, because it doesn't introduce additional build requirements. I've made web GUI optional because it introduced node as a dependency to build the front-end; I wanted Paddler to have a way to be built with just Rust. Even though the supervisor is optional, it does not require Node or anything like that. |
[draft]
General idea:
There should be a tool to manage existing llama.cpp instances. Not all parameters in llama.cpp can be changed on runtime, it should also be brought up when it exits for any reason.
Paddler can potentially download and manage llama.cpp version that it supports.
For example:
Then, it should be possible to restart that specific llama.cpp instance through
supervisor-controller-addr
APIPaddler should keep requests on hold while supervisor restarts llamacpp instances
The text was updated successfully, but these errors were encountered: