OpenCSGs / llm-inference Public

Notifications You must be signed in to change notification settings
Fork 16
Star 72

Code
Issues 13
Pull requests
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: OpenCSGs/llm-inference

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

13 Open 21 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Api server blocked when one request is in-process bug

Something isn't working

#137 opened May 9, 2024 by SeanHH86

GGUF implements will make duplicate copy since cannot detect config.json file in the cache folder

#123 opened Apr 24, 2024 by depenglee1707

vllm implements cannot support download model from repo besides hg

#120 opened Apr 23, 2024 by depenglee1707

Add inference SDK for invoke enhancement

New feature or request

#103 opened Apr 17, 2024 by SeanHH86

Requested tokens (817) exceed context window of 512 bug

Something isn't working

#99 opened Apr 16, 2024 by SeanHH86

Model inference cross multi-nodes

#98 opened Apr 16, 2024 by SeanHH86

Support load Qwen1.5-72B-Chat-GPTQ-Int4 by auto_gptq enhancement

New feature or request

#68 opened Apr 3, 2024 by SeanHH86

Install dependency llama-cpp-python failed

#59 opened Mar 31, 2024 by SeanHH86

Expose model generate parameters by API server

#55 opened Mar 27, 2024 by SeanHH86

No default value for "timeout" if missing "batch_wait_timeout_s: 0" in yaml config

#48 opened Mar 25, 2024 by depenglee1707

inference gradio web reponse random words for deepseek instrcuct model

#37 opened Mar 20, 2024 by KinglyWayne

Support Quantized Model

#20 opened Mar 13, 2024 by SeanHH86

速度和sglang相比哪个快？

#8 opened Mar 6, 2024 by njhouse365

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly