Skip to content
View liyanboSustech's full-sized avatar

Highlights

  • Pro

Block or report liyanboSustech

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. llama.cpp llama.cpp Public

    Forked from ggerganov/llama.cpp

    LLM inference in C/C++

    C++

  2. tensorrtx tensorrtx Public

    Forked from wang-xinyu/tensorrtx

    Implementation of popular deep learning networks with TensorRT network definition API

    C++

  3. InfiniGen InfiniGen Public

    Forked from snu-comparch/InfiniGen

    InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)

    Python

  4. H2O H2O Public

    Forked from FMInference/H2O

    [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

    Python

  5. prompt-cache prompt-cache Public

    Forked from yale-sys/prompt-cache

    Modular and structured prompt caching for low-latency LLM inference

    Python

  6. serverlesskv serverlesskv Public