Lists (1)
Sort Name ascending (A-Z)
Stars
Android real-time display control software
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
A lightweight library for portable low-level GPU computation using WebGPU.
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Pure C++ implementation of several models for real-time chatting on your computer (CPU)
Java Bindings for llama.cpp - A Port of Facebook's LLaMA model in C/C++
Public Domain library for rectifying Chinese coordinates
llama.cpp fork with additional SOTA quants and improved performance
LM inference server implementation based on llama.cpp.