Lists (27)
Sort Name ascending (A-Z)
audio_recognition
autonomous-driving-system
computer-vision
conflicts
courses
distributed_inference_mixed_hw
distributed inference on heterogeneous hardware🔮 Future ideas
gis
image-editing
js
linux
LISA
LLM
macos
ML
ocr
python
python-compilers
rag
llm Retrieval-augmented generationrobotics
simulation
statistics
tagging
vue
webmap
webserver
Writing
Stars
Writing Extension for Text Generation WebUI
An example of utilizing large language models (LLMs) from within the Ren'Py engine
A zero dependency web UI for any LLM backend, including KoboldCpp, OpenAI and AI Horde
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
OCR, layout analysis, reading order, table recognition in 90+ languages
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
A modular graph-based Retrieval-Augmented Generation (RAG) system
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
This tool uses AI to evaluate your pronunciation.
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
ROS 2 enabled 2D mobile robot simulator for behavior prototyping.
Python sample codes and textbook for robotics algorithms.
Navigation2's dynamic obstacle detection, tracking, and processing pipelines.
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
[ICML'24 Oral] "MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions"
Official Code for DragGAN (SIGGRAPH 2023)
Unofficial Implementation of DragGAN - "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold" (DragGAN 全功能实现,在线Demo,本地部署试用,代码、模型已全部开源,支持Windows, macOS, Linux)
🆙 Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.
Python scripts for the Segment Anythin 2 (SAM2) model in ONNX
The entrance repository of Markdown presentation ecosystem
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
Use Florence 2 to auto-label data for use in training fine-tuned object detection models.
The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.
Strong and Open Vision Language Assistant for Mobile Devices