Stars
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Efficient LLM Inference over Long Sequences
Ip2region (2.0 - xdb) is a offline IP address manager framework and locator, support billions of data segments, ten microsecond searching performance. xdb engine implementation for many programming…
The official Python library for the OpenAI API
Things you can do with the token embeddings of an LLM
You like pytorch? You like micrograd? You love tinygrad! ❤️
llama3 implementation one matrix multiplication at a time
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.
Official inference repo for FLUX.1 models
A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
Text preprocessing, representation and visualization from zero to hero.
SGLang is a fast serving framework for large language models and vision language models.
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Example models using DeepSpeed
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).