Stars
A high-throughput and memory-efficient inference and serving engine for LLMs
get things from one computer to another, safely
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
📝 A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion
📚 Parameterize, execute, and analyze notebooks
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
prompt2model - Generate Deployable Models from Natural Language Instructions
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
code for EMNLP 2019 paper Text Summarization with Pretrained Encoders
Data and tools for generating and inspecting OLMo pre-training data.
Python client for the Twitter 'search Tweets' and 'count Tweets' endpoints (v2/Labs/premium/enterprise). Now supports Twitter API v2 /recent and /all search endpoints.
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
A Python library that interfaces with the MediaWiki API. This is a mirror from gerrit.wikimedia.org. Do not submit any patches here. See https://www.mediawiki.org/wiki/Developer_account for contrib…
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
Code for ACL 2020 paper: "Extractive Summarization as Text Matching"
Data and software for building the ACL Anthology.
A tool for holistic analysis of language generations systems
A tool that AI automatically recommends commit messages.
Fetch an academic paper or web article and send it to the reMarkable tablet with a single command
BARTScore: Evaluating Generated Text as Text Generation
SoTA Abstract Meaning Representation (AMR) parsing with word-node alignments in Pytorch. Includes checkpoints and other tools such as statistical significance Smatch.
SacreROUGE is a library dedicated to the use and development of text generation evaluation metrics with an emphasis on summarization.
Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction
A Toolkit for Distributional Control of Generative Models