
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)
🦙 Integrating LLMs into structured NLP pipelines
📚 Process PDFs, Word documents and more with spaCy
🍬 Confection: the sweetest config system for Python
📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) o…
A list of useful Open Source tools and scrapers to gather data for LLMs
This is the repository content that contains all of the course code
Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)
A python module that wraps the pdftoppm utility to convert PDF to PIL Image object
📝 python package to calculate readability statistics of a text object - paragraphs, sentences, articles.
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
A Python library for dewarping/straightening/reformatting document images and PDFs
A text-to-speech (TTS) and Speech-to-Speech (STS) library built on Apple's MLX framework, providing efficient speech synthesis on Apple Silicon.
Toolkit for linearizing PDFs for LLM datasets/training
Examples and guides for using the Gemini API
CCCS security control profiles expressed using OSCAL
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
📝 Design doc template & examples for machine learning systems (requirements, methodology, implementation, etc.)
willwade / tts-wrapper
Forked from mediatechlab/tts-wrapperTTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
RAGChecker: A Fine-grained Framework For Diagnosing RAG
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
🛒 Simple recommender with matrix factorization, graph, and NLP. Beating the regular collaborative filtering baseline.
kachiO / kura
Forked from ivanleomk/kuraKura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embeddings recursively. This helps us understand user behaviour on…
Fully open reproduction of DeepSeek-R1