Stars
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Simple Python library to cache sync/async function results, with disk persistence and I/O tuning options
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
Open source RGB lighting control that doesn't depend on manufacturer software. Supports Windows, Linux, MacOS. Mirror of https://gitlab.com/CalcProgrammer1/OpenRGB. Releases can be found on GitLab.
High performance self-hosted photo and video management solution.
A python program that turns an LLM, running on Ollama, into an automated researcher, which will with a single query determine focus areas to investigate, do websearches and scrape content from vari…
Search for words, documents, images, videos, news, maps and text translation using the DuckDuckGo.com search engine. Downloading files and images to a local hard drive.
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Inventory to calculate your level of burnout or burnout recovery. Based on Likert-scale, bipolar testing, and split half consistency
Scriptable database and system performance benchmark
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
PasiKoodaa / F5-TTS
Forked from SWivid/F5-TTSOfficial code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Lightweight and extensible compatibility layer between dataframe libraries!
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
The Phoronix Test Suite open-source, cross-platform automated testing/benchmarking software.
(Mirror) S3-compatible object store for small self-hosted geo-distributed deployments. Main repo: https://git.deuxfleurs.fr/Deuxfleurs/garage
Unofficial / Community provided Visual Studio Code AppImage - stable release
Large-scale LLM inference engine
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]
Inference and training library for high-quality TTS models.