Highlights
- Pro
Stars
Efficient Triton Kernels for LLM Training
Create architecture diagrams from code automatically using large language models (LLMs).
Open Control Plane for Tables in Data Lakehouse
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
DSPy: The framework for programming—not prompting—language models
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
Open source project for data preparation of LLM application builders
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
Open-source vector similarity search for Postgres
Header-only C++/python library for fast approximate nearest neighbors
A library for efficient similarity search and clustering of dense vectors.
A tutorial of building an LSM-Tree storage engine in a week.
A RocksDB compatible KV storage engine with better performance
🎨 Diagram as Code for prototyping cloud system architectures
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
Command line (CLI) tool to inspect Apache Parquet files on the go
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
💬 An extensive collection of exceptional resources dedicated to the captivating world of talking face synthesis! ⭐ If you find this repo useful, please give it a star! 🤩
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…
This is the official repository for OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering [CVPR2023].
Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Includes notes on using Apache Spark in general, notes on using Spark for Physics, how to run TPCDS on PySpark, how to create histograms with Spark, tools for performance testing CPUs, Jupyter note…
Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Apache Spark Performance Dashboard using containers technology.