Stars
🦜🔗 Build context-aware reasoning applications
A latent text-to-image diffusion model
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
🔊 Text-Prompted Generative Audio Model
Google Research
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
A multi-voice TTS system trained with an emphasis on quality
High-Resolution Image Synthesis with Latent Diffusion Models
Official inference library for Mistral models
This repository contains demos I made with the Transformers library by HuggingFace.
PyTorch code and models for the DINOv2 self-supervised learning method.
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
Multi-Joint dynamics with Contact. A general purpose physics simulator.
Using Low-rank adaptation to quickly fine-tune diffusion models.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Flax is a neural network library for JAX that is designed for flexibility.
Taming Transformers for High-Resolution Image Synthesis
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
Notebooks using the Hugging Face libraries 🤗
Materials for the Hugging Face Diffusion Models Course
A small package to create visualizations of PyTorch execution graphs
Open-source and strong foundation image recognition models.
Kandinsky 2 — multilingual text2image latent diffusion model