Stars
Towards Modality Generalization: A Benchmark and Prospective Analysis
TensorZero creates a feedback loop for optimizing LLM applications — turning production data into smarter, faster, and cheaper models.
LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]
[ICLR2023] Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation (CDCD).
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
ImageBind One Embedding Space to Bind Them All
MU-LLaMA: Music Understanding Large Language Model
Evaluation functions for music/audio information retrieval/signal processing algorithms.
A curated list of Video to Audio Generation
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model
Manually annotated chord data set of US pop songs and Popular Music Collection of RWC Music Database
SD-Trainer. LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.
A large-scale dataset of caption-annotated MIDI files.
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
The Song Describer dataset is an evaluation dataset made of ~1.1k captions for 706 permissively licensed music recordings.
Stable Diffusion web UI
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A curated list of awesome 3d generation papers
Responsive Resume Cv Website Using HTML CSS And JavaScript
A modern static resume template and theme. Powered by Jekyll and GitHub pages.