Lists (1)
Sort Oldest
Stars
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
Official repository for KoMT-Bench built by LG AI Research
An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…
Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch
Unified automatic quality assessment for speech, music, and sound.
Investment Research for Everyone, Everywhere.
Ola: Pushing the Frontiers of Omni-Modal Language Model
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Everything you need to build state-of-the-art foundation models, end-to-end.
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
Official inference repo for FLUX.1 models
[NeurIPS 2024 Spotlight] The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"
Motion-Controllable Video Diffusion via Warped Noise
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling
Repository for Accent Recognition (Hackathon @SLT2022)
Self-Supervised Speech Pre-training and Representation Learning Toolkit
VoiceBench: Benchmarking LLM-Based Voice Assistants
This toolbox aims to unify audio generation model evaluation for easier comparison.
Algorithms for Intelligent Assessment of Human Personality Traits based on His Multimodal Data for ranking potential candidates to perform professional responsibilities
Learn how to use the Cognitive Services Python SDK with these samples