Stars
Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)
Digital Avatar Conversational System - Linly-Talker. πβ¨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interactionβ¦
shamith09 / pygyat
Forked from mathialo/bythonPython with rizz.
Generate music based on natural language prompts using LLMs running locally
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Backtesting several trading strategy and rank them according their profit return.
a collection of open source server components and Python libraries for financial data projects and automated trading
adefossez / demucs
Forked from facebookresearch/demucsCode for the paper Hybrid Spectrogram and Waveform Source Separation
A collection of prompts, system prompts and LLM instructions
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Open-Source Web UI for Apache Kafka Management
real time face swap and one-click video deepfake with only a single image
[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
Library to simplify the way you create and manipulate sounds with the Web Audio API.
SaaS Boilerplate - Open Source and free SaaS stack that lets you build SaaS products faster in React, Django and AWS. Focus on essential business logic instead of coding repeatable features!
AI Audio Datasets (AI-ADS) π΅, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio aβ¦
Official Code for MotionCtrl [SIGGRAPH 2024]
ComfyUI nodes for Stable Video Diffusion
Universal and Transferable Attacks on Aligned Language Models
Improved AnimateDiff for ComfyUI and Advanced Sampling Support
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.