Skip to content
View yamano1212's full-sized avatar

Block or report yamano1212

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Real-time Speech-Text Foundation Model Toolkit (wip)

Python 128 12 Updated Oct 14, 2024

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 7,483 597 Updated Feb 9, 2025

open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for designing complex, interactive environments where agents can act,…

Python 895 72 Updated Jan 30, 2025

A simple, easy-to-hack GraphRAG implementation

Python 2,375 228 Updated Jan 15, 2025

HumanLayer enables AI agents to communicate with humans in tool-based and async workflows. Guarantee human oversight of high-stakes function calls with approval workflows across slack, email and mo…

Python 586 50 Updated Feb 6, 2025

Kanji vector graphics

Python 1,108 188 Updated Jan 30, 2025

Repository for the Lux AI Challenge, season 3 @NeurIPS 24. Hosted on @kaggle

Python 299 63 Updated Feb 5, 2025
Python 10 1 Updated Aug 20, 2024
Jupyter Notebook 23 1 Updated Mar 7, 2023

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 7,016 534 Updated Dec 25, 2024

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 67,164 7,202 Updated Feb 15, 2025
Python 5,694 930 Updated Feb 15, 2025

Whisper with Medusa heads

Python 822 50 Updated Feb 11, 2025

Various AI scripts. Mostly Stable Diffusion stuff.

Python 4,004 450 Updated Feb 15, 2025
Python 1,872 133 Updated Nov 8, 2024

Train high-quality text-to-image diffusion models in a data & compute efficient manner

Python 475 36 Updated Feb 12, 2025

[ICCV 2023] DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders

Jupyter Notebook 1,211 126 Updated Dec 31, 2024

[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models

Python 682 45 Updated Jul 2, 2024

A general fine-tuning kit geared toward diffusion models.

Python 2,082 198 Updated Feb 10, 2025

InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥

Python 11,403 833 Updated Jul 18, 2024

LLaVA-JP is a Japanese VLM trained by LLaVA method

Python 59 13 Updated Jul 3, 2024
Python 3,395 310 Updated Feb 13, 2025

A repository of Japanese Phoneme-Level BERT

Python 22 2 Updated Dec 16, 2023

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Python 821 126 Updated Jan 6, 2025

openjtalk形式のユーザー辞書

Python 5 Updated Feb 26, 2024

Code for HyperSeg and HyperSum

Python 12 Updated May 17, 2024

Detect file content types with deep learning

Rust 8,420 435 Updated Feb 10, 2025

Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.

Jupyter Notebook 79 6 Updated Oct 18, 2023
Next