yamano1212

Follow

yamano1212

Follow

9 followers · 14 following

Achievements

Achievements

Stars

yangdongchao / RSTnet

Real-time Speech-Text Foundation Model Toolkit (wip)

Python 128 12 Updated Oct 14, 2024

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 7,483 597 Updated Feb 9, 2025

Thytu / Agentarium

open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for designing complex, interactive environments where agents can act,…

Python 895 72 Updated Jan 30, 2025

gusye1234 / nano-graphrag

A simple, easy-to-hack GraphRAG implementation

Python 2,375 228 Updated Jan 15, 2025

humanlayer / humanlayer

HumanLayer enables AI agents to communicate with humans in tool-based and async workflows. Guarantee human oversight of high-stakes function calls with approval workflows across slack, email and mo…

Python 586 50 Updated Feb 6, 2025

KanjiVG / kanjivg

Kanji vector graphics

Python 1,108 188 Updated Jan 30, 2025

Lux-AI-Challenge / Lux-Design-S3

Repository for the Lux AI Challenge, season 3 @NeurIPS 24. Hosted on @kaggle

Python 299 63 Updated Feb 5, 2025

pashanitw / W2V2-BERT-ASR-Training

Python 13 1 Updated Mar 25, 2024

pashanitw / xeus-finetune

Python 10 1 Updated Aug 20, 2024

h1karu-s / pretraining_LayoutLMv3_PubLayNet

Jupyter Notebook 23 1 Updated Mar 7, 2023

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 7,016 534 Updated Dec 25, 2024

comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 67,164 7,202 Updated Feb 15, 2025

kohya-ss / sd-scripts

Python 5,694 930 Updated Feb 15, 2025

aiola-lab / whisper-medusa

Whisper with Medusa heads

Python 822 50 Updated Feb 11, 2025

ostris / ai-toolkit

Various AI scripts. Mostly Stable Diffusion stuff.

Python 4,004 450 Updated Feb 15, 2025

XLabs-AI / x-flux

Python 1,872 133 Updated Nov 8, 2024

apple / ml-mdm

Train high-quality text-to-image diffusion models in a data & compute efficient manner

Python 475 36 Updated Feb 12, 2025

piddnad / DDColor

[ICCV 2023] DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders

Jupyter Notebook 1,211 126 Updated Dec 31, 2024

kongzhecn / OMG

[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models

Python 682 45 Updated Jul 2, 2024

bghira / SimpleTuner

A general fine-tuning kit geared toward diffusion models.

Python 2,082 198 Updated Feb 10, 2025

instantX-research / InstantID

InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥

Python 11,403 833 Updated Jul 18, 2024

kousw / experimental-consistory

Python 109 6 Updated Mar 3, 2024

tosiyuki / LLaVA-JP

LLaVA-JP is a Japanese VLM trained by LLaVA method

Python 59 13 Updated Jul 3, 2024

LLaVA-VL / LLaVA-NeXT

Python 3,395 310 Updated Feb 13, 2025

kyamauchi1023 / PL-BERT-ja

A repository of Japanese Phoneme-Level BERT

Python 22 2 Updated Dec 16, 2023

wenet-e2e / wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Python 821 126 Updated Jan 6, 2025

WariHima / KanaYomi-dict

openjtalk形式のユーザー辞書

Python 5 Updated Feb 26, 2024

seongminp / hyperseg

Code for HyperSeg and HyperSum

Python 12 Updated May 17, 2024

google / magika

Detect file content types with deep learning

Rust 8,420 435 Updated Feb 10, 2025

FrenchKrab / IS2023-powerset-diarization

Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.

Jupyter Notebook 79 6 Updated Oct 18, 2023