Skip to content
View Rhatanii's full-sized avatar
😃
😃

Highlights

  • Pro

Block or report Rhatanii

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official Implementation of Video-MA2MBA

JavaScript 10 Updated Dec 3, 2024

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

Python 2,136 269 Updated Jan 10, 2025

Dialogue model that produces empathetic responses when trained on the EmpatheticDialogues dataset.

Python 477 65 Updated Dec 3, 2021

HumanOmni

Python 75 4 Updated Mar 3, 2025

Official implementation of USR (NeurIPS 2024)

Python 30 2 Updated Dec 21, 2024

Official Implementation of SALOVA

JavaScript 5 Updated Nov 27, 2024

Visualize the intermediate output of Mistral 7B

Python 343 15 Updated Jan 22, 2025

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,094 72 Updated Jan 23, 2025

Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language (AAAI 2025)

Python 9 3 Updated Sep 6, 2024

This repository contains the speaker labeled information of VoxCeleb2 and LRS3 audio-visual datasets. (AAAI 2025)

Python 8 Updated Sep 6, 2024

Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)

Python 14 1 Updated Sep 24, 2024

💬 An extensive collection of exceptional resources dedicated to the captivating world of talking face synthesis! ⭐ If you find this repo useful, please give it a star! 🤩

1,016 56 Updated Feb 10, 2025

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Python 7,716 660 Updated Aug 13, 2024

[CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation

Python 32 2 Updated Sep 6, 2024
Python 311 25 Updated May 19, 2024

[ACL 2024 Findings] Official PyTorch Implementation code for realizing the technical part of CoLLaVO: Crayon Large Language and Vision mOdel to significantly improve zero-shot vision language perfo…

Python 95 14 Updated Jun 28, 2024

Pytorch implementation of "Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens"

Python 11 Updated Mar 9, 2024

This is Pytorch Implementation Code for adding new features in code of Segment-Anything. Here, the features support batch-input on the full-grid prompt (automatic mask generation) with post-process…

Python 152 10 Updated Dec 7, 2023

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 31,092 6,487 Updated Jan 9, 2025

multi server gpu monitoring utils

Python 39 8 Updated Sep 17, 2019

Official code for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)

Jupyter Notebook 1,584 215 Updated Nov 29, 2022