Stars
Inefficient sample code for getting screen contents in Unity on Meta Quest to workaround lack of 'camera access'
[NAACL Findings 2024] PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits
A Web UI for easy subtitle using whisper model.
OpenMMLab Rotated Object Detection Toolbox and Benchmark
BLSP-Emo: Towards Empathetic Large Speech-Language Models
Korean Sentence Embedding Repository
CoreNet: A library for training deep neural networks
[CVPR 2024 Highlight] Official PyTorch implementation of SpatialTracker: Tracking Any 2D Pixels in 3D Space
Few demos with Unity VisionOS 2D Window and Fully Immersive VR mode.
Diffusion model papers, survey, and taxonomy
Generate images with predefined facial expressions.
[BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.
Radiology Objects in COntext (ROCO): A Multimodal Image Dataset
Official Code and Dataset for "High-fidelity 3D Human Digitization from Single 2K Resolution Images" (CVPR 2023 Highlight)
Reading list for research topics in multimodal machine learning
✨✨Latest Advances on Multimodal Large Language Models
[IJCARS] An official repository for "Minimal Data Requirement for Realistic Endoscopic Image Generation with Stable Diffusion"
Deep Learning Paper Reading Meeting-Archive
ImageBind One Embedding Space to Bind Them All