Skip to content
View phillipinseoul's full-sized avatar

Highlights

  • Pro

Block or report phillipinseoul

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official implementation of Occupancy-Based Dual Contouring (SIGGRAPH Asia 2024).

Python 83 4 Updated Nov 14, 2024

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.

TypeScript 4,389 408 Updated Jan 16, 2025

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 4,967 479 Updated Nov 5, 2024

ImageNet3D: Towards General-Purpose Object-Level 3D Understanding

Python 15 1 Updated Dec 6, 2024

Dataset for visual perspective taking

Python 8 Updated Oct 8, 2024

[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding

Python 82 Updated Nov 26, 2024

Papers, code and datasets about deep learning for 3D Object Detection.

605 55 Updated Dec 1, 2023

Code for "Open Vocabulary Monocular 3D Object Detection"

Python 29 Updated Jan 2, 2025

Official repository for our work on micro-budget training of large-scale diffusion models.

Python 1,116 43 Updated Jan 12, 2025

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Python 127 4 Updated Dec 17, 2024

[arXiv 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control

383 10 Updated Jan 12, 2025
Python 717 60 Updated Jan 19, 2025
Python 245 5 Updated Jan 14, 2025

This series will take you on a journey from the fundamentals of NLP and Computer Vision to the cutting edge of Vision-Language Models.

729 67 Updated Jan 7, 2025

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,173 440 Updated Jan 9, 2025

Official pytorch repository for “Guidance with Spherical Gaussian Constraint for Conditional Diffusion”

Python 52 2 Updated Jul 17, 2024

Official repository for PERSE: Personalized 3D Generative Avatars from A Single Portrait

103 1 Updated Jan 5, 2025

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

Python 1,705 245 Updated Jan 22, 2025

Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step

Python 82 4 Updated Jan 11, 2025

ROOT: VLM based System for Indoor Scene Understanding and Beyond

Jupyter Notebook 19 Updated Jan 22, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 7,672 596 Updated Jan 21, 2025

Official repo and evaluation implementation of VSI-Bench

Python 343 21 Updated Jan 17, 2025

Code release for https://kovenyu.com/WonderWorld/

Python 411 17 Updated Dec 22, 2024

Code for FreeScale, a tuning-free method for higher-resolution visual generation

Python 112 3 Updated Dec 23, 2024

A generative world for general-purpose robotics & embodied AI learning.

Python 23,154 1,924 Updated Jan 22, 2025

A precise and stable CFG for negative prompts, derived via guided sampling with contrastive loss.

Python 4 Updated Dec 27, 2024
Next