-
The Chinese University of Hong Kong
- Hong Kong
Stars
A curated list of wireless sensing security works, organized by signal roles: Victims, Weapons, and Shields
Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)
Python tool for converting files and office documents to Markdown.
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
Open-Sora: Democratizing Efficient Video Production for All
CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!
A lecture note for understanding deep learning
Google Research
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
gprMax is open source software that simulates electromagnetic wave propagation using the Finite-Difference Time-Domain (FDTD) method for numerical modelling of Ground Penetrating Radar (GPR)
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
[CVPR 2024] The code for paper 'Towards Learning a Generalist Model for Embodied Navigation'
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
📚 Jupyter notebook tutorials for OpenVINO™
Self-contained, minimalistic implementation of diffusion models with Pytorch.
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
[ICRA2023] Implementation of Visual Language Maps for Robot Navigation