Starred repositories
Visualize streams of multimodal data. Free, fast, easy to use, and simple to integrate. Built in Rust.
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
paper list of robotic grasping and some related works
The official codebase for ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation(cvpr 2024)
哔哩下载姬downkyi,哔哩哔哩网站视频下载工具,支持批量下载,支持8K、HDR、杜比视界,提供工具箱(音视频提取、去水印等)。
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search…
Parameter-Efficient Abstractive Question Answering\\ over Tables and over Text
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Pytorch Implementation of "SMITE: Segment Me In TimE"
PDDLStream: Integrating Symbolic Planners and Blackbox Samplers
👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)
[AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition
A comprehensive list of papers about Robot Manipulation, including papers, codes, and related websites.
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
This repository contains the bare minimum for communicating with the Shadow Hand from a remote computer: urdf models and messages.
Materials for the ability hand API, including documentation, URDF, Matlab, & Python examples.