ice-ou

ice-ou

0 followers · 5 following

Stars

BeSpontaneous / MCA-pytorch

Don't Judge by the Look: Towards Motion Coherent Video Representation (ICLR2024)

Python 8 Updated Mar 25, 2024

taoyang1122 / adapt-image-models

Forked from amazon-science/adapt-image-models

[ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition

Python 283 22 Updated Sep 17, 2023

deepseek-ai / DeepSeek-VL2

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 3,470 1,459 Updated Feb 9, 2025

deepseek-ai / DeepSeek-R1

73,248 9,449 Updated Feb 8, 2025

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 40,205 4,926 Updated Feb 12, 2025

open-thoughts / open-thoughts

Open Thoughts: Fully Open Data Curation for Thinking Models

Python 774 52 Updated Feb 12, 2025

AnswerDotAI / ModernBERT

Bringing BERT into modernity via both architecture changes and scaling

Python 1,168 75 Updated Feb 12, 2025

sallymmx / ActionCLIP

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

Python 535 61 Updated Dec 6, 2023

OpenGVLab / VisionLLM

VisionLLM Series

Python 996 38 Updated Feb 6, 2025

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 6,992 533 Updated Dec 25, 2024

OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,807 377 Updated Mar 14, 2024

tianruochen / MultimodalVideoTag

多模态视频分类模型

Python 17 3 Updated Nov 23, 2022

linziyi96 / st-adapter

Python 73 3 Updated May 8, 2023

sallymmx / m2clip

[AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition

Python 46 2 Updated Dec 23, 2024

Snailclimb / guide-rpc-framework

A custom RPC framework implemented by Netty+Kyro+Zookeeper.（一款基于 Netty+Kyro+Zookeeper 实现的自定义 RPC 框架-附详细实现过程和相关教程。）

Java 4,060 2,091 Updated Jul 22, 2024

jingyaogong / minimind-v

🚀 「大模型」3小时从0训练27M参数的视觉多模态VLM！🌏 Train a 27M-parameter VLM from scratch in just 3 hours!

Python 957 96 Updated Feb 10, 2025

openvla / openvla

Forked from TRI-ML/prismatic-vlms

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Python 1,869 247 Updated Dec 11, 2024

leggedrobotics / legged_gym

Isaac Gym Environments for Legged Robots

Python 1,631 414 Updated Aug 2, 2024

jh-yi / Video-Panda

Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models

Python 45 2 Updated Jan 12, 2025

open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Python 4,437 1,264 Updated Aug 14, 2024

zlngan / ASQuery

Python 6 Updated Jan 26, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 23,780 2,031 Updated Feb 12, 2025

OpenGVLab / Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,160 259 Updated Jan 18, 2025

luogen1996 / LaVIN

[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"

Python 514 38 Updated Jan 27, 2024

Ledzy / BAdam

[NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models

Python 242 13 Updated Nov 30, 2024

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 37,753 4,617 Updated Feb 11, 2025

LlamaFamily / Llama-Chinese

Llama中文社区，Llama3在线体验和微调模型已开放，实时汇总最新Llama3学习资料，已将所有代码更新适配Llama3，构建最好的中文Llama大模型，完全开源可商用

Python 14,401 1,291 Updated Sep 5, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 21,390 2,353 Updated Aug 12, 2024

mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

Python 1,286 111 Updated Aug 27, 2024

ziplab / LongVLM

Python 86 7 Updated Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly