Stars
Bringing BERT into modernity via both architecture changes and scaling
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
[AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition
A custom RPC framework implemented by Netty+Kyro+Zookeeper.(一款基于 Netty+Kyro+Zookeeper 实现的自定义 RPC 框架-附详细实现过程和相关教程。)
🚀 「大模型」3小时从0训练27M参数的视觉多模态VLM!🌏 Train a 27M-parameter VLM from scratch in just 3 hours!
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
Isaac Gym Environments for Legged Robots
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
A generative world for general-purpose robotics & embodied AI learning.
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
[NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…
Tools to Design or Visualize Architecture of Neural Network
FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models
FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment
[ACMMM 2024] Implementation of the paper “Multi-Modality Co-Learning for Efficient Skeleton-based Action Recognition“.
C++ 资源大全中文版,标准库、Web应用框架、人工智能、数据库、图片处理、机器学习、日志、代码分析等。由「开源前哨」和「CPP开发者」微信公号团队维护更新。