Skip to content
View 2U1's full-sized avatar

Highlights

  • Pro

Block or report 2U1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Rethinking Step-by-step Visual Reasoning in LLMs

Python 238 15 Updated Jan 24, 2025

An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…

TypeScript 9,960 941 Updated Feb 11, 2025

Python bindings for llama.cpp

Python 8,591 1,050 Updated Jan 29, 2025

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 341 15 Updated Jan 13, 2025

This repository collects research papers of large Vision Language Models in Autonomous driving and Intelligent Transportation System. The repository will be continuously updated to track the lates…

216 16 Updated Feb 10, 2025

Fully open reproduction of DeepSeek-R1

Python 18,783 1,580 Updated Feb 11, 2025

[ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer

Python 36 2 Updated Oct 18, 2023

[CVPR 2024] MAPLM: A Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding

Python 121 3 Updated Nov 20, 2023

TensorDict is a pytorch dedicated tensor container.

Python 874 79 Updated Feb 10, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 9,660 1,251 Updated Feb 1, 2025

a family of versatile and state-of-the-art video tokenizers.

Python 329 21 Updated Jan 15, 2025

VideoX: a collection of video cross-modal models

Python 1,004 164 Updated Jun 3, 2024

Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation

Python 33 Updated Jan 8, 2025

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.

TypeScript 4,915 497 Updated Jan 27, 2025

CUDA Templates for Linear Algebra Subroutines

C++ 6,186 1,064 Updated Feb 11, 2025

Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)

Python 1,864 206 Updated May 20, 2024

Official implementation of "Why are Visually-Grounded Language Models Bad at Image Classification?" (NeurIPS 2024)

Jupyter Notebook 67 4 Updated Oct 19, 2024

tiny vision language model

Python 7,303 570 Updated Feb 7, 2025

Microsoft Automatic Mixed Precision Library

Python 563 47 Updated Sep 29, 2024
1 Updated Feb 7, 2025

An open-source implementaion for fine-tuning SmolVLM.

Python 7 Updated Jan 24, 2025

Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas

Python 3,425 461 Updated Jan 26, 2025

A suite of image and video neural tokenizers

Jupyter Notebook 1,547 66 Updated Feb 11, 2025

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,435 464 Updated Feb 11, 2025

Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali

Python 1,786 122 Updated Feb 11, 2025

fast python port of arc90's readability tool, updated to match latest readability.js!

Python 2,719 351 Updated Jan 15, 2025

Fast low-bit matmul kernels in Triton

Python 227 18 Updated Feb 11, 2025

🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.

Python 10,042 947 Updated Feb 11, 2025

Implementing the 4 agentic patterns from scratch

Jupyter Notebook 1,016 106 Updated Jan 25, 2025
Next