talebolano

Follow

Bolano talebolano

Follow

El pueblo unido jamás será vencido

44 followers · 5 following

SWJTU
https://talebolano.github.io/

Achievements

Achievements

Stars

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 3,553 210 Updated Feb 27, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 10,963 1,092 Updated Feb 27, 2025

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,114 573 Updated Feb 26, 2025

xlang-ai / aguvis

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Python 241 15 Updated Jan 14, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 11,794 765 Updated Feb 27, 2025

landing-ai / vision-agent

Vision agent

Python 3,032 349 Updated Feb 24, 2025

jerpelhan / GeCo

Python 27 4 Updated Jan 20, 2025

jerpelhan / DAVE

Python 59 7 Updated Jan 7, 2025

IDEA-Research / ChatRex

Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding

Python 155 8 Updated Jan 24, 2025

AIFSH / OmniGen-ComfyUI

Python 201 29 Updated Nov 14, 2024

VectorSpaceLab / OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 3,639 310 Updated Feb 20, 2025

amazon-science / polygon-transformer

Python 137 9 Updated Jul 19, 2023

niki-amini-naieni / CountGD

Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.

Python 152 16 Updated Feb 25, 2025

AIDC-AI / Ovis

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

Python 762 46 Updated Feb 26, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 24,095 2,078 Updated Feb 27, 2025

taojy123 / KeymouseGo

类似按键精灵的鼠标键盘录制和自动化操作模拟点击和键入 | automate mouse clicks and keyboard input

Python 7,968 1,126 Updated Aug 31, 2024

deepseek-ai / DeepSeek-VL2

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,199 1,598 Updated Feb 26, 2025

if-ai / ComfyUI-IF_MemoAvatar

Memory-Guided Diffusion for Expressive Talking Video Generation

Python 153 8 Updated Dec 18, 2024

ShihuaHuang95 / DEIM

[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence

Python 318 26 Updated Feb 27, 2025

Yangr116 / BoxSnake

[ICCV 2023] BoxSnake official repository.

Python 60 6 Updated May 28, 2024

hustvl / MapTR

[ICLR'23 Spotlight & IJCV'24] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction

Python 1,217 187 Updated Oct 28, 2024

hustvl / DiffusionDrive

[CVPR 2025] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving

Python 458 23 Updated Feb 27, 2025

shouxieai / infer

A new tensorrt integrate. Easy to integrate many tasks

Cuda 409 82 Updated Apr 2, 2023

Sampson-Lee / Rail-Detection

[ACM MM 2022] Official Rail-DB and Rail-Net

Python 53 6 Updated Aug 17, 2023

FoundationVision / VAR

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,728 440 Updated Jan 12, 2025

ifs-rwth-aachen / GERALD

Dataset for German Railway Signals

Python 15 Updated Feb 29, 2024

cactusdynamics / cactus-rt

A C++ framework for programming real-time applications

C++ 239 29 Updated Nov 25, 2024

xialeiliu / Awesome-Incremental-Learning

Awesome Incremental Learning

3,950 586 Updated Jan 2, 2025

ibaiGorordo / ONNX-SAM2-Segment-Anything

Python scripts for the Segment Anythin 2 (SAM2) model in ONNX

Python 220 14 Updated Aug 29, 2024

Skyvern-AI / skyvern

Automate browser-based workflows with LLMs and Computer Vision

Python 12,379 921 Updated Feb 27, 2025