Skip to content
View liweijia's full-sized avatar

Block or report liweijia

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 3,470 1,458 Updated Feb 9, 2025

Code for "Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation"

Python 59 1 Updated Dec 12, 2024

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 15,659 2,056 Updated Feb 1, 2025

A C++/Python implementation of the StreetLearn environment based on images from Street View, as well as a TensorFlow implementation of goal-driven navigation agents solving the task published in “L…

C++ 303 61 Updated Jul 21, 2020

Multimodal Large Language Models for Remote Sensing (RS-MLLMs): A Survey

188 5 Updated Jan 23, 2025

[AAAI 2025]This repo contains evaluation code for the paper “UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios”

Python 19 1 Updated Feb 9, 2025

Awesome-Remote-Sensing-Vision-Language-Models

154 8 Updated Apr 27, 2024

The official pytorch implementation of Exploring the Interactive Guidance for Unified and Effective Image Matting [Arxiv]

Python 24 Updated Mar 29, 2024

The official implementation of the paper “Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm”

Python 42 2 Updated Dec 14, 2024

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Python 836 61 Updated Jan 16, 2025

✨✨Latest Advances on Multimodal Large Language Models

13,817 890 Updated Feb 11, 2025

[ICLR 2025 Spotlight] The official implementation of the paper “LOKI:A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models”

Python 127 1 Updated Feb 11, 2025

This is the repo for the paper Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.

39 Updated Dec 3, 2024

Awesome lists about framework figures in papers

691 16 Updated Feb 2, 2025

[ECCV 2024] About The official implementation of the paper "Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network“.

Python 61 2 Updated Feb 11, 2025

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 25,607 1,948 Updated Feb 11, 2025

Official code for CVPR 2022 paper "Rethinking Visual Geo-localization for Large-Scale Applications"

Python 311 59 Updated Jun 2, 2024

[ECCV-2020 (spotlight)] Self-supervising Fine-grained Region Similarities for Large-scale Image Localization. 🌏 PyTorch open-source toolbox for image-based localization (place recognition).

Python 272 41 Updated Jul 28, 2021

A Survey on Vision-Language Geo-Foundation Models (VLGFMs)

151 8 Updated Jan 14, 2025

Papers related to remote sensing in CVPR 2024

170 10 Updated Jun 24, 2024

[CVPR 2024] 3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions

Python 56 3 Updated Aug 30, 2024

VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis

Python 73 6 Updated Jan 8, 2025
Jupyter Notebook 18 6 Updated Mar 18, 2024

A professional list on Multi-modal Data Fusion Models and Key Datasets for Urban Computing.

123 10 Updated Dec 16, 2024

An Awesome Collection of Urban Foundation Models (UFMs).

154 14 Updated Feb 2, 2025

[CVPR 2024, Highlight] The official implementation of the paper "SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation". [email protected]

39 2 Updated Dec 6, 2024
Python 65 3 Updated Jun 13, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 23,283 2,300 Updated Feb 12, 2025

Official implementation of CVPR 2024 paper: "FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition"

Python 457 15 Updated Oct 21, 2024
Next