Skip to content
View wren93's full-sized avatar

Highlights

  • Pro

Block or report wren93

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)

Python 700 28 Updated Jun 27, 2024

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Python 5,195 160 Updated Feb 22, 2025

Official Repo for AAAI 2025 G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o

Python 4 1 Updated Dec 17, 2024

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,204 2,134 Updated Feb 1, 2025

[ICLR 2025] Autoregressive Video Generation without Vector Quantization

Python 390 10 Updated Feb 22, 2025

SwinIR: Image Restoration Using Swin Transformer (official repository)

Python 4,656 562 Updated May 14, 2024

This repo contains code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation"

Python 11 Updated Jan 7, 2025

[ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding

Python 29 1 Updated Dec 13, 2024

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701

Python 904 39 Updated Sep 27, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 41,577 5,100 Updated Feb 20, 2025

👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including PSNR, SSIM, LPIPS, FID, NIQE, NRQM(Ma), MUSIQ, TOPIQ, NIMA, DBCNN, BRISQUE, PI and more...

Python 2,255 187 Updated Feb 12, 2025

[ECCV 2024] codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior

Python 3,570 300 Updated Dec 12, 2024

[ECCV2024] Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization

Python 947 63 Updated Sep 23, 2024

Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024

Jupyter Notebook 1,454 82 Updated Jun 28, 2024

PyTorch code and models for the DINOv2 self-supervised learning method.

Jupyter Notebook 9,862 887 Updated Aug 7, 2024

A curated list of face restoration & enhancement papers and resources

TeX 145 19 Updated Sep 16, 2023
Python 111 5 Updated Jul 8, 2024

[ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Python 153 13 Updated Sep 24, 2024

A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.

Python 69 3 Updated Oct 14, 2024

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

PostScript 18,571 2,256 Updated Nov 13, 2024

LVBench: An Extreme Long Video Understanding Benchmark

Python 80 1 Updated Aug 30, 2024

Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"

Python 405 11 Updated Sep 2, 2024

[ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

Python 244 9 Updated Feb 14, 2025

Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models

Python 77 8 Updated Jun 28, 2024

Awesome papers & datasets specifically focused on long-term videos.

247 12 Updated Nov 15, 2024

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,624 1,329 Updated Feb 21, 2025

[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, Lei Li, Sishuo Chen, Xu Sun, Lu Hou

Python 102 2 Updated Feb 23, 2025

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

462 19 Updated Dec 14, 2024

Dataset introduced in PlotQA: Reasoning over Scientific Plots

73 8 Updated Jun 20, 2023
Next