zhangyan-ucas

zhangyan zhangyan-ucas

Institute of Information Engineering, Chinese Academy of Sciences
Beijing, China

Stars

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 13,428 1,297 Updated Dec 25, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 6,470 566 Updated Dec 31, 2024

zhangyan-ucas / TEA

Official implementation of the paper "Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues"

Python 9 Updated Dec 20, 2024

weichaozeng / TextCtrl

[2024-NeurIPS] TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

Python 40 4 Updated Dec 16, 2024

ragavsachdeva / magi

Generate a transcript for your favourite Manga: Detect manga characters, text blocks and panels. Order panels. Cluster characters. Match texts to their speakers. Perform OCR.

307 12 Updated Dec 20, 2024

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 5,940 479 Updated Jul 11, 2024

clovaai / units

Python 73 10 Updated Aug 7, 2023

Hon-Wong / Elysium

[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM

Python 64 2 Updated Oct 25, 2024

ViTAE-Transformer / DeepSolo

The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multi…

Python 253 34 Updated Aug 9, 2024