update readme

jijivski · Jul 2, 2024 · fba3e5c · fba3e5c
1 parent 891c7b2
commit fba3e5c
Showing 1 changed file with 17 additions and 18 deletions.
diff --git a/README.md b/README.md
@@ -22,14 +22,14 @@ This repo is the official implementation of [**ECCV2024**](https://eccv.ecva.net
 > [Haiyang Wang*](https://scholar.google.com/citations?user=R3Av3IkAAAAJ&hl=en&oi=ao), [Hao Tang*](https://scholar.google.com/citations?user=MyarrsEAAAAJ&hl=en), [Li Jiang](https://scholar.google.com/citations?user=5cIodxsAAAAJ&hl=en) $^\dagger$, [Shaoshuai Shi](https://scholar.google.com/citations?user=DC9wzBgAAAAJ&hl=en&oi=ao), [Muhammad Ferjad Naeem](https://scholar.google.com/citations?user=PR2DwYYAAAAJ&hl=en), [Hongsheng Li](https://scholar.google.com/citations?user=BN2Ze-QAAAAJ&hl=en&oi=ao), [Bernt Schiele](https://scholar.google.com/citations?user=z76PBfYAAAAJ&hl=en), [Liwei Wang](https://scholar.google.com/citations?user=VZHxoh8AAAAJ&hl=en) $^\dagger$
 > - Primary contact: Haiyang Wang ( [email protected] ), Hao Tang ( [email protected] )
 
-## Overview
-- [💫 What we want to do](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-what-we-want-to-do)
-- [🤔 Introduction](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-introduction)
-- [👀 Todo](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-todo)
-- [🚀 Main Results](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-main-results)
-- [🛠️ Quick Start](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#%EF%B8%8F-quick-start)
-- [👍 Acknowledgments](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-acknowledgement)
-- [📘 Citation](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-citation)
+<div align="center">
+  <img src="assets/Figure1.png" width="800"/>
+</div>
+
+## 📣 News
+- [24-7-01] 🤗 Our GiT was accepted by [ECCV2024](https://eccv.ecva.net/).
+- [24-3-15] 🚀 Training and inference Code is released.
+- [24-3-15] 👀 GiT is released on [arXiv](https://arxiv.org/abs/2403.09394).
 
 ## 💫 What we want to do
 ### The Model Architectures across various AI domains are converging towards <font color=Red>*Multi-Layer Plain Transformers*</font>. 
@@ -43,23 +43,22 @@ This repo is the official implementation of [**ECCV2024**](https://eccv.ecva.net
 
 We aim to unify the model architecture of vision and language through a plain transformer, **reducing human biases** such as modality-specific encoders and task-specific heads.  A key advancement in deep learning is the shift from hand-crafted to autonomously learned features, inspiring us to reduce human-designed aspects in architecture. Moreover, benefiting from the flexibility of plain transformers, our framework can extend to more modalities like [point clouds](https://github.com/Haiyang-W/UniTR) and graphs.
 
+## Overview
+- [💫 What we want to do](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-what-we-want-to-do)
+- [🤔 Introduction](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-introduction)
+- [👀 Todo](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-todo)
+- [🚀 Main Results](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-main-results)
+- [🛠️ Quick Start](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#%EF%B8%8F-quick-start)
+- [👍 Acknowledgments](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-acknowledgement)
+- [📘 Citation](https://github.com/Haiyang-W/GiT?tab=readme-ov-file#-citation)
+
 ## 🤔 Introduction
 Building a universal computation model across all tasks stands as the cornerstone of artificial intelligence, reducing the need for task-specific designs. In this project, we introduce GiT (**G**eneralist V**i**sion **T**ransformer). GiT has the following characteristics: 
  - 😮 **Minimalist architecture design similar to LLM**: GiT consists solely of a single transformer, without the inclusion of additional vision encoders and adapters.
  - 🚀 **Covering all types of visual understanding tasks**: GiT addresses a spectrum of visual tasks, including object-level tasks (e.g., object detection), pixel-level tasks (e.g., semantic segmentation), and vision-language tasks (e.g., image captioning).
  - 🤗 **Achieving multi-task ability by unified language interface**: Similar to LLM, GiT observes the task synergy effect in multi-task training. It fosters mutual enhancement across tasks, leading to significant improvements compared to isolated training.
  - 🔥 **Strong performance on zero-shot and few-shot benchmark**: GiT scales well with model size and data, demonstrating remarkable generalizability across diverse scenarios after training on 27 datasets.
 
-
-<div align="center">
-  <img src="assets/Figure1.png" width="800"/>
-</div>
-
-## 📣 News
-- [24-7-01] 🤗 Our GiT was accepted by [ECCV2024](https://eccv.ecva.net/).
-- [24-3-15] 🚀 Training and inference Code is released.
-- [24-3-15] 👀 GiT is released on [arXiv](https://arxiv.org/abs/2403.09394).
-
 ## 👀 Todo
 
 - [x] Release the [arXiv](https://arxiv.org/abs/2403.09394) version.