Skip to content

Commit

Permalink
[CITATION]
Browse files Browse the repository at this point in the history
  • Loading branch information
Kye committed Feb 8, 2024
1 parent 3dc0c4c commit ad2e578
Showing 1 changed file with 16 additions and 1 deletion.
17 changes: 16 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# Screen AI
Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"
Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding". The flow is:
img + text -> patch sizes -> vit -> embed + concat -> attn + ffn -> cross attn + ffn + self attn -> to out. [PAPER LINK: ](https://arxiv.org/abs/2402.04615)

## Install
`pip3 install screenai`
Expand Down Expand Up @@ -41,3 +42,17 @@ print(out.shape)

# License
MIT


## Citation
```bibtex
@misc{baechler2024screenai,
title={ScreenAI: A Vision-Language Model for UI and Infographics Understanding},
author={Gilles Baechler and Srinivas Sunkara and Maria Wang and Fedir Zubach and Hassan Mansoor and Vincent Etter and Victor Cărbune and Jason Lin and Jindong Chen and Abhanshu Sharma},
year={2024},
eprint={2402.04615},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```

0 comments on commit ad2e578

Please sign in to comment.