Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
Update readme
  • Loading branch information
MulongXie authored Jul 6, 2021
1 parent a21afa4 commit 78332cf
Showing 1 changed file with 27 additions and 31 deletions.
58 changes: 27 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,27 @@
# UIED - UI element detection part of UI2CODE, detecting UI elements from UI screenshots or drawnings
# UIED - UI element detection, detecting UI elements from UI screenshots or drawnings

>The repo is **currently updating**, to use the original stable version, check the latest relase https://github.com/MulongXie/UIED/releases/tag/v2.3
>This project is still ongoing and this repo may be updated irregularly, I also implement a web app for this project in http://uied.online
This project is still ongoing and this repo may be updated irregularly, I developed a web app for the UIED in http://uied.online

## Related Publications:
[1. UIED: a hybrid tool for GUI element detection](https://dl.acm.org/doi/10.1145/3368089.3417940)

[2. Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?](https://arxiv.org/abs/2008.05132)

>The repo has been **upgraded with Google OCR** for GUI text detection, to use the original version in our paper (using [EAST](https://github.com/argman/EAST) as text detector), check the relase [v2.3](https://github.com/MulongXie/UIED/releases/tag/v2.3) and download the pre-trained model in [this link](https://drive.google.com/drive/folders/1MK0Om7Lx0wRXGDfNcyj21B0FL1T461v5?usp=sharing).
## What is it?

UI Element Detection (UIED) is an old-fashioned computer vision (CV) based element detection approach for graphic user interface.

The input of UIED could be various UI image, such as mobile app or web page screenshot, UI design drawn by Photoshop or Sketch, and even some hand-drawn UI design. Then the approach detects and classifies text and graphic UI elements, and exports the detection result as JSON file for future application.

UIED comprises two parts to detect UI text and graphic elements, such as button, image and input bar.
* For text, it leverages a state-of-the-art scene text detector [EAST](https://github.com/argman/EAST) to perfrom detection.
* For text, it leverages [Google OCR](https://cloud.google.com/vision/docs/ocr) to perfrom detection.

* For graphical elements, it uses old-fashioned CV approaches to locate the elements and a CNN classifier to achieve classification.

![UIED Approach](https://github.com/MulongXie/UIED/blob/master/data/demo/approach.png)

* For graphical elements, it uses old-fashioned CV and image processing algorithms with a set of creative innovations to locate the elements and applies a CNN to achieve classification.

## How to use?

### Dependency
Expand All @@ -32,48 +34,42 @@ UIED comprises two parts to detect UI text and graphic elements, such as button,
* **Pandas 0.23.4**

### Installation
Install the mentioned dependencies, and download two pre-trained models from [this link](https://drive.google.com/drive/folders/1MK0Om7Lx0wRXGDfNcyj21B0FL1T461v5?usp=sharing) for EAST text detection and GUI element classification.
<!-- Install the mentioned dependencies, and download two pre-trained models from [this link](https://drive.google.com/drive/folders/1MK0Om7Lx0wRXGDfNcyj21B0FL1T461v5?usp=sharing) for EAST text detection and GUI element classification. -->

Change ``CNN_PATH`` and ``EAST_PATH`` in *config/CONFIG.py* to your locations.
<!-- Change ``CNN_PATH`` and ``EAST_PATH`` in *config/CONFIG.py* to your locations. -->

The new version of UIED equipped with Google OCR is easy to deploy and no pre-trained model is needed. Simply donwload the repo along with the dependencies.

### Usage
To test your own image(s):
* For testing single image, change ``input_path_img`` in *run_single.py* to your own input image and the results will be outputted to ``output_root``.
* For testing mutiple images, change ``input_img_root`` in *run_batch.py* to your own input directory and the results will be outputted to ``output_root``.
* To test single image, change *input_path_img* in ``run_single.py`` to your input image and the results will be output to *output_root*.
* To test mutiple images, change *input_img_root* in ``run_batch.py`` to your input directory and the results will be output to *output_root*.
* To adjust the parameters lively, using ``run_testing.py``

> Note: The best set of parameters vary for different types of GUI image (Mobile App, Web, PC). Three of critical ones are ``{'param-grad', 'param-block', 'param-minarea'}`` which can be easily adjusted in *detect_compo\ip_region_proposal.py*.
> Note: The best set of parameters vary for different types of GUI image (Mobile App, Web, PC). I highly recommend to first play with the ``run_testing.py`` to pick a good set of parameters for your data.
## File structure
*cnn/*
``cnn/``
* Used to train classifier for graphic UI elements
* Set path of the CNN classification model

*config/*
``config/``
* Set data paths
* Set parameters for graphic elements detection

*data/*
``data/``
* Input UI images and output detection results

*detect_compo/*
* Graphic UI elemnts localization
* Graphic UI elemnts classification by CNN

*detect_text_east/*
* UI text detection by EAST

*result_processing/*
* Result evaluation and visualizition

*merge.py*
* Merge the results from the graphical UI elements detection and text detection
``detect_compo/``
* Non-text GUI component detection

*run_batch.py*
* Process a batch of images
``detect_text/``
* GUI text detection using Google OCR

*run_single.py*
* Process a signle image
``detect_merge/``
* Merge the detection results of non-text and text GUI elements

The major detection algorithms are in ``detect_compo/``, ``detect_text/`` and ``detect_merge/``

## Demo
GUI element detection result for web screenshot
Expand Down

0 comments on commit 78332cf

Please sign in to comment.