Table of Contents
- In this day and age, we have many model detection such as Faster-RCNN, SDD, YOLO, and so on.
- More specifically, we will apply the lastest version of YOLO, namely YOLOv7. In order to take ROI in ID Card, we additionally use Perspective Transform based on 4 orientations of image, namely top-left, top-right, bottom-left, bottom-right.
- However, when we cut the ROI in image completely, the orientation of image is not correct. Moreover, many applications have used classification model to category the corners such as CNN, ResNet50, AlexNet, and so on. But this method will be low inference.
- Therefore, we decide to apply mathematics so as to calculate the corner replied on the orientated vector of top-left and top-right that we will describe in this repository.
First of all, we need to install anaconda environment.
- conda
conda create your_conda_environment conda activate your_conda_environment
Then, we install our frameworks and libraries by using pip command line.
- pip
pip install -r path/to/requirements.txt
We suggest that you should use python version 3.8.12 to implement this repository.
- Check CUDA and install Pytorch with conda
nvidia-smi conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
- Clone the repository
git clone https://github.com/Syun1208/IDCardDetectionAndRecognition.git
- Preprocessing Data
-
Dataset size
Total: 21777 images (100%) Train: 10888 images (50%) Val: 4355 images (20%) Test: 6534 (30%)
-
Data's label structure.
[ { "image": "/home/long/Downloads/datasets/version1/top_132045101_13680_jpg.rf.6d2adba419f676ee9bbab8c5a277a1b2.jpg", "id": 13946, "label": [ { "points": [ [ 8.88888888888889, 36.796875 ], [ 86.25, 37.1875 ], [ 85.83333333333333, 64.765625 ], [ 9.305555555555555, 64.609375 ] ], "polygonlabels": [ "top-cmnd" ], "original_width": 720, "original_height": 1280 } ], "annotator": 9, "annotation_id": 16871, "created_at": "2022-09-27T11:06:56.424119Z", "updated_at": "2022-09-27T11:06:58.197087Z", "lead_time": 15.073 }, ...................... ]
-
Folder structure trained on YOLOv7
βββ test β βββ images β βββ labels βββ train β βββ images β βββ labels βββ val βββ images βββ labels
-
If you want custom datasets(json) to yolo's bounding box, please run this command line.
python path/to/data/preprocessing/convertJson2YOLOv5Label.py --folderBoundingBox path/to/labels --folderImage path/to/images --imageSaveBoundingBox path/to/save/visualization --jsonPath path/to/json/label
-
If you want custom datasets(json) to yolo's polygon and 4 corners of images, please run this command line.
python path/to/data/preprocessing/convertJson2YOLOv54Corners.py --folderBoundingBox path/to/save/labels --folderPolygon path/to/save/labels --folderImage path/to/images --imageSaveBoundingBox path/to/save/visualization --imageSavePolygon path/to/save/visualization --jsonPath path/to/json/label
-
Padding your dataset containing image.
python path/to/data/preprocessing/augment_padding_datasets.py --folder path/to/folder/images --folder_save --path/to/save/result
- Testing on local computer
- Put your image's option and run to see the result
python path/to/main.py --weights path/to/weight.pt --cfg-detection yolov7 --img_path path/to/image
- Testing on API
- You need change your local host and port which you want to configure
python path/to/fast_api.py --local_host your/local/host --port your/port
- Request to API
- If you are down for requesting a huge of image to API, run this command
python path/to/test_api.py --url link/to/api --source path/to/folder/images
- Data Cleaning
- Preprocessing Data
- Model Survey and Selection
- Do research on paper
- Configuration and Training Model
- Testing and Evaluation
- Implement Correcting Image Orientation
- Build Docker and API using FastAPI
- Write Report and Conclusion
See the open issues for a full list of proposed features ( and known issues).
Based on the predicted bounding box, we will flip the image with 3 cases 90, 180, 270 degrees by calculating the angle between vector Ox and vector containing coordinates top left and top right for vector AB with A as top left, B is the top right of the image as shown below.
Let's assume that vector AB(xB - xA, yB-yA) is the combination between top_left(tl) and top_right(tr) coordination. Therefore, we will have the equation to rotate.
On the other hand, if the image has the angle which is different with zero and greater than 180 degrees, the image will be considered with the condition below to rotate suitably. Otherwise, the angle will be rotated following the figure above.
Finally we will flip in an anti-clockwise angle.
- Polygon Detection
- Correcting Image Rotation
- Image Alignment
- Results in API
[ { "image_name": "back_sang1_jpg.rf.405e033a9ecb2fb3593541e6ae20d056.jpg" }, [ { "class_id": 0, "class_name": "top_left", "bbox_coordinates": [ 11, 120, 111, 287 ], "confidence_score": 0.76953125 }, { "class_id": 1, "class_name": "top_right", "bbox_coordinates": [ 519, 136, 636, 295 ], "confidence_score": 0.85498046875 }, { "class_id": 2, "class_name": "bottom_right", "bbox_coordinates": [ 524, 383, 636, 564 ], "confidence_score": 0.89697265625 }, { "class_id": 3, "class_name": "bottom_left", "bbox_coordinates": [ 41, 404, 104, 560 ], "confidence_score": 0.7001953125 } ], { "polygon_coordinates": { "top_left": { "x_min": 61.0, "y_min": 203.5 }, "top_right": { "x_max": 577.5, "y_min": 215.5 }, "bottom_right": { "x_max": 580.0, "y_max": 473.5 }, "bottom_left": { "x_min": 72.5, "y_max": 482.0 } } } ]
- Fork the Project
- Create your Feature Branch
git checkout -b exist/folder
- Commit your Changes
git commit -m 'Initial Commit'
- Push to the Branch
git remote add origin https://git.sunshinetech.vn/dev/ai/icr/idc-transformation.git
git branch -M main
git push -uf origin main
- Open a Pull Request
My Information - LinkedIn - [email protected]
Project Link: https://github.com/Syun1208/IDCardDetectionAndRecognition.git