- C++ shared object (.so) with Neon SIMD for Python is runnable on Unix (Ventura 13.3) and Linux (Ubuntu Linux 22.04.02) System. Super fast using -O3
- C++ .so with Pybind11 for Python
The best template matching implementation on the Internet.
Using C++/MFC/OpenCV to build a Normalized Cross Corelation-based image alignment algorithm
The result means the similarity of two images, and the formular is as followed:
-
rotation invariant, and rotation precision is as high as possible
-
using image pyrimid as a searching strategy to speed up 4~128 times the original NCC method (depending on template size), minimizing the inspection area on the top level of image pyrimid
-
optimizing rotation time comsuming from OpenCV by setting needed "size" and modifying rotation matrix
-
SIMD version of image convolution (especially useful for large templates)
4.1 update Neon SIMD on MacOS version .so, super fast
-
optimizing the function GetNextMaxLoc () with struct s_BlockMax, for special cases whose template sizes are extremely smaller than source sizes, and for large TargetNumber.
It gets quite far.
Test case: Src10 (3648 X 3648) and Dst10 (54 X 54)
Effect: time consuming reduces from 534 ms to 100 ms. speed up 434%
Inspection Image : 4024 X 3036
Template Image: 762 X 521
Library | Index | Score | Angle | PosX | PosY | Execution Time |
---|---|---|---|---|---|---|
My Tool | 0 | 1 | 0.046 | 1725.857 | 1045.433 | 76ms 🎖️ |
My Tool | 1 | 0.998 | -119.979 | 2662.869 | 1537.446 | |
My Tool | 2 | 0.991 | 120.150 | 1768.936 | 2098.494 | |
Cognex | 0 | 1 | 0.030 | 1725.960 | 1045.470 | 125ms |
Cognex | 1 | 0.989 | -119.960 | 2663.750 | 1538.040 | |
Cognex | 2 | 0.983 | 120.090 | 1769.250 | 2099.410 | |
Aisys | 0 | 1 | 0 | 1726.000 | 1045.500 | 202ms |
Aisys | 1 | 0.990 | -119.935 | 2663.630 | 1539.060 | |
Aisys | 2 | 0.979 | 120.000 | 1769.63 | 2099.780 |
note: if you want to get a best performance, please make sure you are using release verson (both this project and OpenCV dll). That's because O2-related settings significantly affects efficiency, and the difference of Debug and Release can up to 7 times for some cases.
test0 - with user interface
test1 (164ms 80ms (SIMD version), TargetNum=5, Overlap=0.8, Score=0.8, Tolerance Angle=180)
test2 (237 ms, 175ms (SIMD Version))
test3 (152 ms, 100ms (SIMD Version))
test4 (21 ms, Target Number=38, Score=0.8, Tolerance Angle=0, Min Reduced Area=256)
test5 (27 ms)
test6 (1157ms, 657ms (SIMD Version), Target Number=15, Score=0.8, Tolerance Angle=180, Min Reduced Area=256)
test7 (18ms, TargetNum=100, Score=0.5, Tolerance Angle=0, MaxOverlap=0.5, Min Reduced Area=1024)
- Download Visual Studio 2017 or newer versions
- Check on the option of "x86 and x64 version of C++ MFC"
- Install
- Open MatchTool.vcxproj
- Upgrade if it is required
- Open this project's property page
- Modified "General-Output Directory" to the .exe directory you want (usually the directory where your opencv_worldXX.dll locates)
- Choose the SDK version you have in "General-Windows SDK Version"
- Choose the right toolset you have in "General-Platform Toolset" (for me, it is Visual Studio 2017 (v141))
- Go to "VC++ Directories", and type in "Include Directories" for your own OpenCV (e.g. C:\OpenCV3.1\opencv\build\include or C:\OpenCV4.0\opencv\build\include)
- Type in "Library Directories" for your own OpenCV's library path (the directory where your opencv_worldXX.lib locates)
- Go to "Linker-Input", and type in library name (e.g. opencv_world310d_vs2017.lib or opencv_world401d.lib)
- Make sure that your opencv_worldXX.dll and MatchTool.Lang are in the same directory as .exe of this project
1.Select Debug_4.X or Release_4.X in "Solution Configuration"
2.Do step 10~12 in previous section
- Select the Language you want
- Drag Source Image to the Left Area
- Drag Dst Image to the Right Top Area
- Push "Execute Button"
- Target Number: possible max objects you want to find in the inspection image
- Max OverLap Ratio: (the overlap area between two findings) / area of golden sample
- Score (Similarity): accepted similarity of findings (0~1), lower score causes more execution time
- Tolerance Angle: possible rotation of targets in the inspection image (180 means search range is from -180~180), higher angle causes more execution time or you can push "↓" button to select 2 angle range
- Min Reduced Area: the min area of toppest level in image pyrimid (trainning stage)
- results are sorted by score (decreasing order)
- Angles: inspected rotation of findings
- PosX, PosY: pixel position of findings
contact information: [email protected]
- C++ shared library (.so) for python (Unix-ARM64, Ubuntu 22.04.02-ARM64)
- C++/MFC dll for .Net framework (Windows)
- pure C++ dll for Python (Windows)
- pybind11 .so