Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find object issues with large session file (> 4GB) #66

Open
nhudinh2103 opened this issue Sep 7, 2018 · 5 comments
Open

Find object issues with large session file (> 4GB) #66

nhudinh2103 opened this issue Sep 7, 2018 · 5 comments

Comments

@nhudinh2103
Copy link

Hi matlabe,

When I use find-object to load my dataset (~ 25000 files) and save as session, file generated is big (4.1 GB)

I reload session to find-object in Linux (CentOS) and it cause strange behaviour.

Without tcmalloc:

  • Find-object crash and cause Segmentation fault exception

segmentationfault

  • Stacktrace

stacktrace

With tcmalloc:

  • Find-object warning (... large alloc ...) when load session and can not detect objects from scene

withtcmalloc

detect4gb

Load with smaller subset data (1GB)

  • Find-object load session and detect objects without no problem

detect1gb

Note:

  • When I load big session file (> 4GB) in Windows, it crash (like in Linux)
  • I have sent objects and sessions via your email ([email protected]).
@nhudinh2103
Copy link
Author

I think this issue maybe related with OpenCV
opencv/opencv#1438

@matlabbe
Copy link
Member

matlabbe commented Sep 7, 2018

I don't have crash with the dataset you sent me. Here is the end of the update of the vocabulary with parameters saved in your session 1gb-model.bin (clicking on "update objects" on GUI):

[ INFO] Object 222114, 163 words from 163 descriptors (770992 words, 35 ms) 
[ INFO] Object 222115, 136 words from 137 descriptors (771089 words, 32 ms) 
[ INFO] Object 222116, 200 words from 202 descriptors (771229 words, 46 ms) 
[ INFO] Object 222117, 150 words from 152 descriptors (771347 words, 37 ms) 
[ INFO] Object 222120, 272 words from 272 descriptors (771486 words, 4971 ms) updated
[ INFO] Object 222121, 260 words from 265 descriptors (771645 words, 13 ms) 
[ INFO] Object 222122, 227 words from 230 descriptors (771769 words, 18 ms) 
[ INFO] Object 222123, 169 words from 171 descriptors (771882 words, 14 ms) 
[ INFO] Object 222124, 341 words from 344 descriptors (772117 words, 33 ms) 
[ INFO] Object 222125, 306 words from 308 descriptors (772264 words, 34 ms) 
[ INFO] Object 222126, 338 words from 339 descriptors (772482 words, 43 ms) 
[ INFO] Object 222127, 411 words from 417 descriptors (772771 words, 59 ms) 
[ INFO] Object 222128, 324 words from 326 descriptors (773025 words, 54 ms) 
[ INFO] Object 222132, 134 words from 134 descriptors (773140 words, 24 ms) 
[ INFO] Object 222136, 120 words from 120 descriptors (773232 words, 22 ms) 
[ INFO] Object 222143, 197 words from 198 descriptors (773361 words, 39 ms) 
[ INFO] Creating incremental vocabulary... done! size=773361 (743127 ms)
[ INFO] Extracting descriptors from object -1...
[ INFO] 1599 descriptors extracted from object -1 (in 91 ms)
[ INFO] (13:51:10.008) 2 objects detected!

When saving this session to test2.bin and reload it from console with the scene:

./find_object --scene scene/20180720_004.jpg --session test2.bin --console --json out.json
[ INFO] Options:
[ INFO]    GUI mode = false
[ INFO]    Session path: "test2.bin"
[ INFO]    Scene path: "scene/20180720_004.jpg"
[ INFO]    JSON path: "out.json"
[ INFO]    Settings path: ""
[ INFO]    Vocabulary path: ""
[ INFO] Extracting descriptors from object -1...
[ INFO] 1599 descriptors extracted from object -1 (in 73 ms)
[ INFO] 2 objects detected! (368 ms)
[ INFO] JSON written to "out.json"

$ cat out.json
...
"objects" : [ "object_113936", "object_219996" ]

If I use your session:

./find_object --scene scene/20180720_004.jpg --session session/1gb-model.bin --console
[ INFO] Options:
[ INFO]    GUI mode = false
[ INFO]    Session path: "session/1gb-model.bin"
[ INFO]    Scene path: "scene/20180720_004.jpg"
[ INFO]    JSON path: ""
[ INFO]    Settings path: ""
[ INFO]    Vocabulary path: ""
[ INFO] Extracting descriptors from object -1...
[ INFO] 1599 descriptors extracted from object -1 (in 102 ms)
[ INFO] Object 219996 detected! (461 ms)

Only one object is detected. I suspect that regenerating the vocabulary on my system (maybe different OpenCV version) gave a slightly different vocabulary:
test2.bin: 773361 words
1gb-model.bin: 774636 words

This is the opencv version I am using (Ubuntu 16.04, with ROS Kinetic installed):

$ ldd find_object | grep opencv
	libopencv_imgcodecs3.so.3.3 => /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgcodecs3.so.3.3 (0x00007fc7c493c000)
	libopencv_core3.so.3.3 => /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_core3.so.3.3 (0x00007fc7c3a03000)
	libopencv_xfeatures2d3.so.3.3 => /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_xfeatures2d3.so.3.3 (0x00007fc7c2bd1000)
	libopencv_video3.so.3.3 => /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_video3.so.3.3 (0x00007fc7c25e8000)
	libopencv_calib3d3.so.3.3 => /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_calib3d3.so.3.3 (0x00007fc7c2243000)
	libopencv_features2d3.so.3.3 => /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_features2d3.so.3.3 (0x00007fc7c1f61000)
	libopencv_videoio3.so.3.3 => /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_videoio3.so.3.3 (0x00007fc7c1d2e000)
	libopencv_flann3.so.3.3 => /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_flann3.so.3.3 (0x00007fc7c1ad4000)
	libopencv_imgproc3.so.3.3 => /opt/ros/kinetic/lib/x86_64-linux-gnu/libopencv_imgproc3.so.3.3 (0x00007fc7bf1de000)

cmake output:

cmake ..
-- CATKIN_BUILD=
-- Found Tcmalloc: /usr/lib/libtcmalloc_minimal.so
-- --------------------------------------------
-- Info :
--   CMAKE_INSTALL_PREFIX = /usr/local
--   CMAKE_BUILD_TYPE = Release
--   PROJECT_VERSION = 0.6.2
--   With OpenCV 3 xfeatures2d module (SIFT/SURF/BRIEF/FREAK) = YES
--   With Qt5 = YES
--   With tcmalloc = YES
-- --------------------------------------------
-- Configuring done
-- Generating done

@nhudinh2103
Copy link
Author

Hi matlabe,

Can you share your config when load object above so I can compare with my config?

You can try with 4gb-model.bin and object-session-4GB.zip, I have uploaded it

@matlabbe
Copy link
Member

I used the same config file. I tested the 4GB dataset, and there was a maximum vocabulary size limitation with how the session was saved and loaded. The code is using QByteArray to save the data, for which the maximum data that it can hold is 2GB (max integer). I compress now the vocabulary to limit the size of the exported vocabulary size under 2 GB (900MB compressed instead of 3GB uncompressed on this dataset). If compressed vocabulary is over 2 GB, an error will be shown. See commit 556bf5f

Example:

./find_object --session_new test4GB.bin --config config/config.ini --objects objects/TanDinh-scaled --console --images_not_saved

Detection:

./find_object --session test4GB.bin --scene scene/20180720_004.jpg --json out.json --console
[ INFO] Options:
[ INFO]    GUI mode = false
[ INFO]    Session path: "test4GB.bin"
[ INFO]    Scene path: "scene/20180720_004.jpg"
[ INFO]    JSON path: "out.json"
[ INFO]    Settings path: ""
[ INFO]    Vocabulary path: ""
[ INFO] Loading words to objects references...
[ INFO] Loaded 6370743 object references...
[ INFO] Loading words... (compressed format: 924 MB)
[ INFO] Uncompress vocabulary...
[ INFO] Words: 6370743x128 (3110 MB)
[ INFO] Update vocabulary index...
[ INFO] 1599 descriptors extracted from object -1 (in 36 ms)
[ INFO] 3 objects detected! (682 ms)
[ INFO] JSON written to "out.json"

$ cat out.json
...
"objects" : [ "object_94238", "object_113936", "object_219996" ]
}

@nhudinh2103
Copy link
Author

Thanks matlabe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants