Skip to content

DAISY descriptors on GPU using deep learning libraries

License

Notifications You must be signed in to change notification settings

hmorimitsu/daisy-gpu

Repository files navigation

daisy-gpu

Implementation of the DAISY descriptor [1] on GPU using deep learning libraries. Codes are provided for PyTorch, Tensorflow 1, and Tensorflow 2.

This implementation is based on and borrows some parts of code from the scikit-image version available at https://github.com/scikit-image/scikit-image/blob/master/skimage/feature/_daisy.py.

This code is able to process a batch of images simultaneously for better performance. The most expensive operation when running in GPU mode is the allocation of the space for the descriptors on the GPU. However, this step is only performed when the shape of the input batch changes. Subsequent calls using batches with the same shape as before will reuse the memory and will, therefore, be much faster.

Code for SIFT Flow descriptors on GPU is also available at https://github.com/hmorimitsu/sift-flow-gpu.

Requirements

Base

Deep learning libraries

Usage

PyTorch version

A simple example is shown below. A more complete practical usage is available as a Jupyter demo notebook

from daisy_torch import DaisyTorch

daisy = DaisyTorch()
imgs = [
    read_some_image,
    read_another_image
]
descs = daisy.extract_descriptor(imgs) # This first call can be
                                       # slower, due to memory allocation
imgs2 = [
    read_yet_another_image,
    read_even_one_more_image
]
descs2 = daisy.extract_descriptor(imgs2) # Subsequent calls are faster,
                                         # if images retain same shape

# descs[0] is the descriptor of imgs[0] and so on.

Tensorflow 1 version

from daisy_tf import DaisyTF

daisy = DaisyTF()
imgs = [
    read_some_image,
    read_another_image
]
imgs2 = [
    read_yet_another_image,
    read_even_one_more_image
]
imgs_tf, descs_tf = daisy.extract_descriptor(imgs[0].shape[:2])
with tf.Session() as sess:
    # This first call can be slower, due to memory alloc
    descs = sess.run(
        descs_tf,
        feed_dict={
            imgs_tf:
            np.stack(imgs, axis=0)[:, None].astype(np.float32)})
    # Subsequent calls are faster if images retain same shape
    descs2 = sess.run(
        descs_tf,
        feed_dict={
            imgs_tf:
            np.stack(imgs2, axis=0)[:, None].astype(np.float32)})

# descs[0] is the descriptor of images[0] and so on.

Tensorflow 2 version

from daisy_tf2 import DaisyTF2

daisy = DaisyTF2()
imgs = [
    read_some_image,
    read_another_image
]
descs = daisy.extract_descriptor(imgs) # This first call can be
                                       # slower, due to memory alloc
imgs2 = [
    read_yet_another_image,
    read_even_one_more_image
]
descs2 = daisy.extract_descriptor(imgs2) # Subsequent calls are faster,
                                         # if images retain same shape

# descs[0] is the descriptor of images[0] and so on.

Benchmark

Update: The times reported previously were incorrect, because they were being computed without synchronizing the GPU. The table below was updated with the corrected values.

  • Machine configuration:
    • Intel i7 8750H
    • NVIDIA GeForce GTX1070
    • Images 1024 x 436
    • Descriptor size 200
Batch Size PyTorch
Time CPU(ms)
PyTorch
Time GPU(ms)1
PyTorch FP16
Time GPU(ms)1
TF2
Time GPU(ms)1
PyTorch
Time GPU(ms)2
TF1
Time GPU(ms)2
TF2
Time GPU(ms)2
1 309.8 27.9 25.0 21.3 37.9 26.5 31.6
2 534.9 39.8 34.8 38.0 57.1 48.2 63.4
4 998.3 79.6 67.1 75.3 113.5 92.6 123.6
8 2009.8 158.3 134.9 150.4 226.4 187.0 251.1

1 NOT including time to transfer the result from GPU to CPU

2 Including time to transfer the result from GPU to CPU

These times are the median of 5 runs measured after a warm up run to allocate the descriptor space in memory (read the introduction).

References

[1] E. Tola; V. Lepetit; P. Fua : Daisy: An Efficient Dense Descriptor Applied to Wide Baseline Stereo; IEEE TPAMI. 2010. DOI : 10.1109/TPAMI.2009.77.

About

DAISY descriptors on GPU using deep learning libraries

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published