Skip to content

A C++ reimplementation of Near Duplicate Video Detection - Get a 64-bit comparable hash-value for any video (Video Hash).

License

Notifications You must be signed in to change notification settings

helloall1900/vhash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

1f18f2d · Mar 2, 2024

History

1 Commit
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024
Mar 2, 2024

Repository files navigation


The hash tool for duplicate video and image detection

Build Status Build Status License C++


Introduction

vhash is a C++ reimplementation of videohash for detecting near-duplicate videos. It takes any input video or image file and generate a 64-bit equivalent hash value.


Build vhash

Requirements

  • A C++ compiler supports C++14
  • CMake >= 3.11

Dependencies

External

  • opencv for image decoding & resizing
  • ffmpeg for video decoding & frame extracting
  • fftw for discrete cosine transform (DCT)
  • sqlite3 for file hash value caching
  • spdlog for logging

CentOS

sudo yum install opencv-devel ffmpeg-devel fftw-devel sqlite-devel spdlog-devel

Ubuntu

sudo apt install libopencv-dev libavformat-dev libavcodec-dev libavdevice-dev libavutil-dev libswscale-dev
sudo apt install libfftw3-dev libsqlite3-dev libspdlog-dev

macOS

brew install opencv@4 ffmpeg@5 fftw sqlite spdlog
brew link ffmpeg@5

Included

Compile

git clone https://github.com/helloall1900/vhash.git
cd vhash
make
bin/vhash hash tests/testdata/lena.png

Development

Dependencies

CentOS

sudo yum install gtest-devel google-benchmark-devel

Ubuntu

sudo apt install libgtest-dev libbenchmark-dev

macOS

brew install googletest google-benchmark

Features

  • Generate hash value of single file or files in directory.
  • Store file's hash value in db cache to speed up hash generation.
  • Find duplicate video or image files in directory.

Usage

Hash

Generating hash for video or image files

Usage: vhash hash [OPTIONS] path  

Positionals:  
path TEXT:PATH(existing) REQUIRED file or directory path  

Options:  
-h,--help                   Print this help message and exit  
-e,--ext TEXT ...           file extension filter (i.e. -e mp4,mkv)  
-c,--cache TEXT             cache file or url  
-o,--output TEXT            output file  
-C,--use-cache              use cache  
-r,--recursive              recursively find files  
-P,--no-progress            not print progress bar  
bin/vhash hash -C -o hash.txt some_dir_path

Cache

Operating on hash cache

Usage: vhash cache [OPTIONS] [path]  

Positionals:  
path TEXT                     full file path  

Options:  
-h,--help                     Print this help message and exit  
-c,--cache TEXT               cache file or url  
-f,--find                     find cache item  
-d,--del                      delete cache item  
-C,--clear                    clear all hash cache  
-p,--pure                     pure expired hash cache  
-P,--pure-period INT [604800] pure period in seconds
bin/vhash cache -f some_file_path

Dup

Finding duplicate video or image files

Usage: vhash dup [OPTIONS] [path]  

Positionals:  
path TEXT:PATH(existing)    file or directory path  

Options:  
-h,--help                   Print this help message and exit  
-e,--ext TEXT ...           file extension filter (i.e. -e mp4,mkv)  
-c,--cache TEXT             cache file or url  
-o,--output TEXT            output file  
-C,--use-cache              use cache  
-r,--recursive              recursively find files  
-P,--no-progress            not print progress bar
bin/vhash dup -C -o dup.txt some_dir_path

Credits


License

License: MIT

Copyright (c) 2023 Leo. See LICENSE for details.

About

A C++ reimplementation of Near Duplicate Video Detection - Get a 64-bit comparable hash-value for any video (Video Hash).

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published