Real-Time Human Grasp Detection in Videos

This project proposes to implement an object detection network using deep learning techniques to predict in real-time if an RGB video contains a hand grasping an object.

Pre-Publication Paper: https://drive.google.com/file/d/1YBY8jsC4y6fuyIjIW1ykdgBmoluEp3AG/view?usp=sharing

Presentation Video: https://www.youtube.com/watch?v=y7nI9wQG0e8

Grab detection is the detection of hands grasping objects

This is a fork of the implementation of AlexeyAB's implementation of YoloV4 on darknet. linked here: https://github.com/AlexeyAB/darknet

How to use:

First install the required dependencies as described in the YOLOv4 github: https://github.com/AlexeyAB/darknet#requirements

Alternatively Augmented startups provides a great 2-part step-by-step guide for Windows on Youtube: https://www.youtube.com/watch?v=5pYh1rFnNZs

https://www.youtube.com/watch?v=sUxAVpzZ8hU In this video clone this github instead of the YOLO4 github

Download the required trained weights:

https://drive.google.com/file/d/1B9WDT8EKs0NLcTynmniGzeyvvGuh_Fcc/view?usp=sharing

Put the weights in the folder

~/darknet/build/darknet/x64/backup

To use on videos go to the directory:

~/darknet/build/darknet/x64

then run on the command line:

darknet.exe detector demo data/obj.data yolo-obj.cfg backup/yolo-obj_best.weights filename_of_your_video

Our Grab Dataset:

A human grasping dataset taken from different angles.

https://drive.google.com/file/d/1xeTGrnWud8X1A9PuonK_mwIHl6UsHeUr/view?usp=sharing

DEMO VIDEOS

Results on a video from the Grab Dataset Evaluation Set

https://drive.google.com/open?id=1Hs_dKiOXMXJupfJTYxankmKEhLa_U2Q0

Results on a video from the UTGrasp Dataset

https://drive.google.com/open?id=1L9LAARDvmwcIoDtLduz9YnWYOrOeSDeK

Experimental Results:

Average Precision results at IoU of 0.5

Average FPS on Videos

Comparison with other Object Detection Architectures in the task of Grab detection:

Name		Name	Last commit message	Last commit date
Latest commit History 1,877 Commits
.circleci		.circleci
.github		.github
3rdparty		3rdparty
build/darknet		build/darknet
cfg		cfg
cmake		cmake
data		data
include		include
results		results
scripts		scripts
src		src
.gitignore		.gitignore
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
DarknetConfig.cmake.in		DarknetConfig.cmake.in
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
build.ps1		build.ps1
build.sh		build.sh
darknet.py		darknet.py
darknet_video.py		darknet_video.py
image_yolov2.sh		image_yolov2.sh
image_yolov3.sh		image_yolov3.sh
json_mjpeg_streams.sh		json_mjpeg_streams.sh
net_cam_v3.sh		net_cam_v3.sh
video_v2.sh		video_v2.sh
video_yolov3.sh		video_yolov3.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time Human Grasp Detection in Videos