Skip to content

Deep Pose Estimation implemented using Tensorflow with Custom Architecture for fast inference.

License

Notifications You must be signed in to change notification settings

scyoyo/tf-pose-estimation

This branch is 143 commits ahead of Zumbalamambo/tf-openpose:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

8543abf · Feb 16, 2018
Jan 5, 2018
Feb 7, 2018
Jan 5, 2018
Feb 7, 2018
Feb 7, 2018
Jan 10, 2018
Feb 9, 2018
Feb 16, 2018
Jan 5, 2018
Oct 24, 2017
Feb 7, 2018
Jan 10, 2018
Jan 5, 2018
Jul 31, 2017
Feb 7, 2018
Jan 10, 2018
Feb 5, 2018

Repository files navigation

tf-pose-estimation

'Openpose' for human pose estimation have been implemented using Tensorflow. It also provides several variants that have made some changes to the network structure for real-time processing on the CPU or low-power embedded devices.

You can even run this on your macbook with descent FPS!

Original Repo(Caffe) : https://github.com/CMU-Perceptual-Computing-Lab/openpose

CMU's Original Model
on Macbook Pro 15"
Mobilenet Variant
on Macbook Pro 15"
Mobilenet Variant
on Jetson TX2
cmu-model mb-model-macbook mb-model-tx2
~0.6 FPS ~4.2 FPS @ 368x368 ~10 FPS @ 368x368
2.8GHz Quad-core i7 2.8GHz Quad-core i7 Jetson TX2 Embedded Board

Implemented features are listed here : features

Important Updates

2018.2.7 Arguments in run.py script changed. Support dynamic input size.

Install

Dependencies

You need dependencies below.

  • python3
  • tensorflow 1.4.1+
  • opencv3, protobuf, python3-tk

Install

$ git clone https://www.github.com/ildoonet/tf-openpose
$ cd tf-openpose
$ pip3 install -r requirements.txt

Models

I have tried multiple variations of models to find optmized network architecture. Some of them are below and checkpoint files are provided for research purpose.

  • cmu

    • the model based VGG pretrained network which described in the original paper.
    • I converted Weights in Caffe format to use in tensorflow.
    • pretrained weight download
  • dsconv

    • Same architecture as the cmu version except for the depthwise separable convolution of mobilenet.
    • I trained it using 'transfer learning', but it provides not-enough speed and accuracy.
  • mobilenet

    • Based on the mobilenet paper, 12 convolutional layers are used as feature-extraction layers.
    • To improve on small person, minor modification on the architecture have been made.
    • Three models were learned according to network size parameters.
    • I published models which is not the best ones, but you can test them before you trained a model from the scratch.

Download Tensorflow Graph File(pb file)

Before running demo, you should download graph files. You can deploy this graph on your mobile or other platforms.

  • cmu (trained in 656x368)
  • mobilenet_thin (trained in 432x368)

CMU's model graphs are too large for git, so I uploaded them on an external cloud. You should download them if you want to use cmu's original model. Download scripts are provided in the model folder.

$ cd models/graph/cmu
$ bash download.sh

Inference Time

Dataset Model Inference Time
Macbook Pro i5 3.1G
Inference Time
Jetson TX2
Coco cmu 10.0s @ 368x368 OOM @ 368x368
5.5s @ 320x240
Coco dsconv 1.10s @ 368x368
Coco mobilenet_accurate 0.40s @ 368x368 0.18s @ 368x368
Coco mobilenet 0.24s @ 368x368 0.10s @ 368x368
Coco mobilenet_fast 0.16s @ 368x368 0.07s @ 368x368

Demo

Test Inference

You can test the inference feature with a single image.

$ python3 run.py --model=mobilenet_thin --resolution=432x368 --image=...

The image flag MUST be relative to the src folder with no "~", i.e:

--image ../../Desktop

Then you will see the screen as below with pafmap, heatmap, result and etc.

inferent_result

Realtime Webcam

$ python3 run_webcam.py --model=mobilenet_thin --resolution=432x368 --camera=0

Then you will see the realtime webcam screen with estimated poses as below. This Realtime Result was recored on macbook pro 13" with 3.1Ghz Dual-Core CPU.

Python Usage

This pose estimator provides simple python classes that you can use in your applications.

See run.py or run_webcam.py as references.

e = TfPoseEstimator(get_graph_path(args.model), target_size=(w, h))
humans = e.inference(image)
image = TfPoseEstimator.draw_humans(image, humans, imgcopy=False)

ROS Support

See : etcs/ros.md

Training

See : etcs/training.md

References

OpenPose

[1] https://github.com/CMU-Perceptual-Computing-Lab/openpose

[2] Training Codes : https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation

[3] Custom Caffe by Openpose : https://github.com/CMU-Perceptual-Computing-Lab/caffe_train

[4] Keras Openpose : https://github.com/michalfaber/keras_Realtime_Multi-Person_Pose_Estimation

Lifting from the deep

[1] Arxiv Paper : https://arxiv.org/abs/1701.00295

[2] https://github.com/DenisTome/Lifting-from-the-Deep-release

Mobilenet

[1] Original Paper : https://arxiv.org/abs/1704.04861

[2] Pretrained model : https://github.com/tensorflow/models/blob/master/slim/nets/mobilenet_v1.md

Libraries

[1] Tensorpack : https://github.com/ppwwyyxx/tensorpack

Tensorflow Tips

[1] Freeze graph : https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py

[2] Optimize graph : https://codelabs.developers.google.com/codelabs/tensorflow-for-poets-2

About

Deep Pose Estimation implemented using Tensorflow with Custom Architecture for fast inference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • PureBasic 76.6%
  • Python 23.3%
  • Other 0.1%