This project leverages the Jetson Nano's computational power to augment a drone with computer vision capabilities and allow gesture control. The deep learning model deployed here is part of a larger project, a Pose Classification Kit, focusing on pose estimation/classification applications toward new human-machine interfaces.
- Demonstration & Processing pipeline description
- Getting Started
- Usage
- System performance
- Additional ressources
- License
- Acknowledgments
- Install PyTorch and Torchvision - see PyTorch for Jetson Nano.
- Install TensorFlow - see Installing TensorFlow For Jetson Platform. Note that TensorFlow is already installed on JetPack.
- Install torch2trt
git clone https://github.com/NVIDIA-AI-IOT/torch2trt cd torch2trt sudo python3 setup.py install --plugins
- Install other miscellaneous packages
sudo pip3 install numpy Pillow pymavlink dronekit
- Clone this repository on your Jetson Nano
git clone https://github.com/ArthurFDLR/drone-gesture-control cd drone-gesture-control
- Download and place the TensorRT-Pose pre-trained model resnet18_baseline_att_224x224_A_epoch_249.pth in the folder
.\drone-gesture-control\models
- Run the installation procedure. This operation can take a while.
sudo python3 install.py
- Wire the UART ports D15 (RX) - D14 (TX) on the J41 expansion header pins of the carrier board of the Jetson Nano to a MAVLink enabled serial port on your flight controller. See bellow a setup example using the Pixhawk 4 flight controller. The default baud rate is 57600.
-
Disable the Serial Console trigger on the serial connection - see Jetson Nano โ UART.
systemctl stop nvgetty systemctl disable nvgetty udevadm trigger
-
Connect your camera to the CSI port 0. You might have to adapt the GStreamer pipeline for your camera - see
gstream_pipeline
in.\drone-gesture-control\__main__.py
. The camera used for development is an Arducam IMX477 mini.
python3 drone-gesture-control
The gesture control system currently supports only basics - yet essential - commands:
- T: Arm the drone if it is disarmed and landed; Disarm the drone if it is armed and landed.
- Traffic_AllStop: Take-off at an altitude of 1.8m if the drone is armed and landed; Land if the drone is in flight.
- Traffic_RightTurn: Move 4m to the right if the drone is armed.
- Traffic_LeftTurn: Move 4m to the left if the drone is armed.
- Traffic_BackFrontStop: Move 2m backward if the drone is armed.
- Traffic_FrontStop: Move 2m forward if the drone is armed.
- Yoga_UpwardSalute: Return to Launch (RTL).
For security purposes, the system only transmits orders to the flight controller if it is in GUIDED mode. We recommend binding a switch of your radio controller to select this mode for ease of use.
The classification model used in this project is the best performing of the Pose Classification Kit (PCK). This model yields great results both in terms of accuracy and inference time. During flights, it is pretty common for the embedded camera only to record a person's upper body. The system thus has to be highly reliable even on partial inputs. The model is tested on two datasets to ensure this property: the original PCK dataset and the same samples with missing keypoints. The testing accuracies on these datasets respectively reach 98.3% and 95.1%. As shown in the confusion matrices bellow (left: original dataset - right: partial inputs), even poses that are hardly distinguishable by humans (only looking at the upper-body) are almost perfectly classified by the model. After optimization (see .\install.py
), the whole processing pipeline - from image capture to drone control - consistently run at speed from 9.5Hz to 13Hz.
- Vision-Based Gesture-Controlled Drone:A Comprehensive Python Package to DeployEmbedded Pose Recognition Systems, Arthur Findelair, August 11 2021
- Presentation slides of the project
Distributed under the MIT License. See LICENSE
for more information.
Many thanks to the ECASP Laboratory from the Illinois Institute of Technology that has provided all the necessary hardware to develop this project.