as_DeepClaw: Deep Learning for Arcade Claw Grasping

as_DeepClaw is intended to be setup as a robotic grasping platform for deep learning based research and development. The robot is configured to perform similar tasks like the arcade game of claw crane. The goal of the game is for the player to control the crane in the horizontal x-y plane to determine an optimal coordinate to drop the claw in the vertical z axis. While descending, the claw will close while approaching the bottom of a transparent, enclosed cabinet filled with stuffed toys as rewards. If successful, one (or multiple if lucky) reward will be picked up by the closing claw, and deliver the reward to the lucky (or skilled) player by opening the claw above a drop hole. This is a very interesting game that’s been very popular in arcade game studios since the very beginning. In this project, a robotic system is setup in the same way to perform grasping tasks for deep learning training and validation.

Task Decomposition

Start
- Preliminary Preparation
- Safety Check
- From Idle Position (x_idle, y_idle, z_idle=0, theta_idle, open)
Operation
- Blind Grips (~5k in 1 week)
- Blind Learning with all Blind Grips attempted in 1 week
- Daily Grips (~1k in 1 day)
- Daily Learning with all Daily Grips attempted in 1 day
- Repeat Daily Grips Daily Learning for 8 weeks
End
- Back to Idle Position (x_idle, y_idle, z_idle=0, theta_idle, open)
- Learning and Grasping Result Summary

Start Stage

System Setup
- World => Pedestal => Arm => FT Sensor => Gripper
- World => Desk => Tray => Objects (to be picked)
- World => Desk => Bin => Objects (to be placed)
- World => Camera Kinect (main rgd-d camera for training, focusing on the tray)
- World => Camera LifeCam (auxiliary rgd camera for training, focusing on the tray)
- World => Camera Canon (record the whole experiment, focusing on the whole robot operation)

Coordinate Setup

		Notes	X	Y	Z	Theta
World	Measured
Pedestal Foot A	Measured
Pedestal Foot B	Measured
Pedestal Center	Calculated
Arm Base Center	Calculated
Desk Foot A	Measured
Desk Foot B	Measured
Desk Center	Calculated
Tray Foot A	Measured
Tray Foot B	Measured
Bin Center	Calculated
Bin Foot A	Measured
Bin Foot B	Measured
Bin Center	Calculated
Camera Kinect	Measured
Camera LifeCam	Measured
Camera Canon	Measured
Zero Plane	Measured
Hover Plane	Measured
Grip Plane	Measured
Idle Position	Measured
Drop Position	Measured

Hardware

Robot System (RS)
- Arm: UR5 from Universal Robot
- FT Sensor: FT300 from Robotiq
- Gripper: Adaptive 3 Finger from Robotiq
- Camera 1: Kinect for X-box One from Microsoft
- Camera 2: LifeCam from Microsoft
- Camera 3: Canon EOS M3
Learning Computer (LC)
- OS: Ubuntu Trusty 14.04.5
- CPU: Intel Core i7 6800K Hex-Core 3.4GHz
- GPU: Single 12GB NVIDIA TITAN X (Pascal)
- RAM: 32GB Corsair Dominator Platinum 3000MHz (2 X 16GB)
- SSD: 1TB Samsung 850 Pro Series

Blind Stage

The blind stage starts with Blind Grips and followed by the Blind Learning, aiming at initializing the neural network. Basically, we need to give the robot a chance to explore the world before we benchmark its performance. Specifically, we choose the following strategy to collect some initial test data to better understand the nature of this experiment.

Start the robot with total blind grasps, i.e. just reach out to any coordinate within the allowed workspace, perform a grasp, and check the results right after.
- The advantage would be the simplicity and automation on programming without the involvement of neural network, and
- The disadvantage would be the lack of purpose, which may result in low successful grasp in the beginning, slowing down the learning process.

Based on initial results collected, this process could be improved with the following strategy to boost the learning process.

Start the robot with human-guided grasps, i.e. send labelled coordinates of potential grasps to the robot and let the robot grasp (this is the part that is interestingly like "claw machine", possibly making it a joy to "play" with it.)
- The advantage would be the "manual" speedup of the learning process with purpose and successful grasps, making it a supervised learning problem, and
- The disadvantage would be the uncertainty for an autonomously learned network for robotic grasping with human intervention, would it be a good or bad thing?

The Blind Grips consist of many Blind Grip Cycles, and each Blind Grip Cycle consists of many Blind Grip Attempts. Note that one can treat the tray as the bin to speed up the data collection. One can also merge the idle and drop position to further simplify the workflow.

Start from idle position c_idle
[Blind] Grip Cycle n = 1 (start of this Learning Cycle)
- ...
[Blind] Grip Cycle n (end of last Grip Cycle n-1 = start of this Grip Cycle n)
- Shot 00
  - [take picture I_n_00 without gripper, only the objects in tray]
- Move 00
  - [move gripper horizontally to c_n_0 following a random vector v_n_0 = (x_n_0, y_n_0, z_n_0=0 , st_n_0, ct_n_0, open ) to the zero plane]
- Shot 01
  - [take picture I_n_01 with gripper and objects in tray]
- Move 1
  - [move gripper to c_n_1 following a random vector v_n_1 = (x_n_1, y_n_1, z_n_1=grip , st_n_1, ct_n_1, open ) to the grip plane]
- Shot 1
  - [take picture I_n_1 with gripper and objects in tray]
- Pick
  - [close the gripper for a grip attempt]
- Shot 11
  - [take picture I_n_11 of the tray, recording the immediate grip result]
- Move
  - [move gripper to the drop position c_drop above the bin]
- Shot 12
  - [take picture I_n_12 of the tray, recording the confirmed results in the tray after grip attempt, the number of objects in tray "-1"]
- Drop
  - [open the gripper at drop position c_drop to release object to the bin]
- Shot 13
  - [5K I_n_13 of the bin, recording the confirmed results in the bin after drop, the number of objects in bin "+1"]
[Blind] Grip Cycle n+1 (end of this Grip Cycle n = start of next Grip Cycle n+1)
- ...
[Blind] Grip End (end of this Learning Cycle = start of next Learning Cycle)
- Move [move gripper back to idle position c_idle, marking the end of the experiment]

After the Blind Grips, the data collected from Robot System (RS) will be supplied to the Learning Computer (LC) for network training. Data to be input to the network includes

Camera data
- I_n_00, I_n_01
- all need to be preprocessed with a random crop
- both rgb-d and rgb camera data could be used for training, taken from two different angles
Robot data
- c_n_1 - c_n_0
- note that this is vector data representing relative position
Reward data
- r_g_n [0/1] grasp success marker processed from I_n_11 and I_n_12

Each grip cycle can be further simplified with a few Integrated Task Flow (ITF) as the following.

[Blind] Grip Cycle n (end of last Grip Cycle n-1 = start of this Grip Cycle n)
- ITF-SMS0 (= Shot00 => Move 00 => Shot 01)
- ITF-MoSo (= Move 1 => Shot 1)
- ITF-Pick (= Pick => Shot 11 => Move => Shot 12 => Drop => Shot 13)

Daily Stage

Once a set of initial weights are obtained, the data collection and model training process is then carried out daily. For example, the Robot System will collect data during day time and leave the Learning Computer to update the network during night time. The daily stage is expected to last for 2~~4 weeks, accumulating a total of 30~~50k attempts. The dual camera input could possibly double the training data to 60~80k attempts, but at the risk of over fitting the model.

Start from idle position c_idle
[Daily] Grip Cycle n = 1 (start of today's Learning Cycle)
- ...
[Daily] Grip Cycle n (end of last Grip Cycle n-1 = start of this Grip Cycle n)
- ITF-SMS0 (= Shot00 => Move 00 => Shot 01)
- CEMs (decide whether to Pick, Move, or Raise)
  - [CEMs denote 3 cycles of calculation using Cross-Entropy Method (CEM) to infer a possible new motor command v_n_x* for a new waypoint for picking using the trained model g_m.
  - A decision is made on whether a successful grasp can be performed based on the ratio (p_n) between the calculated grasp success possibility at the current waypoint (g_m(I_n_x_1, close)) and that of at a new waypoint (g_m(I_n_x-1, v_n_x)).]
- ITF-MoSo (if 50% < p_n =< 90%)
  - Repeat CEMs no more than 10 times
- ITF-Pick (if p_n > 90%)
  - Close the gripper for a grip attempt and stop this Grip Cycle
- ITF-Raise (if p_n <= 50%)
  - Move the gripper by raising it 10 cm directly and stop this Grip Cycle
[Daily] Grip Cycle n+1 (end of this Grip Cycle n = start of next Grip Cycle n+1)
- ...
[Daily] Grip End (end of this Learning Cycle = start of next Learning Cycle)
- Move [move gripper back to idle position c_idle, marking the end of the experiment]

End Stage

The End Stage is straight-forward as the following.

Back to Idle Position (x_idle, y_idle, z_idle=0, theta_idle, open)
Learning and Grasping Result Summary

Data Management

Image Data (I_n) and Motor Data (v_n) are the major data to be transmitted and processed. However, there is a potential to include further Motion Data, Grip Data and Sensor Data (TBD)

Efficient data saving (TBD)

Time stamp synchronization (TBD)

The whole learning process consists of a lot (m) of repetitive Learning Cycles

Each Learning Cycle consists of a lot (n) of repetitive Grip Cycles.
Each Grip Cycle consists of a series of Integrated Task Flows (ITF)
Each ITF consists of a series of Actions (Shot, Move, Pick, Drop, etc.)
Each Action executes/generates certain Data
Shot Action
- generates visual Data
Move Action
- executes waypoint Data (position and rotation as a vector)
- generates motion Data (Arm, Gripper, FT Sensor)
Pick Action
- executes close grip Data (adaptive grasp mode: Basic or Pinch)
- generates motion/sensor Data (Gripper, FT Sensor)
Drop Action
- executes open grip Data (adaptive grasp mode: Basic or Pinch)
- generates motion/sensor Data (Gripper, FT Sensor)
XXXX Action
- XXXX

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

as_DeepClaw: Deep Learning for Arcade Claw Grasping

Task Decomposition

Start Stage

Blind Stage

Daily Stage

End Stage

Data Management

About

Releases

Packages

ytdeng/as_DeepClaw

Folders and files

Latest commit

History

Repository files navigation

as_DeepClaw: Deep Learning for Arcade Claw Grasping

Task Decomposition

Start Stage

Blind Stage

Daily Stage

End Stage

Data Management

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages