Skip to content

Commit

Permalink
Update README
Browse files Browse the repository at this point in the history
  • Loading branch information
boucherm committed Oct 27, 2018
1 parent b39ce6b commit 3435294
Showing 1 changed file with 7 additions and 7 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
This project is both an exercise and an attempt at using a webcam to predict the mouse pointer position.
This project is mainly an exercise but also an attempt at using a webcam to predict the mouse pointer position.
This is ( not ) achieved ( but not that far ) using a ( rather simple ) neural net.
As a neural net needs to be trained, training data is needed.
This project presents three python scripts: one to acquire data, one to define and train a neural net, one to test the results.
Expand All @@ -17,16 +17,16 @@ Displays a black image that covers all your screen.
A blue pixel surrounded by a grey square is displayed ( for each new position ).
While looking at the blue pixel, press `space` to acquire some images ( meanwhile the grey square is removed ).
Once a point data has been gathered a new point is selected and displayed.
Points are sampled on a 10x10 grid.
Points are first sampled on a 10x10 grid, then on 40 locations along the screen border.
Once the 10x10 grid has been scanned the program stops.
Press `q` or `escape` to stop early, but a complete run shouldn't be too long.

To provide some robustness to sensor noise and blinking, five images are taken for each sampled point.
As a complete acquisition run scans 10x10 positions, it produces 500 training data.
It usually takes between 1.5mn and 2mn for me.
As a complete acquisition run scans 10x10 + 40 positions, it produces 700 training data.
It usually takes about 2mn for me.

You should probably aim for at least 20 000 images ( therefore performing 40 acquisition runs ).
Perform different acquisition runs preferably at different times of the day, at different positions in front of your screen, etc...
You should probably aim for at least 20 000 images ( therefore performing 30 acquisition runs ).
Perform different acquisition runs preferably at different: times of the day, positions in front of your screen, screen height, screen inclination, etc...
In short: try to make the training data representative of the final use cases.

Quirk: the net is prone to learn to only use the head orientation ( thus ignoring eyes ).
Expand All @@ -46,7 +46,7 @@ File: `Training/train.py`
The net is defined in `Training/gaze_net.py`
To retain enough details, images are downscaled only to a 320x160 resolution.
This is quite big, therefore training is quite long.
With an nvidia gtx970 card, training on about 40 000 images takes about 10h.
With an nvidia gtx970 card, training on about 40 000 images takes about 18h.
During training the parameters are regularly saved to disk.
Resulting files have the names: `gn_320x240_mse0.XXXX_epochY.pt`.
In these names `mse0.XXXX` is the mean squared error for the net on the training data, and `epochY` simply is the epoch number.
Expand Down

0 comments on commit 3435294

Please sign in to comment.