layout
default

Convolutional (Conv) layer

Accepts as input:

feature vector of size

<a href="http://www.codecogs.com/eqnedit.php?latex=W_1&space;\times&space;H_1&space;\times&space;D_1" target="_blank"><img src="http://latex.codecogs.com/gif.latex?W_1&space;\times&space;H_1&space;\times&space;D_1" title="W_1 \times H_1 \times D_1" /></a>

</li>
<li>filters of size 

<a href="http://www.codecogs.com/eqnedit.php?latex=F&space;\times&space;F&space;\times&space;D_1&space;\times&space;D_2" target="_blank"><img src="http://latex.codecogs.com/gif.latex?F&space;\times&space;F&space;\times&space;D_1&space;\times&space;D_2" title="F \times F \times D_1 \times D_2" /></a>

</li>
<li>biases of length 

<a href="http://www.codecogs.com/eqnedit.php?latex=D_2" target="_blank"><img src="http://latex.codecogs.com/gif.latex?D_2" title="D_2" /></a>

</li>
<li>stride 

<a href="http://www.codecogs.com/eqnedit.php?latex=S" target="_blank"><img src="http://latex.codecogs.com/gif.latex?S" title="S" /></a>

</li>
<li>amount of zero padding 

<a href="http://www.codecogs.com/eqnedit.php?latex=P" target="_blank"><img src="http://latex.codecogs.com/gif.latex?P" title="P" /></a>

</li>

Outputs another feature vector of size

$W_2 \times H_2 \times D_2$

, where

<a href="http://www.codecogs.com/eqnedit.php?latex=W_2&space;=&space;\frac{W_1-F&plus;2P}{S}&plus;1" target="_blank"><img src="http://latex.codecogs.com/gif.latex?W_2&space;=&space;\frac{W_1-F&plus;2P}{S}&plus;1" title="W_2 = \frac{W_1-F+2P}{S}+1" /></a>

</li>
<li>

<a href="http://www.codecogs.com/eqnedit.php?latex=H_2&space;=&space;\frac{H_1-F&plus;2P}{S}&plus;1" target="_blank"><img src="http://latex.codecogs.com/gif.latex?H_2&space;=&space;\frac{H_1-F&plus;2P}{S}&plus;1" title="H_2 = \frac{H_1-F+2P}{S}+1" /></a>

</li>

The d-th channel in the output feature vector is obtained by performing a valid convolution with stride

of the d-th filter and the padded input.
source

Stride

The amount by which a filter shifts spatially when convolving it with a feature vector.
source
image source

Dilation

A filter is dilated by a factor

by inserting in every one of its channels independently

zeros between the filter elements.
source
image source

Fully connected (FC) layer

In practice, FC layers are implemented using a convolutional layer. To see how this might be possible, note that when an input feature vector of size

$H \times W \times D_1$

is convolved with a filter bank of size

$H \times W \times D_1 \times D_2$

, it results in an output feature vector of size

$1 \times 1 \times D_2$

. Since the convolution is valid and the filter can not move spatially, the operation is equivalent to a fully connected one. More over, when this feature vector of size 1x1xD_2 is convolved with another filter bank of size

$1 \times 1 \times D_2 \times D_3$

, the result is of size

$1 \times 1 \times D_3$

. In this case, again, the convolution is done over a single spatial location and therefore equivalent to a fully connected layer.
source
image source