Ml Intro 3

3 minute read

Published:

What are convolutional neural networks?

https://www.cs.ryerson.ca/~aharley/vis/conv/ - A convolutional neural network interactive map.

When we look at images, our brains are more focused on finding features. They serve to give the computer vision.

Yann Lecun is the grandfather of CNNs.

Takes in an input image and produces an output label for the image (the image class). So how does the computer see. It starts by converting into an array for example a black and white image is a 2D array with each pixel running from 0 -255. Whereas the color image is a 3D array storing the RGB values.

1_1okwhewf5KCtIPaFib4XaA.gif

In one line CNN = Convolution (ReLU Layer) -> Max Pooling -> Flattening and Full Connection

1- Convolution

Conv%20Function.png

Introducing the feature detector

A Convolution is a combined integration of two functions and it shows you how one function modifies the shape of the other. The image array will go through a feature detector/kernel/filter overlaying on top. (It is usually a 3 * 3 array) So you will basically be multiplying the corresponding values. Its element wise multiplication of matrices. So a stride is basically the rate of movement out feature detector is moving. Conventional stride is 2.

conv.gif

Image + Feature detector = Feature Map/ Convoluted Image. (It will show the number of matches per movement)

Image size will be reduced. The highest number in your feature map is where the pattern matches up the most. This mimics human behavior because we see the features only.

1 (b) - ReLU Layer

Next step will be the rectified linear unit. This will be the Rectifier function to increase non linearity. This is because images themselves are non linear.

2 - Pooling

This is to address situations where the image is not in the regular position for example direction subject is looking at. So we want to add the element of spatial invariance. There are many pooling functions but we will focus on max pooling for now. This looks like convolution but it is just recording the maximum value. It also prevents overfitting.

max_pooling.gif

3 - Flattening

The pooled feature map is then flattened into a vector (column) to be sent to the neural net for further processing.

4 - Full Connection

This stage now adds the whole ANN. We will be combine our features into further defining more attributes. In back propagation we wont just adjust the weights but we will also adjust the feature maps.

Each output neuron will have output layer neurons it will value more. It will notice which neurons will make it light up more. It will then attribute a higher importance.

CNN Feature : Softmax and Cross entropy

The reason why the output neuron values add up to 1 even though the input had been a long vector is because of the Softmax function.

Softmax comes hand in hand with cross entropy. We don’t have to use the Mean Square Error Loss function as the cross entropy function does it better. It helps especially in back propagation with a very low loss value.

Cross%20Entropy.png

Loss function error types

Cross%20Entropy.png

CNN Feature : Eligibility Tracing

Agent takes n-steps. Takes the total reward of those n steps and passes them onto the neural network and has more insights. Meaning the agent will know whats the end instead of assessing one step at a time. Not only does agent look at the cumulative reward, it will also trace such that if it gets a punishment it would have kept trace of what would have been the cause.