What is a neural net thinking about? (an experiment)

3 min readMay 22, 2019

So lately I’ve been feeling interest towards deep learning and so I started doing Stanford’s CS231n and it’s assignments.

On Assignment 1, you have to implement a fully connected neural network of two layers, and train it on CIFAR-10. Well that’s not very fun because it doesn’t work very well on CIFAR-10, you get like 40% accuracy. That’s no fun.

So I searched for an easier problem, a gateway drug let’s say. And then I found MNIST’s dataset on Kaggle. Handwritten digit recognition using deep learning! And you can actually get 90+% accuracy quite easily! Best drug.

The problem goes like this. You have a 28 by 28 image of a handwritten digit and you have to detect what digit that is. So I deployed my two layer net on it and after tuning it got to 91% accuracy on Kaggle’s test dataset.

But what is it thinking about?

Natural question to ask. Does it now have a notion of what a ‘1’ or a ‘0’ looks like? I’d like to see that notion. So what’s the image that will get the least loss for correct label as 1?

Here’s what it thinks the best 1 in the world is.

Yup. That’s it. Doesn’t look much like a one to me personally but hey who am I to judge? :P

Here’s a 5 for you. NSFW.

OK this is getting disturbing. Let me now tell you how I generated these images.

Generating the best image

We train networks through backpropagation. That means we calculate the analytic gradients of a loss function and backpropagate those using the chain rule of multivariate calculus. Backpropagation through a fully connected layer will give three gradients. dW, db and dx, that is the gradient wrt the weight matrix, then the bias and finally the input.

We get the gradient wrt to the input too! So during gradient descent instead of changing the weights, what if we change the x instead? Loss will still decrease obviously.

So here’s what I did. I took a fully trained two layer neural net. It had 91% accuracy on Kaggle. Now I will initialize an input image of all zeros or maybe some random initialization.

Now if I want to make it look like a 9 then I will calculate the loss function on that input image as if it was a 9. Then once I backprop and get dx we just gradient descent. That is how the images above are generated. So these images are the ones that will produce the least loss when given label say 1 and 5.

So if one day you took these images and gave it to my net, it would say with utmost certainty that it is a 5 and it is the most beautiful 5 in the world it has ever seen.

Why the funky images?

I don’t know. Please tell me. This is my first neural network.

Also does the algorithm I carried out to generate images have a name already? I bet it does given how many people are researching this stuff. Please tell me about it!

So that was my experiment. An experiment that has no conclusion, sadly.

What is a neural net thinking about? (an experiment)

But what is it thinking about?

Generating the best image

Why the funky images?

Written by Vineet