Who should really be praised?
First thing first, all the news articles can certainly make you feel that Google has done it on it’s own and that this entire idea of deep dream came from Google only, WRONG. The idea came about from a long list of research papers and efforts made to visualize the affect of applying a deep neural network on an image (for now let’s just assume that a deep neural network is a black-box that takes in an image and tell you which objects are present in the image). More recently, a better way of visualizing the affects was formulated, developed and demonstrated by the researchers at Oxford University led by Dr. Andrei Vedaldi in a series of papers [1], [2]. Google like any other IT giant enjoys more media attention than the researchers who actually develop the idea, but half-points to Google for making the code public that can generate the hallucinations. I would have given three-quarters of points had Google made a web-server for people to generate their own dream images by simply uploading images.
Deep-dreaming for dummies
Real understanding comes from doing, therefore, if you are not actually performing this experiment you are not doing justice to your time that you put in reading up all the material. Look outside, see those clouds? Now, start sifting through the list of common objects that you usually come about, dog, cow, sofa, tree, cat, cartoon characters etc. and while you are going through this train of different objects try to find a similar shaped cloud that might look like the object you were thinking. Once you have found an object that looks like one of the clouds start focussing on that cloud with that object in mind. You will start seeing finer details in the cloud that will progressively make the cloud look more akin to the object. In a couple of minutes, you will feel like calling someone and sharing this awesome cloud that looks like an object with striking similarity, but to your dismay the person next to you might not see it immediately although it is right there !!!! Take a minute and point out different parts of the object in the cloud and you have a partner in the crime of imagination. What happened? How come an object appeared that looks so real in the clouds? Well actually, it did not. Your brain simply tricked you in finding that object in the cloud because you were pressing it really hard to find something. This phenomenon is called priming and it plays a huge role in shaping our behavior and mentality to an extent that even a minute change can alter our behavior drastically. More about priming and it’s socio-political effects later, for now let’s concentrate on deep dreaming.
Just like you could find objects in the clouds or shadows, a deep neural network (referred as DNN from here on) finds objects in images but, unlike you, the eyes of DNNs are not very sharp, they are somewhat out-of-focus that makes the images blurry and vague as a DNN looks at them. As a result, a nice-looking image of a mountain looks like a cloud to the DNN. A cloud in which it finds a hint of an object, say a building. Let us now add the final piece to the dreaming - the ability to reshape the clouds to find the object that DNN is looking for, much like us, playing on a beach and creating objects on the canvas of sand at our will. Remember, how we do that? We start with some lines, curves, dots and primitive shapes, like square, triangle, circle etc. and incrementally keep adding them, specific to the object in search, until we are satisfied with our creation. Exactly, the same thing a DNN does with the given image. At first look, it thinks that there is a particular object in the input image and subsequently it keeps on altering the image that keeps increasing its belief that the image contains the object till it is satisfied with its creation. How come then these images look so weird and why don’t I see any object in the image that a stupid DNN can see? The answer lies in the question - a DNN is stupid, therefore, its criteria for calling a cat, a cat is stupid as well. Just like for an amateur artist a brown trunk with a green blob makes up a tree, black eyes with furry blobs makes up a cat for a DNN. So you see, it’s all relative, just like a trained artist thinks of your tree drawing as a hallucination, you consider the drawing of the DNN hallucination as well. It is this lack of the expressiveness that gives rise to the feeling of hallucination at different levels.
What next? Magic shroom trips !!!
Hopefully by now you have a fair intuitive grasp on how the deep dreams are created. It is time to extend it to videos by training a DNN for video classification using not only the image frames but also the temporal information in the videos. Once, we have such a network we can use a similar pipeline to create hallucinating video sequences from an input video, by using a spatio-temporal prior for reconstruction. Next, thrown in the Oculus virtual reality videos as the input to the hallucinator and we have a working copy of 3D hallucination that can be plugged and played in an Oculus VR head-gear. Those who have seen the power of Oculus VR head-gear would know it’s effect and the the reality of the virtual world it can create. I think it will be a very potent hallucination experience in a virtual world much like a shroom trip without the harmful effects. All in all, exciting days are ahead in the reals of Artificial Intelligence and opposite to the much hyped fear propaganda against AI, it is going to usher into your life through the door so better start educating yourself with it for a warm welcome.