Generative Adversarial Network

                   
Generative adversarial networks are a separate class of neural networks that are created with an intent to learn a probability distribution and to mimic it. There is a common misconception that generative adversarial networks learn generating images. Yes, this statement is true to some extent. GANs are learning the image representation and their distribution but not merely learning to generate images. There's mathematics involved in it! Not simply two engineered networks. If it's hardly about the networks why weren't auto-encoders able to do the same job as efficiently as GANs did? There's a particular way in which these networks are trained. Okay, then let's delve into the GANs.

Why are GANs so successful
  
The success of GANs as compared to other generative models is motivated by the fact that it has moved away from the traditional maximum likelihood approaches. Instead, GANs try to minimise the JS-Divergence between the two distributions. Unlike most other generative models, which attempt to approximate the probability density function, GANs generate a distribution by using a neural network and hence indirectly deal with the probability density function.

Construction:
Generative adversarial networks primarily consist of two neural networks similar to other generative models like Variational Autoencoders (VAE) etc.,  which are trained parallel (I meant in one pass, both the networks are trained) namely Generator and Discriminator. The generator generates data(samples) from random noise and (shows)passes the data generated by it to the discriminator. The discriminator's job is to predict whether the data it is being shown is real or fake(the generator's output). 

Both these networks are initially untrained, and they train by helping each other in an indirect way.The discriminator sees both the fake image(image generated by the generator) and the real image(real image). The purpose of the discriminator is to predict a real image as real and a fake as fake.

Generator essentially consists of a series of deconvolution layers and fully connected networks.
The discriminator is a classifier which is trying to classify the real and fake image and hence consists of convolutional layers and fully connected layers.

It's really not the network that matters the most. It's the training procedure and the gan's loss function that matters.From now on we denote generator function by G and discriminator function by D and the assumption is that both G and D are differentiable.


In simple terms GAN-game is like, the generator tries to fool the discriminator, and the discriminator tries not to get fooled. It's similar to a min-max game

Loss function:

The equation governing this training procedure is:-
min max V (D, G) = E  [log D(x)] + E [log(1 − D(G(z)))].
 G         D                           x∼P data (x)                 z∼P z (z)

Where V is the value-function.
We've defined a prior to random noise variables Pz(z), and so, G(z,θ) is a mapping to data space.We train D to maximise the probability of assigning the correct label to both training examples and samples from G. We simultaneously train G to minimise log(1 − D(G(z))). This is convex loss function, and hence while using gradient descent, we can expect convergence. The most essential part of deep learning is that the loss functions that we choose must be Convex.Why?.
Because while we are training, we generally train by moving in the direction of gradients. Let's assume that our loss function is non-convex with a number of local minima >1.  So, when we use stochastic gradient descent while moving in the direction of the gradient, we might get stuck in a local minimum. 
This local minimum can be global minimum or cannot be. To avoid this problem, if we use a convex function, we will end in the local minimum which is the global minimum as well. 




Problems in GANs

  • Generator gradients:
In general, it is tough to train G because we train G with the help of discriminator's gradients. In a recent publication of Martin arjovsky (Wasserstein GAN) it was proved theoretically that as the discriminator approaches the optimal discriminator state, the generator gradients start tending to zero. So, the generator stops training suffers after a certain point. 
  • Mode collapse:
GANs also suffer from mode collapse. That is, it generates same images continuously and stops training. 
  • The problem in bringing the Diversity: 
Bringing diversity to the generated images is tough. It just mimics the training dataset but cannot learn the full distribution.

With this, we conclude this post. From next post, we will start dealing with variants of the GANs.



Comments

Popular posts from this blog

GAN EXPERIMENTS