D Discriminator
G Generator
Takes examples to find parameters of a distribution pdata
pdata gives us x. But is unknown. We try to approximate it by pmodel.
Estimate
Why do we must care about Generative models ?
2 networks
First:
Second:
TODO : see again
Trained on z, Generate x
x = G(z; θ(G))
z are latent variables, and produce x observables
Likelihood is the evaluation of the model pmodel with its parameters θ using training examples.
If the examples match the distribution, then likelihood is big, and model valid.
We can go to the log space also. The goal is to maximize the likelihood by changing the parameters θ (or to change the model choosen)
Or other way to do is to minimize the KL divergence of pmodel against empirical distribution
DKL(pdata(x)∥pmodel(x; θ))
Train them by trying to find a Nash equilibrium
Generator Creates the samples by generalizing using original data (but not giving the original)
G(z; θ(G))
Discriminator Examine samples in order to say whether if they are real or fake. Learns with supervised learning, with {0, 1} (real or fake) D(x; θ(D))
Try to minimize J(G)(θ(D); θ(G))
The Nash equilibrium is the tupple (θ(D); θ(G)), and is a local minimum for J(D) with respect to θ(D) and is a local minimum for J(G) with respect to θ(G)
For the discriminator, Cross entropy but trained on 2 subsets (real and generated):
It gaves an estimate of the ratio
We set that it is a zero sum game, or minimax, so J(D) = −J(G) The value function V(θ(D), θ(G))= − J(D)(θ(D), θ(G)) summarize the game.
θ(G)* = argminθGmaxθDV(θ(D), θ(G))
Problem:
If generator minimize the same quantity, it will "generate" like initial data by overfitting. Discriminator will always win by rejecting. If generator make a big difference, easy to find what is good or not.
Allow to not inverse the sign.
KL
Reverse KL
TODO IDEA Learning triangle There is data coming with a noise. There are two adversarial network in parallel Each trying to lean about data, and to discriminate real or fake sample...
Allow to separate the distribution (ex: knowing if dog or house allow to not make a mix between)
Problem of the collapse / Helvetica scenario mode: The generator focus on one output only,