The progressive growing of GANs trains the GAN network in multiple phases. In phase 1, it takes in a latent feature z and uses two convolution layers to generate 4×4 images. Then, we train the discriminator with the generated images and the 4×4 real images. Once the training stables, we add 2 more convolution layers to upsampling the image to 8×8 and 2 more convolution layers to downsampling images in the discriminator.
Indeed, if we have 9 phases in total, we can generate the 1024 × 1024 celebrity-look images.
The progressive training speeds up and stabilizes the regular GAN training methods. Most of the iterations are done at lower resolutions, and training is 2–6 times faster with comparable image quality using other approaches. In short, it produces higher resolution images with better image quality. Here is the final network when all phases of training are completed.
When new layers are added in each phase, it will be faded in smoothly with weight α gradually increases linearly from 0 to 1 below.
The progressive GAN also uses a simplified Minibatch discrimination to improve image diversity. The progressive GAN computes the standard deviation for each feature in each spatial location over the minibatch. Then it averages them to yield a single scalar value. It is concatenated to all spatial locations and over the minibatch at one of the latest layers in the discriminator. If the generated images do not have the same diversity as the real images, this value will be different and therefore will be penalized by the discriminator.
The progressive GAN initializes the filter weights with
and then scale the weights at runtime for each layer
where c is the inverse of
For the generator, the features at every convolution layers are normalized.