Image for post
Image for post
Photo by Stefan Cosma

GAN — Self-Attention Generative Adversarial Networks (SAGAN)

How can GAN use attention to improve image quality, like how attention improves accuracy in language translation and image captioning? For example, an image captioning deep network focuses on different areas of the image to generate words in the caption.

Image for post
Image for post
Image for post
Image for post

Motivation

For GAN models trained with ImageNet, they are good at classes with a lot of texture (landscape, sky) but perform much worse for structure. For example, GAN may render the fur of a dog nicely but fail badly for the dog’s legs. While convolutional filters are good at exploring spatial locality information, the receptive fields may not be large enough to cover larger structures. We can increase the filter size or the depth of the deep network but this will make GANs even harder to train.

Image for post
Image for post
Source

Design

For each convolutional layer,

Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
  • Compute the self-attention output.
Image for post
Image for post
Modified from source
Image for post
Image for post
Image for post
Image for post
Visualization of the attention map for the location marked by the red dot. source
Image for post
Image for post
Source
Image for post
Image for post
Image for post
Image for post

Loss function

SAGAN uses hinge loss to train the network:

Image for post
Image for post

Implementation

Self-attention does not apply to the generator only. Both the generator and the discriminator use the self-attention mechanism. To improve the training, different learning rates are used for the discriminator and the generator (called TTUR in the paper). In addition, spectral normalization (SN) is used to stabilize the GAN training. Here is the performance measure in FID (the lower the better).

Image for post
Image for post
Source

Further readings

Reference

Self-Attention Generative Adversarial Networks

Written by

Deep Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store