Generative adversarial networks: the creative side of machine learning

The rapid development of artificial intelligence (AI) is making machines smarter. Since they have the ability to use input data to learn autonomously, machines are creating new ways to support humans in carrying out increasingly complex tasks.

One solution that’s very promising and already delivering impressive results in many areas is known as generative adversarial networks (GANs). GANs are primarily used to generate images, but they also allow the automatic creation of text. But what exactly are GANs? How do they work? And what suitable applications do they offer?

What is a GAN?

Before we explore what GANs can actually do for us, let’s look at what generative adversarial networks actually are.

A GAN is a machine learning system, developed in 2014 by Ian Goodfellow and his team. The task of a GAN is to generate its own creations based on a range of real example data. This allows the end result to be deceptively real and it becomes hard to tell the computer-generated images were not created by human hands. To do this, two neural networks are used which communicate with each other.

The generator network is tasked with creating a fake. The network is fed with data – such as photos of women. Based on this information, it then creates its own photo. First, the network learns what properties the originals have in common. So, the new picture isn’t a duplicate of one piece of source data, but an entirely new image that is similar in nature – in our example, representing the photo of a (non-existent) woman.

The basic data and generated information are provided to the partner network. The task of the discriminator network is to check all the data it receives to determine whether it is real or fake. An image is not only deemed to be fake if it deviates too far from the basic data, but also if it’s too perfect. If the generator simply takes the average of all the data and produces a new image, the machine generation will be easy to determine. The discriminator, therefore, also filters out the results that don’t appear natural.

Both networks try to outdo the other. If the discriminator network recognises a fake dataset, it rejects the data. In this case, the generator network wasn’t good enough and needs to keep learning. At the same time, the discriminator also learns. Since both neural networks train each other, this is referred to as a deep learning system. The generator attempts to create datasets that appear so genuine that the discriminator classifies them as real. On the other hand, the discriminator tries to closely examine and understand the real examples so that false datasets have no chance of being classified as real.

How do GANs work?

Like any other artificial intelligence, GANs also need to be trained. This form of machine learning proceeds through six steps:

  1. Problem definition: In the first step, a problem has to be defined which the system should try to solve. Here, the developers collect real data that the system can use.
  2. Architecture: Various problems also require various generative adversarial networks. For this reason, the GAN has to be equipped with the right architecture for the application.
  3. First discriminator training: Actual training begins during this step. The generator is stopped, while the discriminator only analyses the real data and learns to understand it.
  4. First generator training: Now the discriminator is stopped and the generator starts to generate falsified data.
  5. Second discriminator training: The discriminator network is now fed the new, falsified data from the generator and has to decide which datasets are true and which are false.
  6. Second generator training: The generator network is further improved with the result of the second discriminator training stage. The generator network gets to know the weaknesses of the discriminator and attempts to exploit them and generate even more realistic, fake datasets.

Both networks develop as part of this competition, thereby becoming better and more efficient. The generator network learns how to develop increasingly more realistic datasets. The discriminator network learns how to identify even seemingly real datasets as false.

What challenges does the system need to overcome?

Just as is the case for almost any technology, the developers of GANs face a number of challenges that have to be solved in order to ensure training runs smoothly.

Balanced competition

As explained above, GANs are based on the competition between two neural networks. But this can only work if both networks are just as strong and effective as each other. If one of the two networks is superior, the system will collapse. For instance, if the generator is too effective, the discriminator will classify all falsified data as real. Whereas, if the discriminator has the upper hand, it will classify all the data from the generator as fake. In this case, neither of the networks can develop themselves further.

Correctly understanding objects

Generative adversarial networks often have problems correctly recognising and understanding objects. This is particularly true for images. Here’s an example: A real image shows two cats, each with two eyes. If the generator doesn’t understand the complete structure and positioning of the image, it might generate an image of one cat with four eyes instead. GANs can also be caught out by perspectives and fail to understand that two images depict the same motif from different angles.

Where are GANs used?

Generative adversarial networks gained special attention – even beyond the field of computer science – after the artist collective Obvious used the technology to generate a work of art. The painting was sold at auction for $432,500. But a GAN can also deliver astonishing results outside artistic applications.

Video prediction

Based on the individual video frames, GANs can predict how a video continues and thereby extend the video autonomously at the end of the footage. They consider all elements of the video, including motions and actions, as well as background changes like rain or fog.

Image generation using text

GANs can generate images based on a description. For example, they can use a script to independently generate a storyboard.

Generation of complex objects

Even simple sketches can be automatically transformed by generative adversarial networks into complex three-dimensional objects in no time at all. A simple drawing of a tree can be used to create a highly complex image with tiny details, like leaves fluttering in the wind and a swaying tree trunk, thanks to GANs.

Improving image details

GANs can add new details to an image taken in poor resolution or with missing picture elements. To do so, generative adversarial networks use information from similar images to augment the missing image information.

Developing new products

Some companies are already experimenting with GANs in product development and create completely new designs and product lines using the system.

Product text generation

GANs can also handle text creation and are already used to generate product texts that play a greater role in the purchase decisions of consumers. Using GANs, these descriptions cannot only be created quickly, the networks can also analyse which product texts were most successful in the past and use this information to compose similar texts.

Generative adversarial networks are already being successfully put to use across all these areas. Companies and developers are constantly working on new application possibilities. In the near future, GANs will likely have a major influence on many aspects of our lives and work.

Was this article helpful?
Page top