How Do Those AI Face Generator things work?

technojules/Julia
3 min readDec 9, 2022

Just by browsing the internet, you may have heard or seen of fascinating apps and websites that can generate a seemingly random face of a person within a span of seconds. You may even have tried some out to see what types of faces can be generated.

How do they do it? Are the faces faces of real people?

The machine learning part

A lot of these platforms develop a GAN (General Adversarial Network) machine learning model and train it on a dataset made of real human faces so this GAN can then learn these differing human features to generate new human faces. Therefore, most of the times, the generated faces you see on the internet or on an app are not real, meaning you won’t be able to trace any photo back to someone in this world.

But what is a GAN?

Generative adversarial networks (also called GANs) are a type of “generative” machine learning model, where their outputs are new generated data from discovering/learning patterns in input data. It is part of a very interdisciplinary cross-area surrounding AI and art/creativity/humanities, as GANs can be used to generate new paintings, video scenes, music, poetry, stories, etc. They can be so realistic and unique that artists and researchers are currently using GANs to create new types/styles of artwork, music, etc. to make an impact on these creative fields.

The GAN Structure

A GAN has two “sub-models”: the Generator and Discriminator. The Generator is used to create new examples, usually fake data (like fake faces of human beings). On the other hand, the Discriminator is used to classify these generated examples as real/fake categories. To train a whole GAN, one must show/feed it examples of input data, have it predict outputs that are usually similar to input data, and then correct the GAN to output more expected/accurate outputs. One thing to note is that the Generator and Discriminator compete against each other: the Discriminator must attempt to discern between training data samples and Generator-generated data.

How the Discriminator part works

The Discriminator model is kind of like a classifier, such as a CNN. Its goal is to distinguish real data from data outputted by generator. To train the discriminator to distinguish between these two data types, one must feed in training data that includes both real input images/data (positive examples) and fake data created by generator (negative examples). One important thing to note is that when the discriminator trains, the generator does not train (its weights remain constant but still produces new examples). To evaluate a GAN’s performance, discriminator and generator loss functions can be used. When the discriminator is training, the generator loss is ignored. The Discriminator loss penalizes the discriminator for misclassifying real and fake data points, and in turn the Discriminator updates weights through backpropagation from the discriminator loss through its structure.

How the Generator part works

The Generator model takes in a vector input and generates a new data sample. This vector input is randomly drawn from a Gaussian distribution. After training, points in the vector space/latent space (containing random variables we can’t see directly) correspond to points in the problem domain to form a compressed representation of data. This latent space provides a compressed/high level of data and new points drawn from this space are provided to the generator as input and used to generate different output data. GANs can learn latent space of images, music, text and take samples of data to create new, related data similar to input data, but usually not exactly the same.

Sounds pretty cool, right? However, there are some caveats:

Caveats

While GANs can be used to create fascinating artwork and images, some have abused GANs to create something called deepfakes, which are basically images or videos, usually of people, that are fake but look very realistic, which can be damaging to the people shown in the image or video especially if the scene is about a negative or bad action/activity that the real life person didn’t do. Therefore, the powerful ability of a GAN to create such realistic imagery can easily create fake accusations that can damage people’s reputations, especially for famous people.

Conclusion

However, as the use of GANs increases and the machine learning creative field advances, it could bring new inventions or enhance existing artistic/creative tasks. For example, we could have more high resolution images, more varieties of music to listen to, etc.

Sources Used

https://www.theverge.com/tldr/2019/2/15/18226005/ai-generated-fake-people-portraits-thispersondoesnotexist-stylegan

https://keras.io/examples/generative/conditional_gan/

https://becominghuman.ai/generative-adversarial-networks-for-text-generation-part-1-2b886c8cab10

https://neptune.ai/blog/6-gan-architectures

https://machinelearningmastery.com/what-are-generative-adversarial-networks-gans/

https://neurohive.io/en/tag/gan/

https://developers.google.com/machine-learning/gan/discriminator

https://developers.google.com/machine-learning/gan/training

--

--