DCGAN with Keras Example

DCGAN Example With Keras and Tensorflow

In this post, we’re going to delve into inner workings of a deep convolutional generative adversarial network (DCGAN) with Keras. Since Tensorflow library includes Keras already, we’ll use some of its preprocessing functions aswell.

Furthermore, the model, we’re going to demonstrate, is an image synthesis model. To explain, we’re going to feed it a dataset of images and it will learn to produce similar images.

I should also mention, that for demonstration purposes, I’m going to resize images to 128px width and 128px height. Reason for this is, so the training process doesn’t take forever and still produces promising results.

About dataset

We’ll be using a dataset of anime faces, where each image is of size 512px width and 512px height. They are also all in .png format and since there is over 25 thousand of them, it will take up quite a bit of space.

In case you’re concerned about saving space on your disk, I would recommend converting each image into .jpg format and delete the original.

Coding DCGAN with Tensorflow and Keras

First things first, we need to import all the necessary libraries, which also includes tensorflow and keras.

import tensorflow as tf
from glob import glob
import imageio
import numpy as np
import os
from keras import layers
from tqdm import tqdm
from kaggle.api.kaggle_api_extended import KaggleApi
from PIL import ImageFile, Image

In addition, we also need to set an option for loading truncated images to True, so we can load them with ImageFile class.

# for importing images properly
ImageFile.LOAD_TRUNCATED_IMAGES = True

Next thing, we need to do is authenticate our connection to Kaggle API, so we’ll be able to download our dataset with it.

After that, we can go ahead, and call a function that will download the dataset to a local directory. We also set unzip argument to True, so it automatically extracts the downloaded images.

# authenticate Kaggle API connection
api = KaggleApi()
api.authenticate()

# download dataset from https://www.kaggle.com/datasets/prasoonkottarathil/gananime-lite
api.dataset_download_files(
    'prasoonkottarathil/gananime-lite',
    path='datasets/anime',
    unzip=True
)

Before we begin with preprocessing the dataset, we’re going to set up a couple of hyperparameters. Furthermore, we’ll use these throughout the whole process, including data preprocessing to training the finished model.

# set hyperparameters
BATCH_SIZE = 32
EPOCHS = 45
NOISE_DIM = 256
SAMPLES_TO_GENERATE = 16
IMG_SIZE = 128
AUTOTUNE = tf.data.AUTOTUNE

Data preprocessing

Now, we’re ready to define 2 functions that will load images, preprocess them, and store them into a tf.data.Dataset format.

# define function to load an image and preprocess it
def load_image(path):
    image_png = Image.open(path)

    jpg_path = os.path.splitext(path)[0] + '.jpg'
    image_png.save(jpg_path)
    image_jpg = imageio.imread(jpg_path)
    image = tf.image.resize(image_jpg, (IMG_SIZE, IMG_SIZE))
    image = tf.cast(image, tf.float32)
    image = (image - 127.5) / 127.5

    return image

# define function for creating a dataset
def load_dataset(root_path):
    samples = glob(os.path.join(root_path, '*.png'))
    images = np.empty(shape=(len(samples), IMG_SIZE, IMG_SIZE, 3), dtype=np.float32)

    for idx in tqdm(range(len(samples))):
        image = load_image(samples[idx])
        images[idx] = image

    images = tf.data.Dataset.from_tensor_slices(images).shuffle(len(samples)).batch(BATCH_SIZE)
    
    return images

To explain what exactly happens with each image:

  • load it .png format image with Image class
  • convert it to .jpg format
  • load .jpg format image
  • resize the image to the size of 128×128 pixels
  • convert the pixel uint values into float values
  • normalize these values into range [-1, 1]

We repeat this process for each image and finally, batch them into a dataset. Thus making it ready for training a model.

Define Generator model of DCGAN with Keras

Next, we need to define each DCGAN component separately, so we’ll begin with Generator here. Additionally, each component has its custom loss function, so we define that as well.

# define Generator part of DCGAN
class Generator(tf.keras.Model):
    def __init__(self, conv_dim=64):
        super(Generator, self).__init__()

        self.model = tf.keras.Sequential()
        
        self.model.add(layers.Dense(16 * 16 * 256, use_bias=False, input_shape=(NOISE_DIM,)))
        self.model.add(layers.BatchNormalization())
        self.model.add(layers.LeakyReLU())
        self.model.add(layers.Reshape((16, 16, 256)))
        assert self.model.output_shape == (None, 16, 16, 256)
        self.model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
        assert self.model.output_shape == (None, 16, 16, 128)
        self.model.add(layers.BatchNormalization())
        self.model.add(layers.LeakyReLU())
        self.model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
        assert self.model.output_shape == (None, 32, 32, 64)
        self.model.add(layers.BatchNormalization())
        self.model.add(layers.LeakyReLU())
        self.model.add(layers.Conv2DTranspose(32, (5, 5), strides=(2, 2), padding='same', use_bias=False))
        assert self.model.output_shape == (None, 64, 64, 32)
        self.model.add(layers.BatchNormalization())
        self.model.add(layers.LeakyReLU())
        self.model.add(layers.Conv2DTranspose(3, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
        assert self.model.output_shape == (None, IMG_SIZE, IMG_SIZE, 3)

        self.optimizer = tf.keras.optimizers.Adam(1e-4)
    
    def call(self, x):
        x = self.model(x)

        return x
    
    def loss(self, fake_output):
        cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)
        loss = cross_entropy(tf.ones_like(fake_output), fake_output)

        return loss

Define Discriminator model of DCGAN with Keras

Discriminator is the other side of the coin, which is responsible for determining whether generator produces convincing images or not.

# define Discriminator part of DCGAN 
class Discriminator(tf.keras.Model):
    def __init__(self, conv_dim=32):
        super(Discriminator, self).__init__()

        self.model = tf.keras.Sequential([
            layers.Conv2D(conv_dim, 5, 2, padding='same', use_bias=False, input_shape=[IMG_SIZE, IMG_SIZE, 3]),
            layers.LeakyReLU(0.2),
            layers.Dropout(0.3),
            layers.Conv2D(conv_dim * 2, 5, 2, padding='same', use_bias=False),
            layers.LeakyReLU(0.2),
            layers.Dropout(0.3),
            layers.Conv2D(conv_dim * 4, 5, 2, padding='same', use_bias=False),
            layers.LeakyReLU(0.2),
            layers.Dropout(0.3),
            layers.Conv2D(conv_dim * 8, 5, 2, padding='same', use_bias=False),
            layers.LeakyReLU(0.2),
            layers.Dropout(0.3),
            layers.Flatten(),
            layers.Dense(1)
        ])

        self.optimizer = tf.keras.optimizers.Adam(1e-4)
    
    def call(self, x):
        x = self.model(x)

        return x
    
    def loss(self, real_output, fake_output):
        cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)
        real_loss = cross_entropy(tf.ones_like(real_output), real_output)
        fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
        loss = real_loss + fake_loss

        return loss

Define DCGAN model with Keras

Now to put both of the components together, we need to call both models we defined already. We also need to define custom training functions, which will train both models simultaneously.

# put it all together    
class DCGAN(tf.keras.Model):
    def __init__(self, batch_size=BATCH_SIZE, noise_dim=NOISE_DIM, samples_to_generate=SAMPLES_TO_GENERATE):
        super().__init__()

        self.batch_size = batch_size
        self.noise_dim = noise_dim
        self.samples_to_generate= samples_to_generate

        self.generator = Generator()
        self.discriminator = Discriminator()

        self.seed = tf.random.normal([samples_to_generate, noise_dim])

        self.checkpoint_dir = 'gan checkpoints'
        self.checkpoint_prefix = os.path.join(self.checkpoint_dir, 'anime_ckpt')
        self.checkpoint = tf.train.Checkpoint(
            generator_optimizer=self.generator.optimizer,
            discriminator_optimizer=self.discriminator.optimizer,
            generator=self.generator,
            discriminator=self.discriminator
        )

    # training step that adjusts weights of both models simultaneously
    def train_step(self, images):
        noise = tf.random.normal([self.batch_size, self.noise_dim])

        with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
            generated_images = self.generator(noise, training=True)

            real_output = self.discriminator(images, training=True)
            fake_output = self.discriminator(generated_images, training=True)

            gen_loss = self.generator.loss(fake_output)
            disc_loss = self.discriminator.loss(real_output, fake_output)

            gen_gradients = gen_tape.gradient(gen_loss, self.generator.trainable_variables)
            disc_gradients = disc_tape.gradient(disc_loss, self.discriminator.trainable_variables)

            self.generator.optimizer.apply_gradients(zip(gen_gradients, self.generator.trainable_variables))
            self.discriminator.optimizer.apply_gradients(zip(disc_gradients, self.discriminator.trainable_variables))
    
    # iterate training steps for entire dataset multiple times
    def train(self, dataset, epochs):
        # restore existing weights
        self.checkpoint.restore(tf.train.latest_checkpoint(self.checkpoint_dir))
        for epoch in tqdm(range(epochs)):
            for batch in dataset:
                self.train_step(batch)
            
            # save weights after every 15th epoch
            if(epoch + 1) % 15 == 0:
                self.checkpoint.save(file_prefix=self.checkpoint_prefix)

        # generate examples and save them
        generated_samples = self.generator(self.seed, training=False)
        for i in range(generated_samples.shape[0]):
            image = np.array(generated_samples[i, :, :, :] * 127.5 + 127.5)
            image = Image.fromarray(image.astype(np.uint8))
            image = image.convert('RGB')
            image.save(os.path.join('gan generated images', 'epoch_{:04d}_sample_{:04d}.jpg'.format(epochs, i)))

Train the model

Now all that’s left is for us to preprocess the dataset with the function we defined, instantiate the DCGAN model, and train it.

# load data, create a GAN model, and train it  
data = load_dataset('datasets/anime/out2')
gan = DCGAN()
gan.train(data, EPOCHS)

Conclusion

To conclude, we demonstrated how to code a simple DCGAN model with Keras library and trained it on some anime faces.

I hope this article helped you gain a better understanding about GAN models and how to implement them in practice.

Share this article:

Related posts

Discussion(0)