Deep learning tends to feel like trying to find your path in a snowy mountain. Having solid principles makes you more confident on taking decisions. We have all been there

Photo by Joshua Earle on Unsplash

In the previous post, we thoroughly introduced and inspected all the aspects of the LSTM cell. One may argue that RNN approaches are obsolete and there is no point in studying them. It is true that a more recent category of methods called Transformers [5] has totally nailed the field of natural language processing. However, deep learning never ceases to surprise me, RNN’s included. …

Nowadays, transfer learning from Imagenet is the absolute standard in computer vision. Self-supervised learning dominates natural language processing, but this doesn’t mean that there are no significant use-cases for computer vision that it should be considered.

There are indeed a lot of cool self-supervised tasks that one can devise when one deals with images, such as jigsaw puzzles [6], image colorization, image inpainting, or even unsupervised image synthesis.

But what happens when the time dimension comes into play? How can you approach the video-based tasks that you would like to solve?

So, let’s start from the beginning, one concept at…

Getting Started

In a previous post, we discussed 2K image-to-image translation, video to video synthesis, and large-scale class-conditional image generation. Namely, pix2pixHD, vid-to-vid, and BigGAN.

But how far are we from generating realistic style-based images? Take a quick glance how stylish a real photo can be:

Photo by anabelle carite on Unsplash

To this end, in this part, we will focus on style incorporation via adaptive instance normalization. To do so, we will revisit concepts of in-layer normalization that will be proven quite useful in our understanding of GANs.

No style, no party!

StyleGAN (A Style-Based Generator Architecture for Generative Adversarial Networks 2018)

Building on our understanding of GANs, instead of just generating images, we will now be able to control their style! How cool is that? But, wait a minute. …

Wasserstein distance, boundary equilibrium and progressively growing GAN

GANs dominate deep learning tasks such as image generation and image translation.

In the previous post, we reached the point of understanding the unpaired image-to-image translation. Nevertheless, there are some really important concepts that you have to understand before you implement your own super cool deep GAN model.

In this part, we will take a look at some foundational works. We will see the most common GAN distance function and why it works. Then, we will perceive the training of GANs as an attempt to find the equilibrium of a two-player game. …

Nikolas Adaloglou

AI Research Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store