Open In App

Overview of Style Transfer (Deep Harmonization)

Last Updated : 16 Jul, 2020
Like Article
Since humans have started educating themselves of the surrounding world, painting has remained the salient way of expressing emotions and understanding. For example, the image of the tiger below has the content of a tiger from real-world tigers. But notice the style of texturing and colouring is way dependent on the creator. What is Style Transfer in Neural Networks? Suppose you have your photograph (P), captured from your phone. You want to stylize your photograph as shown below. This process of taking the content of one image (P) and style of another image (A) to generating an image (X) matching content of P and style of A is called Style Transfer or Deep Harmonization. You cannot obtain X by simply overlapping P and A. Architecture & Algorithm Gatys et al in 2015 showed that it is possible to separate content and style of an image and hence possible to combine content and style of different images. He used a convolutional neural network (CNN), called vgg-19 (vgg stands for Visual Geometric Group) which is 19 layers deep (with 16 CONV layers and 3 FC layers). vgg-19 is pre-trained on ImageNet dataset by Standford Vision Lab of Stanford University. Gatys used average pooling and no FC layers. Pooling is typically used to reduce the spatial volume of feature vectors. This helps to reduce the amount of computations. There are 2 types of pooling as depicted below:
MAX & AVG Pool

Pooling Process

Losses in Style Transfer:
  • Content Loss
    Let us select a hidden layer (L) in vgg-19 to calculate the content loss. Let p: original image and x: generated image. Let Pl and Fl denote feature representations of the respective images corresponding to layer L. Then the content loss will be defined as:
    L _{\text {content}}(\rho, x, L)=\frac{1}{2} \sum_{i j}\left(F_{i j}^{l}-P_{i j}^{l}\right)^{2}
  • Style Loss
    For this, we first have to calculate Gram Matrix. Calculation of correlation between different filters/ channels involves the dot product between the vectorized feature maps i and j at layer l. The matrix thus obtained is called Gram Matrix (G). Style loss is the square of difference between the Gram Matrix of the style image with the Gram Matrix of generated Image.
    G_{i j}^{l}=\sum_{k} F_{i k}^{l} F_{j k}^{l}
  • Total Loss
    is defined by the below formula (with α and β are hyperparameters that are set as per requirement).
    L_{\text {total}}(P, A, X)=\alpha \times L_{\text {content}}+\beta \times L_{\text {style}}
    The generated image X, in theory, is such that the content loss and style loss is least. That means X matches both the content of P and style of A at the same time. Doing this will generate the desired output.
Note: This is very exciting new field made possible due to hardware optimizations, parallelism with CUDA (Compute Unified Device Architecture) and Intel’s hyperthreading concept. Code & Output You can find the entire code, data files and outputs of Style Transfer (bonus for sticking around : It has code for audio styling as well!) here __CA__’s Github Repo.

Similar Reads

Deep Transfer Learning - Introduction
Deep transfer learning is a machine learning technique that utilizes the knowledge learned from one task to improve the performance of another related task. This technique is particularly useful when there is a shortage of labeled data for the target task, as it allows the model to leverage the knowledge learned from a similar task with a larger da
8 min read
Unveiling the Power of Fastai: A Deep Dive into the Versatile Deep Learning Library
Fastai is a powerful deep-learning library designed for researchers and practitioners. It offers high-level abstractions, PyTorch integration, and application-specific APIs, making it both adaptable and accessible for a wide range of deep learning tasks. In this article, we'll delve into the intricacies of Fastai, a powerful deep-learning library.
9 min read
Transformer Neural Network In Deep Learning - Overview
In this article, we are going to learn about Transformers. We'll start by having an overview of Deep Learning and its implementation. Moving ahead, we shall see how Sequential Data can be processed using Deep Learning and the improvement that we have seen in the models over the years. Deep Learning So now what exactly is Deep Learning? But before w
10 min read
What is Anonymous FTP (File Transfer Protocol)?
AFTP (Anonymous File Transfer Protocol) is a network protocol used for transmitting files using TCP-based networks. Anonymous file transfer protocol lets a user move files anonymously from one computer to another. Anonymous FTP operates at layer 7; anonymous FTP permits anonymous external computer users without any designated password or user ID to
3 min read
RTL (Register Transfer Level) design vs Sequential logic design
In this article we try to explain the fundamental differences between Register Transfer Level (RTL) Design and Sequential Logic Design. In the RTL Design methodology different types of registers such as Counters, Shift Register, SIPO (Serial In Parallel Out), PISO (Parallel In Serial Out) are used as the basic building blocks for any Sequential Log
4 min read
Text to text Transfer Transformer in Data Augmentation
Do you want to achieve 'the-state-of-the-art' results in your next NLP project?Is your data insufficient for training the machine learning model?Do you want to improve the accuracy of your machine learning model with some extra data? If yes, all you need is Data Augmentation. Whether you are building text classification, summarization, question ans
8 min read
Multiclass image classification using Transfer learning
Image classification is one of the supervised machine learning problems which aims to categorize the images of a dataset into their respective categories or labels. Classification of images of various dog breeds is a classic image classification problem. So, we have to classify more than one class that's why the name multi-class classification, and
8 min read
Difference between Interlingua Approach and Transfer Approach
Interlingua Approach: It is a machine translation approach used by systems in Natural Language Processing to translate one language to another and vice versa. This approach can also be used to translate one language to many languages. The credit of this approach went to Descartes and Leibniz who were among the first people to give the idea of Inter
2 min read
Reliable Data Transfer (RDT) 2.2
The reliable data transfer protocol 2.2 is the successor of RDT 2.1. The prominent change in the RDT 2.2 is the exclusion of Negative acknowledgment. In RDT 2.2 a new field is added to the packet at receiver side acknowledgment with the sequence of the packet sent, This enables the sender to compare the sequence number. If the sequence number doesn
3 min read
Reliable Data Transfer (RDT) 3.0
RDT 3.0 is the last and best version of the Reliable Data Transfer protocol. Before RDT 3.0, RDT 2.2 was introduced, to account for the channel with bit errors in which bit errors can also occur in acknowledgments. As the design of RDT 2.2 is a stop and wait for protocol. If there is a network issue and the acknowledgment/packet is lost. The sender
3 min read