Study Group II

Report of Week 13

Deep Learning Study Group II is a 16 week-long study group, in which we cover advanced deep learning study series for AI enthusiasts and computer engineers. We follow up materials on each week and get together on Saturdays to discuss them.

On March 2, we came to inzva for the 13th week of the study group to talk about fun topics such as face recognition and neural style transfer under the guidance of Macit Giray Gökırmak.

Gökırmak began the session by introducing the concept of face verification which is matching the input image to the identity and seeing whether the declared image complies with the identity or not by using one-to-one matching. On the other hand, face recognition is a method of one-to-many matching. On the other hand, face recognition is a method of one-to-many matching, used at times when we have a pool made up of the identities of many people, in which we try to see whether our neural network learned to find faces in the input image or remains unrecognized.

Before going any further, Gökırmak first explained a process called one-shot learning which aims to learn and train our network with fewer data and helps get the face verified with a single image - just like our brain.

After that, he proceeded to introduce Siamese Networks. The term Siamese Networks is used for the cases when two or more neural networks have identical subnetworks, as in the same parameters and weights. This architecture is applied to find the similarities of two inputs and requires fewer data due to the shared parameters and weights. The idea is to have the subnetworks train on similar inputs and compare the outputs to have a final output, which gives us the answer in regards to the extent of the similarity. After implemented binary classification, these networks can tell us whether the inputs are from the same class or not.

Gökırmak then mentioned a technique called Triplet Loss, which is a loss function used to train our neural network that lets us update the gradients by comparing the anchor image with both the negative and positive samples. In this method, the loss is defined over three things: an anchor image, a positive image from the same class, and a negative image from a different class. It is important to note that whereas the distance between the positive image and the anchor image (defined as d(a,p)) must be low, the difference between the negative image and the anchor image (d(a,n)) must be high. Therefore the triplet loss can be defined as below:


In the second part of this week’s session, Gökırmak started to dive into the most popular concept of this area called neural style transfer, with which we can generate a new art based on a sample by transferring the original one’s artistic methods to our own.

So as to implement this technique, we need to have three main components:

  • Content Image: An image we choose to implement neural style transfer on.

  • Style Image: An artwork which we want to transfer the style from.

  • Generated Image: The image transformed by blending the content image with the style of the style image

Gökırmak then proceeded to talk about defining a cost function for the generated image and the importance of minimizing that cost function to receive the outcome that we want.  This process is made up of two steps. The first is defining a cost function for the generated image, which -to put it simply- tells us how similar the contents of the generated image and content image are.  After defining cost for the generated image, we move onto defining the content and style cost functions to have our final outcome be similar with the content image in terms of content, and style image in terms of style.


Also, Gökırmak mentioned a few additional resources where we can further study these topics. You can find them below:

Next session, we will start following a new course, which is also the last course of our Study Group, named Sequence Models, and begin learning about Recurrent Neural Networks.


Macit Giray received his B.Sc in Computer Engineering from Halic University in 2004, and his M.Sc in Computer Engineering in 2012. Currently working as a machine learning engineer at Turkiye Is Bankasi artificial intelligence tribe.

His main fields of work are financial forecasting, computer vision (tracking/detection/recognition) and natural language processing (chatbots/classification of customer complaints etc.).


Subscribe to our newsletter here and stay tuned for more hacker-driven activities.

inzva is supported by BEV Foundation, an education foundation for the digital native generation which aims to build communities that foster peer-learning and encourage mastery through one-to-one mentorship.