DeepLearning.ai Study Group II
Report of Week 2
Deep Learning Study Group II is a 16 week-long study group, in which we cover advanced deep learning study series for AI enthusiasts and computer engineers. We follow up materials on https://www.deeplearning.ai each week and get together on saturdays to discuss them.
On December 8, we gathered at our sanctuary for the second week of DeepLearning.ai Study Group II and discussed the course titled “Neural Networks Basics”.
After an introductory first week, we were ready to dive into the concept of Neural Networks by further examining the model and studying its basic notions.
This week’s guide Alara Dirik began the session by explaining binary classification used for having a Neural Network classify an image. In this case, Alara used the example of an image of a cat and went onto explain the means to train a Neural Network with the aim to predict whether an image is, in fact, a cat image or not.
Alara proceeded to introduce logistic regression, an approach different than the linear regression which was covered in the previous week, to the participants who are unfamiliar with this kind of learning algorithm.
When it comes to binary classification, logistic regression gives better results while outputting a prediction because with linear regression there is a chance that the output may equal to a number bigger than one or even to a negative number which, of course, would make no sense in terms of probability. Enforcing the output to fall between zero to one is very difficult by using linear regression, so implementing a learning algorithm such as logistic regression will be more convenient for the purposes of this specific task.
In short, while linear regression could help us estimate continuous variables, logistic regression simply is more effective when it comes to classifying data.
To achieve the necessary results, we first need to begin with training parameters in the logistic regression model by setting a cost function. After talking about the cost function, Alara explained the means to implement a loss function which aims to show us how further we are from the optimal with the current parameters we have. “Thus,” Alara said, “the problem of detecting if the image is a cat or non-cat image turned into a more familiar one – finding a local optima for a given function.”
After showing the loss function that determines how well our model doing while making predictions, Alara introduced a method called gradient descent; an optimization algorithm that is used to minimize cost function. This algorithm, which is implemented by taking one step after another in the direction opposite to the gradient, ultimately gives us the local minimum.
Alara also showed how to implement this method by using computational graphs and how to boost the gradient descent algorithm by means of a method called vectorization. Vectorization is an approach that uses matrix operations, as opposed to for loops, while implementing the gradient descent algorithm. Numpy package in Python parallelizes these matrix operations, allowing the model to be trained much faster.
At the end of the sessions, our guide Alara and the participants collaboratively completed “Python Basics with numpy” assignment, which covered the basic functions in Python language and numpy functions used while coding our model.
Alara also mentioned a group of sources she found helpful while trying to thoroughly comprehend the subject.
You can find them here:
- Logistic Regression and Other Types of Regression
- Logistic Regression Simplified
- Image Classification - Mario vs Wario
Next week, we will continue studying the subject by focusing on neural networks with a single hidden layer: Shallow Neural Networks.
Guide of the WeeK: ALARA DIRIK
Alara Dirik is a machine learning engineer and Master's student in Data Analytics at the University of Glasgow.
She is an avid learner with domain expertise in Internet of Things and Natural Language Processing.