AI Projects #3
The happy ending to our AI Projects #3 took place on November 9, 2019 as we concluded our 3-month marathon with a grand showcase, where we invited all AI enthusiasts in our community as well as supporting academics and our members.
This batch we were happy both to raise the number of presenting groups from six to eight and to regularly host three teams, two coming from Ankara to attend the program, in Beykoz Kundura to have productive brainstorming sessions before that week’s meetup and spend the night solving problems occurring while they are progressing with their projects.
You can see a recap of our showcase, which captures some of the best moments of our productive day.
#3 PROJECTS
ANOMALY DETECTION WITH UNSUPERVISED AND GENERATIVE MODELS
One of the teams that become our regular visitors, the first team to take the stage consisted of Ali Akay, Onur Boyar, Oğuz Kaplan and Ramazan Yarar.
Ali Akay started his journey at inzva with the second edition of our DeepLearning.ai Study Group and moved onto Applied AI and finally became a participant in our AI Projects.
Though Onur Boyar was also a previous participant known for his dedicated work, he proved to be an excellent lead, too; as he led an admirably organized project process.
The newcomers, Oğuz Kaplan and Ramazan Yarar, were quick to become a part of our AI community and welcomed faces at our sleepovers.
The team tried to tackle the problem of anomaly detection with a data set consisting of cyberattacks. As their baseline, they chose to apply the LightGBM algorithm.
To improve the results of the baseline, they wanted to try auto-encoder, variational auto-encoder and generative adversarial network (GAN) methods. Their main challenge was the class imbalance problem in the data set. In other words, there were too many “normal” cases and there were only a few anomalies. Therefore, precision and area under the ROC curve were their primary metrics.
They achieved very good results with the auto-encoder model. However, while variational auto-encoder and GAN methods achieved better accuracy, the best ROC was achieved with the auto-encoder. The true positive rate of the auto-encoder model was 100%. Thus, the auto-encoder model was able to catch all of the anomalies while also labeling some normal cases as anomalies. They presented all of their findings with a very nice presentation.
We were also happy to hear that the project will go on and one of the team members, Oğuz Kaplan, will use the knowledge gained here as a part of his master’s thesis!
GOAL-ORIENTED CHATBOT USING REINFORCEMENT LEARNING
A project, which proved that distance is unimportant when it comes to efficient teamwork, our very own Chatbot team had four members, two from Ankara and two from Istanbul, who worked for a common goal under the guidance of our former Vision Cohort attendee, Semih Yağcıoğlu, a Ph.D. candidate coming from Hacettepe University in Ankara.
The rest of the team also stood out for their hard work and enthusiasm; an undergrad from the same university, Sercan Amaç, exceed expectations with his discipline and helpfulness.
The other two newcomers in our community, Cemil Güney and Efehan Danışman, were there every meetup to offer valuable advice to other teams and represent the rest of the team, who could not join every meetup due to long-distance, so that we do not feel the absence of the Ankara resident members during the project duration.
The team aimed to make a comprehensive natural language processing (NLP) project. Their idea was to build a goal-oriented chatbot that had three main components that were connected to each other: A natural language understanding (NLU) module, a dialogue state tracker (DST) module, and a natural language generation (NLG) module.
The purpose of the NLU module was to understand the user input. For example, if the user were to type “find me an Italian restaurant”; the goal of the NLU module would be to understand that the user wanted to make a reservation at an Italian restaurant by giving user dialogue acts (information) as output to the DST. Then the DST could keep track of the state of the dialogue and give its output to the NLG module abstractly, such as “ask budget”. Then NLG module could take this input and generates a well-formed sentence to ask the user their budget.
All three of these models were implemented using pre-trained deep neural networks. The team declared they wanted to continue their project and as their final goal, they hoped to implement a Telegram bot.
You can check out their Github here and presentation over here.
IMAGE-TO-IMAGE (SKETCH-TO-PHOTOGRAPH) TRANSLATION USING SKETCHYGAN
Having graduated from the “junior position” they were assigned to in the previous batch of AI Projects, undergraduates from Boğaziçi University, Cemre Efe Karakaş and Eylül Yalçınkaya, along with another undergraduate from Istanbul Technical University, Burak Bekçi, a previous participant in our Applied AI Study Group, had one of the most acclaimed presentations in AI Projects history as they showcased a very fun way to implement GANs by turning their faces into sketches.
The purpose of this project was to convert a sketch image into an actual image. A GAN architecture is used since this is an image-to-image translation problem. The team first wanted to replicate the results of their reference paper, SketchyGAN [1]. This special GAN architecture contains the masked residual unit (MRU) as a novel component. Apart from the traditional convolutions and the features obtained from them, MRU layer takes downsampled and upsampled versions of the actual images as input during the encoding and decoding part in the generator. Apart from this novel architecture, SketchyGAN also use extra loss terms added to the original GAN loss, such as perceptual loss and diversity loss. The team showed their results by also presenting their own sketches translated into actual images.
Check out the full presentation and their codes on Github.
CONVERTING RECIPE VIDEOS TO TEXT USING COMPUTER VISION
This AI Projects had another unique aspect as we welcomed two teams from Ankara, who were working on their graduation projects to finish their undergraduate educations.
The first of the teams who joined us for completing a part of their graduation project was a team suggested by one of our supporting academics Pınar Duygulu Şahin from Hacettepe University.
The team consisted of Meltem Tokgöz, Fatmanur Turhan, Sevda Sayan and Furkan Çağlar Gülmez and stole our hearts with the number of women engineers in it since here at inzva, we strongly advocate the promotion of women in engineering, notwithstanding the borders.
In this project, the team wanted to recognize the objects and actions from recipe videos and convert them to text based recipes. Therefore, the first step of the project had two parts, action recognition and object recognition. As the project is the graduation project of the team, during the limited time of AI Projects programme, team focused on the action recognition part. They used a kitchen video data set to perform the action recognition. The team also used pre-trained noun and verb classification models to get their preliminary results. In the showcase, they presented their action recognition results in their presentation.
FLYING-ASSISTANT: VISUAL HELPER FOR VISUALLY IMPAIRED PEOPLE
Students of Aykut Erdem, the second graduation project team was also very dear to us as they worked on a social good project that aimed to improve the quality of life for visually impaired people.
The appreciation of the audience expanded each passing moment as the team members, Yunus Emre Özköse, Enes Furkan Çiğdem and Furkan Çağlayan did not only made a well-structured presentation stating their message upfront but also they demonstrated how their flying-assistant functions with a show with the drone they brought.
The team divided the project into three parts: Depth estimation, object detection, and visual-slam. Each member of the team focused on a different part. For depth estimation, they used a pre-trained self-supervised monocular depth estimation model. YOLOv3 architecture and the pre-trained model is used for object detection. After training and testing all three models, they implemented an integrated system consisting of all three modules. Since this is their graduation project, a navigation system, a helper Android application, calibration of the individual modules, and a better drone integration is still on their project schedule. They also want to improve their frame rate by tuning their models to improve the reaction time of the system. See the presentation here.
UNPAIRED IMAGE-TO-IMAGE TRANSLATION USING CYCLEGAN
As can be seen from the number of projects tackling the various implementation of them; generative adversarial networks (GANs) became very popular in recent years on both theoretical research and application areas and the image-to-image translation is only one of the many applications that have been greatly improved with GANs.
A team of two and undergrads, Kayacan Vesek and Artun Akdoğan, showed us another fun way to implement GANs - to simply put- they tried to create their own Instagram filter.
In this project, Vesek and Akdoğan replicated the results of the CycleGAN paper [2]. They trained CycleGAN with a data set consisting of aligned and cropped celebrity faces with many labeled attributes such as hair color, glasses, hats, and so on. They used their own trained model with their own photographs with glasses on and the network converts these images to photographs with no glasses. Vesek and Akdoğan presented their results on the showcase and showed their synthetic photographs to the audience, you can find more in their presentation.
HAND POSE ESTIMATION AND ITS APPLICATIONS
inzva member Ahmet Melek led another team this batch by turning the graduation assignment he gave at the end of the Applied AI Study Group he led last July into a complete project for AI Projects #3.
This project team also included Can Bulguoğlu, who led not only one but two teams in our second batch and İrem Zırhlıoğlu, one of the most appreciated female role models in our community and originally a student of Chemical Engineering who improved her technical skills tremendously thanks to her discipline and self-motivation since the first time we met here back in January in the second edition of our DeepLearning.ai Study Group.
The last member of the team was Cemil Güney, who took part in this project as his second one and showed a dedicated effort in both.
In this project, Melek and his team trained a neural network regression model to predict the hand joint points from a given 2D camera input. They also tried to train a model to perform gesture recognition from the predicted hand joint points to perform human-computer interaction operations, such as using gestures to control electronic devices. While their trained network was not able to correctly classify the gestures, they solved this problem by implementing a rule-based system that classifies some simple gestures.
Finally, they put together these modules to realize a mouse control system from web cameras of laptops. They made an impressive demonstration in the showcase by controlling the mouse with hand movements and clicking with hand gestures.
They also plan to continue their work as you can see in their presentation. So make sure to follow the project on Github!
SOLVING COMBINATORIAL OPTIMIZATION PROBLEMS WITH REINFORCEMENT LEARNING
A previous participant of AI Projects #2 Furkan Gürsoy and Batuhan Koyuncu, who joined our community in July during Applied AI Study Group were the last ones to take the stage as we concluded our showcase.
Furkan Gürsoy and Batuhan Koyuncu chose a challenging and relatively novel problem: They wanted to replicate the results of the paper that tries to solve graph-based combinatorial optimization problems with attention [3]. They focused on one particular combinatorial optimization problem named the famous traveling salesman problem (TSP). Their experiments on TSP-20 and TSP-50 networks, containing 20 nodes and 50 nodes respectively, showed that the approaches worked very well in comparison with the best heuristics. To put in a different way, their solutions were very close to the solutions achieved by the state-of-the-art heuristics.
The novelty of their work was to try generating synthetic data from different distributions such as normal distribution and exponential distribution to train and test the attention network. They finally presented their results on a real-world traveling salesman problem consisting of the capital cities of the states in the USA. You can take a look at their presentation for a more detailed explanation.
This batch of AI Projects came with its own unique traits and offered both us and the participants fresh experiences as we had sleepovers in our home Beykoz Kundura and had inter-city projects overcoming distances.
But as always, a program is only successful if it fulfills its main objective, which - in this case- is having fun through free research with like-minded people!
You can see a recap of our showcase, which captures some of the best moments of our productive day.
Our AI Projects will be taking new applications for the first batch of 2020 starting from January 15!
Thanks to Microsoft for providing our teams with the necessary GPU while they enjoyed their processes.
inzva is supported by BEV Foundation, an education foundation for the digital native generation which aims to build communities that foster peer-learning and encourage mastery through one-to-one mentorship.
Subscribe to our newsletter here and stay tuned for more hacker-driven activities.
References
W. Chen and J. Hays. SketchyGAN: Towards Diverse and Realistic Sketch to Image
Synthesis. arXiv:1801.02753v2 (2018).
Zhu, Jun-Yan, et al. "Unpaired image-to-image translation using cycle-consistent adversarial networks." Proceedings of the IEEE international conference on computer vision. 2017.
Kool, Wouter, Herke van Hoof, and Max Welling. "Attention, Learn to Solve Routing Problems!." arXiv preprint arXiv:1803.08475 (2018).