8 Awesome Data Science Capstone Projects from Praxis Business School
It is not the strongest or the most intelligent who will survive but those who can best manage change.
Evolution is the only way anything can survive in this universe. And when it comes to industry relevant education in a fast evolving domain like Machine Learning and Artificial Intelligence – it is necessary to evolve or you will simply perish (over time).
I have personally experienced this first hand while building Analytics Vidhya. It still amazes me to see where we started and where we are today. During this period, there have been several ups and downs, several product launches, product re-launches and what not! But one thing has been a constant in our story – constant evolution!
So, when I got an invite to be a judge on the panel judging Capstone projects done by students of PGP in Data Science with ML & AI program at Praxis Business School, the same school where I had reviewed the program almost 4 years back – I was curious. I was curious to see and learn how their evolution had panned out.
My interaction with the students four years ago was quite different from my experience sitting in a panel of judges for Capstone projects. You get to see the final outcome coming from a rigorous program as opposed to just having a classroom interaction. This is like the proof of the pudding!
I was hoping to find out answers to 2 broad questions in the process:
- How has the program evolved over the years?
- What kind of projects are students currently doing and how industry relevant were they?
With those questions in mind – I boarded an early morning flight to Bengaluru and was in the Praxis campus by 9:00 a.m. Since the evaluations were supposed to start at 10:30 a.m., I had some time on my hand.
I used this time to catch up with the course faculty Gourab Nath, and other judges of our esteemed panel – Suresh Bommu (Advanced Analytics Practice Head at Wipro Limited) and Rudrani Ghosh (Director at American Express Merchant Recommender and Signal Processing team).
I also grabbed some authentic South Indian breakfast in the process. 🙂
Program Details and Capstone Projects
For people who are not aware – Praxis Business School offers a year-long program – PGP in Data Science with ML & AI at both its campuses – Kolkata and Bengaluru. The program is structured in a manner where the first 9 months are spent in the classroom with in-house and industry faculty and the last 3 months are spent as an intern with an industry partner.
The Capstone project happens before the internship actually starts. So, students spent a total of 9 months in the classroom and had been doing these projects for the last 3 months (month 6 – month 9 in the curriculum).
How has the Program Evolved over the Years?
The last time I had visited Praxis was in 2015 and I was dead sure that the program would have evolved. The question was how much? In which direction? What are the key takeaways for the students and how are the students from Praxis doing in the real world?
So, let me share my findings based on the interaction with Gourab and the rest of the panel.
How Much has the Program Evolved? In which Direction?
The first noticeable change was the name of the program itself. Back in 2015, the Program was called PGP in Business Analytics as most of the material in the course was related to Business Analytics and Statistical Modelling.
Over time, the program has evolved a lot – I was surprised to see the number of topics that are covered in the program. Here is a screenshot of topics covered in the curriculum, picked directly from their site:
The program has clearly evolved a lot. It not only includes Machine Learning and Deep Learning, but also Big Data Tools and Business-Focused topics. As far as I can see – the program has evolved a lot and has become a comprehensive course for data scientists.
What are the key takeaways for the students undergoing the program?
I think the best way to judge this is to look at the projects. So – I held this off and the projects were sufficient proof by themselves.
Needless to say, I was pretty excited by these discussions and with the context of this evolution – I was ready for what the rest of the day was supposed to be.
Here are the views of Gourab Nath, part of the judging panel and Assistant Professor of Praxis’ Data Science Program:
Collection of images is a challenging task for projects that involves topics like face recognition. Previously we were using an approach which was a little time-consuming.
So, this time we decided to take a more systematic approach to collect the images that can massively same time of our participants. The teams working on such projects designed and developed an easy-to-handle application for facial image collection.
A participant was requested to sit in front of the computer where we had the software running and all he/she needed to do was to enter his/her names and press a capture button to start the image collection process.
The students at Praxis Business School are highly encouraged not to be hugely dependent on the tools and the packages and focus more on writing algorithms. This approach helps them to code better no matter what programming languages they use.
Capstone Projects by Current Passing out batch at Praxis Business School
A glance at the list of projects confirmed my views until now. I could see projects on Machine Learning, Natural Language Processing (NLP) and Computer Vision (CV).
More importantly – it looked like these projects were not based on some open datasets. The problems mentioned were unique and I was not aware of many open datasets addressing these problems. Now, I was curious and excited to see what students have and how they have done.
Here’s the list of Capstone Projects done by students at Praxis Business School:
- Detection of Spam Reviews
- Opinion Mining on Mobile Phone Features
- Drowsiness Detection using Computer Vision
- Gesture Recognition using Computer Vision
- Team Selection using Computer Vision
- Attendance Tracking System using Computer Vision
- Recommender System for Fashion Apparel
- Nearest Document Search
Just to put things in perspective – most of the students presenting to us did not have any knowledge of predictive modeling and machine learning till July 2018 – when they started with the program.
Details of the Capstone Projects
Let’s look at each capstone project in a bit more detail to understand what it was about plus the tools and techniques used in each project.
Project 1 – Detection of Spam Reviews
Customer reviews have a huge influence on potential buyers of any product. A number of false reviews may drive the influence either in a positive direction or a negative direction. Any of these cases may make the customers take wrong decisions and the trustworthiness of the online opinions could be an issue.
In this project, we investigate opinion spam in reviews.
Note that this problem is different from email spam classification. Email spam usually refers to unsolicited commercial advertisements to attract people towards some products or services and hence they usually contain some prominent features.
Our specific problem is more challenging because untruthful opinion spam is much harder to deal with. These kinds of spamming material can be carefully crafted and made indistinguishable.
Tools: Python [Packages: NLTK, sklearn]
Techniques: Shingle Method, n-grams, Feature Extraction
Project 2 – Opinion Mining on Mobile Phone Features
You open amazon.com and find that lots of customers have given great reviews about a well-branded mobile phone you are interested in. You wonder – are these good reviews due to the camera of the phone? Or, how good is the battery of the phone? And what about the display?
While the number of reviews is really large and its almost impractical for the readers to go through all of them for evaluating the product, answers to these kinds of questions can be really helpful in making useful decisions.
In this project, our focus is to identify various features of a mobile phone that the customers are talking about in their reviews and mine the customers’ opinion on these features.
Further, we focus on identifying the polarity of these opinions and summarize the reviews. Finally, we develop a user-interface that summarizes the opinions about the features of the phone and rank the customer reviews based on its utility. We also propose an architecture that can perform the same on the reviews of any mobile phones.
Tools: Python [Packages: NLTK, SpaCy, sklearn], Wix.com (for the website creation)
Techniques: Fuzzy Matching, POS tagging, Association Rules Mining, Compactness Pruning, Redundancy Pruning, identifying sentiments based on the word list and weights in AFINN and WordNet
Check out a demonstration of this project below:
Project 3 – Drowsiness Detection using Computer Vision
How many times has this happened to you – you started a movie on your computer at night and fell asleep in the middle of it? And when you woke up the next day, you simply have no clue about how far you watched it? Happens to the best of us.
In this project, we focus on developing an application that will be able to detect if you are asleep and automatically pause the video for you. The system waits to see if you wake up in the next 30 minutes. In case you don’t, it will save a snapshot of the screen, close all the windows and shut down your computer automatically.
Tool: Python, Open CV, Tensorflow, Keras
Techniques: Viola-Jones algorithm on Rapid Object Detection using a Boosted Cascade of Simple Features, Inception V3, LSTM
Project 4 – Gesture Recognition using Computer Vision
Picture this – you are watching a video on your computer but are feeling way too lazy to use the mouse or the keyboard to control the video player. Sounds familiar?
We have a solution for you!
In this project, we focus on making the computer recognize some special gestures which will enable one to control a video player by just using those gestures.
For example, showing your palm in front of the system will enable the pause and the un-pause function. You will also be able to control the volume, fast forward a video or rewind it. You will also be able to do a wide range of other things like changing the slides of your PPT, changing pages, scrolling, etc. without grabbing your mouse or keyboard.
Tool: Python [Packages: Open CV, PyPI (Keyboard and mouse package), Tensorflow, Keras]
Techniques: Green Screen (for background subtraction), Single-Shot Multi-box Detector (SSD)
Project 5 – Team Selection using Computer Vision
Students are asked to create teams for their projects or their assignments, which is of course a very common thing in every school and college. The class representative (CR) creates a Google spreadsheet and shares it with everyone.
Students, after deciding who they want to team up with, populate the spreadsheet with the names of their team members. But the CR must remember the rules given by their Professor – the team size should be three and every team must have one female member at least.
So, the CR checks the restrictions and if everything is fine, he/she shares it with the Professor. This is one way to do it.
Or, you can do it the smart way.
You stand with your teams in front of the computer, the computer checks the restrictions, recognizes you, and fills in the database with your names and photos.
But remember, the computer won’t allow you to register if the constraints are not satisfied or when at least one of the members in your team is already registered as members of any other team. So, you cannot fool it!
Tool: Python [Packages: Open CV, Tensorflow backend, Keras, Imutils, face_recognition, pickle, dlib, cmake, tkinter (for GUI development)]
Techniques: VGG-NET 19, HOG Detector
Project 6 – Attendance Tracking System using Computer Vision
In this project, we developed a system to record class attendance using computer vision.
After a faculty enters the system using a password and sets the period, the camera opens up to capture the picture of the class. The number of snapshots of the class is first passed through a face detector followed by a face recognizer.
After the system recognizes the students, it updates the attendance spreadsheet and saves the captured image in its respective image directory – labeling it by the date and time of the day. The unidentified students are marked as absent.
Tools: Python [Packages: dlib, OpenCV, Tensorflow, Keras, sklearn, tkinter (for GUI development)]
Techniques: Haar Cascade Classifier, HOG, Siamese Model (One Shot Learning), kNN
Project 7 – Recommender System for Fashion Apparel
The use of a recommender system in e-commerce companies is a highly targeted approach that can generate a high conversion rate. These systems help customers discover the products which they might be interested in and will likely purchase.
In this project, we have created a recommender system for a small fashion apparel industry that:
- Allows the customers to search by the image of a product
- Gives a personalized recommendation to the heavy buyers, and
- Displays the most frequently purchased item for the selected item
Techniques: kNN, Collaborative Filtering, Content-Based Filtering, Autoencoders
Here’s a demo video of this project:
Project 8 – Nearest Document Search
In this project, we have created a nearest document search engine for News reading. The application will not just recommend you related news but also give you the sentiment and highlight important words associated with the news. If the news is big and you do not want to read the full news, fair enough, this app will have a summarized version ready for you.
Tools: Python [Packages: NLTK, sklearn, sumy, vadderSentiment, tkinter (for GUI development)]
Techniques: kNN, KDTree, Word Cloud, Lex Rank Summarizer
How relevant were these projects for the Industry?
One of the most critical questions I had was – are these projects industry relevant? Bridging the gap between academia and industry has been a significant challenge in data science. It turns out the answer is quite comprehensive.
In the last 4 years, the number of companies hiring has increased 4 times (from 15 in 2015 to 60 in 2018-19) and the average salary has doubled (5LPA in 2015 to 9LPA in 2018-19).
So, here are the thoughts of my fellow panelists on this topic:
“I am very impressed on the scope, objectives, and contents of the capstone projects executed by Praxis students. The majority of the projects are around the application of deep learning concepts which they have learned as a part of the course work.
The entire project execution and development activities were well planned and organized. Starting from defining the problem statement, challenges, real-time application and finally presenting the results.” – Suresh Bommu, Advanced Analytics Practice Head at Wipro Limited
“What really stood out for me was the effort put in by students in attempting to create an end-to-end product with a UI as well as the variety of projects and its extended application.” – Rudrani Ghosh, Director at American Express Merchant Recommender and Signal Processing team
Key Takeaways from the day
I loved the day and would live it again without second thoughts. But there were a few things which stood out for me:
- There was a stark difference in the projects which students were doing currently. In a period of 9 months, they have completed learning the subject and have completed a Capstone project. This would not have been possible without the efforts of students themselves and the faculty members.
- Most of these projects exposed students to the perils of design thinking, creating and collecting the dataset and cleaning it. I just loved this aspect. I am sure the students realised that building a deep learning model is far easier than actually collecting the data for it.
- I also loved the way students presented their projects. They created video teasers and demo sessions to bring out the work they had done.
It was great to see the high level of projects presented by these students. As I mentioned, I was glad to see the students picking up challenging problems on not openly available datasets.
At the end of the day, I had to rush back to the airport. Day trips to Bengaluru are bad! And the fact that I had to rush through projects for a few students only made it worse. I would have loved to spend more than a day – the Energy of the class, the faculty and the judges was infectious 🙂 Looking at these projects – I can confidently say that Praxis Business School continues to offer one of the best full time program in Machine Learning and Deep Learning in India.