Kaggle Grandmaster Series – Exclusive Interview with 2x Kaggle Grandmaster Prashant Banerjee
“Data Science is full of opportunities and challenges and I want to explore those opportunities.”
Mindset is one of the factors that is gonna determine how fruitful your Kaggle journey is going to be in the long run. Many People tend to give up right in the beginning after negative reviews and failure without realizing that is the path most of the Kaggle Grandmasters take.
To challenge your belief on this, we are back with the 11th edition of the Kaggle Grandmasters series and this time joining us to tell his story is Prashant Banerjee.
Prashant is a 2x Kaggle Grandmaster with the titles in the Notebooks and Discussions category. He holds the 20th and 54th rank respectively. His kernels are highly valued in the community and there is a lot of buzzes when it comes to his discussions.
Further, Prashant completed his Bachelor of Technology(ME) from the Institute of Engineering and Technology (IET) Kanpur and holds a Business Diploma in Business Administration. He currently works as a Data Analyst at Puma.
You can go through the previous Kaggle Grandmaster Series Interviews here.
In this interview, we cover a range of topics, including:
- Prashant’s Education and Work
- Prashant’s Kaggle Journey from Scratch to becoming a Kaggle Grandmaster
- Prashant’s Advice for Beginners in Data Science
- Prashant’s Inspiration and Future plans
Prashant’s Education and Work
Analytics Vidhya (AV): Please tell us something about your educational background and how did you get interested in the Data Science Field?
BachePrashant Banerjee (PB): I did a Bachelor of Technology in Mechanical Engineering from the Institute of Engineering and Technology (IET) Kanpur. I also pursued a Post Graduate Diploma in Business Administration(major- Finance and Statistics) from Jaipuria Institute of Management, Lucknow.
I always knew analytics is always my cup of tea. So, I got a Data Analyst role at Puma Power. But, I realized that I was not doing justice with my skills and caliber. I have a passion for numbers and challenges. So, I researched a lot for other opportunities where I can use my skills. Then, I came to know about the Data Science field. I kept on researching the Data Science field and it has further enhanced my interest in it.
Now, I can say that –
“Data Science is full of opportunities and challenges and I want to explore those opportunities.”
AV: You work at Puma Power as a Data Analyst. Could you describe your Daily tasks and Job role?
AV: My work as a Data Analyst encompasses various roles and responsibilities as follows:-
- My day starts with team meetings to address specific business objectives.
- Then, I collect data from specific departments and crunch, interpret and analyze it, and report the results.
- Then, I create presentations and reports based on recommendations and findings.
- I work with management to prioritize business and information needs and identify opportunities for improvement.
- Locate and define new process improvement opportunities
- I create presentations and reports based on recommendations and findings
- Then, I also acquire data from various sources and maintain databases.
- I define new data collection and analytics processes.
AV: What kind of skillset does a DS aspirant need to have to land a role in Puma Power’s data team?
PB: A candidate should have a few basic qualities to get a role in the Data Science team. Amongst others, these are:-
- Proficient in Python or R as a programming tool.
- Strong foundations in statistics, modeling, machine learning, data processing, etc.
- Knowledge of data structures.
- Numerical and logical reasoning aptitude.
- An analytical and inquisitive mind for problem-solving.
Prashant’s Kaggle Journey from Scratch to become 2x Kaggle Grandmaster
AV: You are a Kaggle Double Grandmaster – Notebooks and Discussions. That is a RARE achievement and takes a remarkable amount of dedication and effort. What were the challenges you faced in achieving each title individually and how did you overcome them?
PB: I joined Kaggle two years ago and I faced lots of difficulties initially.
I started with notebooks but did not get a positive response from the community. It is quite difficult to gain recognition at the start of your Kaggle Journey. I realized that the fault lies in my notebooks. So, I worked on reorganizing my notebooks. I restructure my notebooks in a neat manner, writing clear and concise code with proper explanation. Slowly, I gained recognition and thereafter I received favorable results. Thus, I overcome the challenge with perseverance and hard work.
Similarly, I participated in discussions but faced lots of difficulties. So, I started posting useful resources to learn machine learning and deep learning. I started threads that are aimed at beginners. With these little tricks, I overcome the challenge on the kaggle discussions front.
Kaggle is full of challenges and opportunities, but it’s up to us how to take overcome challenges and take advantage of opportunities.
AV: What is your check-list of the must-do tasks for creating an expert-level notebook?
PB: I focus on different things when writing a notebook. They are listed below:-
- Quality content
- Clear and concise code
- Proper explanation and reasoning
- Easy to navigate and follow
AV: Which of your notebooks has gained maximum upvotes and what is the key to creating a popular notebook?
PB: The notebook – “ALASKA2: Image Steganalysis – All you need to know” has gained maximum upvotes.
The key to creating a popular notebook is that it should appeal to the masses. It must be a good quality notebook with knowledgeable content and it should serve a specific purpose.
AV: On the Kaggle discussions front, do you have a framework in mind when you answer a question or start a new thread?
In general, I don’t have a framework in mind when I participate in discussions. I try to be specific, to the point, clear, concise, and polite when I interact with someone.
Prashant’s Advice to the Beginners in Data Science
AV: We see most aspirants going for the competitions and hackathons while skimming over the discussions aspect. What is the benefit of participating in discussions as a beginner?
PB: Competitions and hackathons are great as they provide an opportunity to practice and hone your skills. But, you should also actively participate in discussions. There are lots of benefits from participating in discussions.
I think the most important benefit is that discussions will boost your knowledge and skills. Kaggle has a solid knowledge-sharing community. You can share your knowledge, clarify your doubts, ask for help, and discuss anything you want.
Discussions help you to build your profile. You will get noticed by others and build relationships with them.
AV: On the other hand, what are the 5 key points a beginner should keep in mind when trying to become a Notebooks Grandmaster?
PB: Notebooks GM is not an easy task. So, I will list 5 key points to keep in mind while trying to become a notebooks GM. I hope, these 5 points will help beginners to become notebooks GM.
They are listed below:-
I think that quality is the most important thing in notebooks. A beginner should produce great quality notebooks to become a notebooks GM.
If you share your knowledge with the Kaggle community, then you become notebooks GM relatively quickly. That’s what I did
Hard work and Dedication
Notebooks GM is a great achievement. But, you need lots of hard work and dedication for that.
Perseverance is another quality a beginner should have. Sometimes, you may not receive a positive response to your notebook. But, you should not be discouraged and move on.
An eye for opportunity
A beginner should always look for opportunities. If a competition is launched, then try to write notebooks associated with that competition.
AV: Can you list down 5 notebooks and 5 discussion threads that inspired you during your Kaggle journey?
PB: My top 5 inspirational notebooks are listed below:-
Every beginner should work on the very famous Titanic dataset. I have worked on it too and I found the above notebook quite useful. As the name suggests, it is a complete End to End ML Pipeline. The author presented different aspects of Machine Learning in great detail.
While working on a problem, we should first explore the dataset to gain insights about it. The above notebook, tells just about that. This notebook describes the data exploration process in detail.
The above is a very useful notebook on the Titanic dataset. It teaches us the machine learning modeling process step by step. It also teaches us how to make predictions using ensemble models.
This is a high-quality notebook by Serigne. He describes how to approach a regression problem and how to make predictions using stacked regressions in great detail.
This is a gem by Gabriel Preda. He describes how to solve a classification problem using Predictive Modeling.
The top 5 inspirational discussion threads are listed below:-
The above discussion thread presents various resources to learn Data Science on Kaggle.
The above post is aimed at beginners. The author describes how to become Data Scientist at your own.
In the above post, we get links to a youtube channel where we can see lots of interviews from kaggle GMs and competition winners.
In the above post, we get to know how to write more professional data science code.
In the above thread, we can chat with other Kagglers about various aspects of Machine Learning.
Another enriching interview with another Kaggle Grandmaster. I hope this adds value to your journey down the road.
This is the ninth interview of the Kaggle Grandmasters Series. We recommend you go through a couple of the previous interviews as well-
- Kaggle Grandmaster Series – Notebooks Grandmaster and Rank #2 Dan Becker’s Data Science Journey!
What did you learn from this interview? Are there other data science leaders you would want us to interview? Let me know in the comments section below!