Year in Review: Best of Analytics Vidhya from 2015
People say that 90% of startups fail by the time they reach their year 2! I would like to thank you all that we have not only made it to the remaining 10% of the startups, but have come out with flying colors!
I still remember last day at my job and my friends at work were curious “How big a market could data scientists and business analysts be?” As a first time entrepreneur with a 6m old daughter, I felt scared that I did not know the answer. I was leaving my cushy job to try out something no one had tried before. All I knew was the glaring knowledge gap I wanted to address and how passionately I felt about it.
Thankfully, it all worked out. Today, we are one of the the largest and fastest growing data science community in the world. Our traffic has become 5x of what is was when 2015 started and it is still growing at a healthy pace. We started the year with launch of our discussion portals, added different forms of content like infographics / cheat sheets, salary test and resource finder. In second half of the year, we also launched our hackathon platform.
Through 2015, our community got bigger and bigger. We felt a huge shift on work load. But, this was pleasing. We wrote our heart out to provide the best possible knowledge in the subject matter. And we hope you enjoyed it. Here are some of the best snippets of content created by our community in 2015. Read them, give the knowledge a test and stay warm as the year comes to an end.
We promise to make 2016 even more exciting and knowledge rich for you.
Note: If you have anything to share(suggestion, opinion, good moments, bad moments), or anything, you can write to us at [email protected]
10 Best Articles of 2015
Many of us tend to get confused in choosing a right algorithm. It’s quite common actually. People fail to decide if logistic regression or decision tree would give better result. In stuck in such situations, this article would come to rescue you. Here you’ll find the complete explanation of 10 Machine Learning algorithms in Python & R. If you are a complete beginner, this should help you to get started with machine learning today.
Github is not just for web programmers, as perceived by most of us. It is a open source repository data science folks as well. While working on this article, I was astonished to find the depth of free resources available here. I ended up with creating a list of top data scientist in the world. Being available on github, you can check their code repository and the projects they’ve worked on. This is both inspiring and exciting to connect with such people.
Those who ain’t curious, end up learning multiple regression and logistic regression. That’s it. In total, there are 7 types of regression techniques which can be used in various situations. Do you know just 5 of them? I’m certain, many of you won’t. Don’t panic now. Here is a complete guide on 7 types of regression techniques used in predictive modeling.
Exploring data sets and developing deep understanding about the data is one of the most important skill every data scientist should possess. People estimate that time spent on these activities can go as high as 80% of the project time in some cases. If you use Python, here’a a complete beginners guide for data exploration in Python with codes. It uses python libraries such as NumPy, Matplotlib, Seaborn, Pandas.
Since both involve predictive modeling, people wanted to know the difference between Machine Learning and Predictive Modeling. A clear line of demarcation has been made in machine learning and statistical modeling with 7 points. The difference between these two have gone down significantly over past decade. Both the branches have learned from each other a lot and will further come closer in future.
Once I was finding free tutorials to learn machine learning. And, every time I searched for my query, google suggested me to watch Youtube. I was oblivious to this side of Youtube. I explored it and found a huge reserve of tutorials on data science. I created a playlist and shared this on internet. I’m glad that people found it helpful. Here’s a complete list of must watch Youtube videos on machine learning, deep learning and neural networks.
This is a complete tutorial on learning Random Forest algorithm. It is widely used in all situations. Though, the accuracy of results may vary. It’s a must have algorithm in your machine learning armory. Random Forests are incredibly powerful and can be implemented quickly. These days, random forest has become a cliché method of checking variable importance.
Here is a complete guide to creating basic to advanced level visualization in R Programming. R Programming offers a satisfactory set of inbuilt function and libraries (such as ggplot2, leaflet, lattice) to build visualizations and present data. These are convenient and allows you to create visualizations in no time.
PyCon conferences are held every year around the world. They have helped millions of beginners and newbies to embrace python and become expert in it. Their hour long workshops and tutorials are enriched with practical experience. Here’s a collection of best tutorials which you must watch, if you love python.
Data scientists are no less than artists. They make paintings in form of digital visualization (of data) with a motive of manifesting the hidden patterns / insights in it. Python uses 2 libraries i.e. Matplotlib and Seaborn. Here is a demonstration of various charts using these libraries in Python.
5 Best Infographics of 2015
Now learn, machine learning algorithms even faster. Here’s a cheatsheet which manifests codes in Python and R for machine learning algorithms. Here you wouldn’t find conceptual explanation of the algorithms, but their practical use and application.
There remains a deep confusion in recently grown job profiles. With machine learning, the work of statistician or data analyst may become outdated. But, the truth is different. These profiles do have a difference in their nature of work and responsibilities held. Here is an infographic which explains the role, responsibilities of these top jobs in analytics industry.
When you watch movies, you would have realized that you don’t forget its story, case, music all of a sudden. It stays in your mind. How about learning some data science this way? I chose these movies based on their relevancy, ratings and audience love for them. My personal favorite is ‘Her’ movie. The operating system is intelligent and adorable. Here are the 10 must watch movies.
R has a repository of more than 5ooo packages. If anyone to were to use all, it would be hard to remember. Instead, the packages have been categorized into role and nature of work. For example: you can’t user dplyr for visualizing data. Hence, it is important to learn which packages is best suited in which situation. Here is an infographic on types of useful packages available in R
Reading books is the best way to gain wisdom and knowledge. Books provide concrete and truthful aspect of a subject matter. If you are an avid reader, here is a list of must read books for people keen to start their career in analytics. I made this document, considering the relevancy, rating of the books. These books will broaden your outlook and improve your ability to learn and improve faster.
5 Best Discussions of 2015
5 Tips from Winning Data Scientists
These winners emerged from our hackathons. They are now mentors for many young data scientists in our community. Below are the tips shared by these data scientists:
1. As a beginner, you must commit yourself to learn feature engineering.
2. Think out of the box. Learn to use H2o and GraphLab libraries in R or Python.
3. You must sharpen your boosting and ensemble skills.
4. No one can teach you parameter tuning. You’ll learn it best by yourself. It is no rocket science. You just Try, Fail, Rebound and Succeed.
5. Don’t get hopeless when your model accuracy doesn’t improve. You should feel fortunate that you are stuck. Because, this is where your real learning will begin.
With this, we come to the end of this article. Once again, we would like to thank all our readers, users. Without you, it wouldn’t have been possible to build a community which is growing everyday. Faster than ever. You should check out these resources. These are the best ones from 2015. Not only people loved them, but shared them on social media to the depth.
Did you find this article useful ? Share your views and opinions in the comments section below.
If you like what you just read & want to continue your analytics learning, subscribe to our emails, follow us on twitter or like our facebook page.