Jul 21 2014

Using statistics: How to understand population distributions?

xkcd_seashell

One of the common queries, which I get on the blog is: I am not a Mathematics / Statistics graduate. Can I still become a good business analyst? or I am not good at statistics. Can I still change my career to become a business analyst? The simple answer to the question is – you …

Continue reading »

Jul 17 2014

Introduction to Markov chain : simplified!

Transition

Markov chain is a simple concept which can explain most complicated real time processes.Speech recognition, Text identifiers, Path recognition and many other Artificial intelligence tools use this simple principle called Markov chain in some form. In this article we will illustrate how easy it is to understand this concept.   Markov chain is based on …

Continue reading »

Jul 15 2014

Baby steps in Python – Libraries and data structures

HelloW-3

In one of the posts last month, we started taking baby steps in learning Python for data analysis. This post will take you one step ahead in your journey to learn Python. By end of this post, you will understand the role of several python libraries and various kinds of data structures used in Python. …

Continue reading »

Jul 13 2014

How to use “VLOOKUP()” like functionality in QlikView?

Applymap_Data

Whenever I interact with a Qlikview user, who has migrated from Excel recently – one of the most common queries which comes through is: “How do I apply VLOOKUP() in Qlikview?” For the starters, VLOOKUP is the excel way of joining 2 datasets through a common key. It is somewhat similar to joins in SQL. …

Continue reading »

Jul 08 2014

Who is the world cheering for? 2014 FIFA WC winner predicted using Twitter feed (in R)

Sports are filled with emotions! Cheering of audience, reactions to events on various media channels are some of the factors, which make a huge impact on the mind of the players. If people support you, your chances to win are greatly enhanced. Live example of this fact, are the statistics of Indian cricket team playing in India and …

Continue reading »

Jul 07 2014

Definitive guide to prepare for an analytics interview

nervous-job-interview

Let’s face it! Facing an analytics interview can be daunting at times! I have met a lot of analysts, who are good analysts when you interact with them informally. But something happens to them, as soon as they enter into an interview! Have you seen one of these analysts and wondered what happens to them …

Continue reading »

Jul 03 2014

Using Facebook as an analyst (Hint – using R)

Facebook has huge data bank and it allows us to make use of it to some extent. October is a month of celebration in India. We have festivals like Diwali and Dushehra in October, which makes the entire month a time to celebrate and reunion. Every time we meet our friends and relatives at different places, to make …

Continue reading »

Jul 01 2014

Baby steps in learning Python for data analysis

baby-steps-300x193

Last weekend turned out to be a very special one! My 10 month old daughter took her first baby steps and watching her take those steps was one of the most beautiful moment of my life. A baby, brimming with excitement to reach out to her father, trying to balance, while exploring her newly acquired skill …

Continue reading »

Jun 27 2014

Comparing a Random Forest to a CART model (Part 2)

Comparison

Random forest is one of the most commonly used algorithm in Kaggle competitions. Along with a good predictive power, Random forest model are pretty simple to build. We have previously explained the algorithm of a random forest ( Introduction to Random Forest ). This article is the second part of the series on comparison of a random …

Continue reading »

Jun 24 2014

What is deep learning and why is it getting so much attention?

A few days back, the content feed reader, which I use, showed 2 out of top 10 articles on deep learning. This is when I thought I need a better understanding of what is deep learning. I probably noticed the term – deep learning sometime late last year. And it has grown in its presence …

Continue reading »

Jun 20 2014

Comparing a CART model to Random Forest (Part 1)

I created my first simple regression model with my father in 8th standard (year: 2002) on MS Excel. Obviously, my contribution in that model was minimal, but I really enjoyed the graphical representation of the data. We tried validating all the assumptions etc. for this model. By the end of the exercise, we had 5 sheets of the simple regression …

Continue reading »

Jun 16 2014

SAS launches a free version – but, is it good enough?

SAS_analytics_U

I have spent the entire 7 years of my corporate work experience working on SAS. So, when I heard that SAS launched a free version (late May) – I was all excited! My initial reaction was that if SAS becomes available for free, it would become the preferred choice of analysis tool for people entering …

Continue reading »

Jun 10 2014

Unveiling Analytics Vidhya Apprentice – a programme to graduate with recognition for your knowledge!

stage11

It has been more than a year since we started our journey to change how Analytics knowledge flows in communities. The experience has been rewarding, fulfilling, gratifying and filled with a lot of learning at the same time. In this short span, we have become one of the leading analytics blogs (in India) and have …

Continue reading »

Jun 10 2014

Introduction to Random forest – Simplified

oecd-income_inequality_2013_2

With increase in computational power, we can now choose algorithms which perform very intensive calculations. One such algorithm is “Random Forest”, which we will discuss in this article. While the algorithm is very popular in various competitions (e.g. like the ones running on Kaggle), the end output of the model is like a black box and …

Continue reading »

Jun 04 2014

Must have books for data scientists (or aspiring ones)

must read books

I am back to one of my favourite topics – books! To double up the excitement, this time the list is for data scientists (or aspiring ones). Unlike the previous lists, these books are not for the light readers. These books are meant for people who enjoy programming and statistics – just the kind a …

Continue reading »

Older posts «