## Introduction

I’ve spent close to a decade in data science & analytics now. Over this period, I have learnt new ways of working on data sets and creating interesting stories. However, before I could succeed, I failed numerous times. Success doesn’t come easy!

How did I succeed? The answer is simple. Every time I failed, I said to myself, ‘Let’s take one more step’. And I managed to travel a long distance. I learnt statistics, data mining, SAS, R, Python, Machine Learning on the way.

I confess that, in last 10 years, the methods of predictive modeling have become faster. Data is becoming larger than ever. We faced constraints when faced with Big Data. But, people came out with several big data technologies.

It’s overwhelming to see how the things have changed. But, there would still be many who are lagging to catch up with success in data science industry.

Hence, I decided to share 20 things which experience has taught me in the last 10 years. Hope you find them useful. The idea is to help people, who don’t have a mentor to provide them this advice all the time. So go ahead and read!

### Some Useful Resources

Learn Machine Learning Algorithms

Learn k- fold Cross Validation

Resources on Neural Networks and Deep Learning

Master Structured Thinking Skill

Kunal good day. First of all, thank you so much for your blogs on analytics VId. I think that your blogs help me understand better ML and Business analytics more..My name is Arailym. I am from Kazakhstan.I want build my career in business analytics. I have B.S. degree in Economics. I know statistics basics and try to learn as much more algoritms of ML. I see that is company where I want to apply is required deep learning math. I read blogs of linear algebra and probability, but I afraid that i couldn’t remind All math. (especially discrete mathematics, integrals so on)

AlsoI want to participate in university competition. They required present project . I don\t know what to get available data and which tool i should use. I afraid if i take data from Kaggle that the would think that i cheated/ I want to do solve problems like segmentation or fraud detection. Plese help me. Thank you in advance. I

Hi Kunal,

Thanks for sharing. I have a query on 5th one, Ensemble modeling.

I was in the impression that different and appropriate algorithms are used to build models and improve accuracy. But here you say combine models and algorithms to boost accuracy. Can you elaborate on that by citing examples?

Thanks,

Sir …

I regularly follow your post in linkedin. It is very helpful..

After completing my MCA in 2007 I had only 1 year java experience..After that i got married .Due to some circumstances I couldnt continue my job..But now I want to reboost my career in Bigdata industry…Is it possible after taking such a long gap to reenter in this industry?I am already doing small courses from coursera as you suggested.I am eagerly waiting for your valuable suggestion.

I have also completed my MCA in 2007, But my UG is B.Com. I am struggling to do something in Business Analytics.

Hi Kunal,

Excellent article, thanks for sharing the key steps/lessons for folks like me who are interested in Data Science/Analytics.

Excellent !

Crisp, insightful and to the point. Thank you very much for sharing your experiences with us.

Awesome! The areas of focus are summarized really well.

I beg to differ on one point though ! Python is nowhere near R in statistical modeling and ease of use ,,,,,

People say R has a steep learning curve , and its true ! Python is easy , again true but as far as cleanliness of the R environment for statistics and a gigantic library of R packages plus the incredible syntax highlighter of R studio I don’t really see myself using python . As far code execution speed is concerned you can always use the parallel library in R or write vectorised code or use H2o library or use the the enhanced R distribution from Revolution R open . Python is king of flexibility and R is the undisputed king of statistics

Excellent!!!! Great information….thank you for sharing your valuable insights!!! it helps..

Thanks for sharing your experience. Really it’s an excellent article. ☺

Kunal ji,

Excellent article

am using SVM function directly on 6 lakh rows and its getting hanged. Should I code SVM line by line and then try or is there some other method ???

can R studio handle huge data. without hadoop ??.

Good info Kunal, thnx.

One of the best article I have read. I am glad I read it. A big to-do list is in the pipeline now. Thank you so much!