Kunal Jain — November 18, 2015
Beginner Business Analytics Infographic Machine Learning

Introduction

I’ve spent close to a decade in data science & analytics now. Over this period, I have learnt new ways of working on data sets and creating interesting stories. However, before I could succeed, I failed numerous times. Success doesn’t come easy!

How did I succeed? The answer is simple. Every time I failed, I said to myself, ‘Let’s take one more step’. And I managed to travel a long distance. I learnt statistics, data mining, SAS, R, Python, Machine Learning on the way.

I confess that, in last 10 years, the methods of predictive modeling have become faster. Data is becoming larger than ever. We faced constraints when faced with Big Data. But, people came out with several big data technologies.

It’s overwhelming to see how the things have changed. But, there would still be many who are lagging to catch up with success in data science industry.

Hence, I decided to share 20 things which experience has taught me in the last 10 years. Hope you find them useful. The idea is to help people, who don’t have a mentor to provide them this advice all the time. So go ahead and read!

successful data scientist

Some Useful Resources

Learn Python

Learn Ensemble Modeling

Learn Boosting Algorithms

Learn Machine Learning Algorithms

Learn k- fold Cross Validation

Learn Feature Engineering

Resources on Neural Networks and Deep Learning

Master Structured Thinking Skill

 

If you like what you just read & want to continue your analytics learning, subscribe to our emailsfollow us on twitter or like our facebook page.

About the Author

Kunal Jain

Kunal is a post graduate from IIT Bombay in Aerospace Engineering. He has spent more than 10 years in field of Data Science. His work experience ranges from mature markets like UK to a developing market like India. During this period he has lead teams of various sizes and has worked on various tools like SAS, SPSS, Qlikview, R, Python and Matlab.

Our Top Authors

  • Analytics Vidhya
  • Guest Blog
  • Tavish Srivastava
  • Aishwarya Singh
  • Aniruddha Bhandari
  • Abhishek Sharma
  • Aarshay Jain

Download Analytics Vidhya App for the Latest blog/Article

15 thoughts on "Lifetime Lessons: 20 Things Every Data Scientist Must Know Today"

Arai
Arai says: November 18, 2015 at 8:52 am
Kunal good day. First of all, thank you so much for your blogs on analytics VId. I think that your blogs help me understand better ML and Business analytics more..My name is Arailym. I am from Kazakhstan.I want build my career in business analytics. I have B.S. degree in Economics. I know statistics basics and try to learn as much more algoritms of ML. I see that is company where I want to apply is required deep learning math. I read blogs of linear algebra and probability, but I afraid that i couldn't remind All math. (especially discrete mathematics, integrals so on) AlsoI want to participate in university competition. They required present project . I don\t know what to get available data and which tool i should use. I afraid if i take data from Kaggle that the would think that i cheated/ I want to do solve problems like segmentation or fraud detection. Plese help me. Thank you in advance. I Reply
Akash
Akash says: November 18, 2015 at 11:11 am
Hi Kunal, Thanks for sharing. I have a query on 5th one, Ensemble modeling. I was in the impression that different and appropriate algorithms are used to build models and improve accuracy. But here you say combine models and algorithms to boost accuracy. Can you elaborate on that by citing examples? Thanks, Reply
Moumita Mitra
Moumita Mitra says: November 18, 2015 at 2:43 pm
Sir ... I regularly follow your post in linkedin. It is very helpful.. After completing my MCA in 2007 I had only 1 year java experience..After that i got married .Due to some circumstances I couldnt continue my job..But now I want to reboost my career in Bigdata industry...Is it possible after taking such a long gap to reenter in this industry?I am already doing small courses from coursera as you suggested.I am eagerly waiting for your valuable suggestion. Reply
Ramdas Narayanan
Ramdas Narayanan says: November 18, 2015 at 4:25 pm
Hi Kunal, Excellent article, thanks for sharing the key steps/lessons for folks like me who are interested in Data Science/Analytics. Reply
Ankur
Ankur says: November 19, 2015 at 3:35 am
Excellent ! Reply
Ram Marthi
Ram Marthi says: November 19, 2015 at 4:31 am
Crisp, insightful and to the point. Thank you very much for sharing your experiences with us. Reply
Venkat
Venkat says: November 19, 2015 at 7:53 am
Awesome! The areas of focus are summarized really well. Reply
Satwik Mittal
Satwik Mittal says: November 19, 2015 at 2:05 pm
I beg to differ on one point though ! Python is nowhere near R in statistical modeling and ease of use ,,,,, People say R has a steep learning curve , and its true ! Python is easy , again true but as far as cleanliness of the R environment for statistics and a gigantic library of R packages plus the incredible syntax highlighter of R studio I don't really see myself using python . As far code execution speed is concerned you can always use the parallel library in R or write vectorised code or use H2o library or use the the enhanced R distribution from Revolution R open . Python is king of flexibility and R is the undisputed king of statistics Reply
Venkata Sreedhar Nalam
Venkata Sreedhar Nalam says: November 20, 2015 at 11:30 am
Excellent!!!! Great information....thank you for sharing your valuable insights!!! it helps.. Reply
Mustafa
Mustafa says: November 22, 2015 at 7:38 pm
Thanks for sharing your experience. Really it's an excellent article. ☺ Reply
Pushparaj
Pushparaj says: November 27, 2015 at 10:39 am
Kunal ji, Excellent article Reply
Pushparaj
Pushparaj says: November 27, 2015 at 10:45 am
I have also completed my MCA in 2007, But my UG is B.Com. I am struggling to do something in Business Analytics. Reply
SHASHANK
SHASHANK says: December 02, 2015 at 6:04 pm
am using SVM function directly on 6 lakh rows and its getting hanged. Should I code SVM line by line and then try or is there some other method ??? can R studio handle huge data. without hadoop ??. Reply
Vishanta
Vishanta says: December 07, 2015 at 11:55 am
Good info Kunal, thnx. Reply
Himanshu
Himanshu says: October 01, 2016 at 1:31 pm
One of the best article I have read. I am glad I read it. A big to-do list is in the pipeline now. Thank you so much! Reply

Leave a Reply Your email address will not be published. Required fields are marked *