MyStory: How I transitioned to Data Science after 6 years in Data warehousing?
Prior to getting initiated to Data Science, I was working in data intensive Data-warehousing for more than 6 years. In 2013, I had an opportunity to work in a problem wherein we were required to build a model to predict the probability of a customer buying a product as part of transformation initiative. This has widened my horizon.
Until that time, I was under the impression that data can be used only for presenting or analyzing what had happened. I did some research on Analytics, Predictive Modelling and I realized that even though it is an extension of BI it requires different skill sets Ii.e. of Statistics, Machine Learning, SQL, and Business acumen in industry one is working.
I had good knowledge of SQL and bit of business acumen. I was determined to learn Statistics and Machine Learning. In 2014, I enrolled in online course by Edvancer. Edvancer has provided me solid foundation in Analytics using SAS and R.
I had plans to do MBA but could not pursue it full time as I could not leave my job. I did not want to do MBA through distance learning. Based on my search pattern google had recommended few Analytics program by reputed business schools. I had applied for IIM Bangalore and Great Lakes Institute of Management Chennai and was shortlisted from both of them.
I had opted for Great Lakes as IIM Bangalore conducts most of the classes on weekends and was not in a position to attend classes on Saturday. Moreover, recording of classes were not available in IIMB. I was not in Bangalore at that point of time hence attending classes over the web would mean missing on peer learnings.
Great Lakes and Analytics Edge
I underwent Great Lakes PGPBABI program during period 2014-2015. As I am not a MBA graduate, I was able to fill certain gaps in my career and got good exposure and solid foundation on Analytics.
After completion of PGPBABI program at Great Lakes, I did “Analytics Edge” program on edx (This tip was shared by Kunal. Thanks Kunal!). As part of Analytics Edge we were supposed to take part in Kaggle competition which is part of grading. I was able to finish in top 90 among 2932 contestants (Top 3%) which gave me great boost to my confidence.
Application of Whatever I learnt
Now the time has come to apply whatever I learned in my workplace. I have mentioned my interest to my boss who were happy to support me. Initial days were challenging as “In theory there is no difference between theory and practice, but in practice there is”.
To overcome this challenge, Analytics Vidhya (AV) and its community were lot of help. I used to get my queries clarified over the discuss portal. What was even more helpful was the hackathon conducted by AV regularly. I made it a point to not to miss any hackathon even when I was travelling.
Hackathon allows you to benchmark yourselves with others and get introduced to wider community and make friends. Like many others, I started at the bottom but slowly improved to finish within top 10 many times and within top 10% in a Kaggle competition.
For every hackathon, I try something new. Sometimes it works and sometimes I get my fingers burned by overfitting the public leaderboard. Each of my experience in hackathon is worth weight in Gold. Thomas Edison once remarked “I did not fail 1000 times but found out 1000 ways how not to make a bulb”. My experience is similar to this.
I am happy with my current ranking of being in top 10 in AV (at time of writing this article) but I am trying my best to improve my ranking. Still lot of distance to be traveled before reaching heights of @SRK or @Vopani or others in top 5.They will remain my inspiration to excel. By the time I reach that level I am sure they would be have doubled their level.
These are some of FAQ for people in their mid careers and would like to make a transition to Data Science. I have answered to the extent I know. Please feel free to correct me if wrong.
How difficult/easy is to transition?
There is no single answer to this question. In fact, if one has enough data, one can run classification model to predict the probability of transition with the following variables (Some I can think of)
- Years of Experience (0-2,2-5,5-10,10-15,15-20 and > 20)
- Chance of Internal transition (Y/N)
- Previous experience in BI
- Expertise in R
- Expertise in Python
- Expertise in SAS
- Expertise in R and Python
- Domain Knowledge
- Strong Background in Marketing , Risk, Supply Chain, Finance etc…
- Expertise in SQL
- In depth understanding of statistical concepts
- Passion in Data Science
- Comfortable with numbers
Does one need to have all the above criteria to make a transition to Analytics? No, it depends on number of years of total experience (Relevant Analytics experience is assumed to be nil) as below (For MBA candidates strong background in Marketing, Risk, Supply Chain or Finance is assumed):
0 -1 years or Freshers: Passion in Data Science, Curiosity and Comfortable with numbers. Good to have expertise in SQL and in-depth understanding of statistical concepts.
1-2 years: Freshers + Expertise in either R or Python or SAS
2-5 years: 1-2 years+ In depth understanding of statistical concepts. Chances of transition is more if either chance of internal transition or previous experience in BI is applicable. Good to have knowledge of both R and Python or experience in SAS (Non data science).
5-10 years: 2-5 years + domain knowledge are applicable + knowledge of both R and Python or experience in SAS (Non data science). Chances of transition is more if either chance of internal transition or previous experience in BI is applicable. If one is Non MBA candidate it is good to have strong background in Marketing, Risk, Supply Chain or Finance.
> 10 years: All are applicable. Should be able to formulate own use cases and derive ROI. Should not leave anything to chance.
What works and what does not?
Internal transition is much better way of making transition. If one have more than 5 years of experience, having learn only data science course will not help. One need to substantiate his / her knowledge and learn to apply the knowledge acquired in his industry.
For those people who are having more than 10 years of experience, they should be capable of providing end to end solution i.e. Starting from finding a use case to executing it and presenting the ROI to the stakeholders.
What kind of Challenges can come during the transition?
Losing patience, aversion to coding, lack of domain knowledge in the industry the candidate has worked etc…
One has to be patient and wait for opportunities and grab them when they present itself. I have seen persons who after few months, get demotivated and do not have the will to pursue data science careers. Obviously it depends on the passion component of the candidate also.
There are few people who get trained in GUI driven softwares like SAS Enterprise Miner or SPSS and not willing to train themselves in R or Python as they are averse to writing codes. They should overcome this aversion to stand better chance.
I have seen certain people especially people coming from software industry lack the domain knowledge required to provide end to end solution. They are comfortable with coding but do not have business acumen. This may be big handicap especially when your number of experience is more and relevant experience is less.
It is true that as number of non relevant experience increases chances of transition to data science diminishes but probability never becomes zero. Remember!
Being at the right place at the right time having right skills matter.
What would be your advice to the people?
“Think like a CEO”. Do not restrict oneself to just model building. Get involved in all phases of Data Science right from problem definition to ROI justification.
How much time, effort and resources are required to make transition?
During the first year or initial period, imagine yourselves training for Olympics. If one is working he has to put 2 hours of study in the morning and another 2 hours in the evening. Work with lot of data and be positive.
Accept failures as stepping stones for success. Time taken to transition would range anywhere from 6 months to 2 years or sometime more than 2 years.
How does industry view transitions?
Industry view favorably transitions of candidates having less than 5 years of total non data science experience. For others with more than five years of total experience but less relevant experience in data science, they consider data science skills as only as an add on.
Data Science is not about new tools or technology. Some of the algorithms used in Data Science were conceived 30-40 years back. It is all about “Data to Decisions”. To be a good data scientist you need to have mixture of Coding skills, data management skills, modeling skills, business skills to succeed.
Do not confine yourselves to a tool or algorithm. There is nothing like good or bad tools or algorithm. Whatever works for the business to solve their problems and improve ROI are welcome. Transition for mid career people takes time and one should have patience and passion to make a successful transition.
Disclaimer: Our stories are published as narrated by the community members. They do not represent Analytics Vidhya’s view on any product / services / curriculum.