A Primer on Getting Started with Data Science for Beginners
This article was published as a part of the Data Science Blogathon.
“Data Scientist: the sexiest job of the 21st Century”.
“Data Science will become USD 178 billion markets by 2025.”
“Companies are investing hugely in data science.”
After completing my engineering and starting my job I was continuously been bombarded with these statements on the internet. I was puzzled and like Lord Buddha wanted to know life’s truth, I also wanted to clarify my doubts. For seeking answers, I searched the internet, approached many people in/out of this domain.
In this article, I have compiled how I decided on starting my journey into Data Science. We will discuss how to build your digital profile and how to learn and retain Data Science Concepts. Are you confused about whether to start with Data Science or not? Don’t worry I will share with you my tips and tricks which will clarify all your doubts.
Is Data Science real or just old wine in a new bottle?
Data science is the field of study that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from data. Data science practitioners apply machine learning algorithms to numbers, text, images, video, audio, and more to produce artificial intelligence (AI) systems to perform tasks that ordinarily require human intelligence. In turn, these systems generate insights that analysts and business users can translate into tangible business value.
Data Science was always present between us. Excel, SQL, Statistics are the early age tools of Data Science. This doesn’t make this field obsolete. Data Science will always amaze us with new and updated magic tricks. Earlier we used to feed data into an excel sheet and then plot graphs nowadays data is automatically stored and graphs are automatically plotted with advanced visualization tools. With the advancement, Data Science has given us many Buzzwords like Machine Learning, Deep Learning, AIOps, etc… and it will continue to do so.
There’s nothing like one size fits all
To be honest there is no definite path and there should not be one also. The real essence of data science is people from different backgrounds and technicalities working together. From whatever journey I have covered, I can just give you course names and tell you to complete such courses and do such projects but this will feel like burdening you. You will be in a rat race of completing the courses and projects and eventually get exhausted. I am an electronics engineer whose final year project was Facial recognition smart door using Raspberry Pi, whose interest made him learn Machine Learning and Deep Learning. I also didn’t follow a specific path, I always went with what I liked and kept me awake at night.
Learning Data Science: Yes or NO?
Many of you might be thinking that whether a non-CS/IT person can learn Data Science or not. The answer is yes. A non-CS/IT can learn Data Science and it is not a compulsion for CS/IT to learn Data Science.
One does not need to start DS, ML, or AI due to societal pressure. You may feel left out if you are not extensively preparing yourself for the AI wrath. If you are good at what you are doing and love what you are currently doing then it is totally awesome to just keep doing what you are doing and keep yourself updated by reading news and blogs.
Like biodiversity is needed to balance the ecosystem, similarly, tech diversity is needed for a prosperous community. We will always need Mechanical Engineers, Electricals, Artists, Web Developers, App Developers, Content Creators, Doctors, Film Makers, CA’s, and Athletes, etc.
How deep should you dive into Data Science?
In Data Science there is something for everyone. So don’t have many fears and start training your brain model with it. The input features of this are your current domain, programming interest, curiosity, enthusiasm, passion, and hard work.
So let’s look into some domains which are associated with Data Science which one can pursue:
- Data Visualization: If you are creative and love statistics you can learn data Visualization tools and become a Data Viz Engineer.
- Data Engineering and Data Warehousing: It deals with storing and querying the data for future analysis. Maintaining data is as important as making predictions. A good prediction model requires good quality data.
- Cloud and Distributed Computing: If you are an expert in IT networking then familiarize yourself with the data science project life cycle and you can design and deploy Data Science models for easy access.
- Business Intelligence and Strategy: If you are a domain expert then you are the backbone of the whole data science project. A BI Strategist is responsible for managing dashboards, reporting to stakeholders, testing and validating models, and documenting.
- IoT Developers: If you are into hardware and fond of building circuits and controllers then you can play the part of gathering the data using sensors and making it ready for real-time analysis or storing.
- Computer Vision: If you love image processing then you can apply deep learning concepts and work on automating processes and building object detection models.
- ML Engineer: Machine learning engineers feed data into models defined by data scientists. They’re also responsible for taking theoretical data science models and helping scale them out to production-level models that can handle terabytes of real-time data.
- NLP Engineers: NLP Engineer responsibilities include transforming natural language data into useful features using NLP techniques to feed classification algorithms.
In the future, there will be many more new job profiles come into the picture the only thing which will keep you job-ready is continuous learning.
So what’s my guide on Machine Learning
I enrolled in several self-paced courses. I did not restrict myself with the videos and assignments given by the courses rather after some time I used to read the topic name and started learning from research papers, internet searches, books, and other sources. You can also select any course which feels right and fit in your pocket. See how the content is delivered and what are the other services offered. I will share with you my roadmap on how I prepared. I won’t bound you with the rigid timeline, you can also follow it at your own pace.
Mathematics and Statistics
Learn the Basics of Statistics and brush up on your school and College Mathematics.
- Derivative and Function minimization
- Vector and Matrices
- Probability Distribution
- Random Variables
- Normal Distribution
- Hypothesis Testing
- Z-Test and T-Test
- Chi-squared Test
- ANOVA Test
Just the basic knowledge is enough of the above topics.
You can select any language of your choice. I chose Python. Some fundamental concepts to know:
- Data Types
- Object-Oriented Programming
- Exception Handling
Don’t worry if you are not comfortable with these concepts in the beginning. After ample practice, you will be confident.
Understand how different algorithms work and how to implement it:
- Linear Regression
- Logistic Regression
- Decision Tree
- Random Forest
- Ensemble, Bagging, Boosting
- Naive Bayes
- K-Means Clustering
- Hierarchical Clustering
- Principal Component Analysis
- Support Vector Machine
- Time Series and Anomaly Detection
For experienced professionals, one can start with advanced excel, other data visualization tools like PowerBI and Tableau. You can get many platforms where you can perform ML predictions without knowledge of coding.
If you are a fresher then start learning to program and complete a few courses on ML. Experiment a lot and keep working hard. Word of motivation for fresher: “If ML is your passion then be like Batman: Do your office work in the day and follow your passion at night.
If you are in college attend conferences, workshops, tech fests, complete courses participate in hackathons, meet as many people as you can, and most importantly enjoy the process.
Retaining what you learn
Now that you are into Learning let me share with you some tips on how to retain what you learn:
- Hands-on implementation of the project. Start with a basic project and make it large. Integrate it with apps, deploy on cloud platforms, etc.
- Feynmann Learning Technique:
- Choose a concept you want to learn about
- Pretend you are teaching it to a student in grade 6
- Identify gaps in your explanation; Go back to the source material, to better understand it.
- Review and simplify
- You can record videos of you explaining concepts and upload it on YouTube. In the end, you will have your own video notes to refer.
Not only YouTube you can choose any social media like I keep my digital notes on Instagram and Facebook
Building your digital profile will not only help you in retaining things but it will also help you a lot in building connections with others in the industry. You can showcase your work, collaborate with others, and work on projects. This practice will develop communication and overall personality skill which many people lack.
Add each and every bit of code you do on Github and write a nice readme.md about it. Start Blogging or create a website to showcase what you are doing ( Google sites is also enough). Just digitally record each and everything you do. Blogger, Medium Facebook Page you can write anywhere you want.
So this was my article discussing what data science is and up to what extent one should learn. I have also shown how to learn in an effective way. If you have any doubts and want suggestions on which books or blogs to follow then ask me in the comment section. Lastly, I would like to say that enjoy the process rather than bounding yourselves to courses.