Before creation, God did just pure mathematics. Then he thought it would be pleasant change to do some applied
-John Edensor Littlewood
Mathematics & Statistics are the founding steps for data science and machine learning. Most of the successful data scientists I know of, come from one of these areas – computer science, applied mathematics & statistics or economics. If you wish to excel in data science, you must have a good understanding of basic algebra and statistics.
However, learning Maths for people not having background in mathematics can be intimidating. First, you have to identify what to study and what not. The list can include Linear Algebra, calculus, probability, statistics, discrete mathematics, regression, optimization and many more topics. What do you do? How deep to you want to get in each of these topics? It is very difficult to navigate through this by yourself.
If you have faced this situation before – don’t worry! You are at the right place now. I have done the hard work for you. Here is a list of popular open courses on Maths for Data science from Coursera, edX, Udemy and Udacity. The list has been carefully curated to give you a structured path to teach you the required concepts of mathematics used in data science.
Get started now to learn & explore mathematics for data science.
To help you navigate through the courses, I have divided the article into beginners, intermediate and advanced section. Choose your level of expertise in mathematics before delving further. Further, I have added the pre-requisites for each course. You can check if you know these topics before starting the course.
Few courses may require you to finish the preceding course for better understanding. So, make sure that you either know the subject or have undergone these courses.
Read on to find out the right course for you!
Duration: 4 weeks
Led by: Duke University (Coursera)
If you are a beginner with very minimal knowledge of mathematics, then this course is for you. In this course, you will learn about concepts of algebra like set theory, inequalities, functions, coordinate geometry, logarithms, probability theory and many more.
This course will take you through all the basic maths skills required for data science and would provide a strong foundation.
The course starts from 9 Jan 2017 and is lead by professors from Duke University.
Prerequisites: Basic maths skills
Duration: 8 weeks
Led by: Udacity
This course by Udacity is an excellent beginners guide for learning statistics. It is fun, practical and filled with examples. The Descriptive Statistics course will first make you familiar with different terms of statistics and their definition. Then you will learn about statistics concepts like central tendency, variability, standard normal distribution and sampling distribution.
This course doesn’t require any prior knowledge of statistics and is open for enrollment.
Prerequisites: None
Duration: 8 weeks
Led by: Udacity
After you have gone through the Descriptive Statistics course, it is time for Inferential statistics. The same practical approach to the subject continues in this course.
In this course, you will learn concepts of statistics like estimation, hypothesis testing, t-test, chi-square test, one-way Anova, two-way Anova, correlation, and regression.
There are problem set and quiz questions after each topic. You will also be able to test your learning on a real-life dataset at the end of the course. The course is open for enrollment.
Prerequisites: Complete understanding of Descriptive Statistics (the course mentioned above)
Alternate Course: You can also look at Statistics: Unlocking the World of Data. It is a 6 weeks long course run by University of Edinburgh (edX)
Duration: 5 weeks
Led by: Duke University (Coursera)
It will provide you hands on experience in data visualization and numeric statistics using R and RStudio.
The course will first take you through basics of probability and data exploration to give a basic understanding to get started. Then, it will individually explain various concepts under each topic in detail. At the end, you will be tested on a data analysis project using a real-world dataset.
The course is led by a Professor in Statistics at Duke University and is also a prerequisite for Statistics in R specialization. If you are looking forward to learn R for data science, then you must take this course. The course is open for enrollment.
Prerequisites: Basic Statistics and knowledge of R
Duration: 1 week
Led by: Davidson College (Udemy)
As the name suggests, this course tells you how maths is being used everywhere from Angry birds to Google. It is a fun approach to applied mathematical concepts.
In this course, you will learn how equation of lines is used to create computer fonts, how graph theory plays a vital role in angry birds, linear systems model the performance of a sports team and how Google uses probability and simulation to lead the race in search engines.
The course is led by the mathematics professor at Davidson College and is open for enrollment.
Prerequisites: Understanding of linear algebra and programming
Duration: 6 weeks
Led by: Purdue University (edX)
This course is designed for anyone looking for a career in data science & information science. It covers essentials of mathematical probabilities.
In this course, you will learn the basic concepts of probability, random variables, distributions, Bayes Theorem, probability mass functions and CDFs, joint distributions and expected values.
Once you are familiar with the basics, you will learn about advanced concepts Bernoulli and Binomial distributions, Geometric distribution, Negative Binomial distribution, Poisson distribution, Hypergeometric distribution and discrete uniform distribution.
After taking this course you will have a thorough understanding of how probability is used in everyday life. The course is open for enrollment.
Prerequisite: Basics Statistics
Duration: 4 weeks
Led by: Johns Hopkins University (Coursera)
Honestly, the “Bio” in “Biostatistics” is misleading. This course is all about fundamental probability and statistics techniques for data analysis.
The course covers topics on probability, expectations, conditional probabilities, distributions, confidence intervals, bootstrapping, binomial proportions, and logs.
A prior knowledge of linear algebra and programming will be advantageous but not mandatory to begin with this course. The course starts from 16 Jan 2017 and is led by biostatistics professor at Johns Hopkins University.
A well-paced course with a complete introduction to mathematical statistics.
Prerequisites: Basic Linear algebra, calculus and programming useful but not mandatory
Duration: 5 weeks
Led by: Davidson College (edX)
This is an interesting course on applications of linear algebra in data science.
The course will first take you through fundamentals of linear algebra. Then, it will introduce you to applications of linear algebra for recognizing handwritten numbers, ranking of sports team along with online codes.
The course is open for enrollment.
Prerequisite: Basic linear algebra
Duration: 8 weeks
Led by: Stanford University (Coursera)
In this mathematical thinking course from Stanford, you will learn how to develop analytical thinking skills. The course teaches you interesting ways to develop out-of-the-box thinking and helps you remain ahead of the competitive curve.
In this course, you will learn about analysis of a language, quantifiers, brief introduction to number theory and real analysis. To make the most of this course one must have familiarity with algebra, number system and elementary set theory.
The course starts from 9 Jan 2017 and is led by professors at Stanford. It is open for enrollment.
Prerequisites: Basic algebra, number system and elementary set theory.
By this time, you know all the basic concepts a data scientist needs to know. This is the time to take your mathematical knowledge to the next level.
Duration: 4 weeks
Led by: University of California (Coursera)
Bayesian Statistics is an important topic in data science. For some reason, it does not get as much attention.
In this course, the first section covers basic topics like probability like conditional probability, probability distribution and Bayes Theorem. Then you will learn about statistical inference for both Frequentist and Bayesian approach, methods for selecting prior distributions and models for discrete data and Bayesian analysis for continuous data.
Prior knowledge of statistics concepts is required to take this course. The course starts from 16 Jan 2017.
Prerequisite: Basic & Advanced Statistics
Duration: 8 weeks
Led by: Stanford University and University of British Columbia (Coursera)
Game theory is an important component of data science. In this course, you will learn about basics of game theory and its applications. If you are looking to master Re-inforcement learning this year – this course is a must learn for you.
The course provides basic understanding of representing games and strategies, the extensive form (which computer scientists call game trees), Bayesian games (modeling things like auctions), repeated and stochastic games. Each concept has been explained with the help of examples and applications.
The course is led by professors from the Stanford University and The University of British Columbia. The course is open for enrollment.
Prerequisite: Basic probability and mathematical thinking
Duration: 5 weeks
Led by: Stanford University and The University of British Columbia (Coursera)
After going through the basics of Game theory in the previous course, this course is on the advanced applications of game theory.
You will learn about how to design interactions between agents in order to achieve good social outcomes. The three main topics covered are social choice theory, mechanism design, and auctions. The course starts from 30 Jan 2017 and is led by professors from Stanford University & The University of British Columbia.
The course is open for enrollment.
Prerequisite: Basics of Game Theory
Duration: 4 weeks
Led By: Harvard University (edX)
Matrix algebra is used in various tools for experimental design and analysis of high-dimensional data.
For easy understanding, the course has been divided into seven parts to provide you a step by step approach. You will learn about matrix algebra notation & operations, application of matrix algebra to data analysis, linear models and QR decomposition.
The language used throughout the course is R. Feel free to choose which part of the course caters more to your interest and take the course accordingly.
The course is conducted by biostatistics professors at Harvard University and is open for enrolment now.
Prerequisite: Basic Linear algebra and knowledge of R
Duration: 6 weeks
Led by: Johns Hopkins University (Coursera)
This course is a two part series for advanced linear statistical learning models. For all those who have an understanding of regressions models and are looking to explore this topic further must take this course.
In this course, you will learn about one & two parameter regression, linear regression, general least square, least square examples, bases & residuals.
Before you proceed further let me clear, to take this course you need a basic understanding of linear algebra & multivariate calculus, statistics & regression models, familiarity with proof based mathematics and working knowledge of R. The course starts from 23 Jan 2017.
Prerequisite: Linear Algebra, calculus, statistics and knowledge of R
Duration: 6 weeks
Led by: Johns Hopkins University
This is the second part of the course on advanced linear statistical learning models. For all those who have an understanding of regressions models and are looking to explore this topic further must take this course.
In this course, you will learn about basics of statistical modeling multivariate normal distribution, distributional results, and residuals.
Before you proceed further let me clear, to take this course you need basic understanding of linear algebra & multivariate calculus, statistics & regression models, familiarity with proof based mathematics and working knowledge of R. The course starts from 23 Jan 2017.
Prerequisite: Linear Algebra, calculus, statistics and knowledge of R
Duration: 8 weeks
Led by: University of Notre Dam (edX)
I am someone who is very curious to know how mathematics can be used to drive deeper insights in sports and everyday life.
I came across this course, which shows how your favorite sport uses mathematics to analyze data and know the trends, performance of players and their fellow teams.
In this course, you will learn how inductive reasoning is used in mathematical analysis, how probability is used to evaluate data, assess the risk and outcomes of any event.
All the major team sports, athletic sports, and even extreme sports like mountain climbing have been covered in the course. The course is led by professors of the University of Notre Dam and is currently open for enrolment.
Prerequisite: Statistics & Linear Algebra
Bravo, by now – you would be on your own. You would have developed a knack for mathematics & statistics and would feel confident about continuous learning – way to go!
Duration: 8 weeks
Led by: University of Melbourne (Coursera)
Every industry & company makes use of optimization. Airlines use optimization to ensure fixed turn-around-time, E-commerce like Amazon uses optimization for on time delivery of products. Macro-level applications of optimization includes deploying electricity to millions of people, way for new medical drug discoveries and many more.
This course provides you a complete understanding of discrete optimization and it is being used in our everyday lives. First, it will take you through fundamental basics of discrete optimization and its various techniques. You will learn about constraint, linear and mixed integer programming. The last section of the course includes advanced topics on optimization.
The prerequisites to take this course are good programming skills, knowledge of fundamental algorithms, and linear algebra. The course starts from 16 Jan 2017 and is conducted by professors at Melbourne University.
Prerequisite: Programming, algorithms and linear algebra
Duration: 4 weeks
Led by: Johns Hopkins University
If you aspire to become a generation sequencing data scientist then you must take this course.
In this course, you will learn about exploratory analysis, linear modeling, hypothesis testing & multi-hypothesis testing, different types of process like RNA-seq, GWAS, ChIP-Seq, and DNA Methylation studies. This course is part of Genomic Data Scientist specialization from Johns Hopkins. The course starts from 16 Jan 2017.
This course is part of Genomic Data Scientist specialization from Johns Hopkins. The course starts from 16 Jan 2017.
Prerequisite: Advanced Statistics and algorithms
Duration: 8 weeks
Led by: utmb Health (edX)
This course is an introduction to data analysis using biomedical big data.
In this course, you will learn about fundamental components of biostatistical methods. Working with biomedical big data can pose various challenges for someone not familiar with statistics.
Learn how basic statistics is used in biomedical data types. You will learn about basics of R programming, how to create & interpret graphical summaries of data and inferential statistics for parametric & non-parametric methods. It will provide you hands on experience in R with biomedical problem types.
The course is open for enrolment.
Prerequisite: Advanced statistics and knowledge R
I hope you found this article useful. By now, you would have identified the learning areas for yourself. If you are from mathematics background, you can choose the right courses for yourself. On the other hand, if you do not have a mathematics background, then start from the beginners sections and move ahead.
For those of you, who have taken any of these courses, let us know your feedback about them. Share your opinions with me and other users through comments below. Through this article I wanted to provide you a list of resources available at your disposal in mathematics for data science. Hope you make good use of them.
Excellent compilation.
Awesome and excellent post. Thank you very much. I'm wondering if you could give me some advices also for time series, econometrics and financial analysis courses
Hi Luis, I'm glad you found it helpful. These resources will helpful for you. 1. Econometrics: Methods and Applications 2. The Language and Tools of Financial Analysis 3. For time series you can go through this article https://www.analyticsvidhya.com/blog/2015/12/complete-tutorial-time-series-modeling/
Hi Swati, Great article for beginners like me. But I had to google for what MOOC (Massive Open Online Courses) actually means.
Hey Swati, Amazing info. Extremely helpful. I am embarking on DS journey and this is going to be the stepping stone ?
Nice article! Few more useful courses: Statistics in Medicine (Stanford Online) by Kristin Sainani Probabilistic Graphical Models 3 part series (Coursera) by Daphne Koller
Advanced Calculus helps in understanding Multivariate Analysis
Hi Swati, Very useful article and came up at right time for me. I was looking for brushing up all mathematics and statistics to make my journey to data science much smoother. But not aware what learn and what to leave. With this article you have given just that. Thanks a lot!!!!
Hi Swati, A very useful article for people at all levels of preparation, but i am in the middle stage like i completed intro to descriptive statistics and intro to inferential statistics course earlier there i learned only about the linear regression, can i go along furthur and learn other regression techniques or follow this blog i am really confused, so please respond me.
Hi Shivanesh, Follow this learning path, https://www.analyticsvidhya.com/blog/2017/01/the-most-comprehensive-data-science-learning-plan-for-2017/ It is the most comprehensive resource which we help you follow the right path for learning data science
Hi Swati, Its is of immense help for someone who want to start their journey in data science...
Just what I need to start my journey into data science. Thanks abunch Swati Kashyap
Really Great Article on how to start Data Science Analytics journey.
Whenn you fiⅾ yourself occupied with a brand neww profession aas a paralegal, tһdre are a selection of choiⅽes whicch ʏoull consider. You mіght determine that being a contract paralegal is the ѡaаy that you juѕt want to pursue this field. Yoս possibly can begin by weighing the professionals and cons of this thrilling new means of working in the paralegaⅼ area; and chances are youll determine thazt it is the most suitable choice for you.
Thanks for the list. I've come across mathxplain.com which has all the above topics for free if anyone has a low budget :)
Hi Swati, Nice collection of articles you have there. I am at 3rd MOOC, would you recommend me MOOC which uses python instead of R? I am on Python track that's why the question.
thank you very much for your post but i think calculus is very important for data science so any advice ! Thanks
Is there a good course for calculus- from single variable to multivariable?