Learn everything about Analytics

Learn Big Data Analytics using Top YouTube Videos, TED Talks & other resources

, / 10


There has been a lot of investment in Big Data by various companies in last few years. This rise in usage of big data analytics has resulted in high demand of skilled big data professionals. While there has been a lot of debate over usefulness of this spend, there is a clear increase in the jobs on Big Data. Here is what a quick search on indeed tells:


Given the sharp increase in demand, big data has become a lucrative area to upskill yourself. However, if you are some one like me, who needs a good overview and understanding of practical benefits before learning something in a more formal fashion, you will probably struggle to find structured resources as I did some time back.

There are a lot of technologies and terminologies associated with Big Data, which can act as an additional road block to get you started. Names like Hadoop, MapReduce, Spark, MongoDB, Hive take some time to get used to! After receiving an overwhelming response on my previous article on Top YouTube Videos on Machine Learning, Deep Learning, Neural Network and looking at the lack of structured resources, the answer was simple! Here’s another YouTube special for Big Data aspirants – basically my take on immersing yourself on Big Data!

Disclaimer: We DO NOT intend to promote any brand or service through this article. The videos listed in this article are solely based on their relevance / usefulness to the audience. 

big data hadoop big data analytics video tutorials


Who is expected to benefit most from watching these videos?

I have written this article keeping in mind the beginners fraternity of Big Data. Hence, this article is best suited for candidates keen to start their career in big data analytics. If you are already an experienced big data professional, this article might not be what you are looking for! However, you can still consume ‘inspiration’ from TED Talks listed below.

The structure of this article is designed to give a complete overview on various technologies used in Big Data Analytics. TED Talks displayed at the beginning are meant to add a pinch of inspiration to your learning path. These talks offers you to imagine an exciting world driven by numbers, analytics and big data technologies.


TED Talks on Big Data

1. Introduction to Big Data by Hilary Mason, Chief Data Scientist at Bitly

Duration: 11:30 mins


Summary: In this short video, Hilary talks about the rise of big data and how it is going to impact our work environment. She also highlights the tiny but significant changes brought by big data which includes CPUs, Data and Algorithms. Later, she examines the profile of a data scientist in her style. She highlights the applications of big data and its usage in our day to day lives.


2. Big Data, Small World: Kirk Borne at TEDxGeorgeMasonU

Duration: 22:00 mins


Summary: Dr. Kirk Borne begins by talking about his journey to become a data scientist. Later, he covers some of the best ideas applied behind data mining and how it can be applied to our daily lives. He also talks about the ‘small world phenomenon’ and ‘6 degrees separation’.  Later, he reveals some surprising statistics of big data which promises that the future world will be driven by data.


3. Kenneth Cukier: Big Data is Better Data

Duration: 16:00 mins

TT -1

Summary: Kenneth lays immense focus on using data available at the granular level. Every byte of data has something or the other to reveal, all it requires is an engineer to discover. He believes that, with the available amount of information, we can find answers to all the questions which were difficult to think of earlier. Data has made us more powerful. Data can be our greatest power if we dispose it intuitively.


4. What to do with all this Big Data?

Duration: 12:29 mins


Summary: Susan believes ‘We are not the passive consumers of data and technology. Rather, we shape the data and make meaning from it‘.  In this short video, she shares her perspective on the rise of big data and the different ways of using data for its optimal utilization. Data doesn’t create meaning, we do. Data offers us a vast ocean of information which has to be churned to extract useful insights.


5. How to find the worst place to park in New York City using Big Data?

Duration: 11:53 mins


Summary: The title says it all. The speaker makes use of statistics and visualization to infer the worst place to park in NYC. He made sure that he didn’t miss out any important information, hence he captured all the important variables in his graphical representations. If you ever wanted to see the real time usage of data, you shouldn’t miss it.


Get Introduced to Big Data Terminologies

I would highly recommend these YouTube videos to people who are new to big data analytics. Watching these quick videos ( ~ 3 mins) videos would give you a clear overview of the different big data technologies and the relations between them.

1. What is HBase?                                               Duration – 3 mins

2. What is Hadoop?                                            Duration – 3:12 mins

3. What is MapReduce?                                     Duration – 2:39 mins

4. What is HDFS?                                               Duration – 2:51 mins

5. What is Flume?                                               Duration – 2:59 mins

6. What is PIG?                                                   Duration – 3:01 mins

7. What is Hive?                                                  Duration – 2:52 mins

8. What is Avro?                                                 Duration – 3:00 mins

9. What is Oozie?                                                Duration – 2:28 mins

10. What is Zookeeper?                                    Duration – 3:26 mins


10 Tutorials on Big Data Analytics

1. Hadoop Crash Course Workshop

Duration – 55:32 mins


Summary: As the name suggest, this video covers all about Hadoop and related concept in less than an hour. The speaker begin with a quick introduction of Hadoop, followed by explaining hadoop ecosystem and distribution, HDFS in detail. Later, multiple components of Hadoop such as Mapreduce, Yarn, Tez are explained using some interesting stories. Finally, he winds up this crash course by revealing some of the not so popular but super useful ways of accessing data.


2. Fundamentals of MapReduce

Duration – 32:03 mins


Summary: This is a complete tutorial to learn basics of MapReduce. This tutorial series is divided in 5 parts, each of which covers a specific module of MapReduce. This introductory video on MapReduce provides a detailed overview on its importance, related job opportunities, applications and usage. As you navigate through its following parts, you will cover essential fundamentals of MapReduce. Do check the Up Next section while you are there!


3. Enabling R on Hadoop

Duration – 40:25 mins


Summary: This tutorial teaches you the knowledge of integrating hadoop with R. The speaker follows a step by step process of Hadoop installation on R. Concepts like RHadoop, RHive and various related R libraries have been discussed. Furthermore, he also discusses on varied usage of R and how R programming has evolved over the years.


4. Introduction to Deep Learning on Hadoop

Duration – 41:14mins


Summary: The speaker beautifully explains the concept of deep learning using hadoop. Deep Learning is one of the most talked about topic in data science community. Scientists and researchers are working hard to discover new patterns using deep learning. The concept of deep learning has been explained in a simplistic manner in this video. Topics like deep belief networks, implementation of Hadoop / YARN have also been discussed.


5. Introduction to Apache Cassandra

Duration – 1:15:06 hour


Summary: I have found very few videos on Apache Cassandra but this makes up for all. Here’s a complete introduction to Apache Cassandra from scratch. The rise of Apache Cassandra is catching eyes of companies and professionals across the world. In this video, the speaker explains the algorithms used, its essential features, benefits and the concept / cause behind launching Apache Cassandra ~6 years back.


6. Introduction to PIG

Duration – 30:56 mins


Summary: This is a complete tutorial to lean about PIG. In this tutorial, the instructor begins with providing an overview of Pig followed by the comparison between Pig and SQL. Since both are very similar, it makes an interesting comparison. He also explains about using Pig latin. Above all this, the basic steps of Pig installation have also been illustrated.


7. An overview of Apache Spark

Duration – 1:06:14 hour


Summary: This tutorial aptly justifies its title by teaching about the spark technology and how it can help in shaping the world. The tutorial begins with a quick refresher of mapreduce followed by spark and the advantage of using this technology. The speaker has beautifully explained these concepts.


8. Introduction to Hive and HiveQL

Duration – 1:06:19 hour


Summary: Hive is built on top of hadoop to provide data management, querying and analysis. This tutorial discusses hive architecture, hive operations and other related functions. This tutorial not only enriches you with theoretical knowledge, but also displays the practical aspect and demonstrates the same on terminal.


9. Introduction to NoSQL

Duration – 54:51 mins


Summary: This is one of the best video I have come across on NoSQL databases. You’ll find an introduction to NoSQL databases along with every other essential knowledge of this concept which you must possess. This tutorial covers application, advantages, disadvantages, compatibility, usage, characteristics and various other essential features of NoSQL. I’ll recommend this video for everyone.


10. MongoDB Tutorial for Beginners

Duration – 4:34:47 hour


Summary: If you ever longed to learn MongoDB, here the complete resource for you. This tutorial comprehensively covers all the aspect of MongoDB and NoSQL databases. Though, it appears to be quite long ( > 4 hours), you can watch this tutorials in breaks. A prior knowledge of Javascript would be advantageous for learning MongoDB through this tutorial. This tutorial begins with introduction to NoSQL databases followed by explaining mongodb, how to run mongodb queries, node.js, advanced data processing and method to learn mongodb on cloud ubuntu.

Alternate resource: MongoDB course on Udacity


End Notes

If you have watched the videos listed above – you would be equipped with the essentials of Big Data by now. In this article, I have highlighted the most helpful YouTube videos and TED talks I found on internet. The videos listed are intend to build you big data basics and make your learning path easier.

If you wish to reap maximum benefits from these videos, I’d insist to make notes and get your hands dirty while watching these videos. In case I have missed out on any important video, feel free to mention it in the comments section below.

If you like what you just read & want to learn more on Big Data, subscribe to our emailsfollow us on twitter or like our facebook page.


  • Kamal T says:

    Wow! This is very close to what I was looking for.
    I just need a little clarification, is big data analytics different from business analytics, if yes then how?
    I’m currently going through statistics videos on khan academy. I hope that helps too?

    • Anon says:

      I’m a learner too. As far as I understand, Business Analytics need not necessarily involve huge amounts of data: you could simply be using the data from your factory or sales figures of a (datawise) small- or medium-sized company, to increase productivity or project profits and things like that. And all of this data could potentially be stored in an ordinary MySQL DB, for example, and would not require the use techniques like MapReduce to decrease computational resources.

      “BigData” refers to phenomenon of companies like Google, Amazon and Facebook which have access to Petabytes of new data every day, from which they want to extract patterns (ostensibly to serve their customers better 😉 — in reality, to push more ads into their faces). The sizes of these datasets are extremely larage, and the structure (or a lack of one) doesn’t resemeble simpler ones like factory output. With more companies tapping into the smartphone/tablet boom every day, even smaller players now have access to large data sets (but still relatively small when compared to Google, say) that they look for what are typically called “BigData Technologies”, like the ones covered by the videos listed in this blog post.

      • Kamal T says:

        So can I broadly say this ” A person equipped with big data tools might apply for a business analytics job but a business analytics person may not be able to apply for a big data job”
        From what I have understood business analytics would just mean using tools such as R, Python,etc with some techniques such as regression, random forest, etc (no idea what they are) to make some sense out of data.
        Big data on the other hand might require using all of the above with more sophistication since the amount of data is too large. Am I on the right track?

        • Anon says:

          ‘So can I broadly say this ” A person equipped with big data tools might apply for a business analytics job but a business analytics person may not be able to apply for a big data job”’

          Not necessarily. BigData is an umbrella term/buzz-word that connotes all things right from large-scale digitial data acquisition, to data cleaning and storage, data analysis and business analytics (the last being involved in business decisions rather than the nitty-gritty of data wrangling). Someone like Kunal is better suited to answer the question of who would get which job.

          Regression, RandomForests etc. are statistical techniques that one applies to data to find patterns or predict values. Simply put they are black boxes which spit out numbers based on the data you’ve fed them. To make proper use of them requires both knowledge of statistics (a given) and the business domain (for context).

          R and Python are programming languages which are conducive to programming statistical models. One could use them to crunch numbers on a table with 100,000 rows with definite values (not really big data) to one with millions or rows and hundreds of variables (big data), with a mixture of text, numbers and whatnot.

          Big data on the other hand might require using all of the above with more sophistication since the amount of data is too large. Am I on the right track?’

          Yep. Large and largely ‘unstructured’.

          Of course, all of this is what I’ve learnt in the last few months by reading articles on the web and discussing it with people on Analytics Vidhya (blog and forum). I could be wrong about certain things, so I invite others to correct me if that’s case.

  • Sushil says:

    Plz add a button “Add to reading list” to all your videos.

  • Kumar Chinnakali says:

    AV Team, AV is kazien tool for me.

    Wow, what a great collection. It’s dictionary. Hats off to AV team.

    Warm Reagrds,
    Kumar Chinnakali

  • Anon says:

    Hadley Wickham provides a non-hype idea of what the term Big Data means (to him) – see the last question:


  • Aditya says:

    I have 4 years of data visualization experience and want to learn Big Data, these are very helpful resources for getting a good start. Thanks for empowering!

Leave A Reply

Your email address will not be published.

Join world’s fastest growing Analytics Community
Receive awesome tips, guides, infographics and become expert at: