Get Knowledge from Best Ever Data Science Discussions on Reddit

avcontentteam 28 Jan, 2016
12 min read



While composing this enriching this list of data science discussions, I found this awesome ‘poem’ drafted statistically. Ain’t it pretty cool ? Dedicated to that every person who thinks this world is driven by data (including my team):

A curvy young belle, Billie Jean,
Makes a measure of each man she’s seen.
Yet the size of one sample
Was sufficiently ample
To skew median far from the mean.
The jealous lads started to hate him,
And the ladies all lined up to date him;
Not his charm, nor his suit
Sets the gals in pursuit,
But the infamous size of his datum.
That persistent young lass, Billie Jean,
Measured more, she was quite the machine!
Like a Bell, things now curved,
And all same, she observed,
Were the median, mode and the mean.

What does this poem suggests? Of course, I’ve shared it for a reason.

And the subtle reason is, there are many more interesting ways to learn subjects which are difficult otherwise. Precisely, people learn data science and its related concepts using text books, guides, video-tutorials, interactive websites and every other way which comes at their disposal. Adding to this list, there’s another interesting and inspiring way to learn. And that is using ‘Discussions‘. Because of its need, we run a discussion portal on Analytics Vidhya as well. But today’s article focuses on a much larger community – Reddit.

Reddit, the Front Page of Internet, is a community known for its candid and outspoken discussions. A platform where spam-ism is despised and genuine content is welcomed. It has content related to everything under the sun. I’d be astonished, if not. However, for people who are new to Reddit, reading / participating on Reddit discussions can be very intimidating at the start. It can take some time before you understand how the forum works.

But there is a ton of useful knowledge on Reddit, which you can learn. People who are active on Reddit would agree to it. Hence, in this article, we have summarized some of the best discussions related to Machine Learning, Deep Learning, Neural Networks, Artificial Intelligence, Python, R Programming, Big Data and Statistics. I hope that you benefit out of it, if you don’t follow Reddit religiously or can not fit in the community.

reddit discussions, data science, machine learning, analytics, neural network, deep learning

Table of Contents

  1. AMA with Top Data Scientists
  2. Top 9 Discussions on Data Science Tutorials / Books
  3. Top 5 Discussions on Career / Jobs in Data Science
  4. Top 7 Discussions on Machine Learning, Neural Networks, Deep Learning
  5. Top 6 Discussions on Python and R Programming
  6. Top 6 leisure reads on Machine Learning, Python, Statistics


AMA with Top Data Scientists

I would do injustice to this article, if I don’t start it from AMAs. AMAs (or Ask Me Anything) are an important part of Reddit community and a setup where you can ask questions to experts in a particular field. I suppose these people require no formal introduction. They are considered to be the best of brains working towards the development of machine learning, deep learning, neural networks etc. If you ever seek inspiration, check out what are folks are upto. Below are the AMAs with 3 key highlights in 4 lines each:


1. AMA with Geoffrey Hinton

data science, geoff hinton, deep learning

a) On human brain – ‘The brain has about 1014 synapses and we only live for about 109 seconds. So we have a lot more parameters than data. This motivates the idea that we must do a lot of unsupervised learning since the perceptual input (including proprioception) is the only place we can get 105 dimensions of constraint per second’.

b) On his career – ‘My father was a Stalinist and sent me to a private Christian school where we had to pray every morning. From a very young age I was convinced that many of the things that the teachers and other kids believed were just obvious nonsense. That’s great training for a scientist and it transferred very well to artificial intelligence. But it was a nasty shock when I found out what Stalin actually did.’

c) On Dark Knowledge –  ‘Yes, I invented the term “Dark Knowledge”. Its inspired by the idea that most of the knowledge is in the ratios of tiny probabilities that have virtually no influence on the cost function used for training or on the test performance. So the normal things we look at miss out on most of the knowledge, just like physicists miss out on most of the matter and energy.’


2. AMA with Yann LeCun

deep learning, artificial intelligence, machine learning

a) On Career in Deep Learning – ‘Read, learn from online material, try things for yourself . Take as many math and physics course as you can, and learn to program. You have to figure out what’s important, know what to ignore, and know how to approximate. These are skills you need to conceptualize, model, and analyze ML models. Another set of courses that are relevant is signal processing, optimization, and control/system theory.

b) On most overlooked things in machine learning – ‘Kernel methods are great for many purposes, but they are merely glorified template matching. There is nothing magical about margin maximization. It’s just another way of saying “L2 regularization” (despite the cute math). There is no opposition between deep learning and graphical models. Many deep learning approaches can be seen as factor graphs.

c) On emotions in Robots – ‘Emotions do not necessarily lead to irrational behavior. They sometimes do, but they also often save our lives. If emotions are anticipations of outcome (like fear is the anticipation of impending disasters or elation is the anticipation of pleasure), or if emotions are drives to satisfy basic ground rules for survival (like hunger, desire to reproduce), then intelligent agent will have to have emotions.


3. AMA with Andrew Ng and Adam Coates

andrew ng adam coates


a) Recommendation after Coursera ML course – ‘Here’re a few common paths: 1. Many people are applying ML to projects by themselves at home, or in their companies. This helps both with your learning, as well as helps build up a portfolio of ML projects in your resume (if that is your goal). If you’re not sure what projects to work on, Kaggle competitions can be a great way to start.

b) Best place to work for ML & AI Engineers – ‘I think Baidu, Google and Facebook are all great places to work! But Baidu Research is very much a startup environment. With ~40 people in our Silicon Valley team, we also invest a lot in employee development. I think these things make the best possible combination for driving machine learning research, which is why both of us (Adam & Andrew) had decided to join Baidu.

c) On key drivers of Deep Learning – ‘I think the two key drivers of deep learning are: – Rise of computation. Not just GPUs, but now the migration toward HPC (high performance computing, aka supercomputers). – Rise of availability of data, because of the digitization of our society, in which increasing amounts of activity on computers/cellphones/etc. creates data.’


4. AMA with Yoshua Bengio

deep learning, data science, artificial intelligence


a) On choice of academia – ‘I like academia because I can choose what to work on, I can choose to work on long-term goals, I can work for the benefit of humanity rather than for a specific company, and I can talk about my work freely. Note that to different degrees, my esteemed colleagues in large industrial labs also enjoy some of that freedom.’

b) On RNN – ‘Recurrent or recursive nets are really useful tools for modelling all kinds of dependency structures on variable-sized objects. We have made progress on ways to train them and it is one of the important areas of current research in the deep learning community. Examples of applications: speech recognition (especially the language part), machine translation, sentiment analysis, speech synthesis, handwriting synthesis and recognition, etc.’

c) On upcoming challenges in NLP – ‘I believe that the really interesting challenge in NLP, which will be the key to actual “natural language understanding”, is the design of learning algorithms that will be able to learn to represent meaning. There are also more computational challenges: we need to be able to train much larger models (say 10000x bigger), and we can’t afford to wait 10000x more time for training.’


5. AMA with Jürgen Schmidhuber



a) On the future of RNN – ‘The world of RNNs is such a big world because RNNs (the deepest of all NNs) are general computers, and because efficient computing hardware in general is becoming more and more RNN-like, as dictated by physics: lots of processors connected through many short and few long wires. Both supervised learning RNNs and reinforcement learning RNNs will be greatly scaled up.’

b) On his hobbies – ‘In my spare time, I am trying to compose music, and create visual art. And while I am doing this, it seems obvious to me that art and science and music are driven by the same basic principle.I think the basic motivation (objective function) of artists and scientists and comedians is data compression progress, that is, the first derivative of data compression performance on the observed history.’

c) On future of AI – ’20 years from now we’ll have 10,000 times faster computers for the same price, plus lots of additional medical data to train them. I assume that even the already existing neural network algorithms will greatly outperform human experts in most if not all domains of medical diagnosis, from melanoma detection to plaque detection in arteries, and innumerable other applications.’


Top 9 Discussions on Tutorials / Books

1. I’m struggling to learn Machine Learning on my own. How should I overcome this hurdle?

Here’s you’ll find a complete guide on overcoming periodical breakers while mastering machine learning. More than learning, you’ll be inspired how other people have endured such situation.

2. What are some good resource to learn Recurrent Neural Networks?

Here’s a list of all the resources people have found essential to learn this Recurrent Neural Networks. You’ll also find the review / opinions of people with respective resources.

3. Which is the best book for Machine Learning in Python ?

Here’s a list of all the books essential to master the concepts of machine learning. If you like to learn from books rather than blogs / videos, you just can’t miss this.

4. How should I start learning Natural Language Processing?

People have learnt NLP using various resources available for free. They are listed in this discussions. Moreover, you’ll find other essential information on learning this concept.

5. How to install Deep Dream on windows with vagrant dev environment?

Deep Learning algorithms are trained by giving them a huge number of images, and telling them what object is in each image. Here’s a step by step tutorial on setting up your machine for deep learning.

6. What are the best resources for beginners to learn Big Data Analytics?

Considering the seamless growth big data industry has, lot of people are deciding to enter this industry. Here’s a compiled list of useful resources that has helped other people to make this move.

7. What books or papers are must read for every professional statistician?

If you are into statistical research and like to explore related concepts, you just can’t miss this discussion. Here you’ll find a comprehensive list of best books / white papers on statistical learning.

8. List of Blogs on Machine Learning

As the name suggests, if you want to learn machine learning, here’s a list of top machine learning blogs which you can subscribe right away.

9. Which are some of the best watch python videos?

If learning from videos is what excites you, here’s a compiled list of best ever python videos that people have shared with their reviews. You’ll know what best amongst the best once you check this discussion.


Top 5 Discussions on Career / Jobs

1. List of commonly asked interview questions on Python

As the title suggest, here’s a compiled of questions been asked to candidates in their python interviews. If you are lucky, you can also find their solution in the following threads.

2. Advice on industry jobs for PhDs in Machine Learning

Many people are opting to pursue PhD in Machine Learning, prominently in USA. Experienced ML professionals have shared their advice helpful to provide an overview of available / upcoming opportunities.

3. How can I start my career in Artificial Intelligence?

Thinking of working with artificial intelligence, is undoubtedly a brave and lucrative decision. Here’s a learning path which enlightens the best ways to opt to start a career in AI.

4. What could be the areas of interest to start a PhD in Machine Learning today?

As mentioned above, with increased interest in PhD, people are searching for topics / areas of interest for the same. Here is a possible list of areas which are best for research.

5. What are some of the best practices to learn and apply for an entry-level data analyst?

In one line, this discussions can answer all your apprehensions related to career in data analytics. Experienced professionals from all over the world have generously contributed to provide best learning path.


Top 7 Discussions on Machine Learning / Deep Learning / Neural Networks

1. Why is Lua such a popular language for Machine Learning?

Lua is being widely used by Facebook, Google, Twitter etc. Here are some surprising reasons why this language is being preferred over other top programming languages by the giants of internet social world.

2. What’s so great about Extreme Learning Machines?

ELM is basically a 2-layer neural net in which the first layer is fixed and random, and the second layer is trained. Here’s a full description of this concept and its greatness.

3. What are Deep Dream images ? How do I make my own?

This is a complete installation guide to deep learning with essential instruction for dealing with images and related algorithms. Tutorials are also available.

4. Is deep learning basically just neural networks with multiple hidden layers in it?

The available answers are good enough to clarify all your confusions in deep learning, neural networks and the way they operates. People have explained in the best possible manner. Check out which suits best for you.

5. Arrival of self driving cars by 2020 by using Deep Learning

This discussion is based on the future world of automation and programming. The arrival of self driving cars, use of multi-complex algorithms and replacement of humans with robots.

6. New ‘deep learning’ technique enables robot mastery of skills via trial and error

Speaking of robots, here’s another research done which introduces a new technique to make a robot better at performing tasks in day to day work using trial and error. The phase of robot evolution has started.

7. Artificial Intelligence will take over the World

This is a popular discussion remained in news for a long time. Here’s you find an enriching perspective to the growth and impact of AI in industries and its repercussions for human existence.


Top 6 Discussions on R Programming / Python

1. R Users, what was something simple you learned late that you wish you learned early?

People tend to discover the most interesting in a programming languages after they have spent sufficient time on them. But if you have just started with R, you’re lucky! You might find this useful.

2. Those of you who regularly use both R and Python for statistical analysis – when do you use each, and why?

The ultimate winner among python and R can best be decided by their users, especially by those who use both. Here’s a compiled list of opinions for the greatness of these languages.

3. I know R. How easy would it be to pick up SQL?

Even though you may not bring it for daily use, companies do require the skill of SQL in candidates. Here’s are some useful career advice for you (If you already know R or Python).

4. What are the top 10 built-in Python modules that a new Python programmer needs to know in detail?

If you have recently learnt python, you must check out this discussion once. There are many python modules which are surely quite helpful but people don’t know about them.

5. Experienced Python Users: What’s the most recent new thing you learned about the language?

Learning is a dynamic process. Here’s compiled list of python codes, python tricks that people have discovered over the years and now have shared with their young generation.

6. What are the most common misconceptions in Python?

People working on python 2.x, python 3.x unknowingly tend to mess up with arguments, structures, loops etc. Here’s a complete list of minor but significant misconceptions that people have come across and are crystal clear now.


Top 6 Leisure Reads on ML, Python, Statistics

1. Explain me Bayesian Statistical Methods like I’m five year old?

I found it quite interesting. The arduous concepts of bayesian have been explained in the most simplistic manner such that every other person can understand and never forget.

2. Course Review: How is Andrew Ng Stanford Machine Learning course?

Here’s a genuine, unbiased course review of famous Andrew Ng ML course on coursera from various people who have undertaken this course. This discussions should help you decide your next step.

3. 11 facts about Data Science that you must know

One of the fact is ‘You should embrace Bayesian Approach’.  And the first discussions of this segments already explains it. Even more important for you learn that. Do check out the rest of 10 facts.

4. List of Favourite Statistics Jokes

Rhyme like a statistician, joke like a statistician. You’ll find some of the hilarious yet intuitive jokes on statistics which you might not have heard of yet.

5. How do you use python to automate tasks in life or at work?

The role of python is not only limited to programming and data analysis, but goes much beyond that. Here people are sharing their logic of using python in their day to day tasks. You’ll be amazed !

6. What are some fun APIs and libraries to screw around with and learn from?

There are many APIs which you might be unaware of, are quite useful and fun to play around with. You might like to bring any of them to use and do something crazy.


End Notes

If you are reading this part, I’m sure you would have found some amazing stuff to add to your bookmark list. After going through all these discussions, I realized there were so many things which I was unaware of initially. Being said, there are many things which can’t learn by books but by experience. And these discussions were filled with enriching experience and opinions.

You might find this article bit lengthy, but don’t worry, you are allowed to read this in parts.

According to you, which is the best discussions of all? Which discussions helped you the most? Do share your opinion in the comments section below. Also, if you follow our discussions, which were the most useful discussions according to you?

If you like what you just read & want to continue your analytics learning, subscribe to our emailsfollow us on twitter or like our facebook page.


avcontentteam 28 Jan, 2016

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers


Gaurav Kant Goel
Gaurav Kant Goel 05 Aug, 2015

Great collection! :)

hemanth 05 Aug, 2015

Help Help!!! :O :O I am a great fan of Analytics Vidhya, your team have a perfect approach in reaching people who are very new to analyitcs :) Taking into mind as you people have great exposure into analytics i came up with a life changing problem, a small suggestion is needed from you people. I've been placed in two firms as data science guy, one is a newly established firm where i am the only one with analytics knowledge and other is a reputed firm with an established data science team. Taking my future into consideration please suggest me which company to join for my better future in analytics and data science. Thanks in advance :)

Hari Prasad
Hari Prasad 25 Nov, 2017

Hi, Thanks for sharing the great information about Data Science..... Its useful and helpful information…Keep Sharing. Thanks Hari