Get Knowledge from Best Ever Data Science Discussions on Reddit
While composing this enriching this list of data science discussions, I found this awesome ‘poem’ drafted statistically. Ain’t it pretty cool ? Dedicated to that every person who thinks this world is driven by data (including my team):
A curvy young belle, Billie Jean,
Makes a measure of each man she’s seen.
Yet the size of one sample
Was sufficiently ample
To skew median far from the mean.
The jealous lads started to hate him,
And the ladies all lined up to date him;
Not his charm, nor his suit
Sets the gals in pursuit,
But the infamous size of his datum.
That persistent young lass, Billie Jean,
Measured more, she was quite the machine!
Like a Bell, things now curved,
And all same, she observed,
Were the median, mode and the mean.
What does this poem suggests? Of course, I’ve shared it for a reason.
And the subtle reason is, there are many more interesting ways to learn subjects which are difficult otherwise. Precisely, people learn data science and its related concepts using text books, guides, video-tutorials, interactive websites and every other way which comes at their disposal. Adding to this list, there’s another interesting and inspiring way to learn. And that is using ‘Discussions‘. Because of its need, we run a discussion portal on Analytics Vidhya as well. But today’s article focuses on a much larger community – Reddit.
Reddit, the Front Page of Internet, is a community known for its candid and outspoken discussions. A platform where spam-ism is despised and genuine content is welcomed. It has content related to everything under the sun. I’d be astonished, if not. However, for people who are new to Reddit, reading / participating on Reddit discussions can be very intimidating at the start. It can take some time before you understand how the forum works.
But there is a ton of useful knowledge on Reddit, which you can learn. People who are active on Reddit would agree to it. Hence, in this article, we have summarized some of the best discussions related to Machine Learning, Deep Learning, Neural Networks, Artificial Intelligence, Python, R Programming, Big Data and Statistics. I hope that you benefit out of it, if you don’t follow Reddit religiously or can not fit in the community.
Table of Contents
- AMA with Top Data Scientists
- Top 9 Discussions on Data Science Tutorials / Books
- Top 5 Discussions on Career / Jobs in Data Science
- Top 7 Discussions on Machine Learning, Neural Networks, Deep Learning
- Top 6 Discussions on Python and R Programming
- Top 6 leisure reads on Machine Learning, Python, Statistics
AMA with Top Data Scientists
I would do injustice to this article, if I don’t start it from AMAs. AMAs (or Ask Me Anything) are an important part of Reddit community and a setup where you can ask questions to experts in a particular field. I suppose these people require no formal introduction. They are considered to be the best of brains working towards the development of machine learning, deep learning, neural networks etc. If you ever seek inspiration, check out what are folks are upto. Below are the AMAs with 3 key highlights in 4 lines each:
a) On human brain – ‘The brain has about 1014 synapses and we only live for about 109 seconds. So we have a lot more parameters than data. This motivates the idea that we must do a lot of unsupervised learning since the perceptual input (including proprioception) is the only place we can get 105 dimensions of constraint per second’.
b) On his career – ‘My father was a Stalinist and sent me to a private Christian school where we had to pray every morning. From a very young age I was convinced that many of the things that the teachers and other kids believed were just obvious nonsense. That’s great training for a scientist and it transferred very well to artificial intelligence. But it was a nasty shock when I found out what Stalin actually did.’
c) On Dark Knowledge – ‘Yes, I invented the term “Dark Knowledge”. Its inspired by the idea that most of the knowledge is in the ratios of tiny probabilities that have virtually no influence on the cost function used for training or on the test performance. So the normal things we look at miss out on most of the knowledge, just like physicists miss out on most of the matter and energy.’
a) On Career in Deep Learning – ‘Read, learn from online material, try things for yourself . Take as many math and physics course as you can, and learn to program. You have to figure out what’s important, know what to ignore, and know how to approximate. These are skills you need to conceptualize, model, and analyze ML models. Another set of courses that are relevant is signal processing, optimization, and control/system theory.
b) On most overlooked things in machine learning – ‘Kernel methods are great for many purposes, but they are merely glorified template matching. There is nothing magical about margin maximization. It’s just another way of saying “L2 regularization” (despite the cute math). There is no opposition between deep learning and graphical models. Many deep learning approaches can be seen as factor graphs.
c) On emotions in Robots – ‘Emotions do not necessarily lead to irrational behavior. They sometimes do, but they also often save our lives. If emotions are anticipations of outcome (like fear is the anticipation of impending disasters or elation is the anticipation of pleasure), or if emotions are drives to satisfy basic ground rules for survival (like hunger, desire to reproduce), then intelligent agent will have to have emotions.
a) Recommendation after Coursera ML course – ‘Here’re a few common paths: 1. Many people are applying ML to projects by themselves at home, or in their companies. This helps both with your learning, as well as helps build up a portfolio of ML projects in your resume (if that is your goal). If you’re not sure what projects to work on, Kaggle competitions can be a great way to start.
b) Best place to work for ML & AI Engineers – ‘I think Baidu, Google and Facebook are all great places to work! But Baidu Research is very much a startup environment. With ~40 people in our Silicon Valley team, we also invest a lot in employee development. I think these things make the best possible combination for driving machine learning research, which is why both of us (Adam & Andrew) had decided to join Baidu.
c) On key drivers of Deep Learning – ‘I think the two key drivers of deep learning are: – Rise of computation. Not just GPUs, but now the migration toward HPC (high performance computing, aka supercomputers). – Rise of availability of data, because of the digitization of our society, in which increasing amounts of activity on computers/cellphones/etc. creates data.’
a) On choice of academia – ‘I like academia because I can choose what to work on, I can choose to work on long-term goals, I can work for the benefit of humanity rather than for a specific company, and I can talk about my work freely. Note that to different degrees, my esteemed colleagues in large industrial labs also enjoy some of that freedom.’
b) On RNN – ‘Recurrent or recursive nets are really useful tools for modelling all kinds of dependency structures on variable-sized objects. We have made progress on ways to train them and it is one of the important areas of current research in the deep learning community. Examples of applications: speech recognition (especially the language part), machine translation, sentiment analysis, speech synthesis, handwriting synthesis and recognition, etc.’
c) On upcoming challenges in NLP – ‘I believe that the really interesting challenge in NLP, which will be the key to actual “natural language understanding”, is the design of learning algorithms that will be able to learn to represent meaning. There are also more computational challenges: we need to be able to train much larger models (say 10000x bigger), and we can’t afford to wait 10000x more time for training.’
a) On the future of RNN – ‘The world of RNNs is such a big world because RNNs (the deepest of all NNs) are general computers, and because efficient computing hardware in general is becoming more and more RNN-like, as dictated by physics: lots of processors connected through many short and few long wires. Both supervised learning RNNs and reinforcement learning RNNs will be greatly scaled up.’
b) On his hobbies – ‘In my spare time, I am trying to compose music, and create visual art. And while I am doing this, it seems obvious to me that art and science and music are driven by the same basic principle.I think the basic motivation (objective function) of artists and scientists and comedians is data compression progress, that is, the first derivative of data compression performance on the observed history.’
c) On future of AI – ’20 years from now we’ll have 10,000 times faster computers for the same price, plus lots of additional medical data to train them. I assume that even the already existing neural network algorithms will greatly outperform human experts in most if not all domains of medical diagnosis, from melanoma detection to plaque detection in arteries, and innumerable other applications.’
Top 9 Discussions on Tutorials / Books
Here’s you’ll find a complete guide on overcoming periodical breakers while mastering machine learning. More than learning, you’ll be inspired how other people have endured such situation.
Here’s a list of all the resources people have found essential to learn this Recurrent Neural Networks. You’ll also find the review / opinions of people with respective resources.
Here’s a list of all the books essential to master the concepts of machine learning. If you like to learn from books rather than blogs / videos, you just can’t miss this.
People have learnt NLP using various resources available for free. They are listed in this discussions. Moreover, you’ll find other essential information on learning this concept.
Deep Learning algorithms are trained by giving them a huge number of images, and telling them what object is in each image. Here’s a step by step tutorial on setting up your machine for deep learning.
Considering the seamless growth big data industry has, lot of people are deciding to enter this industry. Here’s a compiled list of useful resources that has helped other people to make this move.
If you are into statistical research and like to explore related concepts, you just can’t miss this discussion. Here you’ll find a comprehensive list of best books / white papers on statistical learning.
As the name suggests, if you want to learn machine learning, here’s a list of top machine learning blogs which you can subscribe right away.
If learning from videos is what excites you, here’s a compiled list of best ever python videos that people have shared with their reviews. You’ll know what best amongst the best once you check this discussion.
Top 5 Discussions on Career / Jobs
As the title suggest, here’s a compiled of questions been asked to candidates in their python interviews. If you are lucky, you can also find their solution in the following threads.
Many people are opting to pursue PhD in Machine Learning, prominently in USA. Experienced ML professionals have shared their advice helpful to provide an overview of available / upcoming opportunities.
Thinking of working with artificial intelligence, is undoubtedly a brave and lucrative decision. Here’s a learning path which enlightens the best ways to opt to start a career in AI.
As mentioned above, with increased interest in PhD, people are searching for topics / areas of interest for the same. Here is a possible list of areas which are best for research.
In one line, this discussions can answer all your apprehensions related to career in data analytics. Experienced professionals from all over the world have generously contributed to provide best learning path.
Top 7 Discussions on Machine Learning / Deep Learning / Neural Networks
Lua is being widely used by Facebook, Google, Twitter etc. Here are some surprising reasons why this language is being preferred over other top programming languages by the giants of internet social world.
ELM is basically a 2-layer neural net in which the first layer is fixed and random, and the second layer is trained. Here’s a full description of this concept and its greatness.
This is a complete installation guide to deep learning with essential instruction for dealing with images and related algorithms. Tutorials are also available.
The available answers are good enough to clarify all your confusions in deep learning, neural networks and the way they operates. People have explained in the best possible manner. Check out which suits best for you.
This discussion is based on the future world of automation and programming. The arrival of self driving cars, use of multi-complex algorithms and replacement of humans with robots.
Speaking of robots, here’s another research done which introduces a new technique to make a robot better at performing tasks in day to day work using trial and error. The phase of robot evolution has started.
This is a popular discussion remained in news for a long time. Here’s you find an enriching perspective to the growth and impact of AI in industries and its repercussions for human existence.
Top 6 Discussions on R Programming / Python
People tend to discover the most interesting in a programming languages after they have spent sufficient time on them. But if you have just started with R, you’re lucky! You might find this useful.
The ultimate winner among python and R can best be decided by their users, especially by those who use both. Here’s a compiled list of opinions for the greatness of these languages.
Even though you may not bring it for daily use, companies do require the skill of SQL in candidates. Here’s are some useful career advice for you (If you already know R or Python).
If you have recently learnt python, you must check out this discussion once. There are many python modules which are surely quite helpful but people don’t know about them.
Learning is a dynamic process. Here’s compiled list of python codes, python tricks that people have discovered over the years and now have shared with their young generation.
People working on python 2.x, python 3.x unknowingly tend to mess up with arguments, structures, loops etc. Here’s a complete list of minor but significant misconceptions that people have come across and are crystal clear now.
Top 6 Leisure Reads on ML, Python, Statistics
I found it quite interesting. The arduous concepts of bayesian have been explained in the most simplistic manner such that every other person can understand and never forget.
Here’s a genuine, unbiased course review of famous Andrew Ng ML course on coursera from various people who have undertaken this course. This discussions should help you decide your next step.
One of the fact is ‘You should embrace Bayesian Approach’. And the first discussions of this segments already explains it. Even more important for you learn that. Do check out the rest of 10 facts.
Rhyme like a statistician, joke like a statistician. You’ll find some of the hilarious yet intuitive jokes on statistics which you might not have heard of yet.
The role of python is not only limited to programming and data analysis, but goes much beyond that. Here people are sharing their logic of using python in their day to day tasks. You’ll be amazed !
There are many APIs which you might be unaware of, are quite useful and fun to play around with. You might like to bring any of them to use and do something crazy.
If you are reading this part, I’m sure you would have found some amazing stuff to add to your bookmark list. After going through all these discussions, I realized there were so many things which I was unaware of initially. Being said, there are many things which can’t learn by books but by experience. And these discussions were filled with enriching experience and opinions.
You might find this article bit lengthy, but don’t worry, you are allowed to read this in parts.
According to you, which is the best discussions of all? Which discussions helped you the most? Do share your opinion in the comments section below. Also, if you follow our discussions, which were the most useful discussions according to you?
If you like what you just read & want to continue your analytics learning, subscribe to our emails, follow us on twitter or like our facebook page.