Working with Python has always been a good experience for me. Not just because of its easy code syntax, but due to its phenomenal community support. I must admit that Python has always surprised me with its extended industry use.
PyCon Conference 2016 was held between May 28th – June 5th at Portland, Oregon. It witnessed an amazing series of python tutorial and talks. The panel of speakers were second to none. Also, the father of Python ‘Guido Van Rossum’ delivered a keynote focusing on state of Python. For people like you and me, who couldn’t attend this conference I’ve created this post so that you can take part in this knowledge fest too.
Be it a novice, intermediate or an expert python user, PyCon had something for everyone’s happiness. Be it in web development or data science. Topics like Bayesian Statistics, Deep Learning, Data Cleaning, Text Mining, Machine Learning got discussed.
From data science perspective, I’ve listed the most useful tutorials from PyCon 2016. For your convenience, I’ve also added a short summary of each video. You’ll find tutorial in two forms: Workshops and Talks. Workshops are of longer duration involving detailed explanation of concepts. Talks are of shorter durations.
Note: I would like to sincerely thank PyCon for generously sharing such enriching content from PyCon Conference 2016 @ Youtube.
List of Workshops
1. Keynote by Guido van Rossum
Duration – 42:13 mins
It would be unfair if we don’t devote some time and be all ears to the creator of Python. As an entrepreneur, I’ve always found more excitement in building products than just using them. Having created such a magnificent product, Van Rossum shares his experience on driving the evolution of python programming language in near future. Also, he talks about Python 2.7, 3.5, 3.6 and various other developments which you as a python user must know.
2. Essential Data Science Skills For Every Programmer
Duration – 3:23:19 hrs
This is a must watch tutorial for anyone aspiring to learn python for data science. It’s a beginner level tutorial. In this tutorial, Andy Terrel, will take you through hands on practice on data sets using Pandas, Scikit-learn and Pydata tools. You’ll become familiar with data munging, modeling and methods to make predictions. Towards the end of tutorial, you’ll also learn about building interactive data visualizations which can be deployed over web.
3. The Fellowship of the Data
Duration – 1:52:12 hrs
So, we always talk about building predictive models in python on available data sets. Have you ever wondered about collecting data using Python ? That’s why I’m in love with Python. It can do much more than you can imagine. This tutorial teaches various methods of collecting, storing and organizing data using Python. Good thing is, you’d learn by doing. By the end of this tutorial, you’d be able to collect, store and merge data in one pipeline using python.
4. Computational Statistics
Duration – 2:29:18 mins
Statistics and Math are the two things which a data scientist must be good at. Hence, if you are one of the aspiring data scientist, you must watch this video. This tutorial will introduce you to the traditional yet powerful methods of statistics such as estimations, hypothesis testing, monte carlo simulations etc. More than theoretical explanations, the focus has been kept on learning with practical exercises.
5. Cleaning and Prepping Data
Duration – 1:55:07 hrs
90% of the times, chances are you would get messy data sets for model building i.e. comprising of invalid values, missing values, outliers etc. As a data scientist, it is important to learn the skills of data cleaning and preparing a informative data set for model building. This tutorial is a must watch for beginners. In this tutorial, Renee adapts a step wise approach to demonstrate data cleaning in python. Be ready with your code editor, it’s a practical workshop.
6. Introduction to Python for Data Analysis and Visualization
Duration – 2:54:16 hrs
Data Analysis helps to discover underlying hidden trends in the data. It’s an absolutely must watch tutorial for every novice in data science. Here, you’ll learn about the steps involved in data analysis and ways to perform these steps in Python. By the end of this video, you’ll get enough expertise to study and analyze small data sets.
7. Regular Expressions
If you still struggle with using regular expression(as most of us do) in modeling, this beginners tutorial has to be your next halt. Trey Hunner (the speaker) will guide you through the basics of regular expression. By the end of this video, you’d be able to build regular expressions on your own by working with practice problems and discussions. Good thing is, there is no theory involved. The focus is kept on enhancing its practical understanding.
8. Practical Network Analysis Made Simple
Duration – 2:40:13 mins
If you’ve always preferred to learn from applications rather than digging theories, this tutorial is must for you! Eric J. Ma demonstrates the use of network analysis in python. But, what is network analysis ? Network Analysis is simply a useful modeling tool widely used to map complex relationship. This concept is being extensively applied by Facebook, Google, Amazon in their recommender systems. It’s an intermediate level talk. Make sure you are good with basics of python.
9. Diving into Machine Learning with Tensor Flow
Duration – 2:55:05 hrs
TensorFlow is an open source software library from Google which provides access to numerical computations using data flow graphs. This tutorial will guide you through the basics of machine learning to building a text classification model using TensorFlow. Along the way, you’ll understand about tensorflow’s working and how you can build and train model with its help!
10. Machine Learning with Text in Scikit Learn
Duration – 2:44:32 hrs
So, you are given a data set. You have numeric variables on which you can easily work. Along with, you have variables which comprises of text such as house address, product description etc. The knowledge of dealing with such variables can provide immense boost to your predictive model. In this tutorial, Kevin (Founder of dataschool.io) shares this knowledge using multiple practice examples. It’s a practical workshop, hence be ready to reproduce codes at your end.
11. Large Scale Data Analysis Tools in Python
Duration – 2:54:41 hrs
With the growing demand of large scale data analysis, would you think python would stay back ? Broadly, this tutorial teaches you to handle big data in Python. Here you’ll learn the basics of using Hadoop / MapReduce and Spark in Python. In the end, Sarah & Sean provide a hands on exercise for practical understanding of data analysis on large data sets.
12. Making an impact with Python NLP Tools
Duration – 2:54:35 hrs
This tutorial is best suited for people having prior experience in string manipulation. In this tutorial, you’ll get familiar with a toolkit specially designed to work with text data in Python. Brands are pursuing the power of NLP to identify their customer sentiments on social media platforms, feedback forms for enhancing their brand perception. Hence, this concept is widely used in industry and must to know for a data scientist.
13. Faster Python Programs
Duration – 3:03:58 hrs
This would interest you if you are comfortable coding in python and would like to reduce your computational time. Mike Mueller introduces some handy tips and tricks to optimize your python programs. For optimization, one would require knowledge of algorithms and data structures which also has been explained. The ability to write faster programs for creating quick visualization and models is driven by an optimization strategy. If you aren’t understanding here, may be you’ll do more while doing practical exercises with Mike.
14. Computational Geometry in Python
Duration – 2:34:50 hrs
If you are interested in pursuing field like Robotics, Geo mapping, Astrophysics and more, this tutorial should give you a good headstart. In simple words, computational geometry is nothing a way to solve a problem which are influenced by dimensions such as geographical information, network building etc. Understanding this video requires deep knowledge of mathematics and related concepts.
List of Talks
1. Beginners Guide to Deep Learning
Duration – 28:51 mins
Deep Learning techniques have brought disruptive advancements in the field of data science. Be it learning from robots, images, speech or detecting anomalies, deep learning algorithms are widely known to solve complex data problems. This talk introduces you to concepts like convolutional nets, backpropagation, image recognition and restricted boltzman constant. Irene has used interesting examples from real life to set up deep learning’s connection with human lives.
2. Web Scraping and Data Analysis of NHL Penalties
Duration – 30:19 mins
This is a good video to watch and understand the use of python, data analysis and web scraping in real life. Wendy, used python to do web scraping and data analysis in order to compute results on NHL Penalties. In this video, Wendy follows a step wise approach, right from collecting data, analyzing it and generating useful insights. A lot has been said in previous videos though, a complete end to end overview is still missing above.
3. IPython Notebook in Data Intensive Communities
Duration – 30:01 mins
This talk encourages using ipython notebook for data science work. I myself prefer working on IPython Notebook rather than working in any other text editor. There are several benefits. I could state many. But, that’s what these guys have interactively explained in the video. So, if you are a beginner or intermediate in data science and use python, this video will provide you a fresh perspective about this tool.
4. Statistics for Hackers
Duration – 40:32 mins
Another must watch for novices. You can say, it’s a crash course on statistics using Python. In this talk, Jake will solve all confusions about the jargons like distribution, confidence interval, p-value, t-test by using computational methods like sampling, shuffling, simulation and cross-validation. And, he shares his strategies and approach to build a powerful statistical model.
Just watching these videos wouldn’t make you a better analyst. You need to practice too. For best results, you can take notes from the video. This will help you to quickly refer the topic at a later point in time.
While watching these video, there were several moments when I felt, there are lot many things in Python which I am yet to explore. Once again I would like to thank python community for being so generous, helpful and always being helpful in time of need. If you would like to see more such videos from PyCon 2016, you can check out their Youtube channel.
Did you find this list of tutorials and talks helpful ? Which tutorial or talk you like the most ? Share your experience/ suggestion in the comments below.