The rstudio::conf 2018 was held in San Diego two weeks ago. Most of the slides and materials presented have now been shared by the facilitators. While the focus was understandably on deep learning, quite a few other interesting packages were shared during the event.
The conference was held over a two day span and we have highlighted the most exciting things from each day in this article
The first day of the conference had a heavy focus on the tidyverse world. The keynote speaker was Diane Cook on the topic “To the tidyverse and beyond: Challenges for the Future in Data Science“. The takeaway from her talk was that tidy data provides the glue from raw data to the data in the statistics textbooks and it will continue to help in various fields in the future.
Davis Vaughan presented on “The future of time series and financial analysis in the tidyverse” where he revealed a couple of packages which, as the name suggests, make it far easier to deal with messy time series and financial data. Keeping the theme of finance going, Emily Riederer has created a package called “tidycf” which makes dealing with cash flow analysis a whole lot simpler and interpretable.
Emily Robinson shined some light on “The lesser known stars of the tidyverse“. The presentation looks at some of the ways of tidying your data using not-so-well known tidyverse functions. You can view her presentation slides here.
A few of the other talks on day 1 included:
- Debugging techniques in RStudio by Amanda Gadrow
- Understanding PCA using Shiny and Stack Overflow data by Julia Silge
- Best practices for working with databases by Edgar Ruiz
- Plumber: turning your R code into an API by Jeff Allen
You can watch the entire day’s video below:
The second day had a heavy dose of deep learning in R. The keynote on this day was ‘Machine Learning with TensorFlow and R‘ presented by JJ Allaire. He kicked things off with a tour of the basic of tensors and introduced the tensorflow package in R. Mr. Allaire wrapped up his talk by demoing various ways of deploying tensorflow and keras models, including publishing them directly to RStudio Connect.
This was followed by Google’s Michael Quinn. His talk was on “Large scale machine learning using TensorFlow, BigQuery and CloudML Engine within RStudio“. Once you’ve developed a tensorflow or keras model, you can then deploy this to Google’s CloudML. This can be accomplished using the ‘cloudml‘ package.
Keeping the theme of deep learning going, Javier Luraschi (from RStudio) gave a talk on “Deploying TensorFlow models with tfdeploy“. The ‘tfdeploy’ package provides a unified way of deploying models directly to various platforms including CloudML and RStudio Connect.
One of the more intriguing talks of the conference was by Ali Zaidi on “Reinforcement learning in Minecraft with CNTK-R“. Mr. Zaidi demonstrated how he trained a deep-learning model to control a character in the popular video game Minecraft. The character was taught how to navigate a puzzling maze as well as understand a few tidbits of natural language. The package used for training the model was “CNTK-R“.
Some of the other fascinating talks included:
- “You can make a package in 20 minutes” by Jim Hester
- “Contributing to the tidyverse” by Mara Averick
- “Achieving impact with advanced analytics: Breaking down the adoption barrier” by Aaron Horowitz
- “Storytelling with R” by Olga Pierce
- “Building Spark ML pipelines with sparklyr” by Kevin Kuo
Watch the entire day’s video here.
All the material that was presented at the conference can be found on Github here. It will be updated with the recordings of each session as well once available.
Our take on this
Deep learning was the general theme running throughout day 2. It has made significant strides in the world of machine learning and given it’s increasing influence, it came as no surprise that plenty of packages are being created for this purpose. R has a plethora of supporting packages for TensorFlow which is quite encouraging for people using this language.
Apart from that, the ‘tidyverse’ world got a lot of love from the presenters and new ways of tidying up messy data were shown. For R users, this conference is something you should absolutely check out using the links above.