DataHack Radio #18: Andriy Burkov’s Journey to Writing the Ultimate 100-Page Machine Learning Book
Have you seen most of the recommended books on Machine Learning only to feel overwhelmed by their thickness and the amount of effort it will take to read those books?
If you feel that way – don’t worry! You are not alone. A lot of people face this situation but do very little about it. Not Andriy Burkov! Andriy saw this and thought that the ideal Machine Learning book for beginners should be written within 100 pages.
More importantly – he wrote the book and published it. His recently launched ‘Hundred-Page Machine Learning Book‘ has quickly ascended the bestseller list and is perched at #1 on Amazon (under the ‘machine learning’ category). The book has even been endorsed by the great Peter Norvig!
It was our pleasure hosting him on episode #18 of DataHack Radio. Kunal and Andriy had a rich discussion on several topics, including:
- Andriy’s foray into machine learning and artificial intelligence
- Industry experience, including NLP projects at Gartner
- His idea behind writing the book
- The challenges he faced while trying to condense topics into this crisp format
- Advice to aspiring data scientists
All DataHack Radio episodes are available on the below podcast platforms. Subscribe today!
Andriy Burkov’s Background & Foray into Artificial Intelligence
Andriy’s professional career began in Ukraine at the turn of the millennium. He created his own online startup while doing his graduation in Computer Engineering and Networking. But after working on this for 3 years, the dotcom bubble burst and his investor decided to withdraw.
Andriy wasn’t about to give up on his dreams. The fire within him to build another online startup continued to burn bright. This wasn’t possible in Ukraine back then though, given the economic situation. Attracting another investor was proving to be impossible.
The first thought Andriy had was to move to Europe with his family. They considered France but eventually settled on Quebec, Canada. The primary reason was immigration purposes and Quebec seemed a good fit as it has a French dwelling community.
Once in Canada, Andriy spent some time looking for jobs before finally settling on doing his Master’s in computer science and artificial intelligence. He converted this into a Ph.D, choosing multi-agent systems as his primary topic. His thesis – Leveraging Repeated Games for Solving Complex Multi-agent Decision Problems.
Want to see how much the field of AI has changed in the last decade? Here’s an eye-opening anecdote Andriy told us:
“One of my ex-colleagues, when he finished his Ph.D in Quebec City (several years before me), couldn’t find a job in artificial intelligence. It was such an exotic field.”
It’s mind-blowing how quickly technology changes our lives.
Gaining Valuable Industry Experience Post-Ph.D
Two paths opened up once Andriy finished his Ph.D – research or teaching. The former appealed to him far more than becoming a full-time professor. So he decided to dip his toes into an industry role with Fujitsu, a Japanese multinational company. He worked with Fujistsu for 2 years before moving on.
From there, Andriy shifted to WANTED Technologies, a job announcement portal for companies. His role was primarily about turning terabytes of job announcements from online job boards into structured knowledge. An excellent experience that has played a big part in his professional career.
That was followed by a move to a company that was acquired twice, the second time by Gartner. In Andriy’s own words:
“I survived two acquisitions!”
There was an understandable seed of doubt – will working for such a huge organization stifle creativity? But Gartner’s work culture soon put any doubts to rest.
Andriy leads a team working on the research and development side of a product called Talent Neuron. It is a portal that “combines big data and statistical insights to provide global talent, location and competitive intelligence for any industry or function”.
Andriy’s Interest and Work in NLP
A quick glance at Andriy’s LinkedIn profile tells us he is a Natural Language Processing (NLP) expert. That’s a topic we at Analytics Vidhya are very passionate about. So picking Andriy’s brains on NLP felt like a natural fit!
His team’s role at Gartner is geared more towards applied text analytics, rather than core computational linguistics. I have mentioned a few intriguing tasks the team works on (or has worked on):
- Parsing of free-form résumé to detect and extract candidate name, employers, skills, certifications, employment history
- Parsing of job descriptions to detect and extract job title, salary, employer name, assign a standard occupation code
- Automated noisy and multi-language data normalization, cleanup, de duplication
- Novel data detection
- Language detection / text segmentation by language
- Sequence labeling (queries, addresses, salaries)
This is one of the most absorbing sections in this episode, especially for NLP enthusiasts. Andriy spoke about the complexities and nuances of language modeling with several examples. A must-listen!
Idea behind the ‘Hundred-Page Machine Learning Book’
How did the idea of writing this book come about? It definitely wasn’t the usual route! The thought hadn’t even crossed Andriy’s mind initially.
There’s a huge following on LinkedIn Andriy has built up over the years. He posts some superb stuff related to machine learning everyday (you should definitely give him a follow).
“I have a ton of machine learning and artificial intelligence books at home. But I never end up finishing any of them. I asked myself why?”
A huge reason for that is we’ve become increasingly busy. We barely have time to finish a book, especially one that runs over 500 pages. He put his thoughts in a LinkedIn post, saying that if he ever wrote a book on machine learning, it wouldn’t be more than 100 pages long.
He had no intention of actually writing it though! The post went viral – hundreds of likes, tons of comments. The majority of comments could be divided into two categories:
- The nay-sayers: It’s impossible to condense so much knowledge into 100 pages
- The enthusiasts: Andriy, you can write the book. Go ahead and do it!
“I took a week to think about it. I told myself – “I will try to write several chapters. If it goes well, we’ll see. If it doesn’t go well, I’ll stop.””
The first three chapters were written over a weekend and the response? Overwhelmingly positive. And while the number of pages ran a shade above hundred, Andriy still managed to pack in all the fundamentals and essential knowledge a data scientist should have. A tremendous achievement!
One of the most prominent things about the book is that it’s available to read online for anyone. If you like it or found it useful, you should buy the paperback or hardcover edition. I really appreciate Andriy’s thought process behind doing this.
Challenges and Trade-offs while Writing the Book
Given the crisp nature of the book, leaving certain concepts out was inevitable. But which ones? And to what degree? Those were critical questions Andriy addressed in this section.
The idea, as we discussed earlier, was to include all the fundamentals, such as the mathematics behind core machine learning algorithms. So topics like reinforcement learning and back propagation were either not included or just brushed upon. A fair trade-off, in my opinion.
Advice to People Entering Data Science (Aspiring Data Science Professionals)
Here’s a brief summary of the key points Andriy spoke about:
- Learn the basics of computer science
- Learn to be a good programmer
- Find an interesting problem and work your socks off to solve it (there are plenty of platforms to find datasets)
Just hearing Andriy talk about his book made me appreciate how hard the journey must have been. I have personally read the book and couldn’t recommend it enough. It is a gem and will help thousands of aspiring data scientists make the leap into this field.
Have you read the book yet? Let me know in the comments section below!