Develop Your Own Personal Deep Learning Image Dataset using this Python Script

Pranav Dar 09 Apr, 2018 • 2 min read

Overview

  • This python script let’s you download hundreds of images from Google Images
  • You can search keywords and /or key phrases
  • You can also invoke this script from another python file

 

Introduction

Developing your own dataset can be a really tedious and time consuming task. And when it comes to images, multiply the amount of effort by 100.

So this python script will come in handy for people who don’t have a lot of time on their hands but want to build an exhaustive image dataset for deep learning purposes. Using this, you can download hundreds of Google images to your own machine.

This script is a command line python program. You can search keywords and/or key phrases in Google Images and optionally download the resulting images. Additionally, you can also invoke this script from another python file.

It’s a fairly straightforward program and does not require any dependencies if you only prefer downloading 100 images per keyword. In case your requirement goes beyond the 100 threshold, you’ll need to install the Selenium library along with chromedriver.

This program is compatible with both python 2 and 3. You can see the underlying structured behind the code in the below image:

You can install this library from pip by typing the below command:

$ pip install google_images_download

Read more about this library and access the source code on GitHub here.

Please take note of the below disclaimer regarding copyright terms before using these images:

This program lets you download tons of images from Google. Please do not download or use any image that violates its copyright terms. Google Images is a search engine that merely indexes images and allows you to find them. It does NOT produce its own images and, as such, it doesn’t own copyright on any of them. The original creators of the images own the copyrights.

Images published in the United States are automatically copyrighted by their owners, even if they do not explicitly carry a copyright warning. You may not reproduce copyright images without their owner’s permission, except in “fair use” cases, or you could risk running into lawyer’s warnings, cease-and-desist letters, and copyright suits. Please be very careful before its usage!

 

Our take on this

This is a really cool script that you can use for your own personal purposes. Practice deep learning problems on your own machine but please do not use it commercially or distribute images without the owner’s permission!

It’s a pretty easy script to replicate once you understand how it works underneath. Let us know your experience using it in the comments below.

 

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

 

Pranav Dar 09 Apr 2018

Senior Editor at Analytics Vidhya. Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Priyanka
Priyanka 12 Apr, 2018

Thank you for this awesome script reference. Just one correction...it is pip install google-images-download