Tavish Srivastava — Published On March 22, 2015 and Last Modified On July 19th, 2020
Beginner Business Analytics Machine Learning Python Structured Data Technique

This article is going to be different from the rest of my articles published on Analytics Vidhya – both in terms of content and format. I usually layout my article such that after a read, the reader is left to think about how this article can be implemented on grounds.

Hacking Google Maps 1

In this article, I will start with a round of brainstorming around a particular type of business problem and then talk about a sample analytics based solution to these problems. To make use of this article make sure that you follow my instructions carefully.

Let’s start with a few business cases:

  1. Retail bank: Optimize primary bank branch allocation for all the customers. This is to make sure that the bank branch allotted to the customer is close to the mailing or permanent address of the customer for his convenience.  This might be specially applicable, if we open a new branch and the closest branch for many existing customer changes to this new branch.
  2. Retail Store chain: Send special offers to your loyal customers. But offers could be region specific so same offer cannot be sent to all. Hence, you first need to find the closest store to the customer and then mail the offer which is currently applicable for that store.
  3. Credit card company who sells co-branded cards: You wish to find out all partner stores which are closest to your existing client base and then mail them appropriate offers.
  4. Manufacturing plant: Wish to find out wholesalers near your plant for components required in manufacturing of the product.

What is so common in all the problems mentioned above? Each of these problems deal with getting the distance between multiple combination of source and target destinations.

Exercise : Think about at-least 2 such cases in your current industry and then at least 2 cases outside your current industry and write them in the comment section below.


A common approach 

I have worked in multiple domains and saw this problem being solved in similar fashion which gives approximate but quick results.

Exercise : Can you think of a method to do the same using your currently available data and resources?

Here is the approach :

You generally have a PIN CODE for both source and destination. Using these PIN CODES, we find the centroid of these regions. Once you have both the centroids, you check their latitude and longitude. You finally calculate the eucledian distance between these two points. We approximate our required distance with this number. Following figure will explain the process better :



The two marked areas refers to different PIN CODES and the distance 10 kms is used as an approximate distance between the two points.

Exercise : Can you think of challenges with this approach ? 

Here are a few I can think of :

  1. If the point of interest is far away from the centroid, this approach will give inaccurate results.
  2. Some times the centroid of other PIN CODE can be more closer to the point of interest than its own PIN CODE. But because it falls in area of the distant PIN CODE, we still approximate the point of interest with the centroid of distant PIN CODE.
  3. In cases where we need finer distances than the precision of PIN CODE demarcation, this method will lead nowhere. Imagine a scenario where two branches of a bank and customer address is located in the same PIN CODE. We have no way to find the closest branch.
  4. The distance calculated is a point to point distance and not on road. Imagine a scenario when you have two PIN Codes right next to each other but you have valley between which you need to circle around to reach destination.


A manual Approach

Say you have two branches and a single customer, how will you make a call between the two branches (which one is closer)? Here is a step by step approach :

  1. You choose the first combination of branch-customer pair.
  2. You feed the two addresses in Google Maps.
  3. You pick the distance/time on road
  4. You fill in the distance in the table with the combinations (2 in this case)
  5. Repeat the same process with the other combination.


How to automate this approach?

Obviously, this process cannot be done manually for millions of customers and thousands of branches. But this process can be well automated (however, Google API have a few caps on the total number of searches). Here is a simple Python code which can be used to create functions to calculate the distance between two points on Google Map.



Exercise : Create a table with a few sources and destinations. Use these functions to find distance and time between those points. Reply “Done without support” if you are able to implement the code without looking at the rest of the solution.

Here is how we can read in a table of different source-destination combinations :

code2Notice that we have all types of combinations here. Combination 1 is a combo of two cities. Combo 4 is a combination of two detailed address. Combo 6 is a combination of a city and a monument. Let’s now try to get the distances and time & check if they make sense.

code3All the distance and time calculations in this table look accurate.

Exercise : What are the benefits of using this approach over the PIN CODE approach mentioned above? Can you think of a better way to do this task?

Here is the complete Code :

[stextbox id=”grey”]
import googlemaps
from datetime import datetime
def finddist(source, destination):
     gmaps = googlemaps.Client(key='XXX')
    now = datetime.now()
   directions_result = gmaps.directions(source, destination, mode="driving",departure_time=now)
   for map1 in directions_result:
         overall_stats = map1['legs']
         for dimensions in overall_stats:
                distance = dimensions['distance']
                return [distance['text']]
def findtime(source, destination):
      gmaps = googlemaps.Client(key='XXX')
      now = datetime.now()
      directions_result = gmaps.directions(source, destination, mode="driving",departure_time=now)
      for map1 in directions_result:
            overall_stats = map1['legs']
            for dimensions in overall_stats:
                   duration = dimensions['duration']
                   return [duration['text']]

import numpy as np
import pandas as pd
import pylab as pl 
import os
cities = pd.read_csv("cities.csv")
cities["distance"] = 0
cities["time"] = 0
for i in range(0,8):
 source = cities['Source'][i]
 destination = cities['Destination'][i]
 cities['distance'][i] = finddist(source,destination)
 cities['time'][i] = findtime(source,destination)


End Notes

GoogleMaps API come with a few limitations on the total number of searches. You can have look at the documentation, if you see a use case of this algorithm.

Did you find the article useful? Share with us find more use cases of GoogleMaps API usage apart from the one mentioned in this article? Also share with us any links of related video or article to leverage GoogleMaps API. Do let us know your thoughts about this article in the box below.

If you like what you just read & want to continue your analytics learning, subscribe to our emailsfollow us on twitter or like our facebook page.

About the Author

Tavish Srivastava
Tavish Srivastava

Tavish Srivastava, co-founder and Chief Strategy Officer of Analytics Vidhya, is an IIT Madras graduate and a passionate data-science professional with 8+ years of diverse experience in markets including the US, India and Singapore, domains including Digital Acquisitions, Customer Servicing and Customer Management, and industry including Retail Banking, Credit Cards and Insurance. He is fascinated by the idea of artificial intelligence inspired by human intelligence and enjoys every discussion, theory or even movie related to this idea.

Our Top Authors

Download Analytics Vidhya App for the Latest blog/Article

6 thoughts on "Hacking Google Maps to create distance features in your model / applications"

sumalatha says: March 23, 2015 at 11:18 am
Is it a sophisticated implementation of any of the methods that are used to solve transportation problems (like modi method, VAM method etc)? Reply
Tavish Srivastava
Tavish Srivastava says: March 23, 2015 at 12:28 pm
Hi Sumalatha, This article is on how to find distances between two points and has nothing to do with transportation problem frameworks. However, you can use the method described in this article to find distances between points and then use algorithms like VAM and Modi, to optimize routes. Thank you for sharing this thought. This just adds a new business case to the list I have mentioned in this article : Using distance mapping techniques in transportation problems. Tavish Reply
Sunny says: August 12, 2015 at 4:36 am
First of all, thanks a lot for sharing this wonderful article. It really helps students like me to think beyond text-books. I was trying to replicate this project and came up with some doubts (may be very silly). I will be obliged if you can answer it: a) While generating the Key from Google Maps - which one (Directions API/Distance Matrix API/any other) to select. Else if, the key is totally different than the three options above, can you please include a brief process to generate the Key. b) Will this code work for source and destination as latitude and longitude. If not, can something be done to calculate distance based on these. Thanks a lot in advance! It really helps :) Reply
Pratima Joshi
Pratima Joshi says: November 21, 2016 at 6:18 am
Hi, I had the same question/suggestion as sunny to use Longitude/Latitude instead of PIN code to find the distance between source and destination. Reply
Pratima Joshi
Pratima Joshi says: November 21, 2016 at 6:22 am
Also, 2 more business cases: 1. Finding nearby restaurants from a location and providing suggestions to the user. 2. Finding nearby health service providers (hospitals, clinics, pharmacies etc.) Reply
Ajas says: November 23, 2016 at 8:41 am
Hi Tavish, Thanks for sharing, it was great reading it. I am having one question, in common approach "Why do we need to calculate the centroid of point of interests and take euclidean distance?" can't we just take euclidean distance directly between point of interests ? (it may be a silly question, but still i want to know.) Thanks Reply

Leave a Reply Your email address will not be published. Required fields are marked *