Learn everything about Analytics

Hacking Google Maps to create distance features in your model / applications

SHARE
, / 6

This article is going to be different from the rest of my articles published on Analytics Vidhya – both in terms of content and format. I usually layout my article such that after a read, the reader is left to think about how this article can be implemented on grounds.

Hacking Google Maps 1

In this article, I will start with a round of brainstorming around a particular type of business problem and then talk about a sample analytics based solution to these problems. To make use of this article make sure that you follow my instructions carefully.

Let’s start with a few business cases:

  1. Retail bank: Optimize primary bank branch allocation for all the customers. This is to make sure that the bank branch allotted to the customer is close to the mailing or permanent address of the customer for his convenience.  This might be specially applicable, if we open a new branch and the closest branch for many existing customer changes to this new branch.
  2. Retail Store chain: Send special offers to your loyal customers. But offers could be region specific so same offer cannot be sent to all. Hence, you first need to find the closest store to the customer and then mail the offer which is currently applicable for that store.
  3. Credit card company who sells co-branded cards: You wish to find out all partner stores which are closest to your existing client base and then mail them appropriate offers.
  4. Manufacturing plant: Wish to find out wholesalers near your plant for components required in manufacturing of the product.

What is so common in all the problems mentioned above? Each of these problems deal with getting the distance between multiple combination of source and target destinations.

Exercise : Think about at-least 2 such cases in your current industry and then at least 2 cases outside your current industry and write them in the comment section below.

 

A common approach 

I have worked in multiple domains and saw this problem being solved in similar fashion which gives approximate but quick results.

Exercise : Can you think of a method to do the same using your currently available data and resources?

Here is the approach :

You generally have a PIN CODE for both source and destination. Using these PIN CODES, we find the centroid of these regions. Once you have both the centroids, you check their latitude and longitude. You finally calculate the eucledian distance between these two points. We approximate our required distance with this number. Following figure will explain the process better :

dist

 

The two marked areas refers to different PIN CODES and the distance 10 kms is used as an approximate distance between the two points.

Exercise : Can you think of challenges with this approach ? 

Here are a few I can think of :

  1. If the point of interest is far away from the centroid, this approach will give inaccurate results.
  2. Some times the centroid of other PIN CODE can be more closer to the point of interest than its own PIN CODE. But because it falls in area of the distant PIN CODE, we still approximate the point of interest with the centroid of distant PIN CODE.
  3. In cases where we need finer distances than the precision of PIN CODE demarcation, this method will lead nowhere. Imagine a scenario where two branches of a bank and customer address is located in the same PIN CODE. We have no way to find the closest branch.
  4. The distance calculated is a point to point distance and not on road. Imagine a scenario when you have two PIN Codes right next to each other but you have valley between which you need to circle around to reach destination.

 

A manual Approach

Say you have two branches and a single customer, how will you make a call between the two branches (which one is closer)? Here is a step by step approach :

  1. You choose the first combination of branch-customer pair.
  2. You feed the two addresses in Google Maps.
  3. You pick the distance/time on road
  4. You fill in the distance in the table with the combinations (2 in this case)
  5. Repeat the same process with the other combination.

 

How to automate this approach?

Obviously, this process cannot be done manually for millions of customers and thousands of branches. But this process can be well automated (however, Google API have a few caps on the total number of searches). Here is a simple Python code which can be used to create functions to calculate the distance between two points on Google Map.

code1

 

Exercise : Create a table with a few sources and destinations. Use these functions to find distance and time between those points. Reply “Done without support” if you are able to implement the code without looking at the rest of the solution.

Here is how we can read in a table of different source-destination combinations :

code2Notice that we have all types of combinations here. Combination 1 is a combo of two cities. Combo 4 is a combination of two detailed address. Combo 6 is a combination of a city and a monument. Let’s now try to get the distances and time & check if they make sense.

code3All the distance and time calculations in this table look accurate.

Exercise : What are the benefits of using this approach over the PIN CODE approach mentioned above? Can you think of a better way to do this task?

Here is the complete Code :

import googlemaps
from datetime import datetime
def finddist(source, destination):
     gmaps = googlemaps.Client(key='XXX')
    now = datetime.now()
   directions_result = gmaps.directions(source, destination, mode="driving",departure_time=now)
   for map1 in directions_result:
         overall_stats = map1['legs']
         for dimensions in overall_stats:
                distance = dimensions['distance']
                return [distance['text']]
 
def findtime(source, destination):
      gmaps = googlemaps.Client(key='XXX')
      now = datetime.now()
      directions_result = gmaps.directions(source, destination, mode="driving",departure_time=now)
      for map1 in directions_result:
            overall_stats = map1['legs']
            for dimensions in overall_stats:
                   duration = dimensions['duration']
                   return [duration['text']]

import numpy as np
import pandas as pd
import pylab as pl 
import os
os.chdir(r"C:\Users\Tavish\Desktop")
cities = pd.read_csv("cities.csv")
cities["distance"] = 0
cities["time"] = 0
for i in range(0,8):
 source = cities['Source'][i]
 destination = cities['Destination'][i]
 cities['distance'][i] = finddist(source,destination)
 cities['time'][i] = findtime(source,destination)

 

End Notes

GoogleMaps API come with a few limitations on the total number of searches. You can have look at the documentation, if you see a use case of this algorithm.

Did you find the article useful? Share with us find more use cases of GoogleMaps API usage apart from the one mentioned in this article? Also share with us any links of related video or article to leverage GoogleMaps API. Do let us know your thoughts about this article in the box below.

If you like what you just read & want to continue your analytics learning, subscribe to our emailsfollow us on twitter or like our facebook page.

6 Comments

  • sumalatha says:

    Is it a sophisticated implementation of any of the methods that are used to solve transportation problems (like modi method, VAM method etc)?

  • Tavish Srivastava says:

    Hi Sumalatha,
    This article is on how to find distances between two points and has nothing to do with transportation problem frameworks. However, you can use the method described in this article to find distances between points and then use algorithms like VAM and Modi, to optimize routes. Thank you for sharing this thought. This just adds a new business case to the list I have mentioned in this article : Using distance mapping techniques in transportation problems.

    Tavish

  • Sunny says:

    First of all, thanks a lot for sharing this wonderful article. It really helps students like me to think beyond text-books. I was trying to replicate this project and came up with some doubts (may be very silly). I will be obliged if you can answer it:

    a) While generating the Key from Google Maps – which one (Directions API/Distance Matrix API/any other) to select. Else if, the key is totally different than the three options above, can you please include a brief process to generate the Key.

    b) Will this code work for source and destination as latitude and longitude. If not, can something be done to calculate distance based on these.

    Thanks a lot in advance! It really helps 🙂

  • Pratima Joshi says:

    Hi,
    I had the same question/suggestion as sunny to use Longitude/Latitude instead of PIN code to find the distance between source and destination.

  • Pratima Joshi says:

    Also, 2 more business cases:
    1. Finding nearby restaurants from a location and providing suggestions to the user.
    2. Finding nearby health service providers (hospitals, clinics, pharmacies etc.)

  • Ajas says:

    Hi Tavish,

    Thanks for sharing, it was great reading it. I am having one question, in common approach “Why do we need to calculate the centroid of point of interests and take euclidean distance?” can’t we just take euclidean distance directly between point of interests ? (it may be a silly question, but still i want to know.)
    Thanks

Leave A Reply

Your email address will not be published.

Join world’s fastest growing Analytics Community
Receive awesome tips, guides, infographics and become expert at: