Read and Update Google Spreadsheets with Python!
- Learn how to setup a Google service account
- Read and Write data in Google Spreadsheets using Python
Automation of work has been one of the quickest ways to reach functional efficiency. Moreover, in today’s era where success is dependent on speed, automation of myriad repetitive tasks play a key role in any industry and at the most basic level of functionality. But many of us fail to understand how to automate some tasks and end in the loop of manually doing the same things again.
For instance, we often spend hours daily extracting data and then copy-pasting to spreadsheets and creating reports leading to excessive time consumption. Consequently, it would be great if we just run a script, and data is uploaded in the spreadsheet and the report is prepared with just a click. There are multiple advantages of report automation like you would be able to save time on data collection and removing typos and focus would be more on the analysis part.
In this article, we will see a step by step process to set up a Google service account. We will make use of the Google APIs to read google spreadsheets data using python and we will also update the data in the spreadsheet using python. We are going to read the cricket commentary data from the spreadsheet and find out the number of runs scored by each batsman and then upload the results into a separate spreadsheet.
In case you are unfamiliar with Python, do have a look at our free course Introduction to Python
Table of Contents
- Create Google Service Account
- Read Data from Google Sheets
- Update Data in Google Sheets
Create Google Service Account
In order to read and update the data from google spreadsheets in python, we will have to create a Service Account. It is a special type of account that is used to make authorized API calls to Google Cloud Services. First of all, make sure that you have a google account. If you have a Google account, you can follow these steps to create a Google service account.
- Go to the developer’s console. Now, you will see something like this. Click on the Create Project button.
- Then provide the project name and the organization name which is optional. Then click on the create button.
- Now, that our project is created we need to enable the APIs that we require in this project. Click on the Enable APIs and Services button to search for the APIs that Google provides. Consequently, we will add two APIs for our project.
- Google Sheets API
- Google Drive API
- Then, in the search bar, search for these APIs and click on the enable button.
- Google Sheets API will look something like this. It will allow you to access Google Spreadsheets. You would be able to read and modify the content present in the Spreadsheets.
Google Drive API will look something like this. It will allow you to access the resources from Google Drive.
- Once you have enabled the required APIs in your project then it’s time to create credentials for the service account. Click on the Create Credentials button to continue.
- Now, select Google Drive API in the type of API required question. We will be calling the API from a non UI based platform so select Other non-UI (e.g. cron job, daemon). Select the Application Data in the next question as we do not require any user data to run our application. And also we are not using any cloud-based compute engine for our application. Finally, click on the What credentials do I need? button.
- Then, share the google spreadsheets with other people and provide permission like edit or view only. Similarly, we will provide access to our service account. We will give it the complete access so that we will be able to read as well as write the spreadsheets and download the JSON file of the credentials.
Now, a JSON file will be downloaded which contains the keys to access the API. Our google service account is ready to use. In the next section, we will read and modify the data in the spreadsheet.
Read Data from Google Sheets
We will read the commentary data of the India Bangladesh cricket match. You can access the data here.
We have a ball by ball data of the complete match in the spreadsheet. Now, we will do a very basic task and calculate how many runs are scored by each of the batsmen. We can do this by using a simple groupby in pandas. And finally, we will upload the results in a separate sheet.
Provide access to the Google Sheet
Now, we need to provide access to the google sheet so that the API can access it. Open the JSON file that we downloaded from the developer’s console. Look for the client_email in the JSON file and copy it.
Then click on the Share button on the Spreadsheet and provide access to this client email.
Now, we are ready to code and access the sheet using python. The following are the steps-
1. Importing the Libraries
We will use the gspread and oauth2client service to authorize and make API calls to Google Cloud Services.
You can install the libraries using the following commands.
!pip3 install gspread !
pip3 install --upgrade google-api-python-client oauth2client
2. Define the scope of the application
Then, we will define the scope of the application and add the JSON file that has the credentials to access the API.
3. Create the Sheet Instance
Use the client object and open the sheet. You just need to pass the title of the sheet as the argument. Also, you can pass the URL of the sheet if you want to do so.
Access Particular Sheet: We have multiple sheets in a single spreadsheet. You can access particular google spreadsheets with python by providing the index of that sheet in the get_worksheet function. For the first sheet, pass the index 0 and so on.
The API provides some basic functionalities such as the number of columns by using col_count and get the value in a particular cell. Here are some examples of the same.
4. Get all records
Then, we will get all the data present in the sheet using the get_all_records function. It will return a JSON string containing the data.
5. Convert the Dictionary to the Dataframe
In data science, pandas is one of the most preferred libraries to do data manipulation tasks. So we will first convert the JSON string to the pandas dataframe.
In case you are not comfortable with the pandas, I would highly recommend you to enroll in this free course: Pandas for Data Analysis in Python
6. Grouping Batsman
Then, we will create a groupby of the number of runs scored by a batsman and upload that dataframe in the separate sheet.
Now, we will add this dataframe into the google sheets.
Update Data in Google Sheets
The following are steps to update data in google sheets.
Create a Separate Sheet
Firstly, we will create a separate sheet to store the results. For that, use the add_worksheet function and pass the number of rows and columns required and the title of the sheet. After that get the instance of the second sheet by providing the index which is 1.
Once you run this command, you will see that a separate sheet is created.
Update values to the sheet
Then, convert the runs dataframe into the 2-D list and use the function to add values in the sheet. With this single line of code, you can update the sheet. Then, you will get a message of the number of rows and columns updated with some more details.
To summarize, in this article, we dived into understanding various steps involved in the process of creating a service account. And how to read the write in the google spreadsheets right from your python console. We downloaded the spreadsheet data and converted it into the pandas dataframe and created a groupby table and uploaded that on the spreadsheet again. This API can be very helpful in the automation of reports.
In case you want to brush up your spreadsheet concepts, I recommend the following article and course-
I hope this helps you in automating scripts and saving loads of your valuable time. Reach out in the comment section in case of any doubts. I will be happy to help.