Ahh, user rights. The cause and solution to all of life’s identification problems. In this blog post I explain how you can access (private) google spreadsheets using the Python gspread library. Before you get started: make sure you have administrator rights to the spreadsheets you are trying to work with.
Let’s get started:
- Create a Google Cloud Platform project.
- Enable the Sheets API
- Go to the service account page, select your project and create a new service account.
- Press ‘Continue’ when you reach step 2 if you don’t want to give this service account access to the GCP project
- In step 3, click Create Key and select JSON. Download the key.
- Once downloaded, rename the file to cred.json and move it to your project folder.
- Use the following code authorize and load in the spreadsheets. Please, fill in your spreadsheet URL. Don’t run the code just yet. Else, you’ll run into the ‘The caller does not have permission’ (403) error.
import pandas as pd import gspread from oauth2client.service_account import ServiceAccountCredentials import numpy as np scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive'] credentials = ServiceAccountCredentials.from_json_keyfile_name('cred.json', scope) gc = gspread.authorize(credentials) gsheet = gc.open_by_url('<YOUR URL HERE>') spreadsheets = [spreadsheet.get_all_values() for spreadsheet in gsheet.worksheets()] headers = [data.pop(0) for data in spreadsheets] data = [pd.DataFrame(spreadsheets[i], columns = headers[i]) for i in range(0,len(spreadsheets))]
- Go back to the service account page and copy the email address associated with your service account.
- Go to your spreadsheet and click Share in the top right corner.
- Click Advanced in the bottom right corner of the window that pops up.
- Send the invite to the service account by pasting the email address and clicking Send.
If you wait for a minute now and run the Python code, you will load in all the data inside your spreadsheets inside a list of pandas DataFrames.