How to access google spreadsheet
井民全, Jing, mqjing@gmail.com
Go Python Page
這文件教你如何用 Python 存取 Google spreadsheet 文件.
首先, 你得建立一個專門用來存取 Google Drive 與 Google Sheets 的線上服務.
我們的 python code 將透過這個 Google API 服務,
來存取使用者的 spreadsheet 文件.
我們的 python code 將透過這個 Google API 服務,
來存取使用者的 spreadsheet 文件.
1. Setup Google API
1.1. Create a Google Cloud project (go)
Step 1: Create a Project
[Select a Project] -> [New Project]
Step 2: Select [Drive] and [Sheets] API
[Google APIs] -> [Library] or [Enable APIs and Serivces]
1.2. Get credentials for your services
不是阿貓阿狗可以敲門進來使用這個服務, 必須要擁有 credential 的app 才可以使用 (secret json).
我們希望當擁有合法授權的 app 進行存取使用者文件時, 使用 OAuth 的方式來獲得使用者同意.
所以, 我們設定 OAuth 型態的 credential.
我們希望當擁有合法授權的 app 進行存取使用者文件時, 使用 OAuth 的方式來獲得使用者同意.
所以, 我們設定 OAuth 型態的 credential.
Step 1: Setup OAuth Credential
[Create Credentials] -> On the left side: [Credentials]: [OAuth consent screen]
Project Name
Step 2: Create a OAuth client ID for generate access token.
Command:
(a) [Create credientials] -> [OAuth clinet ID]: Other, and setup the name as climb-auto-gsheet
(b) then, Download the secret json
2. Create a sheet to work
3. Python Code
3.1. Code
Gsheet_test.py
from __future__ import print_function
from apiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
import pandas as pd
SPREADSHEET_ID = '1ZsGOUlpYP************TU0' # 指定 sheet ID, 你可以從 URL 得到 ID.
RANGE_NAME = 'Sheet1' # 指定 worksheet 名稱
def get_google_sheet(spreadsheet_id, range_name):
# 設定 我們的 App 會讀取 user 的 google spreadsheet
scopes = 'https://www.googleapis.com/auth/spreadsheets.readonly'
# Setup the Sheets API
store = file.Storage('credentials.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('client_secret.json', scopes)
creds = tools.run_flow(flow, store)
service = build('sheets', 'v4', http=creds.authorize(Http()))
# Call the Sheets API
gsheet = service.spreadsheets().values().get(spreadsheetId=spreadsheet_id, range=range_name).execute()
return gsheet
def gsheet2df(gsheet):
""" Converts Google sheet data to a Pandas DataFrame.
Note: This script assumes that your data contains a header file on the first row!
Also note that the Google API returns 'none' from empty cells - in order for the code
below to work, you'll need to make sure your sheet doesn't contain empty cells,
or update the code to account for such instances.
"""
header = gsheet.get('values', [])[0] # Assumes first line is header!
values = gsheet.get('values', [])[1:] # Everything else is data.
if not values:
print('No data found.')
else:
all_data = []
for col_id, col_name in enumerate(header):
column_data = []
for row in values:
column_data.append(row[col_id])
ds = pd.Series(data=column_data, name=col_name)
all_data.append(ds)
df = pd.concat(all_data, axis=1)
return df
gsheet = get_google_sheet(SPREADSHEET_ID, RANGE_NAME)
print('gsheet = ', gsheet)
df = gsheet2df(gsheet)
print('Dataframe size = ', df.shape)
print(df.head())
|
3.2. Run
python gsheet_test.py
使用者會被提醒, 是否同意我們的 python 程式存取 google sheet
3.3. Result
4. Reference
5. Further Reading
- Get Creditental, https://gspread.readthedocs.io/en/latest/oauth2.html