Claude-skill-registry google-drive-file-processor
Workflow and ready-to-import helpers for connecting to Google Drive with a service account, listing folders, and routing files based on MIME type. Use this skill whenever you need to download/export Docs, Sheets, Slides, Forms, or arbitrary binaries and surface their contents as pandas tables or local artifacts.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/google-drive-file-processor" ~/.claude/skills/majiayu000-claude-skill-registry-google-drive-file-processor && rm -rf "$T"
manifest:
skills/data/google-drive-file-processor/SKILL.mdsource content
Google Drive File Processor
Codex already knows the Google APIs exist; what it needs is a self-contained replica of the OpsDashboard helpers. Copying this skill directory into any project gives you
references/google_drive_processing_example.py, which contains production-ready helpers (build_drive_clients, process_drive_file, process_drive_folder, fetch_sheet_tables, etc.) that require only a service-account JSON blob.
Authentication & secrets
- Store the full service account JSON (plus
) underfolder_id
or load it from disk and pass the dict directly tost.secrets["gdrive_secrets"]
.build_drive_clients - Always request the read-only scopes used here:
,https://www.googleapis.com/auth/drive.readonly
,https://www.googleapis.com/auth/spreadsheets.readonly
.https://www.googleapis.com/auth/presentations.readonly - Build clients with
andservice_account.Credentials.from_service_account_info(..., scopes=SCOPES)
.googleapiclient.discovery.build
Folder listing pattern
- Pull
from the secret, defaulting to whole Drive when missing.folder_id - Compose the Drive query (
plus MIME filters when needed)."'<folder_id>' in parents"
shows the paging pattern when you need more than 200 files.src/show_sheet_explorer._fetch_sheet_metadata - Call
and capturedrive.files().list(..., includeItemsFromAllDrives=True, supportsAllDrives=True)
,id
, andname
.mimeType
MIME routing matrix
Re-use the handlers implemented inside
references/google_drive_processing_example.py:
| MIME | Handler | Result |
|---|---|---|
| + | Dict of tab -> rows/DataFrames |
| | Returns ordered text snippets |
| | Saves |
| Emits warning; Drive cannot export responses | None |
Other | | Saves |
| Everything else (PPTX, XLSX, PDF, etc.) | | Saves raw bytes |
process_drive_file(...) already contains this matrix and returns a dict describing the work performed (artifact path + metadata). Re-use it instead of re-implementing the branching logic.
Recommended workflow
- Build Drive/Sheets/Slides clients once per request and pass them into helpers; cache expensive sheet reads with
if the UI displays them repeatedly.@st.cache_data(ttl=3600) - Iterate files, call the right handler, and collect structured outputs (dataframes, exported files). Persist exports on disk or keep them in memory (BytesIO) before attaching to downstream tasks.
- When surfacing in Streamlit, expose controls to refresh the cache, filter by sheet name, and show links using
(seest.data_editor
for pattern).show_sheet_explorer
Example resources
exposes reusable helpers plusreferences/google_drive_processing_example.py
andprocess_drive_folder(...)
. Import it directly or execute it as a module to download/export folders in other projects—no OpsDashboard dependencies remain.fetch_sheet_tables(...)