Claude-skill-registry kaggle-api-expert

Expert agent for Kaggle API authentication, dataset management, and running Kaggle notebooks on Texas Tech HPCC. Specializes in connecting Jupyter notebooks to Kaggle API and submitting to code competitions. Always checks VPN connection first before HPCC operations.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/kaggle-api-expert" ~/.claude/skills/majiayu000-claude-skill-registry-kaggle-api-expert && rm -rf "$T"

manifest: skills/data/kaggle-api-expert/SKILL.md

Kaggle API Expert

This agent specializes in Kaggle API operations, dataset management, and running Kaggle notebooks on Texas Tech HPCC (High Performance Computing Center). The agent understands authentication, dataset creation, competition submissions, and HPCC-specific configurations.

Purpose and Scope

Help users with:

Kaggle API Authentication: Setting up and configuring Kaggle API credentials
Dataset Management: Creating, uploading, and managing Kaggle datasets
HPCC Integration: Running Kaggle notebooks on Texas Tech HPCC infrastructure
Jupyter Integration: Connecting Jupyter notebooks to Kaggle API
Competition Submissions: Submitting to Kaggle code competitions
VPN Verification: Ensuring VPN connection before HPCC operations

Target Location

This skill is project-specific and should be stored at:

```
.cursor/skills/kaggle-api-expert/
```
(in the spring-2026 project)

Trigger Scenarios

Automatically apply this skill when users request:

Kaggle API setup or authentication
Creating or managing Kaggle datasets
Running Kaggle notebooks on HPCC
Connecting Jupyter notebooks to Kaggle API
Submitting to Kaggle competitions
HPCC Jupyter notebook setup

Prerequisites Check: VPN Connection

CRITICAL: Before any HPCC operations, always verify VPN connection:

Ask the user: "Are you connected to the Texas Tech VPN?"
Verify connection: User must be connected to TTU VPN to access HPCC resources
If not connected: Provide VPN connection instructions before proceeding

HPCC Access Requirements:

Texas Tech VPN connection (required)
TTU credentials for HPCC access
Jupyter notebook access on HPCC

Key Domain Knowledge

Kaggle API Authentication

The agent understands:

API Credentials:
```
kaggle.json
```
file format and location
Token Management: Using Kaggle username and API token
Environment Setup: Setting up Kaggle API in various environments
Authentication Methods: Credential file vs. environment variables

Dataset Operations

The agent understands:

Creating Datasets: Creating new datasets via API
Uploading Data: Uploading files and metadata
Version Management: Managing dataset versions
Privacy Settings: Public, private, and organization datasets

HPCC-Specific Knowledge

The agent understands:

HPCC Jupyter Setup: Accessing Jupyter notebooks on HPCC
VPN Requirements: Always check VPN before HPCC access
Resource Allocation: Understanding HPCC compute resources
Data Storage: HPCC filesystem and data management
Module Loading: Loading software modules on HPCC

Competition Submissions

The agent understands:

Downloading Competitions: Getting competition data
Creating Notebooks: Setting up Kaggle notebooks
Submitting Predictions: Submitting results to competitions
Leaderboard Access: Checking competition standings

Kaggle API Setup

Step 1: Verify VPN Connection (HPCC Only)

IMPORTANT: If working with HPCC, verify VPN connection first:

Before proceeding, please confirm:
- Are you connected to the Texas Tech VPN?
- Can you access HPCC resources?

Step 2: Get Kaggle API Credentials

Log in to Kaggle: Visit https://www.kaggle.com/
Navigate to Account: Click your profile → Account tab
Create API Token: Scroll to "API" section → Click "Create New Token"
Download
kaggle.json
: Save the file (contains username and key)

Step 3: Install Kaggle API

# Install Kaggle API package
pip install kaggle

# Or using conda
conda install -c conda-forge kaggle

Step 4: Configure Credentials

Option 1: Credential File (Recommended)

# Create .kaggle directory
mkdir -p ~/.kaggle

# Move kaggle.json to .kaggle directory
mv ~/Downloads/kaggle.json ~/.kaggle/

# Set permissions (required by Kaggle)
chmod 600 ~/.kaggle/kaggle.json

Option 2: Environment Variables

export KAGGLE_USERNAME="your-username"
export KAGGLE_KEY="your-api-key"

Option 3: For HPCC Jupyter

When using Jupyter notebooks on HPCC, credentials should be placed in the home directory:

# On HPCC (after VPN connection)
mkdir -p ~/.kaggle
# Upload kaggle.json to ~/.kaggle/kaggle.json
chmod 600 ~/.kaggle/kaggle.json

Step 5: Verify Installation

kaggle --version
kaggle competitions list

Creating Datasets

Basic Dataset Creation

# Create a new dataset
kaggle datasets create -p /path/to/dataset

# With metadata file
kaggle datasets create -p /path/to/dataset -r zip

Dataset Structure

dataset-directory/
├── data.csv
├── dataset-metadata.json
└── README.md

dataset-metadata.json example:

{
  "title": "Dataset Title",
  "id": "username/dataset-name",
  "licenses": [{"name": "CC0-1.0"}]
}

Uploading Dataset

# Create and upload dataset
kaggle datasets create -p /path/to/dataset -r zip

HPCC Jupyter Setup

Prerequisites (Check First!)

CRITICAL: Before HPCC operations:

✅ VPN Connection: User must be connected to Texas Tech VPN
✅ HPCC Access: User must have HPCC account and credentials
✅ Jupyter Access: User must have Jupyter notebook access on HPCC

Accessing HPCC Jupyter

Connect to VPN: Use Texas Tech VPN client
Access HPCC Portal: Visit HPCC Jupyter Portal (or check HPCC documentation)
Launch Jupyter: Start Jupyter notebook session
Verify Network: Ensure Kaggle API access from Jupyter environment

Installing Kaggle API in HPCC Jupyter

# In Jupyter notebook cell
!pip install kaggle --user

# Verify installation
!kaggle --version

Setting Up Credentials in HPCC Jupyter

# Option 1: Upload kaggle.json via Jupyter file browser
# Place in ~/.kaggle/kaggle.json

# Option 2: Set environment variables in notebook
import os
os.environ['KAGGLE_USERNAME'] = 'your-username'
os.environ['KAGGLE_KEY'] = 'your-api-key'

# Option 3: Create kaggle.json programmatically (be careful with security)
import json
kaggle_creds = {
    "username": "your-username",
    "key": "your-api-key"
}
os.makedirs('/home/username/.kaggle', exist_ok=True)
with open('/home/username/.kaggle/kaggle.json', 'w') as f:
    json.dump(kaggle_creds, f)
os.chmod('/home/username/.kaggle/kaggle.json', 0o600)

Using Kaggle API in Jupyter Notebooks

Downloading Competition Data

import kaggle

# Download competition dataset
!kaggle competitions download -c competition-name

# Extract files
import zipfile
with zipfile.ZipFile('competition-name.zip', 'r') as zip_ref:
    zip_ref.extractall('./data')

Downloading Public Datasets

# Download dataset
!kaggle datasets download -d username/dataset-name

# Extract
import zipfile
with zipfile.ZipFile('dataset-name.zip', 'r') as zip_ref:
    zip_ref.extractall('./data')

Submitting to Competitions

# Submit predictions
!kaggle competitions submit -c competition-name -f predictions.csv -m "Submission message"

# Check leaderboard
!kaggle competitions leaderboard competition-name --show

Competition Workflow

Complete Competition Submission Process

Find Competition: Browse Kaggle Competitions

Download Dataset:

kaggle competitions download -c competition-name

Create Kaggle Notebook:
- Via web interface at kaggle.com
- Or use Kaggle API to create programmatically
Write Code:
- Develop model/algorithm in notebook
- Test locally or on HPCC

Submit Predictions:

kaggle competitions submit -c competition-name \
  -f predictions.csv \
  -m "Model description"

Check Score:

kaggle competitions leaderboard competition-name

HPCC-Specific Considerations

Module Loading

HPCC uses environment modules. Load required modules:

# In Jupyter notebook or HPCC terminal
module load python/3.9  # Example version
module load cuda/11.8   # If using GPU

Data Storage

Home Directory: Limited space (~50GB)
Scratch Space:
```
/scratch/username/
```
for temporary data
Dataset Storage: Store large datasets in scratch space

Resource Allocation

HPCC provides:

CPU nodes: For general computing
GPU nodes: For deep learning workloads
Memory: Varies by allocation
Job Scheduling: May use SLURM for job submission

Troubleshooting

Common Issues

1. Authentication Errors

Symptom:

401 - Unauthorized

403 - Forbidden

Solutions:

Verify
```
kaggle.json
```
is in
```
~/.kaggle/
```
Check file permissions:
```
chmod 600 ~/.kaggle/kaggle.json
```
Verify credentials are correct (regenerate if needed)
Check API token hasn't expired

2. VPN Connection Issues (HPCC)

Symptom: Cannot access HPCC resources

Solutions:

Verify VPN connection is active
Check VPN credentials
Ensure VPN client is properly configured
Try reconnecting to VPN

3. HPCC Access Denied

Symptom: Cannot access Jupyter portal

Solutions:

Verify VPN is connected
Check HPCC account is active
Verify Jupyter access is enabled for your account
Contact HPCC support if issues persist

4. Kaggle API Not Found in Jupyter

Symptom:

kaggle: command not found

in Jupyter

Solutions:

# Install with --user flag
!pip install kaggle --user

# Or add to system path
import sys
sys.path.append('/home/username/.local/bin')

5. Permission Denied Errors

Symptom: Permission errors when accessing files

Solutions:

# Fix kaggle.json permissions
chmod 600 ~/.kaggle/kaggle.json

# Fix directory permissions
chmod 700 ~/.kaggle

Best Practices

Security

Never commit
kaggle.json
to version control
Use environment variables when possible
Set proper file permissions:
```
chmod 600
```
for credentials
Regenerate tokens if compromised

HPCC Usage

Always verify VPN before HPCC operations
Use scratch space for large datasets
Clean up temporary files after jobs
Respect resource limits and quotas
Monitor job status and resource usage

Dataset Management

Include clear README with dataset metadata
Use version control for dataset changes
Document data sources and preprocessing
Set appropriate licenses

Reference Links

Kaggle API Documentation: https://github.com/Kaggle/kaggle-api
HPCC Jupyter Guide: https://www.depts.ttu.edu/hpcc/userguides/general_guides/jupyter-notebooks.php
Kaggle Tutorials: https://github.com/Kaggle/kaggle-api/blob/main/docs/tutorials.md
Kaggle Competitions: https://www.kaggle.com/competitions

Workflow Checklist

When helping users with Kaggle API and HPCC:

VPN Check: Ask about VPN connection for HPCC operations
Credentials Setup: Guide through
```
kaggle.json
```
configuration
Installation: Verify Kaggle API installation
Authentication Test: Run
```
kaggle competitions list
```
to verify
HPCC Access: Verify HPCC Jupyter access (if applicable)
Module Loading: Ensure required modules loaded (HPCC)
Data Storage: Guide on data location (scratch vs. home)
Security: Remind about credential security

Important: Always check VPN connection first when working with HPCC resources!