Awesome-omni-skill juicefs-skill
Work with JuiceFS, a high-performance POSIX file system for cloud-native environments. Use when dealing with distributed file systems, object storage backends (S3, Azure, GCS), metadata engines (Redis, MySQL, TiKV), or when users mention JuiceFS, cloud storage, big data, or ML training storage.
git clone https://github.com/diegosouzapw/awesome-omni-skill
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data-ai/juicefs-skill" ~/.claude/skills/diegosouzapw-awesome-omni-skill-juicefs-skill && rm -rf "$T"
skills/data-ai/juicefs-skill/SKILL.mdJuiceFS Skill
Prerequisites
JuiceFS Client Installation
The initialization script can install JuiceFS automatically if needed.
Standard Installation (Recommended)
curl -sSL https://d.juicefs.com/install | sh -
This installs to
/usr/local/bin/juicefs (accessible system-wide).
Manual Installation
wget https://github.com/juicedata/juicefs/releases/latest/download/juicefs-linux-amd64.tar.gz tar -zxf juicefs-linux-amd64.tar.gz sudo install juicefs /usr/local/bin/
Verify Installation
juicefs version
Using the Initialization Script
The initialization script will:
- Check if JuiceFS is in your PATH
- Offer to install it automatically if not found
- Guide you through the process
Overview
JuiceFS is a high-performance POSIX file system designed for cloud-native environments. It separates data and metadata storage:
- Data: Stored in object storage (S3, GCS, Azure Blob, local disk, etc.)
- Metadata: Stored in databases (Redis, MySQL, PostgreSQL, TiKV, SQLite, etcd)
- Client: Mounts the file system and coordinates data/metadata
When to Use This Skill
Use this skill when:
- Setting up or managing JuiceFS file systems
- Integrating JuiceFS with Kubernetes, Hadoop, or Docker
- Optimizing JuiceFS performance for specific workloads
- Troubleshooting JuiceFS issues
- Migrating data to/from JuiceFS
- Configuring JuiceFS for big data, ML training, or shared storage
Core Concepts
Architecture
┌─────────────┐ │ JuiceFS │ │ Client │ └──────┬──────┘ │ ┌──┴───────────┐ │ │ ┌───▼────┐ ┌───▼────────┐ │Metadata│ │Object │ │Engine │ │Storage │ │(Redis) │ │(S3, etc.) │ └────────┘ └────────────┘
Data Organization
- Files → Chunks (max 64 MiB) → Slices (variable) → Blocks (4 MiB) → Object Storage
Metadata Engines
- Redis: Best for production, fast, supports Sentinel/Cluster for HA
- MySQL/PostgreSQL: Good for production with existing infrastructure
- TiKV: Excellent for large-scale distributed deployments
- SQLite: Simple, single-node, good for testing/development
- etcd: Small to medium scale
🔒 Security: Protecting Sensitive Credentials
IMPORTANT FOR AI AGENTS: When working with JuiceFS in AI agent environments, credentials (AK/SK, passwords) should NOT be exposed to the AI model to prevent data leakage.
SKILL Responsibility Boundary
What This SKILL Provides:
- Security guidance for AI agents working with JuiceFS
- Method to prevent AI agents from accessing sensitive credentials
- Secure initialization process with binary compilation
- Clear separation between admin setup (root) and agent usage (non-root)
What This SKILL Does NOT Handle:
- How AI agents are deployed or run
- Host system security configuration
- Network security setup
- General system administration
Design Philosophy: This SKILL assumes the AI agent runs as a non-root user and provides maximum isolation between the agent and sensitive information. Security recommendations under root/admin mode are ineffective as root has unrestricted access.
When Credential Protection is Required
Use the secure initialization approach when using:
- ✅ Object storage with access keys (S3, OSS, Azure Blob, GCS, etc.)
- ✅ Databases with passwords (Redis, MySQL, PostgreSQL with auth)
- ✅ Any configuration containing sensitive information
NOT required for:
- ❌ Local storage (
) + SQLite3 (no password)--storage file - ❌ Unauthenticated metadata engines
Secure Initialization Process
Instead of directly running
juicefs format and juicefs mount commands that expose credentials:
IMPORTANT: The initialization script MUST be run with root/administrator privileges (sudo)
Why root is required:
- To install shc (Shell Script Compiler) if not present
- To compile scripts into secure binaries
- To set proper ownership and permissions
- To ensure AI agent user cannot access credentials
Run the initialization script:
# MUST run as root/admin sudo ./scripts/juicefs-init.sh # Script will prompt for AI agent username
Re-running the script: The script is designed to be re-runnable and will:
- Detect and prompt before overwriting existing binary
- Check if filesystem already exists (skip formatting if so)
- Allow you to update configuration without reformatting
This interactive script will:
- Prompt for AI agent username
- Prompt for all sensitive configuration (AK/SK, passwords, URLs)
- Install shc (Shell Script Compiler) if not present
- Format the filesystem if needed
- Generate wrapper script with embedded credentials
- Compile wrapper into binary using shc
- Name binary after filesystem for easy identification
- Verify binary functionality
- Clean up intermediate files (wrapper script, C source)
- Set proper permissions and ownership (root:AI_AGENT_USER group, 750)
Generated binary (in
juicefs-scripts/ directory):
- Compiled binary wrapper (e.g.,<filesystem-name>
)prod-data
The binary:
- Contains embedded credentials (compiled into binary format, obfuscated)
- Accepts any JuiceFS command and parameters
- Named after filesystem for easy identification and management
- One filesystem = one binary program
- Owned by root, executable by AI agent user
AI Agent usage:
# Switch to AI agent user first su - aiagent # Show available commands ./juicefs-scripts/myfs # Mount filesystem ./juicefs-scripts/myfs mount /mnt/jfs # Mount with custom options ./juicefs-scripts/myfs mount --cache-size 204800 /mnt/jfs # Check filesystem status ./juicefs-scripts/myfs status # Show statistics ./juicefs-scripts/myfs stats /mnt/jfs # Unmount filesystem ./juicefs-scripts/myfs umount /mnt/jfs # Run benchmark ./juicefs-scripts/myfs bench /mnt/jfs
Example: Secure Setup Flow
Step 1: Admin initializes (one-time setup as root):
cd /path/to/juicefs-skill sudo ./scripts/juicefs-init.sh # Enter AI agent username: aiagent # Follow prompts to enter: # - Filesystem name: prod-data # - Redis: localhost:6379 with password # - S3 bucket + AWS credentials # - Compression option (for format) # Binary compiled: juicefs-scripts/prod-data # Owned by root, executable by aiagent
Step 2: AI agent user uses the binary:
# Switch to AI agent user su - aiagent # Mount filesystem (specify mount point at runtime) ./juicefs-scripts/prod-data mount /mnt/jfs # Check status ./juicefs-scripts/prod-data status # Work with filesystem ls /mnt/jfs cp data.csv /mnt/jfs/ # Unmount when done ./juicefs-scripts/prod-data umount /mnt/jfs
For AI Agents: Detection and Guidance
When a user asks to format or mount JuiceFS:
Check if using sensitive resources:
# User mentions: S3, OSS, Azure, GCS, Redis password, MySQL password, etc.
Provide guidance:
⚠️ SECURITY WARNING: This operation requires sensitive credentials. To prevent credential exposure to the AI model: 1. Please run the initialization script yourself (MUST use sudo): sudo ./scripts/juicefs-init.sh # Script will prompt for AI agent username 2. Follow the prompts to configure your filesystem - The script will install shc if needed - Script will compile credentials into a binary using shc - Binary will be named after your filesystem - Binary owned by root, executable by AI agent 3. Once complete, I can help you use the generated binary: - Show commands: ./juicefs-scripts/<name> - Mount: ./juicefs-scripts/<name> mount <mountpoint> - Status: ./juicefs-scripts/<name> status - Unmount: ./juicefs-scripts/<name> umount <mountpoint> This keeps your AK/SK and passwords secure from the AI model. The binary contains compiled credentials that cannot be read with simple commands. Note: Root privileges are required for shc installation, binary compilation, and setting proper ownership/permissions.
Insecure Setup (Local Development Only)
For local development without sensitive data:
# This is safe for AI agents - no credentials involved juicefs format \ --storage file \ --bucket /tmp/jfs-storage \ sqlite3:///tmp/jfs.db \ dev-fs juicefs mount sqlite3:///tmp/jfs.db /mnt/jfs-dev
Essential Commands
1. Format a File System
Create a new JuiceFS file system:
# Basic format with Redis and S3 juicefs format \ --storage s3 \ --bucket https://mybucket.s3.amazonaws.com \ redis://localhost:6379/1 \ my-juicefs # With compression juicefs format \ --storage s3 \ --bucket https://mybucket.s3.amazonaws.com \ --compress lz4 \ redis://localhost:6379/1 \ my-juicefs # Local development with SQLite juicefs format \ --storage file \ --bucket /data/storage \ sqlite3://myjfs.db \ dev-fs
2. Mount a File System
# Basic mount juicefs mount redis://localhost:6379/1 /mnt/jfs # Production mount with cache optimization juicefs mount \ --cache-dir /ssd/cache \ --cache-size 204800 \ --writeback \ -d \ redis://localhost:6379/1 \ /mnt/jfs # Mount with prefetch for read-heavy workloads juicefs mount \ --cache-dir /nvme/cache \ --cache-size 409600 \ --prefetch 3 \ redis://localhost:6379/1 \ /mnt/jfs
Key Mount Options:
: Cache directory (default:--cache-dir
)~/.juicefs/cache
: Cache size in MiB (default: 102400 = 100GB)--cache-size
: Enable write-back cache for better write performance--writeback
: Enable read prefetch with N threads--prefetch N
: Read buffer size in MiB (default: 300)--buffer-size
: Run in background (daemon mode)-d
3. Unmount
# Graceful unmount juicefs umount /mnt/jfs # Force unmount juicefs umount -f /mnt/jfs
4. Sync Data
# Sync local to JuiceFS juicefs sync /local/path/ jfs://redis://localhost:6379/1/remote/path/ # Sync between JuiceFS file systems juicefs sync jfs://redis://localhost:6379/1/src/ jfs://redis://localhost:6379/2/dst/ # Sync from S3 to JuiceFS juicefs sync s3://bucket/path/ /mnt/jfs/path/ # Dry run juicefs sync --dry-run /source/ /dest/
5. Status and Monitoring
# Show file system status juicefs status redis://localhost:6379/1 # Real-time statistics juicefs stats /mnt/jfs # Profile operations juicefs profile /mnt/jfs # Benchmark juicefs bench /mnt/jfs
6. Configuration
# View configuration juicefs config redis://localhost:6379/1 # Set trash retention juicefs config redis://localhost:6379/1 --trash-days 7 # Set capacity quota juicefs config redis://localhost:6379/1 --capacity 1048576
7. Maintenance
# Garbage collection (dry run first) juicefs gc redis://localhost:6379/1 --dry # Actual garbage collection juicefs gc redis://localhost:6379/1 # Dump metadata for backup juicefs dump redis://localhost:6379/1 backup.json # Load metadata from backup juicefs load redis://localhost:6379/1 backup.json
8. S3 Gateway
# Start S3-compatible gateway export MINIO_ROOT_USER=admin export MINIO_ROOT_PASSWORD=12345678 juicefs gateway redis://localhost:6379/1 localhost:9000
Configuration by Workload
Big Data Processing (Hadoop/Spark)
juicefs mount \ --cache-dir /ssd/cache \ --cache-size 204800 \ --writeback \ redis://redis:6379/1 \ /mnt/jfs
Machine Learning Training
juicefs mount \ --cache-dir /nvme/cache \ --cache-size 409600 \ --prefetch 3 \ --buffer-size 600 \ redis://redis:6379/1 \ /mnt/ml-data
Shared Development Environment
juicefs mount \ --cache-size 102400 \ redis://redis:6379/1 \ /mnt/shared
Backup/Archive (Write-heavy)
juicefs mount \ --writeback \ --buffer-size 600 \ redis://redis:6379/1 \ /mnt/backup
Kubernetes Integration
Basic PersistentVolume
apiVersion: v1 kind: PersistentVolume metadata: name: juicefs-pv spec: capacity: storage: 10Pi volumeMode: Filesystem accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain csi: driver: csi.juicefs.com volumeHandle: juicefs-volume fsType: juicefs nodePublishSecretRef: name: juicefs-secret namespace: default
Troubleshooting
Mount Fails
-
Check metadata engine:
# For Redis redis-cli -h localhost -p 6379 ping -
Check credentials: Verify access keys for object storage
-
Check logs:
tail -f /var/log/juicefs.log
Slow Performance
-
Check cache hit rate:
juicefs stats /mnt/jfs -
Increase cache:
juicefs umount /mnt/jfs juicefs mount --cache-size 204800 redis://localhost:6379/1 /mnt/jfs -
Enable prefetch for sequential reads:
juicefs mount --prefetch 3 redis://localhost:6379/1 /mnt/jfs
No Space Left on Device
-
Clean cache:
rm -rf ~/.juicefs/cache/* -
Increase free-space-ratio:
juicefs mount --free-space-ratio 0.2 redis://localhost:6379/1 /mnt/jfs
Common Patterns
Production Setup with HA
# Format with Redis Sentinel juicefs format \ --storage s3 \ --bucket https://prod-bucket.s3.amazonaws.com \ redis://sentinel1:26379,sentinel2:26379,sentinel3:26379/mymaster/1 \ prod-fs # Mount with optimized settings juicefs mount \ --cache-dir /ssd/cache \ --cache-size 204800 \ --writeback \ -d \ redis://sentinel1:26379,sentinel2:26379,sentinel3:26379/mymaster/1 \ /mnt/jfs
Development Setup
# Format with SQLite (local) juicefs format \ --storage file \ --bucket /tmp/jfs-storage \ sqlite3:///tmp/jfs.db \ dev-fs # Mount juicefs mount sqlite3:///tmp/jfs.db /mnt/jfs-dev
Data Migration
# Step 1: Mount source and destination juicefs mount redis://source:6379/1 /mnt/source juicefs mount redis://dest:6379/1 /mnt/dest # Step 2: Sync data juicefs sync /mnt/source/ /mnt/dest/ # Or use juicefs sync directly juicefs sync jfs://redis://source:6379/1/ jfs://redis://dest:6379/1/
Performance Tuning Quick Guide
| Workload | Cache Size | Cache Dir | Extra Options |
|---|---|---|---|
| Read-heavy | 200-400GB | SSD/NVMe | |
| Write-heavy | 100-200GB | SSD | |
| ML Training | 400GB+ | NVMe | |
| Mixed | 100-200GB | SSD | Default |
| Small files | 100GB | SSD | |
Security Best Practices
-
🔒 Protect credentials in AI agent environments:
- Use
to create compiled binary with embedded credentials./scripts/juicefs-init.sh - The script uses shc (Shell Script Compiler) to protect sensitive information
- Binary is named after filesystem for easy management
- Credentials are compiled into binary format (obfuscated by shc)
- This prevents AI models from easily accessing AK/SK, passwords, and sensitive URLs
- See the "Security: Protecting Sensitive Credentials" section above for details
- Use
-
Enable encryption:
juicefs format --encrypt-secret redis://localhost:6379/1 secure-fs -
Use TLS for metadata engine: Connect via
instead ofrediss://redis:// -
Use HTTPS for object storage: Always use HTTPS endpoints
-
IAM roles: Use IAM roles instead of static access keys when possible
-
Network isolation: Use VPC/private networks for production
Advanced Security Recommendations
For production environments requiring maximum security:
1. Secret Management Services:
- AWS Secrets Manager / Parameter Store
- HashiCorp Vault
- Azure Key Vault
- Benefits: Centralized rotation, auditing, time-limited access
2. IAM-Based Authentication:
- AWS: Use IAM roles with EC2 instance profiles
- Azure: Use Managed Identity
- GCP: Use Workload Identity
- Benefits: No static credentials, automatic rotation
3. Certificate-Based Authentication:
- Use TLS client certificates for Redis/databases
- Benefits: No passwords to protect, automatic validation
4. Configuration File Encryption:
- age (modern encryption tool)
- SOPS (Secrets OPerationS)
- Benefits: Version-controllable configs, separate key management
See scripts/SECURITY_MODEL.md for detailed implementation guidance.
Environment Variables
The initialization script does NOT export sensitive environment variables. Instead, credentials are compiled into secure binaries.
For reference, JuiceFS supports these environment variables:
# Custom cache (✓ Safe - no credentials) export JUICEFS_CACHE_DIR=/ssd/cache # Debug logging (✓ Safe - no credentials) export JUICEFS_LOGLEVEL=debug # AWS credentials (⚠️ NOT RECOMMENDED - exposes to AI agent) # export AWS_ACCESS_KEY_ID=your-key # export AWS_SECRET_ACCESS_KEY=your-secret # Redis password (⚠️ NOT RECOMMENDED - exposes to AI agent) # export REDIS_PASSWORD=your-password
Recommended approach: Use the initialization script which compiles credentials into binaries rather than using environment variables.
Quick Decision Trees
Choosing a Metadata Engine
- Redis: Fast, production-ready, supports HA (Sentinel/Cluster)
- MySQL/PostgreSQL: Already have infrastructure, need SQL features
- TiKV: Large scale, need horizontal scalability
- SQLite: Development, testing, single node
- etcd: Small to medium scale, already using etcd
Choosing Cache Size
- Working set < 100GB: 100GB cache (102400 MiB)
- Working set 100-500GB: 200-400GB cache
- Working set > 500GB: 400GB+ cache
- Rule of thumb: 10-20% of working set size
References
For detailed information, see the references:
- Comprehensive Reference - Complete JuiceFS documentation
- Quick Start Guide - Task patterns and troubleshooting flowcharts
- Table of Contents - Index of all topics
Resources
- Official Documentation: https://juicefs.com/docs/community/introduction
- GitHub Repository: https://github.com/juicedata/juicefs
- Quick Start: https://juicefs.com/docs/community/quick_start_guide
- Command Reference: https://juicefs.com/docs/community/command_reference
- Community: https://github.com/juicedata/juicefs/discussions
Installation
# Linux AMD64 curl -sSL https://d.juicefs.com/install | sh - # macOS (Homebrew) brew install juicefs # Docker docker pull juicedata/juicefs
Tips for AI Agents
- Always check metadata engine connectivity first
- Cache is critical - allocate sufficient space on fast storage
- Use
for write-heavy,--writeback
for read-heavy workloads--prefetch - Monitor with
regularlyjuicefs stats - Test with
before productionjuicefs bench - Plan for metadata engine HA in production
- Use compression (
) to reduce costs--compress lz4 - Enable trash (
) for safety--trash-days 7 - Run
regularlyjuicefs gc - Keep JuiceFS client updated