Awesome-omni-skill dataflow

Kailash DataFlow - zero-config database framework with automatic model-to-node generation. Use when asking about 'database operations', 'DataFlow', 'database models', 'CRUD operations', 'bulk operations', 'database queries', 'database migrations', 'multi-tenancy', 'multi-instance', 'database transactions', 'PostgreSQL', 'MySQL', 'SQLite', 'MongoDB', 'pgvector', 'vector search', 'document database', 'RAG', 'semantic search', 'existing database', 'database performance', 'database deployment', 'database testing', or 'TDD with databases'. DataFlow is NOT an ORM - it generates 11 workflow nodes per SQL model, 8 nodes for MongoDB, and 3 nodes for vector operations.

install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/development/dataflow-integrum-global" ~/.claude/skills/diegosouzapw-awesome-omni-skill-dataflow-edfbef && rm -rf "$T"
manifest: skills/development/dataflow-integrum-global/SKILL.md
source content

Kailash DataFlow - Zero-Config Database Framework

DataFlow is a zero-config database framework built on Kailash Core SDK that automatically generates workflow nodes from database models.

Overview

  • Automatic Node Generation: 11 nodes per model (@db.model decorator)
  • Multi-Database Support: PostgreSQL, MySQL, SQLite (SQL) + MongoDB (Document) + pgvector (Vector Search)
  • Enterprise Features: Multi-tenancy, multi-instance isolation, transactions
  • Zero Configuration: String IDs preserved, deferred schema operations
  • Developer Experience: Enhanced errors (DF-XXX codes), strict mode validation, debug agent, CLI tools

Quick Start

DataFlow nodes follow the canonical 4-parameter pattern from

/01-core-sdk
.

from dataflow import DataFlow
from kailash.workflow.builder import WorkflowBuilder
from kailash.runtime.local import LocalRuntime

# Initialize DataFlow
db = DataFlow(connection_string="postgresql://user:pass@localhost/db")

# Define model (generates 11 nodes automatically)
@db.model
class User:
    id: str  # String IDs preserved
    name: str
    email: str

# Use generated nodes in workflows
workflow = WorkflowBuilder()
workflow.add_node("User_Create", "create_user", {
    "data": {"name": "John", "email": "john@example.com"}
})

# Execute with context manager (recommended for resource cleanup)
with LocalRuntime() as runtime:
    results, run_id = runtime.execute(workflow.build())
    user_id = results["create_user"]["result"]  # Access pattern

Generated Nodes (11 per model)

Each

@db.model
class generates:

  1. {Model}_Create
    - Create single record
  2. {Model}_Read
    - Read by ID
  3. {Model}_Update
    - Update record
  4. {Model}_Delete
    - Delete record
  5. {Model}_List
    - List with filters
  6. {Model}_Upsert
    - Insert or update (atomic)
  7. {Model}_Count
    - Efficient COUNT(*) queries
  8. {Model}_BulkCreate
    - Bulk insert
  9. {Model}_BulkUpdate
    - Bulk update
  10. {Model}_BulkDelete
    - Bulk delete
  11. {Model}_BulkUpsert
    - Bulk upsert

Critical Rules

  • ✅ String IDs preserved (no UUID conversion)
  • ✅ Deferred schema operations (safe for Docker/FastAPI)
  • ✅ Multi-instance isolation (one DataFlow per database)
  • ✅ Result access:
    results["node_id"]["result"]
  • ❌ NEVER use truthiness checks on filter/data parameters (empty dict
    {}
    is falsy)
  • ❌ ALWAYS use key existence checks:
    if "filter" in kwargs
    instead of
    if kwargs.get("filter")
  • ❌ NEVER use direct SQL when DataFlow nodes exist
  • ❌ NEVER use SQLAlchemy/Django ORM alongside DataFlow

Reference Documentation

Getting Started

Core Operations

Advanced Features

Developer Experience Tools

  • dataflow-strict-mode - Build-time validation (4-layer, OFF/WARN/STRICT)
  • dataflow-debug-agent - Intelligent error analysis (5-stage pipeline)
  • ErrorEnhancer - Automatic error enhancement (40+ DF-XXX codes)
  • Inspector API - Self-service debugging (18 introspection methods)
  • CLI Tools - dataflow-validate, dataflow-analyze, dataflow-debug (5 commands)

Troubleshooting

Database Support Matrix

DatabaseTypeNodes/ModelDriver
PostgreSQLSQL11asyncpg
MySQLSQL11aiomysql
SQLiteSQL11aiosqlite
MongoDBDocument8Motor
pgvectorVector3pgvector

Not an ORM: DataFlow generates workflow nodes, not ORM models. Uses string-based result access and integrates with Kailash's workflow execution model.

Integration Patterns

With Nexus (Multi-Channel)

from dataflow import DataFlow
from nexus import Nexus

db = DataFlow(connection_string="...")
@db.model
class User:
    id: str
    name: str

# Auto-generates API + CLI + MCP
nexus = Nexus(db.get_workflows())
nexus.run()  # Instant multi-channel platform

With Core SDK (Custom Workflows)

from dataflow import DataFlow
from kailash.workflow.builder import WorkflowBuilder

db = DataFlow(connection_string="...")
# Use db-generated nodes in custom workflows
workflow = WorkflowBuilder()
workflow.add_node("User_Create", "user1", {...})

When to Use This Skill

Use DataFlow when you need to:

  • Perform database operations in workflows
  • Generate CRUD APIs automatically (with Nexus)
  • Implement multi-tenant systems
  • Work with existing databases
  • Build database-first applications
  • Handle bulk data operations

Related Skills

Support

For DataFlow-specific questions, invoke:

  • dataflow-specialist
    - DataFlow implementation and patterns
  • testing-specialist
    - DataFlow testing strategies (NO MOCKING policy)
  • framework-advisor
    - Choose between Core SDK and DataFlow