Claude-skill-registry classdiagram-to-neo4j

Extract entities, properties, and relationships from UML class diagrams (images) and populate Neo4j graph database. Supports TMF-style diagrams, schema diagrams, and other UML class diagrams. Uses vision models for extraction and generates Cypher queries for Neo4j population.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/classdiagram-to-neo4j" ~/.claude/skills/majiayu000-claude-skill-registry-classdiagram-to-neo4j && rm -rf "$T"

manifest: skills/data/classdiagram-to-neo4j/SKILL.md

Class Diagram to Neo4j Extraction Skill

Overview

This skill extracts structured data from UML class diagrams (images) and populates Neo4j graph databases. It's designed for:

TMF (TM Forum) API specification diagrams
UML class diagrams
Entity-relationship diagrams
Schema diagrams

Workflow

1. Image Analysis

Use vision models (GPT-4 Vision, Claude Vision, etc.) to analyze diagram images
Extract text, boxes, lines, and relationships
Identify entities, properties, and relationships

2. Structured Extraction

Parse entities (classes) with their properties
Extract relationships (associations, inheritance, etc.)
Capture cardinality and relationship metadata
Handle color coding and visual indicators

3. Data Normalization

Convert to structured format (YAML/JSON)
Normalize entity names and types
Standardize relationship types
Handle references and aliases

4. Neo4j Population

Generate Cypher queries
Create nodes with properties
Create relationships with metadata
Handle constraints and indexes

Usage Patterns

Pattern 1: Direct Image → Neo4j

from classdiagram_to_neo4j import extract_and_populate

# Extract from image and populate Neo4j
extract_and_populate(
    image_path="diagrams/product_offering.png",
    neo4j_uri="bolt://localhost:7687",
    neo4j_user="neo4j",
    neo4j_password="password"
)

Pattern 2: Extract → Review → Populate

from classdiagram_to_neo4j import extract_diagram, populate_neo4j

# Step 1: Extract to JSON/YAML
data = extract_diagram(
    image_path="diagrams/product_offering.png",
    output_format="json",
    output_path="extracted.json"
)

# Step 2: Review/edit JSON if needed
# ... manual review ...

# Step 3: Populate Neo4j
populate_neo4j(
    data=data,
    neo4j_uri="bolt://localhost:7687",
    neo4j_user="neo4j",
    neo4j_password="password"
)

Pattern 3: Batch Processing

from classdiagram_to_neo4j import extract_diagram, populate_neo4j

# Process multiple diagrams
diagrams = [
    "diagrams/product_offering.png",
    "diagrams/category.png",
    "diagrams/pricing.png"
]

for diagram_path in diagrams:
    data = extract_diagram(diagram_path, output_format="json")
    populate_neo4j(
        data=data,
        neo4j_uri="bolt://localhost:7687",
        neo4j_user="neo4j",
        neo4j_password="password"
    )

Diagram Types Supported

TMF-Style Diagrams

ProductOffering hub diagrams
Category relationships
Specification diagrams
Reference entity diagrams

UML Class Diagrams

Classes with attributes
Associations with multiplicities
Inheritance hierarchies
Aggregations and compositions

Schema Diagrams

Database schemas
API schemas
Domain models

Extraction Process

Step 1: Vision Analysis

The vision model analyzes the image and extracts:

Entities: Boxes/classes with names
Properties: Attributes within entities
Relationships: Lines/arrows between entities
Metadata: Cardinality, roles, types
Visual Indicators: Colors, borders, dashed lines

Step 2: Structured Output

Extracted data is normalized into:

meta:
  source: "diagrams/product_offering.png"
  extracted_at: "2024-01-01T00:00:00Z"
  diagram_type: "uml_class"

entities:
  ProductOffering:
    label: "ProductOffering"
    properties:
      - name: "id"
        type: "string"
        required: true
      - name: "name"
        type: "string"
        required: true
      - name: "isBundle"
        type: "boolean"
        required: false

relationships:
  - from: "ProductOffering"
    to: "ProductSpecification"
    type: "has_specification"
    cardinality: "0..1"
    direction: "out"
    properties:
      role: null

Step 3: Neo4j Population

Generates Cypher queries:

// Create schema block
MERGE (sb:SchemaBlock {id: 'tmf620_productoffering'})
SET sb.title = 'ProductOffering Diagram',
    sb.artifact = 'diagrams/productoffering.png';

// Create entities with FQN
MERGE (e:Entity {fqn: 'tmf620_productoffering#ProductOffering'})
SET e.name = 'ProductOffering',
    e.specId = 'tmf620_productoffering',
    e.kind = 'Entity';

// Create fields
MERGE (f:Field {fqn: 'tmf620_productoffering#ProductOffering.name'})
SET f.name = 'name',
    f.type = 'string',
    f.required = true;

// Link field to entity
MATCH (e:Entity {fqn: 'tmf620_productoffering#ProductOffering'})
MATCH (f:Field {fqn: 'tmf620_productoffering#ProductOffering.name'})
MERGE (e)-[:HAS_FIELD]->(f);

// Create relationships
MATCH (from:Entity {fqn: 'tmf620_productoffering#ProductOffering'})
MATCH (to:Entity {fqn: 'tmf620_productoffering#ProductSpecification'})
MERGE (from)-[r:RELATES_TO {
    type: 'has_specification',
    fromCardinality: '0..1',
    toCardinality: '1',
    direction: 'out'
}]->(to);

Key Features

1. Scalable Data Model

Uses stable labels (
```
:Entity
```
,
```
:RefType
```
,
```
:SchemaBlock
```
) instead of per-class labels
Uses FQN (Fully Qualified Name) for entity identity:
```
<specId>#<entityName>
```
Uses generic
```
RELATES_TO
```
relationship type with
```
type
```
property
Avoids label explosion and supports namespacing

See

references/SCALABLE_RELATIONSHIP_MODEL.md

2. Provenance Tracking

Tracks source diagram via
```
SchemaBlock
```
nodes
Uses FQN for entity identity (supports multiple versions)
Maintains extraction metadata (
```
specId
```
,
```
extracted_at
```
)
Links entities to schema blocks via
```
CONTAINS_ENTITY
```

3. Conflict Resolution

Handles duplicate entities
Merges properties intelligently
Resolves relationship conflicts

4. Validation

Validates extracted data structure before population
Checks for missing required fields
Verifies relationship consistency
Validates cardinality formats
Can be disabled with
```
--no-validate
```
flag

5. Property Persistence

Properties are stored as
```
:Field
```
nodes
Fields linked to entities via
```
HAS_FIELD
```
relationships
Property metadata (type, required, default) fully persisted

Configuration

Vision Model Settings

vision:
  provider: "openai"  # or "anthropic"
  model: "gpt-4o"  # or "claude-3-5-sonnet-20241022"
  max_tokens: 8000
  temperature: 0.1
  use_structured_output: true  # Uses JSON mode when available

Neo4j Settings

neo4j:
  uri: "bolt://localhost:7687"
  user: "neo4j"
  password: "password"
  database: "neo4j"
  create_constraints: true
  create_indexes: true

Extraction Settings

extraction:
  include_properties: true
  include_methods: false
  normalize_names: true
  handle_references: true
  extract_cardinality: true

Output Formats

YAML Format

See

schema_examples/tmf620/productoffering_hub.core.example.yaml

for example.

JSON Format

{
  "meta": {
    "source": "diagrams/product_offering.png",
    "extracted_at": "2024-01-01T00:00:00Z"
  },
  "entities": {
    "ProductOffering": {
      "label": "ProductOffering",
      "properties": [...]
    }
  },
  "relationships": [...]
}

Cypher Format

See

schema_examples/neo4j/tmf620_productoffering_scalable_model.cypher

for example.

Integration with Existing Tools

With TMF MCP Builder

import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent / "scripts"))

from extract_and_populate import extract_and_populate
from neo4j import GraphDatabase

# Extract and populate
extract_and_populate(
    image_path="diagrams/tmf620_productoffering.png",
    neo4j_password="password"
)

# Query for relevant subgraph
driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "password"))
with driver.session() as session:
    result = session.run("""
        MATCH (e:Entity {name: 'ProductOffering'})-[r:RELATES_TO*1..2]->(related)
        WHERE r.type IN ['has_specification', 'has_price']
        RETURN e, r, related
    """)
    # Process results...
driver.close()

Best Practices

Pre-process Images
- Ensure high resolution
- Remove noise and artifacts
- Standardize format (PNG preferred)
Validate Extraction
- Review extracted YAML/JSON
- Verify entity names
- Check relationship cardinalities
Incremental Updates
- Use merge strategies
- Track changes
- Maintain provenance
Query Optimization
- Create indexes on common properties
- Use relationship type filters
- Limit hop depth
Error Handling
- Handle missing entities
- Validate relationships
- Log extraction issues

Examples

See

examples/

directory for:

Simple UML class diagram extraction
TMF ProductOffering diagram extraction
Batch processing example
Custom extraction rules

References

references/SCALABLE_RELATIONSHIP_MODEL.md

- Relationship modeling approach

```
references/VISION_EXTRACTION_PROMPTS.md
```
- Vision model prompts
```
NEO4J_REQUIREMENTS.md
```
- Neo4j server version requirements
```
schema_examples/neo4j/
```
- Example Cypher scripts

Neo4j Server Requirements

Important: Relationship property indexes require Neo4j server version 4.3+.

The
```
requirements.txt
```
specifies the Python driver version, not the server version
Check your Neo4j server version:
```
neo4j version
```
or
```
CALL dbms.components()
```
See
```
NEO4J_REQUIREMENTS.md
```
for full compatibility details

Troubleshooting

Common Issues

Low Extraction Quality
- Increase image resolution
- Use better vision model
- Provide more context in prompts
Missing Relationships
- Check diagram clarity
- Verify relationship detection logic
- Review extraction output
Neo4j Population Errors
- Check constraints
- Verify relationship types
- Review Cypher syntax
Performance Issues
- Batch operations
- Use transactions
- Create indexes

Future Enhancements

Support for sequence diagrams
Support for activity diagrams
Multi-page diagram handling
Automatic relationship inference
Diagram versioning and diff