Claude-skill-registry classdiagram-to-neo4j
Extract entities, properties, and relationships from UML class diagrams (images) and populate Neo4j graph database. Supports TMF-style diagrams, schema diagrams, and other UML class diagrams. Uses vision models for extraction and generates Cypher queries for Neo4j population.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/classdiagram-to-neo4j" ~/.claude/skills/majiayu000-claude-skill-registry-classdiagram-to-neo4j && rm -rf "$T"
skills/data/classdiagram-to-neo4j/SKILL.mdClass Diagram to Neo4j Extraction Skill
Overview
This skill extracts structured data from UML class diagrams (images) and populates Neo4j graph databases. It's designed for:
- TMF (TM Forum) API specification diagrams
- UML class diagrams
- Entity-relationship diagrams
- Schema diagrams
Workflow
1. Image Analysis
- Use vision models (GPT-4 Vision, Claude Vision, etc.) to analyze diagram images
- Extract text, boxes, lines, and relationships
- Identify entities, properties, and relationships
2. Structured Extraction
- Parse entities (classes) with their properties
- Extract relationships (associations, inheritance, etc.)
- Capture cardinality and relationship metadata
- Handle color coding and visual indicators
3. Data Normalization
- Convert to structured format (YAML/JSON)
- Normalize entity names and types
- Standardize relationship types
- Handle references and aliases
4. Neo4j Population
- Generate Cypher queries
- Create nodes with properties
- Create relationships with metadata
- Handle constraints and indexes
Usage Patterns
Pattern 1: Direct Image → Neo4j
from classdiagram_to_neo4j import extract_and_populate # Extract from image and populate Neo4j extract_and_populate( image_path="diagrams/product_offering.png", neo4j_uri="bolt://localhost:7687", neo4j_user="neo4j", neo4j_password="password" )
Pattern 2: Extract → Review → Populate
from classdiagram_to_neo4j import extract_diagram, populate_neo4j # Step 1: Extract to JSON/YAML data = extract_diagram( image_path="diagrams/product_offering.png", output_format="json", output_path="extracted.json" ) # Step 2: Review/edit JSON if needed # ... manual review ... # Step 3: Populate Neo4j populate_neo4j( data=data, neo4j_uri="bolt://localhost:7687", neo4j_user="neo4j", neo4j_password="password" )
Pattern 3: Batch Processing
from classdiagram_to_neo4j import extract_diagram, populate_neo4j # Process multiple diagrams diagrams = [ "diagrams/product_offering.png", "diagrams/category.png", "diagrams/pricing.png" ] for diagram_path in diagrams: data = extract_diagram(diagram_path, output_format="json") populate_neo4j( data=data, neo4j_uri="bolt://localhost:7687", neo4j_user="neo4j", neo4j_password="password" )
Diagram Types Supported
TMF-Style Diagrams
- ProductOffering hub diagrams
- Category relationships
- Specification diagrams
- Reference entity diagrams
UML Class Diagrams
- Classes with attributes
- Associations with multiplicities
- Inheritance hierarchies
- Aggregations and compositions
Schema Diagrams
- Database schemas
- API schemas
- Domain models
Extraction Process
Step 1: Vision Analysis
The vision model analyzes the image and extracts:
- Entities: Boxes/classes with names
- Properties: Attributes within entities
- Relationships: Lines/arrows between entities
- Metadata: Cardinality, roles, types
- Visual Indicators: Colors, borders, dashed lines
Step 2: Structured Output
Extracted data is normalized into:
meta: source: "diagrams/product_offering.png" extracted_at: "2024-01-01T00:00:00Z" diagram_type: "uml_class" entities: ProductOffering: label: "ProductOffering" properties: - name: "id" type: "string" required: true - name: "name" type: "string" required: true - name: "isBundle" type: "boolean" required: false relationships: - from: "ProductOffering" to: "ProductSpecification" type: "has_specification" cardinality: "0..1" direction: "out" properties: role: null
Step 3: Neo4j Population
Generates Cypher queries:
// Create schema block MERGE (sb:SchemaBlock {id: 'tmf620_productoffering'}) SET sb.title = 'ProductOffering Diagram', sb.artifact = 'diagrams/productoffering.png'; // Create entities with FQN MERGE (e:Entity {fqn: 'tmf620_productoffering#ProductOffering'}) SET e.name = 'ProductOffering', e.specId = 'tmf620_productoffering', e.kind = 'Entity'; // Create fields MERGE (f:Field {fqn: 'tmf620_productoffering#ProductOffering.name'}) SET f.name = 'name', f.type = 'string', f.required = true; // Link field to entity MATCH (e:Entity {fqn: 'tmf620_productoffering#ProductOffering'}) MATCH (f:Field {fqn: 'tmf620_productoffering#ProductOffering.name'}) MERGE (e)-[:HAS_FIELD]->(f); // Create relationships MATCH (from:Entity {fqn: 'tmf620_productoffering#ProductOffering'}) MATCH (to:Entity {fqn: 'tmf620_productoffering#ProductSpecification'}) MERGE (from)-[r:RELATES_TO { type: 'has_specification', fromCardinality: '0..1', toCardinality: '1', direction: 'out' }]->(to);
Key Features
1. Scalable Data Model
- Uses stable labels (
,:Entity
,:RefType
) instead of per-class labels:SchemaBlock - Uses FQN (Fully Qualified Name) for entity identity:
<specId>#<entityName> - Uses generic
relationship type withRELATES_TO
propertytype - Avoids label explosion and supports namespacing
- See
references/SCALABLE_RELATIONSHIP_MODEL.md
2. Provenance Tracking
- Tracks source diagram via
nodesSchemaBlock - Uses FQN for entity identity (supports multiple versions)
- Maintains extraction metadata (
,specId
)extracted_at - Links entities to schema blocks via
CONTAINS_ENTITY
3. Conflict Resolution
- Handles duplicate entities
- Merges properties intelligently
- Resolves relationship conflicts
4. Validation
- Validates extracted data structure before population
- Checks for missing required fields
- Verifies relationship consistency
- Validates cardinality formats
- Can be disabled with
flag--no-validate
5. Property Persistence
- Properties are stored as
nodes:Field - Fields linked to entities via
relationshipsHAS_FIELD - Property metadata (type, required, default) fully persisted
Configuration
Vision Model Settings
vision: provider: "openai" # or "anthropic" model: "gpt-4o" # or "claude-3-5-sonnet-20241022" max_tokens: 8000 temperature: 0.1 use_structured_output: true # Uses JSON mode when available
Neo4j Settings
neo4j: uri: "bolt://localhost:7687" user: "neo4j" password: "password" database: "neo4j" create_constraints: true create_indexes: true
Extraction Settings
extraction: include_properties: true include_methods: false normalize_names: true handle_references: true extract_cardinality: true
Output Formats
YAML Format
See
schema_examples/tmf620/productoffering_hub.core.example.yaml for example.
JSON Format
{ "meta": { "source": "diagrams/product_offering.png", "extracted_at": "2024-01-01T00:00:00Z" }, "entities": { "ProductOffering": { "label": "ProductOffering", "properties": [...] } }, "relationships": [...] }
Cypher Format
See
schema_examples/neo4j/tmf620_productoffering_scalable_model.cypher for example.
Integration with Existing Tools
With TMF MCP Builder
import sys from pathlib import Path sys.path.insert(0, str(Path(__file__).parent / "scripts")) from extract_and_populate import extract_and_populate from neo4j import GraphDatabase # Extract and populate extract_and_populate( image_path="diagrams/tmf620_productoffering.png", neo4j_password="password" ) # Query for relevant subgraph driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "password")) with driver.session() as session: result = session.run(""" MATCH (e:Entity {name: 'ProductOffering'})-[r:RELATES_TO*1..2]->(related) WHERE r.type IN ['has_specification', 'has_price'] RETURN e, r, related """) # Process results... driver.close()
Best Practices
-
Pre-process Images
- Ensure high resolution
- Remove noise and artifacts
- Standardize format (PNG preferred)
-
Validate Extraction
- Review extracted YAML/JSON
- Verify entity names
- Check relationship cardinalities
-
Incremental Updates
- Use merge strategies
- Track changes
- Maintain provenance
-
Query Optimization
- Create indexes on common properties
- Use relationship type filters
- Limit hop depth
-
Error Handling
- Handle missing entities
- Validate relationships
- Log extraction issues
Examples
See
examples/ directory for:
- Simple UML class diagram extraction
- TMF ProductOffering diagram extraction
- Batch processing example
- Custom extraction rules
References
- Relationship modeling approachreferences/SCALABLE_RELATIONSHIP_MODEL.md
- Vision model promptsreferences/VISION_EXTRACTION_PROMPTS.md
- Neo4j server version requirementsNEO4J_REQUIREMENTS.md
- Example Cypher scriptsschema_examples/neo4j/
Neo4j Server Requirements
Important: Relationship property indexes require Neo4j server version 4.3+.
- The
specifies the Python driver version, not the server versionrequirements.txt - Check your Neo4j server version:
orneo4j versionCALL dbms.components() - See
for full compatibility detailsNEO4J_REQUIREMENTS.md
Troubleshooting
Common Issues
-
Low Extraction Quality
- Increase image resolution
- Use better vision model
- Provide more context in prompts
-
Missing Relationships
- Check diagram clarity
- Verify relationship detection logic
- Review extraction output
-
Neo4j Population Errors
- Check constraints
- Verify relationship types
- Review Cypher syntax
-
Performance Issues
- Batch operations
- Use transactions
- Create indexes
Future Enhancements
- Support for sequence diagrams
- Support for activity diagrams
- Multi-page diagram handling
- Automatic relationship inference
- Diagram versioning and diff