Skillshub graph-schema

dot-skills Graph Database Schema Design Best Practices

install

source · Clone the upstream repo

git clone https://github.com/ComeOnOliver/skillshub

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/pproenca/dot-skills/graph-schema" ~/.claude/skills/comeonoliver-skillshub-graph-schema && rm -rf "$T"

manifest: skills/pproenca/dot-skills/graph-schema/SKILL.md

source content

dot-skills Graph Database Schema Design Best Practices

Comprehensive graph database data modeling guide for property graphs (Neo4j, Memgraph, Amazon Neptune, etc.). Contains 46 rules across 8 categories, prioritized by modeling impact from critical (entity classification, relationship design) to incremental (scale and evolution). Each rule includes detailed explanations, real-world Cypher examples comparing incorrect vs. correct models, and specific impact descriptions.

Philosophy: Data modeling correctness first, performance second. Always ask "what is the user trying to achieve?" before choosing structure.

When to Apply

Reference these guidelines when:

Designing a new graph database schema from domain requirements
Translating a relational schema to a graph model
Deciding whether something should be a node, relationship, or property
Reviewing an existing graph schema for modeling errors
Refactoring a graph that produces awkward or slow queries
Planning for schema evolution and data growth

Rule Categories by Priority


entity-
rel-
prop-
query-
pattern-
anti-
constraint-
scale-

Quick Reference

1. Entity Classification (CRITICAL)

```
entity-events
```
- Model multi-participant events as first-class nodes
```
entity-shared-values
```
- Promote shared property values to nodes
```
entity-specific-labels
```
- Use specific labels over generic ones
```
entity-multi-label
```
- Qualify entities with multiple labels
```
entity-identity-state
```
- Separate identity from mutable state
```
entity-reify-actions
```
- Reify lifecycle actions into nodes
```
entity-avoid-god-nodes
```
- Avoid kitchen-sink entity nodes

2. Relationship Design (CRITICAL)

```
rel-specific-types
```
- Use specific relationship types over generic ones
```
rel-meaningful-direction
```
- Choose semantically meaningful direction
```
rel-naming-conventions
```
- Follow UPPER_SNAKE_CASE for relationship types
```
rel-no-redundant-reverse
```
- Don't create redundant reverse relationships
```
rel-properties-scope
```
- Put data on relationships only when it describes the connection
```
rel-single-semantic
```
- One relationship type per semantic meaning
```
rel-typed-over-filtered
```
- Prefer typed relationships over generic + property filter

3. Property Placement (HIGH)

```
prop-no-foreign-keys
```
- Don't embed foreign keys as properties
```
prop-promote-to-node
```
- Promote frequently-queried values to nodes
```
prop-correct-data-types
```
- Use appropriate data types for properties
```
prop-no-arrays-for-connections
```
- Don't use property arrays when you need relationships
```
prop-relationship-vs-node-data
```
- Know when data belongs on relationship vs. node

4. Query-Driven Refinement (HIGH)

```
query-critical-traversals
```
- Design for your most critical traversals first
```
query-shortcut-relationships
```
- Add shortcut relationships for frequent multi-hop queries
```
query-denormalize-reads
```
- Denormalize for read-heavy paths
```
query-filter-by-rel-props
```
- Use relationship properties to filter traversals
```
query-test-before-deploy
```
- Test model against real queries before deploying

5. Structural Patterns (HIGH)

```
pattern-intermediary-nodes
```
- Use intermediary nodes for multi-entity relationships
```
pattern-hierarchy
```
- Model hierarchies with category nodes and depth relationships
```
pattern-linked-list
```
- Use linked lists for ordered sequences
```
pattern-timeline-tree
```
- Apply timeline trees for temporal data
```
pattern-fan-out
```
- Fan-out pattern for event streams and activity feeds
```
pattern-bipartite
```
- Use bipartite structure for many-to-many with context

6. Anti-Patterns (MEDIUM)

```
anti-join-table-nodes
```
- Don't model relational join tables as nodes
```
anti-generic-relationships
```
- Don't use generic RELATED_TO or CONNECTED relationships
```
anti-relational-porting
```
- Don't port relational schemas directly to graph
```
anti-over-modeling
```
- Don't make everything a node
```
anti-duplicate-data
```
- Don't duplicate data instead of creating relationships
```
anti-string-encoded-structure
```
- Don't encode structured data as delimited strings

7. Constraints & Integrity (MEDIUM)

```
constraint-unique-identifiers
```
- Define uniqueness constraints on natural identifiers
```
constraint-existence
```
- Use existence constraints for required properties
```
constraint-index-traversals
```
- Create indexes on traversal entry point properties
```
constraint-no-over-index
```
- Don't over-index — each index has a write cost
```
constraint-node-key
```
- Use composite node keys for natural multi-part identifiers

8. Scale & Evolution (LOW-MEDIUM)

```
scale-supernode-mitigation
```
- Mitigate supernodes with fan-out or partitioning
```
scale-temporal-versioning
```
- Separate current state from historical state
```
scale-schema-migration
```
- Plan for label and relationship type evolution
```
scale-batch-refactoring
```
- Use APOC or batched queries for schema refactoring
```
scale-dense-node-detection
```
- Monitor and detect emerging supernodes

How to Use

Read individual reference files for detailed explanations and code examples:

Section definitions - Category structure and impact levels
Rule template - Template for adding new rules

Reference Files

File	Description
references/_sections.md	Category definitions and ordering
assets/templates/_template.md	Template for new rules
metadata.json	Version and reference information

Priority	Category	Impact	Prefix
1	Entity Classification	CRITICAL	`entity-`
2	Relationship Design	CRITICAL	`rel-`
3	Property Placement	HIGH	`prop-`
4	Query-Driven Refinement	HIGH	`query-`
5	Structural Patterns	HIGH	`pattern-`
6	Anti-Patterns	MEDIUM	`anti-`
7	Constraints & Integrity	MEDIUM	`constraint-`
8	Scale & Evolution	LOW-MEDIUM	`scale-`