Goblin-mode cypher-linguist
Neo4j and Cypher: graph schema design, query patterns, performance optimisation, PostgreSQL integration.
git clone https://github.com/JasonWarrenUK/goblin-mode
T=$(mktemp -d) && git clone --depth=1 https://github.com/JasonWarrenUK/goblin-mode "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/cypher-linguist" ~/.claude/skills/jasonwarrenuk-goblin-mode-cypher-linguist && rm -rf "$T"
skills/cypher-linguist/SKILL.mdNeo4j/Cypher Mastery
Comprehensive guide to Neo4j graph database and Cypher query language. Covers fundamental concepts, common patterns, performance optimization, schema design, and integration with PostgreSQL/Supabase.
When This Skill Applies
Use this skill when:
- Writing Cypher queries
- Designing graph schemas
- Optimizing graph traversals
- Building recommendation systems
- Modeling hierarchies or networks
- Integrating Neo4j with relational databases
- Questions about graph database patterns
Core Concepts
Nodes, Relationships, Properties
Nodes - Entities (nouns):
// Simple node CREATE (u:User) // Node with properties CREATE (u:User { id: 'user-123', name: 'Alice', email: 'alice@example.com' }) // Multiple labels CREATE (p:Person:Developer {name: 'Bob'})
Relationships - Connections (verbs):
// Simple relationship CREATE (a)-[:FOLLOWS]->(b) // Relationship with properties CREATE (a)-[:FOLLOWS {since: date(), strength: 'strong'}]->(b) // Relationship types are UPPERCASE by convention CREATE (a)-[:MEMBER_OF {role: 'admin'}]->(org)
Properties - Attributes (key-value pairs):
// Node properties { id: 'user-123', name: 'Alice', age: 30, verified: true, createdAt: datetime() } // Relationship properties { since: date(), weight: 0.85, type: 'professional' }
Graph Thinking
Relational mindset:
-- Joins and foreign keys SELECT * FROM users u JOIN follows f ON f.follower_id = u.id JOIN users u2 ON f.followed_id = u2.id WHERE u.id = '123';
Graph mindset:
// Pattern matching MATCH (u:User {id: '123'})-[:FOLLOWS]->(friend) RETURN friend;
Key difference: Relationships are first-class citizens in graphs.
Cypher Fundamentals
MATCH - Finding Patterns
Basic pattern:
// Find all users MATCH (u:User) RETURN u; // Find users with specific property MATCH (u:User {name: 'Alice'}) RETURN u; // Find users matching condition MATCH (u:User) WHERE u.age > 25 RETURN u;
Relationship patterns:
// Outgoing relationship MATCH (a)-[:FOLLOWS]->(b) RETURN a, b; // Incoming relationship MATCH (a)<-[:FOLLOWS]-(b) RETURN a, b; // Any direction MATCH (a)-[:FOLLOWS]-(b) RETURN a, b; // Multiple relationships MATCH (a)-[:FOLLOWS]->(b)-[:FOLLOWS]->(c) RETURN a, b, c; // Variable length MATCH (a)-[:FOLLOWS*1..3]->(b) RETURN a, b;
CREATE - Adding Data
Create nodes:
// Single node CREATE (u:User {id: 'user-123', name: 'Alice'}) RETURN u; // Multiple nodes CREATE (a:User {name: 'Alice'}), (b:User {name: 'Bob'}), (c:User {name: 'Charlie'});
Create relationships:
// Find existing nodes, create relationship MATCH (a:User {name: 'Alice'}) MATCH (b:User {name: 'Bob'}) CREATE (a)-[:FOLLOWS]->(b); // Create nodes and relationships together CREATE (a:User {name: 'Alice'})-[:FOLLOWS]->(b:User {name: 'Bob'});
MERGE - Create or Match
Create if not exists:
// Create user only if doesn't exist MERGE (u:User {id: 'user-123'}) ON CREATE SET u.name = 'Alice', u.createdAt = datetime() ON MATCH SET u.lastSeen = datetime() RETURN u; // Create relationship only if doesn't exist MATCH (a:User {id: 'user-123'}) MATCH (b:User {id: 'user-456'}) MERGE (a)-[r:FOLLOWS]->(b) ON CREATE SET r.since = datetime() RETURN r;
Important: MERGE matches on entire pattern:
// This matches on ALL properties MERGE (u:User {id: 'user-123', name: 'Alice'}) // Better: Match on unique constraint only MERGE (u:User {id: 'user-123'}) SET u.name = 'Alice'
SET - Updating Properties
// Set single property MATCH (u:User {id: 'user-123'}) SET u.name = 'Alicia' RETURN u; // Set multiple properties MATCH (u:User {id: 'user-123'}) SET u.name = 'Alicia', u.verified = true RETURN u; // Set properties from map MATCH (u:User {id: 'user-123'}) SET u += {name: 'Alicia', age: 31} RETURN u; // Add label MATCH (u:User {id: 'user-123'}) SET u:Verified RETURN u;
DELETE - Removing Data
// Delete node (only if no relationships) MATCH (u:User {id: 'user-123'}) DELETE u; // Delete node and all relationships MATCH (u:User {id: 'user-123'}) DETACH DELETE u; // Delete relationship only MATCH (a:User)-[r:FOLLOWS]->(b:User) WHERE a.id = 'user-123' AND b.id = 'user-456' DELETE r; // Delete properties MATCH (u:User {id: 'user-123'}) REMOVE u.age, u.verified RETURN u;
RETURN - Formatting Results
// Return nodes MATCH (u:User) RETURN u; // Return specific properties MATCH (u:User) RETURN u.id, u.name; // Alias properties MATCH (u:User) RETURN u.name AS userName, u.email AS userEmail; // Return count MATCH (u:User) RETURN count(u) AS totalUsers; // Return distinct MATCH (u:User)-[:FOLLOWS]->(friend) RETURN DISTINCT friend.name;
Common Patterns
Social Graph Patterns
Followers/Following:
// Get user's followers MATCH (follower:User)-[:FOLLOWS]->(u:User {id: $userId}) RETURN follower; // Get who user follows MATCH (u:User {id: $userId})-[:FOLLOWS]->(following) RETURN following; // Mutual follows (friends) MATCH (a:User {id: $userId})-[:FOLLOWS]->(b:User) MATCH (b)-[:FOLLOWS]->(a) RETURN b AS friend; // Follow suggestions (friends of friends, not already following) MATCH (u:User {id: $userId})-[:FOLLOWS]->()-[:FOLLOWS]->(suggestion) WHERE NOT (u)-[:FOLLOWS]->(suggestion) AND u <> suggestion RETURN DISTINCT suggestion LIMIT 10;
Blocking:
// Create block relationship MATCH (a:User {id: $userId}) MATCH (b:User {id: $blockUserId}) MERGE (a)-[:BLOCKED]->(b); // Get all users except blocked MATCH (u:User) WHERE NOT (:User {id: $currentUserId})-[:BLOCKED]->(u) AND NOT (u)-[:BLOCKED]->(:User {id: $currentUserId}) RETURN u;
Hierarchy Patterns
Organizational structure:
// Find all reports (direct and indirect) MATCH (manager:Person {id: $managerId})-[:MANAGES*]->(report:Person) RETURN report; // Find direct reports only MATCH (manager:Person {id: $managerId})-[:MANAGES]->(report:Person) RETURN report; // Find manager chain up to CEO MATCH path = (person:Person {id: $personId})-[:REPORTS_TO*]->(ceo:Person) WHERE NOT (ceo)-[:REPORTS_TO]->() RETURN nodes(path); // Find all people in same department MATCH (person:Person {id: $personId})-[:MEMBER_OF]->(dept:Department) MATCH (colleague:Person)-[:MEMBER_OF]->(dept) WHERE person <> colleague RETURN colleague;
Category hierarchies:
// Find all subcategories MATCH (parent:Category {id: $categoryId})-[:PARENT_OF*]->(child:Category) RETURN child; // Find path to root category MATCH path = (cat:Category {id: $categoryId})-[:CHILD_OF*]->(root:Category) WHERE NOT (root)-[:CHILD_OF]->() RETURN nodes(path);
Recommendation Patterns
Collaborative filtering:
// Users who liked similar items MATCH (u:User {id: $userId})-[:LIKED]->(item:Item) MATCH (item)<-[:LIKED]-(other:User) MATCH (other)-[:LIKED]->(recommendation:Item) WHERE NOT (u)-[:LIKED]->(recommendation) RETURN recommendation, count(*) AS score ORDER BY score DESC LIMIT 10; // Weighted recommendations MATCH (u:User {id: $userId})-[r1:LIKED]->(item:Item) MATCH (item)<-[r2:LIKED]-(other:User) MATCH (other)-[r3:LIKED]->(recommendation:Item) WHERE NOT (u)-[:LIKED]->(recommendation) WITH recommendation, sum(r1.weight * r2.weight * r3.weight) AS score RETURN recommendation ORDER BY score DESC LIMIT 10;
Content-based filtering:
// Items similar to liked items MATCH (u:User {id: $userId})-[:LIKED]->(item:Item) MATCH (item)-[:HAS_TAG]->(tag:Tag) MATCH (tag)<-[:HAS_TAG]-(similar:Item) WHERE NOT (u)-[:LIKED]->(similar) AND item <> similar RETURN similar, count(tag) AS commonTags ORDER BY commonTags DESC LIMIT 10;
Path Finding
Shortest path:
// Shortest path between two users MATCH path = shortestPath( (a:User {id: $userId1})-[:FOLLOWS*]-(b:User {id: $userId2}) ) RETURN path, length(path); // All shortest paths MATCH path = allShortestPaths( (a:User {id: $userId1})-[:FOLLOWS*]-(b:User {id: $userId2}) ) RETURN path;
Dijkstra's algorithm (weighted paths):
// Find cheapest route CALL gds.shortestPath.dijkstra.stream('graph', { sourceNode: $startNodeId, targetNode: $endNodeId, relationshipWeightProperty: 'cost' }) YIELD path, totalCost RETURN path, totalCost;
Access Control
Permission hierarchies:
// Check if user has permission MATCH (u:User {id: $userId})-[:HAS_ROLE]->(role:Role) MATCH (role)-[:HAS_PERMISSION*0..]->(permission:Permission {name: $permissionName}) RETURN count(permission) > 0 AS hasPermission; // Get all user permissions (including inherited) MATCH (u:User {id: $userId})-[:HAS_ROLE]->(role:Role) MATCH (role)-[:HAS_PERMISSION*0..]->(permission:Permission) RETURN DISTINCT permission;
Performance Optimization
Indexes and Constraints
Unique constraints (automatically create index):
// Unique user ID CREATE CONSTRAINT user_id_unique FOR (u:User) REQUIRE u.id IS UNIQUE; // Unique email CREATE CONSTRAINT user_email_unique FOR (u:User) REQUIRE u.email IS UNIQUE;
Regular indexes:
// Index on property CREATE INDEX user_name_index FOR (u:User) ON (u.name); // Composite index CREATE INDEX user_location_index FOR (u:User) ON (u.city, u.country); // Full-text search CREATE FULLTEXT INDEX user_search_index FOR (u:User) ON EACH [u.name, u.bio, u.email];
Use indexes:
// Full-text search CALL db.index.fulltext.queryNodes('user_search_index', 'Alice') YIELD node, score RETURN node, score;
Query Optimization
Use PROFILE to analyze:
PROFILE MATCH (u:User {id: $userId})-[:FOLLOWS*1..3]->(friend) RETURN friend;
Optimization tips:
1. Start with most specific nodes:
// ✗ Bad: Starts with all users MATCH (u:User)-[:FOLLOWS]->(friend:User {id: $friendId}) RETURN u; // ✓ Good: Starts with specific user MATCH (u:User)-[:FOLLOWS]->(friend:User) WHERE friend.id = $friendId RETURN u;
2. Limit relationship depth:
// ✗ Bad: Unbounded traversal MATCH (u:User {id: $userId})-[:FOLLOWS*]->(friend) RETURN friend; // ✓ Good: Bounded traversal MATCH (u:User {id: $userId})-[:FOLLOWS*1..3]->(friend) RETURN friend;
3. Use LIMIT early:
// ✓ Good: Limit before expensive operations MATCH (u:User) RETURN u ORDER BY u.createdAt DESC LIMIT 10;
4. Avoid Cartesian products:
// ✗ Bad: Creates cartesian product MATCH (a:User), (b:User) WHERE a.city = b.city RETURN a, b; // ✓ Good: Connect via relationship or property MATCH (a:User)-[:LIVES_IN]->(city:City)<-[:LIVES_IN]-(b:User) RETURN a, b;
Batch Operations
Bulk create:
// Create many nodes efficiently UNWIND $users AS userData MERGE (u:User {id: userData.id}) SET u.name = userData.name, u.email = userData.email; // Create many relationships UNWIND $follows AS follow MATCH (a:User {id: follow.followerId}) MATCH (b:User {id: follow.followedId}) MERGE (a)-[:FOLLOWS {since: follow.since}]->(b);
Use APOC for batching:
// Process in batches of 1000 CALL apoc.periodic.iterate( "MATCH (u:User) RETURN u", "SET u.processed = true", {batchSize: 1000} );
Schema Design
Modeling Guidelines
Nodes: Represent entities
(:User) (:Post) (:Comment) (:Tag)
Relationships: Represent connections
(:User)-[:POSTED]->(:Post) (:User)-[:COMMENTED]->(:Comment) (:Comment)-[:ON]->(:Post) (:Post)-[:TAGGED]->(:Tag)
Properties: Store attributes
// On nodes User {id, name, email, createdAt} // On relationships FOLLOWS {since, strength} LIKED {rating, timestamp}
When to Use Relationships vs Properties
Use relationship when:
- Connection between entities
- Need to query traversals
- Connection has properties
- Many-to-many relationship
// ✓ Good: Relationship (user:User)-[:LIKED {rating: 5}]->(post:Post)
Use property when:
- Simple value
- Doesn't need traversal
- One-to-one relationship
- Rarely queried independently
// ✓ Good: Property (:User {email: 'alice@example.com'})
Multiple Labels
Use multiple labels for:
- Shared behaviour
- Polymorphism
- Categorization
// User can be both Person and Developer CREATE (p:Person:Developer {name: 'Alice'}) // Query specific type MATCH (d:Developer) RETURN d; // Query any person MATCH (p:Person) RETURN p;
Integration with PostgreSQL/Supabase
Shared Primary Keys
Use same UUIDs:
// Create in PostgreSQL const { data: user } = await supabase .from('users') .insert({ id: userId, email: 'alice@example.com', name: 'Alice' }); // Create in Neo4j await neo4j.run(` CREATE (u:User { id: $userId, name: $name }) `, { userId, name: user.name });
Data Synchronization
Event-driven sync:
// PostgreSQL trigger → Sync to Neo4j supabase .from('users') .on('INSERT', async (payload) => { await neo4j.run(` MERGE (u:User {id: $id}) SET u.name = $name, u.email = $email `, payload.record); }) .subscribe();
Batch sync:
// Bulk sync from PostgreSQL to Neo4j const { data: users } = await supabase .from('users') .select('*'); await neo4j.run(` UNWIND $users AS userData MERGE (u:User {id: userData.id}) SET u.name = userData.name, u.email = userData.email `, { users });
Query Patterns
Hybrid queries:
// Get user from PostgreSQL const { data: user } = await supabase .from('users') .select('*') .eq('id', userId) .single(); // Get social graph from Neo4j const result = await neo4j.run(` MATCH (u:User {id: $userId}) MATCH (u)-[:FOLLOWS]->(following:User) MATCH (follower:User)-[:FOLLOWS]->(u) RETURN count(DISTINCT following) as followingCount, count(DISTINCT follower) as followerCount `, { userId }); // Combine results return { ...user, social: { followingCount: result.records[0].get('followingCount'), followerCount: result.records[0].get('followerCount') } };
Portfolio Evidence
KSBs Demonstrated:
- S6: Design and Implement Database Systems (graph modeling)
- S1: Analyse Requirements (choosing graph for relationships)
- S8: Create Analysis Artefacts (query optimization)
How to Document:
- Schema diagrams showing graph structure
- Query examples with performance comparisons
- Explain why graph chosen over relational
- Document traversal patterns
- Show integration with other databases
Success Criteria
Neo4j implementation is successful when:
- Queries leverage graph traversal strengths
- Indexes on frequently queried properties
- Bounded traversals (not unbounded
)* - Clear distinction between nodes/relationships/properties
- Integration with relational database clean
- Performance acceptable for use case
- Schema supports future requirements