Goblin-mode cypher-linguist

Neo4j and Cypher: graph schema design, query patterns, performance optimisation, PostgreSQL integration.

install
source · Clone the upstream repo
git clone https://github.com/JasonWarrenUK/goblin-mode
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/JasonWarrenUK/goblin-mode "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/cypher-linguist" ~/.claude/skills/jasonwarrenuk-goblin-mode-cypher-linguist && rm -rf "$T"
manifest: skills/cypher-linguist/SKILL.md
source content

Neo4j/Cypher Mastery

Comprehensive guide to Neo4j graph database and Cypher query language. Covers fundamental concepts, common patterns, performance optimization, schema design, and integration with PostgreSQL/Supabase.


When This Skill Applies

Use this skill when:

  • Writing Cypher queries
  • Designing graph schemas
  • Optimizing graph traversals
  • Building recommendation systems
  • Modeling hierarchies or networks
  • Integrating Neo4j with relational databases
  • Questions about graph database patterns

Core Concepts

Nodes, Relationships, Properties

Nodes - Entities (nouns):

// Simple node
CREATE (u:User)

// Node with properties
CREATE (u:User {
  id: 'user-123',
  name: 'Alice',
  email: 'alice@example.com'
})

// Multiple labels
CREATE (p:Person:Developer {name: 'Bob'})

Relationships - Connections (verbs):

// Simple relationship
CREATE (a)-[:FOLLOWS]->(b)

// Relationship with properties
CREATE (a)-[:FOLLOWS {since: date(), strength: 'strong'}]->(b)

// Relationship types are UPPERCASE by convention
CREATE (a)-[:MEMBER_OF {role: 'admin'}]->(org)

Properties - Attributes (key-value pairs):

// Node properties
{
  id: 'user-123',
  name: 'Alice',
  age: 30,
  verified: true,
  createdAt: datetime()
}

// Relationship properties
{
  since: date(),
  weight: 0.85,
  type: 'professional'
}

Graph Thinking

Relational mindset:

-- Joins and foreign keys
SELECT * FROM users u
JOIN follows f ON f.follower_id = u.id
JOIN users u2 ON f.followed_id = u2.id
WHERE u.id = '123';

Graph mindset:

// Pattern matching
MATCH (u:User {id: '123'})-[:FOLLOWS]->(friend)
RETURN friend;

Key difference: Relationships are first-class citizens in graphs.


Cypher Fundamentals

MATCH - Finding Patterns

Basic pattern:

// Find all users
MATCH (u:User)
RETURN u;

// Find users with specific property
MATCH (u:User {name: 'Alice'})
RETURN u;

// Find users matching condition
MATCH (u:User)
WHERE u.age > 25
RETURN u;

Relationship patterns:

// Outgoing relationship
MATCH (a)-[:FOLLOWS]->(b)
RETURN a, b;

// Incoming relationship
MATCH (a)<-[:FOLLOWS]-(b)
RETURN a, b;

// Any direction
MATCH (a)-[:FOLLOWS]-(b)
RETURN a, b;

// Multiple relationships
MATCH (a)-[:FOLLOWS]->(b)-[:FOLLOWS]->(c)
RETURN a, b, c;

// Variable length
MATCH (a)-[:FOLLOWS*1..3]->(b)
RETURN a, b;

CREATE - Adding Data

Create nodes:

// Single node
CREATE (u:User {id: 'user-123', name: 'Alice'})
RETURN u;

// Multiple nodes
CREATE
  (a:User {name: 'Alice'}),
  (b:User {name: 'Bob'}),
  (c:User {name: 'Charlie'});

Create relationships:

// Find existing nodes, create relationship
MATCH (a:User {name: 'Alice'})
MATCH (b:User {name: 'Bob'})
CREATE (a)-[:FOLLOWS]->(b);

// Create nodes and relationships together
CREATE (a:User {name: 'Alice'})-[:FOLLOWS]->(b:User {name: 'Bob'});

MERGE - Create or Match

Create if not exists:

// Create user only if doesn't exist
MERGE (u:User {id: 'user-123'})
ON CREATE SET u.name = 'Alice', u.createdAt = datetime()
ON MATCH SET u.lastSeen = datetime()
RETURN u;

// Create relationship only if doesn't exist
MATCH (a:User {id: 'user-123'})
MATCH (b:User {id: 'user-456'})
MERGE (a)-[r:FOLLOWS]->(b)
ON CREATE SET r.since = datetime()
RETURN r;

Important: MERGE matches on entire pattern:

// This matches on ALL properties
MERGE (u:User {id: 'user-123', name: 'Alice'})

// Better: Match on unique constraint only
MERGE (u:User {id: 'user-123'})
SET u.name = 'Alice'

SET - Updating Properties

// Set single property
MATCH (u:User {id: 'user-123'})
SET u.name = 'Alicia'
RETURN u;

// Set multiple properties
MATCH (u:User {id: 'user-123'})
SET u.name = 'Alicia', u.verified = true
RETURN u;

// Set properties from map
MATCH (u:User {id: 'user-123'})
SET u += {name: 'Alicia', age: 31}
RETURN u;

// Add label
MATCH (u:User {id: 'user-123'})
SET u:Verified
RETURN u;

DELETE - Removing Data

// Delete node (only if no relationships)
MATCH (u:User {id: 'user-123'})
DELETE u;

// Delete node and all relationships
MATCH (u:User {id: 'user-123'})
DETACH DELETE u;

// Delete relationship only
MATCH (a:User)-[r:FOLLOWS]->(b:User)
WHERE a.id = 'user-123' AND b.id = 'user-456'
DELETE r;

// Delete properties
MATCH (u:User {id: 'user-123'})
REMOVE u.age, u.verified
RETURN u;

RETURN - Formatting Results

// Return nodes
MATCH (u:User)
RETURN u;

// Return specific properties
MATCH (u:User)
RETURN u.id, u.name;

// Alias properties
MATCH (u:User)
RETURN u.name AS userName, u.email AS userEmail;

// Return count
MATCH (u:User)
RETURN count(u) AS totalUsers;

// Return distinct
MATCH (u:User)-[:FOLLOWS]->(friend)
RETURN DISTINCT friend.name;

Common Patterns

Social Graph Patterns

Followers/Following:

// Get user's followers
MATCH (follower:User)-[:FOLLOWS]->(u:User {id: $userId})
RETURN follower;

// Get who user follows
MATCH (u:User {id: $userId})-[:FOLLOWS]->(following)
RETURN following;

// Mutual follows (friends)
MATCH (a:User {id: $userId})-[:FOLLOWS]->(b:User)
MATCH (b)-[:FOLLOWS]->(a)
RETURN b AS friend;

// Follow suggestions (friends of friends, not already following)
MATCH (u:User {id: $userId})-[:FOLLOWS]->()-[:FOLLOWS]->(suggestion)
WHERE NOT (u)-[:FOLLOWS]->(suggestion)
  AND u <> suggestion
RETURN DISTINCT suggestion
LIMIT 10;

Blocking:

// Create block relationship
MATCH (a:User {id: $userId})
MATCH (b:User {id: $blockUserId})
MERGE (a)-[:BLOCKED]->(b);

// Get all users except blocked
MATCH (u:User)
WHERE NOT (:User {id: $currentUserId})-[:BLOCKED]->(u)
  AND NOT (u)-[:BLOCKED]->(:User {id: $currentUserId})
RETURN u;

Hierarchy Patterns

Organizational structure:

// Find all reports (direct and indirect)
MATCH (manager:Person {id: $managerId})-[:MANAGES*]->(report:Person)
RETURN report;

// Find direct reports only
MATCH (manager:Person {id: $managerId})-[:MANAGES]->(report:Person)
RETURN report;

// Find manager chain up to CEO
MATCH path = (person:Person {id: $personId})-[:REPORTS_TO*]->(ceo:Person)
WHERE NOT (ceo)-[:REPORTS_TO]->()
RETURN nodes(path);

// Find all people in same department
MATCH (person:Person {id: $personId})-[:MEMBER_OF]->(dept:Department)
MATCH (colleague:Person)-[:MEMBER_OF]->(dept)
WHERE person <> colleague
RETURN colleague;

Category hierarchies:

// Find all subcategories
MATCH (parent:Category {id: $categoryId})-[:PARENT_OF*]->(child:Category)
RETURN child;

// Find path to root category
MATCH path = (cat:Category {id: $categoryId})-[:CHILD_OF*]->(root:Category)
WHERE NOT (root)-[:CHILD_OF]->()
RETURN nodes(path);

Recommendation Patterns

Collaborative filtering:

// Users who liked similar items
MATCH (u:User {id: $userId})-[:LIKED]->(item:Item)
MATCH (item)<-[:LIKED]-(other:User)
MATCH (other)-[:LIKED]->(recommendation:Item)
WHERE NOT (u)-[:LIKED]->(recommendation)
RETURN recommendation, count(*) AS score
ORDER BY score DESC
LIMIT 10;

// Weighted recommendations
MATCH (u:User {id: $userId})-[r1:LIKED]->(item:Item)
MATCH (item)<-[r2:LIKED]-(other:User)
MATCH (other)-[r3:LIKED]->(recommendation:Item)
WHERE NOT (u)-[:LIKED]->(recommendation)
WITH recommendation,
     sum(r1.weight * r2.weight * r3.weight) AS score
RETURN recommendation
ORDER BY score DESC
LIMIT 10;

Content-based filtering:

// Items similar to liked items
MATCH (u:User {id: $userId})-[:LIKED]->(item:Item)
MATCH (item)-[:HAS_TAG]->(tag:Tag)
MATCH (tag)<-[:HAS_TAG]-(similar:Item)
WHERE NOT (u)-[:LIKED]->(similar)
  AND item <> similar
RETURN similar, count(tag) AS commonTags
ORDER BY commonTags DESC
LIMIT 10;

Path Finding

Shortest path:

// Shortest path between two users
MATCH path = shortestPath(
  (a:User {id: $userId1})-[:FOLLOWS*]-(b:User {id: $userId2})
)
RETURN path, length(path);

// All shortest paths
MATCH path = allShortestPaths(
  (a:User {id: $userId1})-[:FOLLOWS*]-(b:User {id: $userId2})
)
RETURN path;

Dijkstra's algorithm (weighted paths):

// Find cheapest route
CALL gds.shortestPath.dijkstra.stream('graph', {
  sourceNode: $startNodeId,
  targetNode: $endNodeId,
  relationshipWeightProperty: 'cost'
})
YIELD path, totalCost
RETURN path, totalCost;

Access Control

Permission hierarchies:

// Check if user has permission
MATCH (u:User {id: $userId})-[:HAS_ROLE]->(role:Role)
MATCH (role)-[:HAS_PERMISSION*0..]->(permission:Permission {name: $permissionName})
RETURN count(permission) > 0 AS hasPermission;

// Get all user permissions (including inherited)
MATCH (u:User {id: $userId})-[:HAS_ROLE]->(role:Role)
MATCH (role)-[:HAS_PERMISSION*0..]->(permission:Permission)
RETURN DISTINCT permission;

Performance Optimization

Indexes and Constraints

Unique constraints (automatically create index):

// Unique user ID
CREATE CONSTRAINT user_id_unique
FOR (u:User) REQUIRE u.id IS UNIQUE;

// Unique email
CREATE CONSTRAINT user_email_unique
FOR (u:User) REQUIRE u.email IS UNIQUE;

Regular indexes:

// Index on property
CREATE INDEX user_name_index
FOR (u:User) ON (u.name);

// Composite index
CREATE INDEX user_location_index
FOR (u:User) ON (u.city, u.country);

// Full-text search
CREATE FULLTEXT INDEX user_search_index
FOR (u:User) ON EACH [u.name, u.bio, u.email];

Use indexes:

// Full-text search
CALL db.index.fulltext.queryNodes('user_search_index', 'Alice')
YIELD node, score
RETURN node, score;

Query Optimization

Use PROFILE to analyze:

PROFILE
MATCH (u:User {id: $userId})-[:FOLLOWS*1..3]->(friend)
RETURN friend;

Optimization tips:

1. Start with most specific nodes:

// ✗ Bad: Starts with all users
MATCH (u:User)-[:FOLLOWS]->(friend:User {id: $friendId})
RETURN u;

// ✓ Good: Starts with specific user
MATCH (u:User)-[:FOLLOWS]->(friend:User)
WHERE friend.id = $friendId
RETURN u;

2. Limit relationship depth:

// ✗ Bad: Unbounded traversal
MATCH (u:User {id: $userId})-[:FOLLOWS*]->(friend)
RETURN friend;

// ✓ Good: Bounded traversal
MATCH (u:User {id: $userId})-[:FOLLOWS*1..3]->(friend)
RETURN friend;

3. Use LIMIT early:

// ✓ Good: Limit before expensive operations
MATCH (u:User)
RETURN u
ORDER BY u.createdAt DESC
LIMIT 10;

4. Avoid Cartesian products:

// ✗ Bad: Creates cartesian product
MATCH (a:User), (b:User)
WHERE a.city = b.city
RETURN a, b;

// ✓ Good: Connect via relationship or property
MATCH (a:User)-[:LIVES_IN]->(city:City)<-[:LIVES_IN]-(b:User)
RETURN a, b;

Batch Operations

Bulk create:

// Create many nodes efficiently
UNWIND $users AS userData
MERGE (u:User {id: userData.id})
SET u.name = userData.name, u.email = userData.email;

// Create many relationships
UNWIND $follows AS follow
MATCH (a:User {id: follow.followerId})
MATCH (b:User {id: follow.followedId})
MERGE (a)-[:FOLLOWS {since: follow.since}]->(b);

Use APOC for batching:

// Process in batches of 1000
CALL apoc.periodic.iterate(
  "MATCH (u:User) RETURN u",
  "SET u.processed = true",
  {batchSize: 1000}
);

Schema Design

Modeling Guidelines

Nodes: Represent entities

(:User)
(:Post)
(:Comment)
(:Tag)

Relationships: Represent connections

(:User)-[:POSTED]->(:Post)
(:User)-[:COMMENTED]->(:Comment)
(:Comment)-[:ON]->(:Post)
(:Post)-[:TAGGED]->(:Tag)

Properties: Store attributes

// On nodes
User {id, name, email, createdAt}

// On relationships
FOLLOWS {since, strength}
LIKED {rating, timestamp}

When to Use Relationships vs Properties

Use relationship when:

  • Connection between entities
  • Need to query traversals
  • Connection has properties
  • Many-to-many relationship
// ✓ Good: Relationship
(user:User)-[:LIKED {rating: 5}]->(post:Post)

Use property when:

  • Simple value
  • Doesn't need traversal
  • One-to-one relationship
  • Rarely queried independently
// ✓ Good: Property
(:User {email: 'alice@example.com'})

Multiple Labels

Use multiple labels for:

  • Shared behaviour
  • Polymorphism
  • Categorization
// User can be both Person and Developer
CREATE (p:Person:Developer {name: 'Alice'})

// Query specific type
MATCH (d:Developer)
RETURN d;

// Query any person
MATCH (p:Person)
RETURN p;

Integration with PostgreSQL/Supabase

Shared Primary Keys

Use same UUIDs:

// Create in PostgreSQL
const { data: user } = await supabase
  .from('users')
  .insert({
    id: userId,
    email: 'alice@example.com',
    name: 'Alice'
  });

// Create in Neo4j
await neo4j.run(`
  CREATE (u:User {
    id: $userId,
    name: $name
  })
`, { userId, name: user.name });

Data Synchronization

Event-driven sync:

// PostgreSQL trigger → Sync to Neo4j
supabase
  .from('users')
  .on('INSERT', async (payload) => {
    await neo4j.run(`
      MERGE (u:User {id: $id})
      SET u.name = $name, u.email = $email
    `, payload.record);
  })
  .subscribe();

Batch sync:

// Bulk sync from PostgreSQL to Neo4j
const { data: users } = await supabase
  .from('users')
  .select('*');

await neo4j.run(`
  UNWIND $users AS userData
  MERGE (u:User {id: userData.id})
  SET u.name = userData.name,
      u.email = userData.email
`, { users });

Query Patterns

Hybrid queries:

// Get user from PostgreSQL
const { data: user } = await supabase
  .from('users')
  .select('*')
  .eq('id', userId)
  .single();

// Get social graph from Neo4j
const result = await neo4j.run(`
  MATCH (u:User {id: $userId})
  MATCH (u)-[:FOLLOWS]->(following:User)
  MATCH (follower:User)-[:FOLLOWS]->(u)
  RETURN
    count(DISTINCT following) as followingCount,
    count(DISTINCT follower) as followerCount
`, { userId });

// Combine results
return {
  ...user,
  social: {
    followingCount: result.records[0].get('followingCount'),
    followerCount: result.records[0].get('followerCount')
  }
};

Portfolio Evidence

KSBs Demonstrated:

  • S6: Design and Implement Database Systems (graph modeling)
  • S1: Analyse Requirements (choosing graph for relationships)
  • S8: Create Analysis Artefacts (query optimization)

How to Document:

  • Schema diagrams showing graph structure
  • Query examples with performance comparisons
  • Explain why graph chosen over relational
  • Document traversal patterns
  • Show integration with other databases

Success Criteria

Neo4j implementation is successful when:

  • Queries leverage graph traversal strengths
  • Indexes on frequently queried properties
  • Bounded traversals (not unbounded
    *
    )
  • Clear distinction between nodes/relationships/properties
  • Integration with relational database clean
  • Performance acceptable for use case
  • Schema supports future requirements