Claude-skill-registry-data memory-design-patterns
Best practices for memory architecture design including user vs agent vs session memory patterns, vector vs graph memory tradeoffs, retention strategies, and performance optimization. Use when designing memory systems, architecting AI memory layers, choosing memory types, planning retention strategies, or when user mentions memory architecture, user memory, agent memory, session memory, memory patterns, vector storage, graph memory, or Mem0 architecture.
git clone https://github.com/majiayu000/claude-skill-registry-data
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry-data "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/memory-design-patterns" ~/.claude/skills/majiayu000-claude-skill-registry-data-memory-design-patterns && rm -rf "$T"
data/memory-design-patterns/SKILL.mdMemory Design Patterns
Production-ready memory architecture patterns for AI applications using Mem0. This skill provides comprehensive guidance on designing scalable, performant memory systems with proper isolation, retention strategies, and optimization techniques.
Instructions
Phase 1: Understand Memory Types
Mem0 provides three distinct memory scopes, each serving different purposes:
1. User Memory (Persistent Preferences & Profile)
Purpose: Long-term personal preferences, profile data, and user characteristics that persist across all interactions.
Use Cases:
- User preferences (dietary restrictions, communication style, language preferences)
- Personal information (location, occupation, family details)
- Long-term goals and interests
- Historical context that should persist indefinitely
Implementation:
# Add user-level memory memory.add( "User prefers concise responses without technical jargon" user_id="customer_bob" ) # Search user memories user_context = memory.search( "communication style" user_id="customer_bob" )
Key Characteristics:
- Persists indefinitely (or until explicitly deleted)
- Shared across all agents interacting with this user
- Should contain stable, long-term information
- Typically 10-50 memories per user
2. Agent Memory (Agent-Specific Context)
Purpose: Agent-specific knowledge, behaviors, and learned patterns that apply across all users interacting with this agent.
Use Cases:
- Agent capabilities and limitations
- Domain-specific knowledge
- Learned behaviors and patterns
- Agent-specific instructions and protocols
Implementation:
# Add agent-level memory memory.add( "When handling refund requests, always check order date first" agent_id="support_agent_v2" ) # Search agent memories agent_context = memory.search( "refund process" agent_id="support_agent_v2" )
Key Characteristics:
- Shared across all users interacting with this agent
- Contains agent-specific procedures and knowledge
- Moderate retention (days to months)
- Typically 50-200 memories per agent
3. Session/Run Memory (Temporary Conversation Context)
Purpose: Ephemeral context specific to a single conversation or task session.
Use Cases:
- Current conversation topic
- Temporary task context
- Session-specific state
- Short-term working memory
Implementation:
# Add session-level memory memory.add( "Current issue: payment failed with error code 402" run_id="session_12345_20250115" ) # Search session memories session_context = memory.search( "current issue" run_id="session_12345_20250115" )
Key Characteristics:
- Short-lived (minutes to hours)
- Isolated to specific conversation or task
- Should be cleaned up after session ends
- Typically 5-20 memories per session
Phase 2: Choose Storage Backend (Vector vs Graph)
Vector Memory (Default)
How It Works: Embeddings stored in vector database, semantic similarity search using cosine distance.
Strengths:
- Fast semantic search
- Excellent for unstructured data
- Low setup complexity
- Works out-of-the-box with Mem0
Weaknesses:
- Cannot query relationships
- No explicit entity connections
- Limited reasoning about connections
Best For:
- Simple preference storage
- Document/chunk retrieval
- Semantic search use cases
- Quick prototyping
Configuration:
from mem0 import Memory # Default vector-only configuration memory = Memory()
Graph Memory (Advanced)
How It Works: Entities and relationships stored in graph database (Neo4j/Memgraph), enables relationship traversal and complex queries.
Strengths:
- Explicit entity relationships
- Complex query capabilities
- Relationship reasoning
- Multi-hop traversal
Weaknesses:
- Requires graph database setup
- Higher infrastructure complexity
- Slower for pure semantic search
- More storage overhead
Best For:
- Multi-entity systems
- Relationship-heavy domains
- Complex reasoning requirements
- Enterprise knowledge graphs
Configuration:
from mem0 import Memory from mem0.configs.base import MemoryConfig config = MemoryConfig( graph_store={ "provider": "neo4j" "config": { "url": "bolt://localhost:7687" "username": "neo4j" "password": "password" } } ) memory = Memory(config)
Decision Matrix:
| Use Case | Vector | Graph |
|---|---|---|
| User preferences | ✅ Best | ⚠️ Overkill |
| Product recommendations | ✅ Best | ⚠️ Overkill |
| Customer support | ✅ Good | ✅ Better |
| Knowledge management | ⚠️ Limited | ✅ Best |
| Multi-tenant systems | ✅ Good | ✅ Best |
| Team collaboration | ⚠️ Limited | ✅ Best |
Phase 3: Design Retention Strategy
Use the retention strategy template:
bash scripts/generate-retention-policy.sh <memory-type> <retention-days>
Retention Guidelines
User Memory:
- Retention: Indefinite (with user control)
- Cleanup: User-initiated deletion only
- Archival: After 1 year of inactivity
- GDPR: Must support right to deletion
Agent Memory:
- Retention: 90-180 days typical
- Cleanup: Automatic based on relevance score
- Versioning: Keep agent version history
- Deprecation: Clear old agent memories on major updates
Session Memory:
- Retention: 1-24 hours
- Cleanup: Automatic after session end
- Conversion: Promote important memories to user/agent level
- Storage: Consider in-memory for very short sessions
Retention Implementation
Run the retention analyzer:
bash scripts/analyze-retention.sh <user_id_or_agent_id>
This script:
- Analyzes memory age and access patterns
- Identifies stale memories
- Suggests cleanup actions
- Generates retention reports
Phase 4: Implement Multi-Level Memory Pattern
Pattern: Combine all three memory types for comprehensive context.
Template: Use
templates/multi-level-memory-pattern.py
Architecture:
Query Processing Flow: 1. Retrieve session context (immediate) 2. Retrieve user context (preferences) 3. Retrieve agent context (capabilities) 4. Merge contexts with priority weighting 5. Generate response with full context
Priority Weighting:
- Session: 40% weight (most relevant to current task)
- User: 35% weight (personalizes response)
- Agent: 25% weight (ensures consistent behavior)
Implementation:
# Retrieve all context levels session_memories = memory.search(query, run_id=run_id) user_memories = memory.search(query, user_id=user_id) agent_memories = memory.search(query, agent_id=agent_id) # Weighted merge context = merge_contexts( session=session_memories user=user_memories agent=agent_memories weights={"session": 0.4, "user": 0.35, "agent": 0.25} )
Phase 5: Optimize Performance
Vector Search Optimization
Run the performance analyzer:
bash scripts/analyze-memory-performance.sh <project_name>
Optimization Techniques:
-
Limit Search Results:
memories = memory.search(query, user_id=user_id, limit=5)- Default: 10 results
- Recommended: 3-5 for chat, 10-20 for RAG
-
Use Filters to Reduce Search Space:
memories = memory.search( query filters={ "AND": [ {"user_id": "alex"} {"agent_id": "support_agent"} ] } ) -
Cache Frequently Accessed Memories:
- Cache user preferences (rarely change)
- Refresh cache every 5-10 minutes
- Invalidate on explicit memory updates
-
Batch Operations:
# Add multiple memories in one call memory.add(messages, user_id=user_id)
Graph Query Optimization
For graph memory:
- Limit Traversal Depth: Max 2-3 hops
- Index Key Properties: user_id, agent_id, timestamps
- Use Relationship Filters: Reduce unnecessary traversals
- Monitor Query Performance: Track slow queries > 100ms
Phase 6: Implement Cost Optimization
Run the cost analyzer:
bash scripts/analyze-memory-costs.sh <user_id> <date_range>
Cost Optimization Strategies:
-
Deduplication: Remove similar/redundant memories
bash scripts/deduplicate-memories.sh <user_id> -
Archival: Move old memories to cold storage
- Active: Last 30 days (vector DB)
- Archive: 30-180 days (compressed JSON)
- Long-term: > 180 days (S3/cold storage)
-
Compression: Use shorter embeddings for less critical memories
- Critical: 1536 dimensions (OpenAI large)
- Standard: 768 dimensions (OpenAI small)
- Archival: 384 dimensions (lightweight model)
-
Smart Pruning: Remove low-value memories
- Score-based: Keep only high relevance scores
- Access-based: Remove never-accessed memories
- Importance-based: User/agent priority tagging
Phase 7: Security and Isolation
Multi-Tenant Isolation
Pattern: Ensure complete data isolation between users/organizations.
Implementation:
# Always scope by user_id or org_id memories = memory.search( query filters={"user_id": current_user_id} ) # Validate access before retrieval if not user_has_access(user_id, requested_user_id): raise PermissionError("Access denied")
Security Checklist:
- ✅ Never allow cross-user memory access
- ✅ Validate all user_id parameters
- ✅ Implement org-level isolation for multi-tenant apps
- ✅ Audit memory access logs
- ✅ Encrypt sensitive memory content
- ✅ Support GDPR right to deletion
Run the security audit:
bash scripts/audit-memory-security.sh
Decision Trees
When to Use Each Memory Type
Use the decision helper:
bash scripts/suggest-memory-type.sh "<use_case_description>"
Quick Reference:
- User dietary preferences → User Memory
- Agent's SOP for task X → Agent Memory
- Current conversation topic → Session Memory
- Customer support ticket details → Session Memory (promote to User if resolved)
- System capabilities → Agent Memory
- User's birthday → User Memory
Vector vs Graph Decision
Use the architecture advisor:
bash scripts/suggest-storage-architecture.sh "<project_description>"
Decision Criteria:
- Need relationship traversal? → Graph
- Pure semantic search? → Vector
- < 10,000 memories total? → Vector
- Complex entity relationships? → Graph
- Team/org hierarchies? → Graph
- Simple preference storage? → Vector
Key Files
Scripts (all functional, not placeholders):
- Create retention policy configsscripts/generate-retention-policy.sh
- Analyze memory age and access patternsscripts/analyze-retention.sh
- Performance profilingscripts/analyze-memory-performance.sh
- Cost analysis and optimization suggestionsscripts/analyze-memory-costs.sh
- Find and remove duplicate memoriesscripts/deduplicate-memories.sh
- Security compliance checkingscripts/audit-memory-security.sh
- Interactive memory type advisorscripts/suggest-memory-type.sh
- Architecture recommendation toolscripts/suggest-storage-architecture.sh
Templates:
- Complete implementationtemplates/multi-level-memory-pattern.py
- Retention configurationtemplates/retention-policy.yaml
- Vector memory setuptemplates/vector-only-config.py
- Graph memory setuptemplates/graph-memory-config.py
- Vector + Graph combinedtemplates/hybrid-architecture.py
- Cost optimization settingstemplates/cost-optimization-config.yaml
Examples:
- Full implementation guideexamples/customer-support-memory-architecture.md
- Shared memory patternsexamples/multi-agent-collaboration.md
- Product recommendation memoryexamples/e-commerce-personalization.md
- HIPAA-compliant memory architectureexamples/healthcare-assistant.md
Best Practices
- Start Simple: Use vector-only with user + session memories
- Add Complexity as Needed: Only introduce graph when relationships matter
- Monitor Performance: Track memory retrieval times and costs
- Implement Retention Early: Don't let memory grow unbounded
- Test Isolation: Verify cross-user memory access is impossible
- Document Memory Schema: Track what memories mean and when they're used
- Version Agent Memories: Clear separation between agent versions
- Promote Important Memories: Session → User when patterns emerge
- Use Metadata: Tag memories with categories for better filtering
- Regular Audits: Monthly review of memory growth and costs
Troubleshooting
Slow Memory Retrieval:
- Reduce search limit
- Add more specific filters
- Check vector index performance
- Consider caching
High Costs:
- Run cost analyzer script
- Implement deduplication
- Review retention policy
- Archive old memories
Poor Search Results:
- Check embedding model quality
- Verify memory content is descriptive
- Use hybrid search (keyword + semantic)
- Add metadata for filtering
Memory Leakage Between Users:
- Audit security script immediately
- Review all memory queries for user_id filtering
- Check RLS policies if using custom backends
- Implement access logging
Plugin: mem0 Version: 1.0.0 Last Updated: 2025-10-27