Marketplace storage-debug-instrumentation
Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.
install
source · Clone the upstream repo
git clone https://github.com/aiskillstore/marketplace
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/aiskillstore/marketplace "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/benderfendor/storage-debug-instrumentation" ~/.claude/skills/aiskillstore-marketplace-storage-debug-instrumentation && rm -rf "$T"
manifest:
skills/benderfendor/storage-debug-instrumentation/SKILL.mdsource content
Storage Debug Instrumentation
Purpose
Enable rapid diagnosis of storage state, synchronization health, and backend performance bottlenecks by exposing:
- Raw article inspection from both PostgreSQL and ChromaDB
- Storage drift detection (missing/dangling entries)
- Detailed startup timeline breakdown (DB init, cache preload, vector store, RSS refresh)
- One-page debug dashboard consolidating all diagnostics
Scope
- Backend:
,app/services/startup_metrics.py
,app/main.py
,app/vector_store.py
,app/database.pyapp/api/routes/debug.py - Frontend:
,frontend/lib/api.tsfrontend/app/debug/page.tsx - No schema changes; purely additive instrumentation and debug routes
Workflow
1. Create startup metrics service
File:
backend/app/services/startup_metrics.py
- Implement thread-safe
class to record phase timingsStartupMetrics - Expose
for phase capturerecord_event(name, started_at, detail, metadata) - Support
for arbitrary annotationsadd_note(key, value) - Export singleton
for app-wide usestartup_metrics
2. Instrument vector store initialization
File:
backend/app/vector_store.py
- Import
startup_metrics - In
, wrap initialization withVectorStore.__init__()
timertime.time() - Record event with metadata:
,host
,port
,collectiondocuments - Catch connection errors and annotate them
3. Instrument FastAPI startup sequence
File:
backend/app/main.py
- Call
at beginning ofstartup_metrics.mark_app_started()on_startup() - Wrap each phase (DB init, schedulers, cache preload, RSS refresh, migration) with
record_event() - Include metadata:
,cache_size
,article_countoldest_article_hours - Call
at endstartup_metrics.mark_app_completed() - Add app version notes via
add_note()
4. Add database pagination helpers
File:
backend/app/database.py
- Implement
to support:fetch_articles_page()- Limit/offset pagination
- Optional source filter
- Missing-embeddings-only flag
- Published date range filters
- Sort direction (asc/desc)
- Return oldest/newest timestamp bounds
- Implement
to return all article→chroma ID mappings for drift analysisfetch_article_chroma_mappings()
5. Add vector store pagination helpers
File:
backend/app/vector_store.py
- Implement
to return paginated Chroma documents with metadata and previewslist_articles(limit, offset) - Implement
to return all stored Chroma IDs for drift detection (used bylist_all_ids()
)/debug/storage/drift
6. Expose debug API endpoints
File:
backend/app/api/routes/debug.py
- Add
→ returns startup metrics timeline (events + notes)GET /debug/startup - Add
→ returns paginated raw Chroma entries with limit/offsetGET /debug/chromadb/articles - Add
→ returns paginated Postgres rows with filters (source, embeddings, date range, sort)GET /debug/database/articles - Add
→ compares Chroma IDs vs Postgres mappings, returns missing/dangling counts + samplesGET /debug/storage/drift
7. Add frontend API bindings
File:
frontend/lib/api.ts
- Export types:
,StartupEventMetric
,StartupMetricsResponse
,ChromaDebugResponse
,DatabaseDebugResponseStorageDriftReport - Export fetchers:
,fetchStartupMetrics()
,fetchChromaDebugArticles()
,fetchDatabaseDebugArticles()fetchStorageDrift() - Ensure snake_case→camelCase mapping for response fields
8. Build debug dashboard page
File:
frontend/app/debug/page.tsx
- Create
route with multi-tab inspection UI/debug - Render startup timeline: phase name, duration, metadata badges (cache size, vectors, migrated records)
- Display Chroma browser: paginated table with ID, title, source, preview
- Display Postgres browser: paginated table with filters (source, date range, missing-embeddings-only flag)
- Display drift report: sample tables for missing-in-chroma and dangling-in-chroma entries
- Include summary cards for quick metrics (boot time, total articles, vector count, drift count)
Implementation checklist
- Create
backend/app/services/startup_metrics.py - Instrument
backend/app/vector_store.py::VectorStore.__init__() - Instrument
(all phases)backend/app/main.py::on_startup() - Add
andfetch_articles_page()
tofetch_article_chroma_mappings()backend/app/database.py - Add
andlist_articles()
tolist_all_ids()backend/app/vector_store.py - Add
,/debug/startup
,/debug/chromadb/articles
,/debug/database/articles
to/debug/storage/driftbackend/app/api/routes/debug.py - Add types and fetchers to
frontend/lib/api.ts - Create
with dashboard layoutfrontend/app/debug/page.tsx - Run
→ all checks passuvx ruff check backend - Test endpoints in curl or Postman to verify response structure
Verification checklist
-
returns valid timeline with events and notesGET http://localhost:8000/debug/startup -
returns paginated Chroma docsGET http://localhost:8000/debug/chromadb/articles?limit=50&offset=0 -
filters correctlyGET http://localhost:8000/debug/database/articles?source=bbc&missing_embeddings_only=false -
compares counts and returns drift samplesGET http://localhost:8000/debug/storage/drift -
loads without errors and displays all four sectionshttp://localhost:3000/debug - Refresh button triggers all four API calls in parallel
- Pagination controls update limit/offset correctly
- Database filters (source, date range) update and refresh data
- Startup timeline shows non-zero phase durations if backend just started
Future enhancements
- Streaming startup metrics via SSE (live tail during boot)
- Export startup report as JSON/CSV for performance tracking over time
- Automated drift alerts (post to Slack/email if dangling > threshold)
- Performance graphs (startup time trends, article throughput)
- Sync-on-demand action (button to force vector store refresh for missing articles)