Claude-scientific-skills open-notebook
Self-hosted, open-source alternative to Google NotebookLM for AI-powered research and document analysis. Use when organizing research materials into notebooks, ingesting diverse content sources (PDFs, videos, audio, web pages, Office documents), generating AI-powered notes and summaries, creating multi-speaker podcasts from research, chatting with documents using context-aware AI, searching across materials with full-text and vector search, or running custom content transformations. Supports 16+ AI providers including OpenAI, Anthropic, Google, Ollama, Groq, and Mistral with complete data privacy through self-hosting.
git clone https://github.com/K-Dense-AI/scientific-agent-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/K-Dense-AI/scientific-agent-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/scientific-skills/open-notebook" ~/.claude/skills/k-dense-ai-claude-scientific-skills-open-notebook && rm -rf "$T"
scientific-skills/open-notebook/SKILL.mdOpen Notebook
Overview
Open Notebook is an open-source, self-hosted alternative to Google's NotebookLM that enables researchers to organize materials, generate AI-powered insights, create podcasts, and have context-aware conversations with their documents — all while maintaining complete data privacy.
Unlike Google's Notebook LM, which has no publicly available API outside of the Enterprise version, Open Notebook provides a comprehensive REST API, supports 16+ AI providers, and runs entirely on your own infrastructure.
Key advantages over NotebookLM:
- Full REST API for programmatic access and automation
- Choice of 16+ AI providers (not locked to Google models)
- Multi-speaker podcast generation with 1-4 customizable speakers (vs. 2-speaker limit)
- Complete data sovereignty through self-hosting
- Open source and fully extensible (MIT license)
Repository: https://github.com/lfnovo/open-notebook
Quick Start
Prerequisites
- Docker Desktop installed
- API key for at least one AI provider (or local Ollama for free local inference)
Installation
Deploy Open Notebook using Docker Compose:
# Download the docker-compose file curl -o docker-compose.yml https://raw.githubusercontent.com/lfnovo/open-notebook/main/docker-compose.yml # Set the required encryption key export OPEN_NOTEBOOK_ENCRYPTION_KEY="your-secret-key-here" # Launch the services docker-compose up -d
Access the application:
- Frontend UI: http://localhost:8502
- REST API: http://localhost:5055
- API Documentation: http://localhost:5055/docs
Configure AI Provider
After startup, configure at least one AI provider:
- Navigate to Settings > API Keys in the UI
- Add credentials for your preferred provider (OpenAI, Anthropic, etc.)
- Test the connection and discover available models
- Register models for use across the platform
Or configure via the REST API:
import requests BASE_URL = "http://localhost:5055/api" # Add a credential for an AI provider response = requests.post(f"{BASE_URL}/credentials", json={ "provider": "openai", "name": "My OpenAI Key", "api_key": "sk-..." }) credential = response.json() # Discover available models response = requests.post( f"{BASE_URL}/credentials/{credential['id']}/discover" ) discovered = response.json() # Register discovered models requests.post( f"{BASE_URL}/credentials/{credential['id']}/register-models", json={"model_ids": [m["id"] for m in discovered["models"]]} )
Core Features
Notebooks
Organize research into separate notebooks, each containing sources, notes, and chat sessions.
import requests BASE_URL = "http://localhost:5055/api" # Create a notebook response = requests.post(f"{BASE_URL}/notebooks", json={ "name": "Cancer Genomics Research", "description": "Literature review on tumor mutational burden" }) notebook = response.json() notebook_id = notebook["id"]
Sources
Ingest diverse content types including PDFs, videos, audio files, web pages, and Office documents. Sources are processed for full-text and vector search.
# Add a web URL source response = requests.post(f"{BASE_URL}/sources", data={ "url": "https://arxiv.org/abs/2301.00001", "notebook_id": notebook_id, "process_async": "true" }) source = response.json() # Upload a PDF file with open("paper.pdf", "rb") as f: response = requests.post( f"{BASE_URL}/sources", data={"notebook_id": notebook_id}, files={"file": ("paper.pdf", f, "application/pdf")} )
Notes
Create and manage notes (human or AI-generated) associated with notebooks.
# Create a human note response = requests.post(f"{BASE_URL}/notes", json={ "title": "Key Findings", "content": "TMB correlates with immunotherapy response in NSCLC...", "note_type": "human", "notebook_id": notebook_id })
Context-Aware Chat
Chat with your research materials using AI that cites sources.
# Create a chat session session = requests.post(f"{BASE_URL}/chat/sessions", json={ "notebook_id": notebook_id, "title": "TMB Discussion" }).json() # Send a message with context from sources response = requests.post(f"{BASE_URL}/chat/execute", json={ "session_id": session["id"], "message": "What are the key biomarkers for immunotherapy response?", "context": {"include_sources": True, "include_notes": True} })
Search
Search across all materials using full-text or vector (semantic) search.
# Vector search across the knowledge base results = requests.post(f"{BASE_URL}/search", json={ "query": "tumor mutational burden immunotherapy", "search_type": "vector", "limit": 10 }).json() # Ask a question with AI-powered answer answer = requests.post(f"{BASE_URL}/search/ask/simple", json={ "query": "How does TMB predict checkpoint inhibitor response?" }).json()
Podcast Generation
Generate professional multi-speaker podcasts from research materials with 1-4 customizable speakers.
# Generate a podcast episode job = requests.post(f"{BASE_URL}/podcasts/generate", json={ "notebook_id": notebook_id, "episode_profile_id": episode_profile_id, "speaker_profile_ids": [speaker1_id, speaker2_id] }).json() # Check generation status status = requests.get(f"{BASE_URL}/podcasts/jobs/{job['job_id']}").json() # Download audio when ready audio = requests.get( f"{BASE_URL}/podcasts/episodes/{status['episode_id']}/audio" )
Content Transformations
Apply custom AI-powered transformations to content for summarization, extraction, and analysis.
# Create a custom transformation transform = requests.post(f"{BASE_URL}/transformations", json={ "name": "extract_methods", "title": "Extract Methods", "description": "Extract methodology details from papers", "prompt": "Extract and summarize the methodology section...", "apply_default": False }).json() # Execute transformation on text result = requests.post(f"{BASE_URL}/transformations/execute", json={ "transformation_id": transform["id"], "input_text": "...", "model_id": "model_id_here" }).json()
Supported AI Providers
Open Notebook supports 16+ AI providers through the Esperanto library:
| Provider | LLM | Embedding | Speech-to-Text | Text-to-Speech |
|---|---|---|---|---|
| OpenAI | Yes | Yes | Yes | Yes |
| Anthropic | Yes | No | No | No |
| Google GenAI | Yes | Yes | No | Yes |
| Vertex AI | Yes | Yes | No | Yes |
| Ollama | Yes | Yes | No | No |
| Groq | Yes | No | Yes | No |
| Mistral | Yes | Yes | No | No |
| Azure OpenAI | Yes | Yes | No | No |
| DeepSeek | Yes | No | No | No |
| xAI | Yes | No | No | No |
| OpenRouter | Yes | No | No | No |
| ElevenLabs | No | No | Yes | Yes |
| Perplexity | Yes | No | No | No |
| Voyage | No | Yes | No | No |
Environment Variables
Key configuration variables for Docker deployment:
| Variable | Description | Default |
|---|---|---|
| Required. Secret key for encrypting stored credentials | None |
| SurrealDB connection URL | |
| Database namespace | |
| Database name | |
| Optional password protection for the UI | None |
API Reference
The REST API is available at
http://localhost:5055/api with interactive documentation at /docs.
Core endpoint groups:
- Notebook CRUD and source association/api/notebooks
- Source ingestion, processing, and retrieval/api/sources
- Note management/api/notes
- Chat session management/api/chat/sessions
- Chat message execution/api/chat/execute
- Full-text and vector search/api/search
- Podcast generation and management/api/podcasts
- Content transformation pipelines/api/transformations
- AI model configuration and discovery/api/models
- Provider credential management/api/credentials
For complete API reference with all endpoints and request/response formats, see
references/api_reference.md.
Architecture
Open Notebook uses a modern stack:
- Backend: Python with FastAPI
- Database: SurrealDB (document + relational)
- AI Integration: LangChain with the Esperanto multi-provider library
- Frontend: Next.js with React
- Deployment: Docker Compose with persistent volumes
Important Notes
- Open Notebook requires Docker for deployment
- At least one AI provider must be configured for AI features to work
- For free local inference without API costs, use Ollama
- The
must be set before first launch and kept consistent across restartsOPEN_NOTEBOOK_ENCRYPTION_KEY - All data is stored locally in Docker volumes for complete data sovereignty