Harbor new-boost-module

Name: new-boost-module
Author: av

install

source · Clone the upstream repo

git clone https://github.com/av/harbor

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/av/harbor "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.agents/skills/new-boost-module" ~/.claude/skills/av-harbor-new-boost-module && rm -rf "$T"

manifest: .agents/skills/new-boost-module/SKILL.md

source content

Creating Harbor Boost Modules

Harbor Boost is an optimizing LLM proxy with an OpenAI-compatible API. Modules are Python files that hook into the completion pipeline — they can rewrite prompts, inject system messages, chain multiple LLM calls, stream artifacts, or replace the completion entirely.

A module is activated by prefixing a model name with the module's

ID_PREFIX

. For example, a module with

ID_PREFIX = "mymod"

is invoked via model

mymod-llama3.1

Before You Write Code

Read the Custom Modules guide:

docs/5.2.1.-Harbor-Boost-Custom-Modules.md

Scan existing modules in
```
services/boost/src/modules/
```
to find similar functionality you can reuse or learn from. There are 17+ built-in modules covering reasoning chains, prompt rewriting, structured output, artifacts, and more.
Read the Built-in Modules reference:
```
docs/5.2.3-Harbor-Boost-Modules.md
```

Understanding the available primitives (

chat

llm

config

selection

log

) saves you from reinventing patterns that already exist.

Module Structure

Every module is a single

.py

file with two required exports:

ID_PREFIX = 'my_module'

async def apply(chat: 'Chat', llm: 'LLM'):
    # Module logic here
    pass

Required Exports

Export	Type	Purpose
`ID_PREFIX`	`str`	Unique identifier. Becomes the model prefix (e.g., `mymod-llama3.1` ). Use lowercase, short, memorable names.
`apply`	`async def(chat, llm)`	Entry point called for every matching completion request.

Recommended Exports

Export	Type	Purpose
`DOCS`	`str`	Markdown documentation shown in the Boost modules reference. Include a description, parameters, and usage examples.
`logger`	Logger	Created via `log.setup_logger(ID_PREFIX)` for consistent, filterable logging.

Core Primitives

chat

— The Conversation

Chat

is a linked list of

ChatNode

objects. The

tail

is the most recent message.

chat.tail.content      # last message text
chat.text()            # full conversation as plain text
chat.history()         # list of dicts from tail upward
chat.user("...")       # append a user message
chat.assistant("...")  # append an assistant message

# Insert a system message before the first message
chat.tail.ancestor().add_parent(
    ch.ChatNode(role="system", content="You are helpful.")
)

# Create a fresh chat for a side-conversation
side = ch.Chat.from_conversation([
    {"role": "user", "content": "Summarize this."}
])

llm

— The Downstream Model

llm

talks to the actual LLM backend configured in Boost.

Output to the client (streamed in real time):

await llm.emit_message("text")           # raw text chunk
await llm.emit_status("Working...")       # status indicator
await llm.emit_artifact("<html>...</html>")  # Open WebUI artifact

Internal completions (not streamed, returned to you):

result = await llm.chat_completion(prompt="...", resolve=True)
result = await llm.chat_completion(messages=[...], resolve=True)
result = await llm.chat_completion(chat=some_chat, resolve=True)
# With structured output
result = await llm.chat_completion(prompt="...", schema=MyModel, resolve=True)

Streamed completions (sent to client AND returned):

text = await llm.stream_chat_completion(prompt="...")
text = await llm.stream_chat_completion(messages=[...])

Final completion (always reaches the client, even with intermediate output disabled):

await llm.stream_final_completion()                    # from current chat
await llm.stream_final_completion(prompt="Custom prompt")

Boost parameters — clients can pass

@boost_*

params in the request body:

llm.boost_params  # dict of params without the @boost_ prefix

config

— Service Configuration

For configurable modules, define config entries in

services/boost/src/config.py

using the same pattern as existing modules. Config keys follow the

HARBOR_BOOST_<MODULE>_<PARAM>

convention and are exposed via

harbor boost <module> <param>

CLI.

selection

— Message Selection

For modules that operate on specific messages rather than the whole chat, use the selection module. It provides strategy-based filtering (

all

first

last

match

, etc.) — see

klmbr

eli5

, or

rcn

modules for usage patterns.

Writing Good DOCS

The

DOCS

string is the user-facing documentation. It should include:

A brief description of what the module does and why it's useful
Parameter descriptions if the module is configurable
Usage examples showing
```
harbor boost
```
CLI commands and/or standalone Docker usage

Format example:

DOCS = """
Explains complex questions simply before answering them, improving accuracy
on nuanced prompts.

**Parameters**

- `strat` - message selection strategy. Default: `match`
- `strat_params` - selection filter. Default: matches last user message

```bash
harbor boost modules add eli5
harbor boost eli5 strat match
harbor boost eli5 strat_params role=user,index=-1

Standalone

docker run \\
  -e "HARBOR_BOOST_MODULES=eli5" \\
  -e "HARBOR_BOOST_OPENAI_URLS=http://172.17.0.1:11434/v1" \\
  -p 34131:8000 \\
  ghcr.io/av/harbor-boost:latest

"""


## Common Patterns

**Prompt injection** — add context before the final completion:
```python
async def apply(chat, llm):
    chat.tail.add_parent(ch.ChatNode(role="system", content="Extra context"))
    await llm.stream_final_completion()

Multi-step reasoning — chain internal completions, then produce a final answer:

async def apply(chat, llm):
    await llm.emit_status("Analyzing...")
    analysis = await llm.chat_completion(prompt="Analyze: {q}", q=chat.tail.content, resolve=True)
    await llm.stream_final_completion(prompt="Given analysis:\n{analysis}\n\nAnswer: {q}",
                                       analysis=analysis, q=chat.tail.content)

Pass-through with side effects — do something extra while preserving normal behavior:

async def apply(chat, llm):
    logger.info(f"Processing: {chat.tail.content[:50]}")
    await llm.stream_final_completion()

Checklist

Before considering the module done:

```
ID_PREFIX
```
is unique, lowercase, and concise
```
DOCS
```
export explains functionality, parameters, and includes usage examples
```
logger = log.setup_logger(ID_PREFIX)
```
is defined
```
async def apply(chat, llm)
```
is the entry point

Module file is placed in

services/boost/src/modules/

(built-in) or

boost/src/custom_modules/

(custom)

If configurable, config entries added to
```
services/boost/src/config.py
```
Documentation in
```
docs/5.2.3-Harbor-Boost-Modules.md
```
is updated

Harbor new-boost-module

Creating Harbor Boost Modules

Before You Write Code

Module Structure

Required Exports

Recommended Exports

Core Primitives

`chat`
— The Conversation

`llm`
— The Downstream Model

`config`
— Service Configuration

`selection`
— Message Selection

Writing Good DOCS

Checklist