Marketplace streaming-llm-responses

install

source · Clone the upstream repo

git clone https://github.com/aiskillstore/marketplace

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/aiskillstore/marketplace "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/asmayaseen/streaming-llm-responses" ~/.claude/skills/aiskillstore-marketplace-streaming-llm-responses && rm -rf "$T"

manifest: skills/asmayaseen/streaming-llm-responses/SKILL.md

Streaming LLM Responses

Build responsive, real-time chat interfaces with streaming feedback.

Quick Start

import { useChatKit } from "@openai/chatkit-react";

const chatkit = useChatKit({
  api: { url: API_URL, domainKey: DOMAIN_KEY },

  onResponseStart: () => setIsResponding(true),
  onResponseEnd: () => setIsResponding(false),

  onEffect: ({ name, data }) => {
    if (name === "update_status") updateUI(data);
  },
});

Response Lifecycle

User sends message
    ↓
onResponseStart() fires
    ↓
[Streaming: tokens arrive, ProgressUpdateEvents shown]
    ↓
onResponseEnd() fires
    ↓
UI unlocks, ready for next interaction

Core Patterns

1. Response Lifecycle Handlers

Lock UI during AI response to prevent race conditions:

function ChatWithLifecycle() {
  const [isResponding, setIsResponding] = useState(false);
  const lockInteraction = useAppStore((s) => s.lockInteraction);
  const unlockInteraction = useAppStore((s) => s.unlockInteraction);

  const chatkit = useChatKit({
    api: { url: API_URL, domainKey: DOMAIN_KEY },

    onResponseStart: () => {
      setIsResponding(true);
      lockInteraction(); // Disable map/canvas/form interactions
    },

    onResponseEnd: () => {
      setIsResponding(false);
      unlockInteraction();
    },

    onError: ({ error }) => {
      console.error("ChatKit error:", error);
      setIsResponding(false);
      unlockInteraction();
    },
  });

  return (
    <div>
      {isResponding && <LoadingOverlay />}
      <ChatKit control={chatkit.control} />
    </div>
  );
}

2. Client Effects (Fire-and-Forget)

Server sends effects to update client UI without expecting a response:

Backend - Streaming Effects:

from chatkit.types import ClientEffectEvent

async def respond(self, thread, item, context):
    # ... agent processing ...

    # Fire client effect to update UI
    yield ClientEffectEvent(
        name="update_status",
        data={
            "state": {"energy": 80, "happiness": 90},
            "flash": "Status updated!"
        }
    )

    # Another effect
    yield ClientEffectEvent(
        name="show_notification",
        data={"message": "Task completed!"}
    )

Frontend - Handling Effects:

const chatkit = useChatKit({
  api: { url: API_URL, domainKey: DOMAIN_KEY },

  onEffect: ({ name, data }) => {
    switch (name) {
      case "update_status":
        applyStatusUpdate(data.state);
        if (data.flash) setFlashMessage(data.flash);
        break;

      case "add_marker":
        addMapMarker(data);
        break;

      case "select_mode":
        setSelectionMode(data.mode);
        break;
    }
  },
});

3. Progress Updates

Show "Searching...", "Loading...", "Analyzing..." during long operations:

from chatkit.types import ProgressUpdateEvent

@function_tool
async def search_articles(ctx: AgentContext, query: str) -> str:
    """Search for articles matching the query."""

    yield ProgressUpdateEvent(message="Searching articles...")

    results = await article_store.search(query)

    yield ProgressUpdateEvent(message=f"Found {len(results)} articles...")

    for i, article in enumerate(results):
        if i % 5 == 0:
            yield ProgressUpdateEvent(
                message=f"Processing article {i+1}/{len(results)}..."
            )

    return format_results(results)

4. Thread Lifecycle Events

Track thread changes for persistence and UI updates:

const chatkit = useChatKit({
  api: { url: API_URL, domainKey: DOMAIN_KEY },

  onThreadChange: ({ threadId }) => {
    setThreadId(threadId);
    if (threadId) localStorage.setItem("lastThreadId", threadId);
    clearSelections();
  },

  onThreadLoadStart: ({ threadId }) => {
    setIsLoadingThread(true);
  },

  onThreadLoadEnd: ({ threadId }) => {
    setIsLoadingThread(false);
  },
});

5. Client Tools (State Query)

AI needs to read client-side state to make decisions:

Backend - Defining Client Tool:

@function_tool(name_override="get_selected_items")
async def get_selected_items(ctx: AgentContext) -> dict:
    """Get the items currently selected on the canvas.

    This is a CLIENT TOOL - executed in browser, result comes back.
    """
    yield ProgressUpdateEvent(message="Reading selection...")
    pass  # Actual execution happens on client

Frontend - Handling Client Tools:

const chatkit = useChatKit({
  api: { url: API_URL, domainKey: DOMAIN_KEY },

  onClientTool: ({ name, params }) => {
    switch (name) {
      case "get_selected_items":
        return { itemIds: selectedItemIds };

      case "get_current_viewport":
        return {
          center: mapRef.current.getCenter(),
          zoom: mapRef.current.getZoom(),
        };

      case "get_form_data":
        return { values: formRef.current.getValues() };

      default:
        throw new Error(`Unknown client tool: ${name}`);
    }
  },
});

Client Effects vs Client Tools

Type	Direction	Response Required	Use Case
Client Effect	Server → Client	No (fire-and-forget)	Update UI, show notifications
Client Tool	Server → Client → Server	Yes (return value)	Get client state for AI decision

Common Patterns by Use Case

Interactive Map/Canvas

onResponseStart: () => lockCanvas(),
onResponseEnd: () => unlockCanvas(),
onEffect: ({ name, data }) => {
  if (name === "add_marker") addMarker(data);
  if (name === "pan_to") panTo(data.location);
},
onClientTool: ({ name }) => {
  if (name === "get_selection") return getSelectedItems();
},

Form-Based UI

onResponseStart: () => setFormDisabled(true),
onResponseEnd: () => setFormDisabled(false),
onClientTool: ({ name }) => {
  if (name === "get_form_values") return form.getValues();
},

Game/Simulation

onResponseStart: () => pauseSimulation(),
onResponseEnd: () => resumeSimulation(),
onEffect: ({ name, data }) => {
  if (name === "update_entity") updateEntity(data);
  if (name === "show_notification") showToast(data.message);
},

Thread Title Generation

Dynamically update thread title based on conversation:

class TitleAgent:
    async def generate_title(self, first_message: str) -> str:
        result = await Runner.run(
            Agent(
                name="TitleGenerator",
                instructions="Generate a 3-5 word title.",
                model="gpt-4o-mini",  # Fast model
            ),
            input=f"First message: {first_message}",
        )
        return result.final_output

# In ChatKitServer
async def respond(self, thread, item, context):
    if not thread.title and item:
        title = await self.title_agent.generate_title(item.content)
        thread.title = title
        await self.store.save_thread(thread, context)

Anti-Patterns

Not locking UI during response - Leads to race conditions
Blocking in effects - Effects should be fire-and-forget
Heavy computation in onEffect - Use requestAnimationFrame for DOM updates
Missing error handling - Always handle onError to unlock UI
Not persisting thread state - Use onThreadChange to save context

Verification

Run:

python3 scripts/verify.py

Expected:

✓ streaming-llm-responses skill ready

If Verification Fails

Check: references/ folder has streaming-patterns.md
Stop and report if still failing

References

references/streaming-patterns.md - Complete streaming configuration