Claude-code-plugins-plus-skills perplexity-known-pitfalls

install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/perplexity-pack/skills/perplexity-known-pitfalls" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-perplexity-known-pitfalls && rm -rf "$T"
manifest: plugins/saas-packs/perplexity-pack/skills/perplexity-known-pitfalls/SKILL.md
source content

Perplexity Known Pitfalls

Overview

Real gotchas when integrating Perplexity Sonar API. Perplexity uses an OpenAI-compatible chat endpoint but performs live web searches -- a fundamentally different paradigm from standard LLM completions. These pitfalls come from treating it like a regular chatbot.

Prerequisites

  • Perplexity API key configured
  • Understanding of OpenAI-compatible chat API format

Pitfalls

1. Using It as a Generic Chatbot

Perplexity searches the web per request. Using it for tasks that don't need web search wastes money.

# BAD: general chatbot (wastes a search query)
response = call_perplexity("Write me a haiku about cats")
# Costs $0.005+ for something any LLM can do offline

# GOOD: leverage web search capability
response = call_perplexity(
    "What are the latest Next.js 15 features released this month?",
    search_recency_filter="month"
)

2. Ignoring Citations

Perplexity returns

[1]
,
[2]
markers in text with a separate
citations
array. Ignoring them loses the key value prop.

data = response.model_dump()  # or response.json() for raw HTTP
answer = data["choices"][0]["message"]["content"]
citations = data.get("citations", [])  # NOT in choices — top-level field

# BAD: displaying raw markers
print(answer)  # "According to [1], Node.js 22 adds..."

# GOOD: replace markers with links
import re
for i, url in enumerate(citations, 1):
    answer = answer.replace(f"[{i}]", f"[{i}]({url})")

3. Using Wrong SDK Import

There is no

@perplexity/sdk
or
perplexity
Python package. Use the standard OpenAI client.

// BAD — this package doesn't exist
import { PerplexityClient } from "@perplexity/sdk";

// GOOD — use OpenAI client with Perplexity base URL
import OpenAI from "openai";
const client = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

4. Not Setting max_tokens

Without

max_tokens
, responses can be arbitrarily long, increasing costs unpredictably.

// BAD: no token limit — output cost can spike
await client.chat.completions.create({
  model: "sonar-pro",  // $15/M output tokens!
  messages: [{ role: "user", content: "Tell me about AI" }],
});

// GOOD: always set max_tokens
await client.chat.completions.create({
  model: "sonar-pro",
  messages: [{ role: "user", content: "Tell me about AI" }],
  max_tokens: 1024,
});

5. No Recency Filter for Time-Sensitive Queries

Without

search_recency_filter
, Perplexity may cite outdated articles.

# BAD: may return articles from any time period
response = call_perplexity("current Bitcoin price")

# GOOD: constrain to recent results
response = call_perplexity(
    "current Bitcoin price",
    search_recency_filter="day"  # hour | day | week | month
)

6. Sending Full Conversation History

Each message in the conversation may trigger new search queries. Sending 20 turns of history is expensive and slow.

# BAD: 20 turns of history = many search queries
messages = long_history + [{"role": "user", "content": "summarize"}]

# GOOD: summarize context, send focused query
messages = [
    {"role": "system", "content": "Answer based on web search."},
    {"role": "user", "content": f"Context: {summary}\nQuestion: {question}"}
]

7. Using sonar-pro for Simple Queries

sonar-pro
costs 3-15x more than
sonar
. Using it for simple factual lookups wastes budget.

// BAD: sonar-pro for a trivial question
await client.chat.completions.create({
  model: "sonar-pro",  // $3 input + $15 output per M tokens
  messages: [{ role: "user", content: "What is the capital of France?" }],
});

// GOOD: match model to complexity
const model = isComplexQuery(query) ? "sonar-pro" : "sonar";

8. Mixing Allowlist and Denylist in Domain Filter

search_domain_filter
supports either allowlist (include) or denylist (exclude with
-
prefix), but not both in the same request.

// BAD: mixing modes
search_domain_filter: ["python.org", "-reddit.com"]  // ERROR

// GOOD: pick one mode
search_domain_filter: ["python.org", "docs.python.org"]  // Allowlist
// OR
search_domain_filter: ["-reddit.com", "-quora.com"]  // Denylist

9. Not Caching Search Results

Every uncached call performs a web search. At scale, duplicate queries burn budget.

// BAD: same query hits API every time
app.get("/search", (req, res) => {
  const result = await client.chat.completions.create({ ... });
  res.json(result);
});

// GOOD: cache by query hash
const cache = new LRUCache({ max: 1000, ttl: 3600_000 });
app.get("/search", (req, res) => {
  const key = hash(req.query.q);
  if (cache.has(key)) return res.json(cache.get(key));
  const result = await client.chat.completions.create({ ... });
  cache.set(key, result);
  res.json(result);
});

10. Wrong Base URL

The API is at

api.perplexity.ai
, not
api.perplexity.com
.

// BAD
baseURL: "https://api.perplexity.com"  // Wrong domain

// GOOD
baseURL: "https://api.perplexity.ai"   // Correct

Code Review Checklist

  • Uses
    openai
    package, not fake
    @perplexity/sdk
  • Base URL is
    https://api.perplexity.ai
  • max_tokens
    set on every request
  • Citations parsed from
    response.citations
    array
  • search_recency_filter
    used for time-sensitive queries
  • Caching implemented for repeated queries
  • Model routing: sonar for simple, sonar-pro for complex
  • Conversation history trimmed before sending
  • PII sanitized from queries
  • Domain filter uses only allowlist OR denylist, not both

Error Handling

PitfallImpactDetection
No caching3-5x cost overrunCheck cache hit rate metric
Wrong modelBudget wasteGrep for
sonar-pro
in simple query paths
No max_tokensUnpredictable costsGrep for
create()
calls without
max_tokens
PII in queriesPrivacy violationRun sanitization check in CI

Output

  • Identified anti-patterns in existing code
  • Applied fixes for each pitfall
  • Code review checklist for ongoing quality

Resources