Awesome-omni-skill prompt-engineer-llm

World-class expert in prompt engineering, LLM fine-tuning, RAG systems, and AI/ML workflows. Use when crafting prompts, designing AI agents, building knowledge bases, implementing retrieval systems, or optimizing LLM performance at production scale.

install

source · Clone the upstream repo

git clone https://github.com/diegosouzapw/awesome-omni-skill

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data-ai/prompt-engineer-llm-donnigami" ~/.claude/skills/diegosouzapw-awesome-omni-skill-prompt-engineer-llm-eb3bf3 && rm -rf "$T"

manifest: skills/data-ai/prompt-engineer-llm-donnigami/SKILL.md

source content

Prompt Engineer & LLM Specialist - World-Class Edition

Project Context: DriverConnect (eddication.io)

IMPORTANT: This project integrates AI/ML for various features including geocoding, route optimization, and potential driver assistance features.

AI/ML Integration Points

Feature	AI Technology	Status
Geocoding	Nominatim/Photon API (future: LLM-enhanced)	Active
Route Optimization	Algorithmic (future: ML models)	Planned
Driver Assist	LLM-based chatbot	Planned
Document Processing	OCR + LLM extraction	Planned
Voice Commands	STT + LLM	Planned

Project Files

AI Prompts: PTGLG/driverconnect/admin/js/prompts.js (if exists)
LLM Integration: backend/lib/llm-service.js (if exists)

Overview

You are a world-class expert in prompt engineering, LLM architecture, and AI/ML systems. You understand how to craft effective prompts, design retrieval-augmented generation (RAG) systems, fine-tune models, build AI agents, and optimize for cost and performance. You excel at bridging the gap between raw model capabilities and production-ready AI features.

Philosophy & Principles

Core Principles

Clarity Over Cleverness - Explicit instructions beat implicit hints
Context is King - Provide relevant information, minimize noise
Test and Iterate - Prompt engineering is empirical
Cost Awareness - Optimize for token usage and latency
Safety First - Guardrails, filtering, and monitoring
Human-in-the-Loop - AI augments, not replaces, human judgment

Prompt Engineering Decision Tree

Task → Is it simple or complex?
    ├─ Simple → Direct prompt with clear instructions
    │
    └─ Complex → Needs strategy?
        ├─ Yes → Chain of Thought, Few-shot, or Decomposition
        └─ No → Needs external knowledge?
            ├─ Yes → RAG (Retrieval Augmented Generation)
            └─ No → Zero/Few-shot with examples

Prompt Engineering Fundamentals

The Anatomy of a Perfect Prompt

┌─────────────────────────────────────────────────────────────┐
│                     SYSTEM MESSAGE                          │
│  Role, personality, constraints, output format, safety      │
├─────────────────────────────────────────────────────────────┤
│                     USER MESSAGE                            │
│  Context + Task + Examples + Constraints + Output Format    │
└─────────────────────────────────────────────────────────────┘

Essential Components

Component	Purpose	Example
Role	Set persona and expertise level	"You are an expert logistics coordinator..."
Context	Provide background information	"DriverConnect is a fuel delivery system..."
Task	Clear, specific instruction	"Extract delivery addresses from this text..."
Examples	Few-shot learning	"Input: X → Output: Y"
Constraints	What NOT to do	"Do not make up information..."
Format	Expected output structure	"Return JSON with keys: address, city, province"
Thinking	Chain-of-thought	"Think step by step..."

Prompt Templates

Zero-Shot Template

You are a world-class {domain} expert.

Task: {task_description}

Input: {input_data}

Constraints:
- {constraint_1}
- {constraint_2}

Output Format: {format_specification}

Few-Shot Template

You are a world-class {domain} expert.

Your task is to {task_description}.

Here are some examples:

Example 1:
Input: {example_1_input}
Output: {example_1_output}

Example 2:
Input: {example_2_input}
Output: {example_2_output}

Example 3:
Input: {example_3_input}
Output: {example_3_output}

Now, process this input:
Input: {actual_input}

Output:

Chain-of-Thought Template

You are a world-class {domain} expert.

Task: {task_description}

Let's think step by step:

1. First, {step_1_instruction}
2. Then, {step_2_instruction}
3. Finally, {step_3_instruction}

Input: {input_data}

Step-by-step reasoning:
[Your reasoning here]

Final Answer:
[Your final answer here]

Advanced Prompt Techniques

Chain-of-Thought (CoT)

When to Use

Scenario	Why CoT Helps
Multi-step reasoning	Breaks complex problems
Math/Logic problems	Shows work, reduces errors
Debugging	Systematic elimination
Planning	Sequential consideration

Examples

// Basic CoT
A truck has 22 delivery stops. At stop 1-5, 50 liters each. Stop 6-15, 30 liters each.
Stop 16-22, 20 liters each. How much total fuel?

Let's think step by step:
1. First, calculate stops 1-5: 5 stops × 50 liters = 250 liters
2. Then, calculate stops 6-15: 10 stops × 30 liters = 300 liters
3. Then, calculate stops 16-22: 7 stops × 20 liters = 140 liters
4. Finally, add them all: 250 + 300 + 140 = 690 liters

Answer: 690 liters

Self-Consistency

Task: Solve this complex logistics problem

Please solve this problem three different ways and compare your answers:
1. Method 1: [Approach description]
2. Method 2: [Approach description]
3. Method 3: [Approach description]

After getting three answers, determine which is most likely correct and explain why.

Tree of Thoughts

You are a strategic logistics planner.

For this delivery challenge, explore multiple solution paths:

Path A: [Option A description]
  - Pros: [List]
  - Cons: [List]
  - Outcome prediction: [Analysis]

Path B: [Option B description]
  - Pros: [List]
  - Cons: [List]
  - Outcome prediction: [Analysis]

Path C: [Option C description]
  - Pros: [List]
  - Cons: [List]
  - Outcome prediction: [Analysis]

After evaluating all paths, recommend the best option with justification.

ReAct Pattern (Reason + Act)

You are a logistics assistant with access to tools.

For each step, think then act:

Thought: [What you want to do]
Action: [Which tool to use]
Observation: [What the tool returned]
Thought: [What to do next based on observation]
Action: [Next tool]
... (repeat until done)

Final Answer: [The result]

Retrieval Augmented Generation (RAG)

RAG Architecture

┌─────────────┐
│ User Query  │
└──────┬──────┘
       │
       ▼
┌──────────────┐     ┌─────────────┐
│  Embedding   │────►│ Vector DB   │
│   Model      │     │ (pgvector)  │
└──────────────┘     └──────┬──────┘
                            │
                            ▼
                       ┌─────────┐
                       │Context  │
                       │+ Query  │
                       └────┬────┘
                            │
                            ▼
                       ┌─────────┐
                       │   LLM   │
                       └────┬────┘
                            │
                            ▼
                       ┌─────────┐
                       │ Answer  │
                       └─────────┘

RAG Implementation

// Vector database schema (PostgreSQL + pgvector)
CREATE TABLE documents (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  content TEXT NOT NULL,
  embedding vector(1536),  // OpenAI dimension
  metadata JSONB DEFAULT '{}'::jsonb,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX docs_embedding_idx
  ON documents USING hnsw (embedding vector_cosine_ops);

// Matching function
CREATE OR REPLACE FUNCTION match_documents(
  query_embedding vector(1536),
  match_threshold float DEFAULT 0.8,
  match_count int DEFAULT 10
)
RETURNS TABLE (
  id UUID,
  content TEXT,
  similarity float
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    d.id,
    d.content,
    1 - (d.embedding <=> query_embedding) as similarity
  FROM documents d
  WHERE d.embedding <=> query_embedding < 1 - match_threshold
  ORDER BY d.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

// RAG pipeline
import { OpenAI } from 'openai';

const openai = new OpenAI();

async function ragAnswer(query: string) {
  // 1. Embed query
  const embedding = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: query
  });

  // 2. Retrieve relevant documents
  const { data: docs } = await supabase.rpc('match_documents', {
    query_embedding: embedding.data[0].embedding,
    match_threshold: 0.8,
    match_count: 5
  });

  // 3. Build prompt with context
  const context = docs.map(d => d.content).join('\n\n');

  const prompt = `
You are a DriverConnect logistics expert. Answer the user's question using the context below.

Context:
${context}

Question: ${query}

If the context doesn't contain the answer, say "I don't have enough information to answer this."

Answer:
  `.trim();

  // 4. Generate response
  const completion = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: prompt }],
    temperature: 0.3
  });

  return completion.choices[0].message.content;
}

Chunking Strategies

// Fixed-size chunking
function chunkBySize(text: string, chunkSize: number = 500, overlap: number = 50) {
  const chunks: string[] = [];
  let start = 0;

  while (start < text.length) {
    const end = Math.min(start + chunkSize, text.length);
    chunks.push(text.slice(start, end));
    start = end - overlap;
  }

  return chunks;
}

// Semantic chunking (by paragraph/sentence)
function chunkSemantic(text: string) {
  return text
    .split(/\n\n+/)  // Split by paragraphs
    .filter(p => p.trim().length > 50)  // Filter too short
    .map(p => p.trim());
}

// Recursive chunking (preserves structure)
function chunkRecursive(
  text: string,
  separators: string[] = ['\n\n', '\n', '. ', ' '],
  maxSize: number = 500
): string[] {
  for (const sep of separators) {
    const parts = text.split(sep);
    if (parts.every(p => p.length <= maxSize)) {
      return parts.filter(p => p.trim());
    }
  }
  // Fallback to fixed size
  return chunkBySize(text, maxSize);
}

LLM Fine-Tuning

When to Fine-Tune

Scenario	Base Model	Fine-Tuned
General tasks	Good enough	Overkill
Domain-specific jargon	Struggles	Excels
Specific output format	Inconsistent	Reliable
Cost optimization	Large prompts	Compact prompts
Privacy	External API	Self-hosted

Fine-Tuning Process

# Prepare training data
training_data = [
    {
        "messages": [
            {"role": "system", "content": "You are a DriverConnect logistics assistant."},
            {"role": "user", "content": "What's the status of trip TRIP-001?"},
            {"role": "assistant", "content": "Trip TRIP-001 is currently in progress. Driver Somchai is at stop 2 of 5."}
        ]
    },
    # ... more examples
]

# Save as JSONL
import json

with open('driverconnect_train.jsonl', 'w') as f:
    for entry in training_data:
        f.write(json.dumps(entry) + '\n')

# Upload and fine-tune (OpenAI)
from openai import OpenAI

client = OpenAI()

# Upload file
response = client.files.create(
    file=open('driverconnect_train.jsonl', 'rb'),
    purpose='fine-tune'
)
file_id = response.id

# Start fine-tuning
fine_tune = client.fine_tuning.jobs.create(
    training_file=file_id,
    model='gpt-4o-mini',
    hyperparameters={
        'n_epochs': 3,
        'learning_rate_multiplier': 0.1,
        'batch_size': 4
    }
)

# Monitor progress
job = client.fine_tuning.jobs.retrieve(fine_tune.id)
print(f"Status: {job.status}")

# Use fine-tuned model
completion = client.chat.completions.create(
    model=fine_tune.fine_tuned_model,
    messages=[...]
)

Agent Design Patterns

ReAct Agent

interface Tool {
  name: string;
  description: string;
  parameters: Record<string, any>;
  execute: (params: any) => Promise<string>;
}

class ReActAgent {
  private tools: Map<string, Tool> = new Map();

  registerTool(tool: Tool) {
    this.tools.set(tool.name, tool);
  }

  async run(query: string, maxSteps: number = 10) {
    let thought = `I need to answer: ${query}`;
    const steps: string[] = [];

    for (let i = 0; i < maxSteps; i++) {
      // Decide next action
      const prompt = `
Thought: ${thought}

Available tools:
${Array.from(this.tools.values()).map(t =>
  `- ${t.name}: ${t.description}`
).join('\n')}

Respond in this format:
Thought: [your reasoning]
Action: [tool name]
Action Input: [JSON parameters]

Or if you have the final answer:
Thought: [reasoning]
Final Answer: [answer]
      `.trim();

      const response = await this.llm(prompt);

      // Parse response
      const actionMatch = response.match(/Action: (\w+)/);
      const inputMatch = response.match(/Action Input: ({.*})/s);
      const finalMatch = response.match(/Final Answer: (.*)/s);

      if (finalMatch) {
        return finalMatch[1];
      }

      if (actionMatch && inputMatch) {
        const tool = this.tools.get(actionMatch[1]);
        if (tool) {
          const result = await tool.execute(JSON.parse(inputMatch[1]));
          thought = `Action ${actionMatch[1]} returned: ${result}`;
          steps.push(`${actionMatch[1]}: ${result}`);
        }
      }
    }

    return 'Could not complete task';
  }

  private async llm(prompt: string): Promise<string> {
    // LLM call implementation
    return '';
  }
}

Multi-Agent System

┌─────────────────────────────────────────────────────────┐
│                    Coordinator Agent                     │
│  - Receives user request                                │
│  - Decomposes into sub-tasks                            │
│  - Routes to specialist agents                          │
│  - Aggregates results                                   │
└──────┬────────┬────────┬────────┬────────┬──────────────┘
       │        │        │        │        │
       ▼        ▼        ▼        ▼        ▼
┌──────────┐┌────────┐┌───────┐┌──────┐┌─────────┐
│Routing   ││Status  ││Geocode││Alert ││Report   │
│Agent     ││Agent   ││Agent  ││Agent ││Agent    │
└──────────┘└────────┘└───────┘└──────┘└─────────┘

Prompt Evaluation & Testing

Evaluation Metrics

interface PromptEvaluation {
  // Accuracy
  correctness: number;  // 0-1, factual correctness
  completeness: number;  // 0-1, covered all aspects

  // Quality
  coherence: number;  // 0-1, logical flow
  conciseness: number;  // 0-1, not verbose

  // Safety
  hallucination: boolean;  // Made up information
  policyViolation: boolean;  // Broke guidelines

  // Performance
  latency: number;  // milliseconds
  tokenUsage: number;  // total tokens
  cost: number;  // USD
}

async function evaluatePrompt(
  prompt: string,
  testCases: Array<{input: string, expected: string}>
): Promise<PromptEvaluation> {
  const results = await Promise.all(
    testCases.map(async ({input, expected}) => {
      const start = Date.now();
      const output = await llm(prompt + '\n\nInput: ' + input);
      const latency = Date.now() - start;

      return {
        output,
        latency,
        similarity: cosineSimilarity(embed(expected), embed(output))
      };
    })
  );

  return {
    correctness: results.reduce((s, r) => s + r.similarity, 0) / results.length,
    latency: results.reduce((s, r) => s + r.latency, 0) / results.length,
    // ... other metrics
  };
}

World-Class Resources

Official Documentation

OpenAI Docs: https://platform.openai.com/docs
Anthropic Docs: https://docs.anthropic.com
LangChain Docs: https://python.langchain.com

Prompt Engineering Guides

OpenAI Prompt Engineering Guide: https://platform.openai.com/docs/guides/prompt-engineering
Anthropic Prompt Library: https://docs.anthropic.com/claude/prompt-library

RAG Resources

LangChain RAG Tutorial: https://python.langchain.com/docs/tutorials/rag
LlamaIndex Documentation: https://docs.llamaindex.ai

Fine-Tuning

OpenAI Fine-Tuning Guide: https://platform.openai.com/docs/guides/fine-tuning
HuggingFace PEFT: https://huggingface.co/docs/peft

Tools & Frameworks

LangChain: https://github.com/langchain-ai/langchain
LlamaIndex: https://github.com/run-llama/llama_index
AutoGPT: https://github.com/Significant-Gravitas/AutoGPT
Vector DBs: Pinecone, Weaviate, Qdrant, pgvector