Awesome-Agent-Skills-for-Empirical-Research llm-aiops-guide

Papers on LLMs for IT operations and AIOps research

install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/domains/cs/llm-aiops-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-llm-aiops-guide && rm -rf "$T"
manifest: skills/43-wentorai-research-plugins/skills/domains/cs/llm-aiops-guide/SKILL.md
source content

LLM for AIOps Guide

Overview

A curated collection of research on applying LLMs to IT Operations (AIOps) — log analysis, anomaly detection, incident management, root cause analysis, and automated remediation. Tracks how foundation models are transforming traditional rule-based operations tooling into intelligent, adaptive systems. Relevant for CS researchers at the intersection of systems, NLP, and operations.

Research Areas

LLM for AIOps
├── Log Analysis
│   ├── Log parsing (template extraction)
│   ├── Anomaly detection (from log sequences)
│   ├── Log summarization
│   └── Root cause from logs
├── Incident Management
│   ├── Incident triage and routing
│   ├── Severity classification
│   ├── Similar incident retrieval
│   └── Resolution recommendation
├── Root Cause Analysis
│   ├── Topology-aware diagnosis
│   ├── Multi-signal correlation
│   └── Causal inference
├── Monitoring & Alerting
│   ├── Metric anomaly detection
│   ├── Alert correlation
│   ├── Noise reduction
│   └── Capacity planning
└── Automated Remediation
    ├── Runbook generation
    ├── Script generation
    ├── Self-healing systems
    └── Change impact analysis

Key Papers

PaperYearFocus
LogPPT2023Few-shot log parsing with prompt tuning
OpsEval2024Benchmark for evaluating LLMs in AIOps
D-Bot2024LLM-based database diagnosis
RCAgent2024Agent for root cause analysis
LogAgent2024Autonomous log analysis agent

Use Cases

  1. Literature tracking: Follow LLM-AIOps research evolution
  2. System design: Learn intelligent operations patterns
  3. Benchmark comparison: Evaluate AIOps approaches
  4. Research planning: Identify under-explored AIOps problems
  5. Industry applications: Bridge research to production AIOps

References