Awesome-Agent-Skills-for-Empirical-Research llm-aiops-guide

Papers on LLMs for IT operations and AIOps research

install

source · Clone the upstream repo

git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/domains/cs/llm-aiops-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-llm-aiops-guide && rm -rf "$T"

manifest: skills/43-wentorai-research-plugins/skills/domains/cs/llm-aiops-guide/SKILL.md

source content

LLM for AIOps Guide

Overview

A curated collection of research on applying LLMs to IT Operations (AIOps) — log analysis, anomaly detection, incident management, root cause analysis, and automated remediation. Tracks how foundation models are transforming traditional rule-based operations tooling into intelligent, adaptive systems. Relevant for CS researchers at the intersection of systems, NLP, and operations.

Research Areas

LLM for AIOps
├── Log Analysis
│   ├── Log parsing (template extraction)
│   ├── Anomaly detection (from log sequences)
│   ├── Log summarization
│   └── Root cause from logs
├── Incident Management
│   ├── Incident triage and routing
│   ├── Severity classification
│   ├── Similar incident retrieval
│   └── Resolution recommendation
├── Root Cause Analysis
│   ├── Topology-aware diagnosis
│   ├── Multi-signal correlation
│   └── Causal inference
├── Monitoring & Alerting
│   ├── Metric anomaly detection
│   ├── Alert correlation
│   ├── Noise reduction
│   └── Capacity planning
└── Automated Remediation
    ├── Runbook generation
    ├── Script generation
    ├── Self-healing systems
    └── Change impact analysis

Key Papers

Paper	Year	Focus
LogPPT	2023	Few-shot log parsing with prompt tuning
OpsEval	2024	Benchmark for evaluating LLMs in AIOps
D-Bot	2024	LLM-based database diagnosis
RCAgent	2024	Agent for root cause analysis
LogAgent	2024	Autonomous log analysis agent

Use Cases

Literature tracking: Follow LLM-AIOps research evolution
System design: Learn intelligent operations patterns
Benchmark comparison: Evaluate AIOps approaches
Research planning: Identify under-explored AIOps problems
Industry applications: Bridge research to production AIOps