Awesome-Agent-Skills-for-Empirical-Research ai-security-papers-guide
AI security papers from top-4 security conferences
install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/domains/cs/ai-security-papers-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-ai-security-paper && rm -rf "$T"
manifest:
skills/43-wentorai-research-plugins/skills/domains/cs/ai-security-papers-guide/SKILL.mdsource content
AI Security Papers Guide (BIG4 Venues)
Overview
A curated collection of AI security papers from the top-4 security conferences: IEEE S&P, ACM CCS, USENIX Security, and NDSS. Covers adversarial attacks, model stealing, data poisoning, privacy attacks, deepfake detection, and LLM security. Organized by year and venue, focusing exclusively on peer-reviewed work from these prestigious venues.
Venues
| Venue | Full Name | Focus |
|---|---|---|
| S&P | IEEE Symposium on Security and Privacy | Broad security + privacy |
| CCS | ACM Conference on Computer and Communications Security | Systems security |
| USENIX | USENIX Security Symposium | Systems + network security |
| NDSS | Network and Distributed System Security | Network security |
Topic Categories
AI Security (BIG4) ├── Adversarial ML │ ├── Evasion attacks (adversarial examples) │ ├── Poisoning attacks (backdoors, trojans) │ ├── Model stealing (extraction, distillation) │ └── Defenses (certified robustness, detection) ├── Privacy Attacks │ ├── Membership inference │ ├── Model inversion │ ├── Attribute inference │ └── Training data extraction ├── LLM Security │ ├── Prompt injection │ ├── Jailbreaking │ ├── Data leakage │ └── Alignment attacks ├── Deepfakes │ ├── Generation methods │ ├── Detection techniques │ └── Watermarking └── Federated Learning Security ├── Byzantine attacks ├── Gradient leakage └── Secure aggregation
Key Papers by Year
# Recent highlights papers_2024_2025 = [ {"title": "Not What You've Signed Up For: " "Compromising Real-World LLM-Integrated Applications", "venue": "S&P 2024", "topic": "LLM security"}, {"title": "Prompt Stealing Attacks Against " "Text-to-Image Generation Models", "venue": "S&P 2024", "topic": "Prompt extraction"}, {"title": "Backdoor Attacks on Language Models", "venue": "CCS 2024", "topic": "NLP backdoors"}, {"title": "Membership Inference in LLMs", "venue": "USENIX 2024", "topic": "Privacy"}, ] for p in papers_2024_2025: print(f"[{p['venue']}] {p['title']}") print(f" Topic: {p['topic']}")
Research Trends
### Emerging Areas (2024-2025) 1. **LLM security** — Jailbreaking, prompt injection, agent attacks 2. **Supply chain attacks** — Poisoned models, malicious packages 3. **Multi-modal attacks** — Cross-modal adversarial examples 4. **Agent security** — Attacks on LLM-based autonomous systems 5. **Watermarking** — LLM output detection, IP protection 6. **Unlearning** — Machine unlearning verification and attacks
Use Cases
- Security research: Find state-of-the-art attack/defense methods
- Threat modeling: Understand AI system vulnerabilities
- Literature review: Systematic coverage of BIG4 AI security
- Course material: Graduate-level AI security curriculum
- Red teaming: Learn evaluation techniques for AI systems