Claude-skill-registry r-anti-slop

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/anti-slop" ~/.claude/skills/majiayu000-claude-skill-registry-r-anti-slop && rm -rf "$T"

manifest: skills/data/anti-slop/SKILL.md

When to Use This

Use this for:

✓ Any R code leaving your machine (analysis, packages, scripts)
✓ AI-generated code review (catches
```
df
```
,
```
result
```
, missing
```
::
```
)
✓ CRAN submissions (they'll reject generic code anyway)
✓ Team code standards

Skip for:

Quick console experiments (though habits form fast)
Legacy code you can't touch
Bioconductor or other style guides that override this

Quick Example

Before (AI Slop):

# Load the library
library(dplyr)

# Read the data
df <- read.csv("data.csv")

# Filter the data
result <- df %>% filter(x > 0)

After (Anti-Slop):

customer_data <- readr::read_csv("data/customers.csv")

active_customers <- customer_data |>
  dplyr::filter(status == "active", revenue > 0)

return(active_customers)

What changed:

✓ Descriptive names (
```
customer_data
```
not
```
df
```
)
✓ Namespace qualification (
```
dplyr::
```
,
```
readr::
```
)
✓ Native pipe (
```
|>
```
not
```
%>%
```
)
✓ No obvious comments
✓ Explicit return

When to Use What

If you need to...	Do this	Details
Name variables	Use `snake_case` , no `df` / `data` / `result`	reference/naming.md
Call tidyverse functions	Always use `::` (e.g., `dplyr::filter()` )	reference/tidyverse.md
Return from function	Always explicit `return()` statement	reference/naming.md
Write pipe chains	Use `\|>` , break at 8+ operations	reference/tidyverse.md
Document functions	Specific `@param` , `@return` , no circular text	reference/documentation.md
Handle missing data	Explicit strategy + report data loss	reference/statistical-rigor.md
Validate data	Check assumptions with `stopifnot()`	reference/statistical-rigor.md
Format code	Use `styler::style_file()`	reference/tidyverse.md
Check code quality	Use `lintr::lint()`	reference/tidyverse.md

Core Workflow

5-Step Quality Check

Namespace qualification - All external functions use

::

# Good
dplyr::filter(data, x > 0)
# Bad
filter(data, x > 0)

Explicit returns - Every function has

return()

# Good
my_function <- function(x) {
  result <- x + 1
  return(result)
}
# Bad
my_function <- function(x) {
  x + 1
}

Naming conventions - All objects use

snake_case

# Good
customer_lifetime_value <- calculate_clv(data)
# Bad
df <- calculate_clv(data)
customerLifetimeValue <- calculate_clv(data)

Documentation quality - No generic descriptions

# Good
#' @param deaths Data frame with `age_group` and `count` columns
# Bad
#' @param data The data

Code formatting - Run styler and lintr

styler::style_file("script.R")
lintr::lint("script.R")

Quick Reference Checklist

Before committing R code, verify:

All external functions qualified with
```
::
```
All functions have explicit
```
return()
```
All objects use
```
snake_case
```
No generic names (
```
df
```
,
```
data
```
,
```
result
```
,
```
temp
```
)
Pipes (
```
|>
```
) have space before, end lines
Long pipelines (>8 ops) broken into named steps
Complex operations have WHY comments
Data validated after transformations
Seeds set before random operations
Uncertainty reported (SE, CI) for statistical models
No
```
attach()
```
calls
No right-hand assignment (
```
->
```
)
Roxygen documentation is specific
Examples are realistic and run

Common Workflows

Workflow 1: Clean Up AI-Generated R Script

Context: AI generated an analysis script with generic patterns.

Steps:

Run detection script

Rscript toolkit/scripts/detect_slop.R analysis.R --verbose

Fix high-priority issues first

# Replace df, data, result with descriptive names
# Before
df <- readr::read_csv("data.csv")
result <- df %>% filter(x > 0)

# After
customer_data <- readr::read_csv("data/customers.csv")
active_customers <- customer_data |> dplyr::filter(status == "active")

Add namespace qualification

# Before
data %>% filter(x > 0) %>% summarize(mean(y))

# After
data |>
  dplyr::filter(x > 0) |>
  dplyr::summarize(mean_y = mean(y))

Add explicit returns

# Before
calculate_rate <- function(numerator, denominator) {
  numerator / denominator
}

# After
calculate_rate <- function(numerator, denominator) {
  rate <- numerator / denominator
  return(rate)
}

Break long pipes

# Before (12 operations in one chain)
result <- data |>
  filter(...) |> mutate(...) |> group_by(...) |>
  summarize(...) |> arrange(...) |> [7 more ops]

# After
clean_data <- data |>
  dplyr::filter(!is.na(value)) |>
  dplyr::mutate(category = categorize(value))

summary_stats <- clean_data |>
  dplyr::group_by(category) |>
  dplyr::summarize(mean_val = mean(value))

Format and validate

styler::style_file("analysis.R")
lintr::lint("analysis.R")

Expected outcome: Score drops from 60+ to <20

Workflow 2: Fix Generic Package Documentation

Context: R package has generic roxygen documentation.

Steps:

Identify generic patterns

# Bad
#' Process Data
#'
#' @description This function processes the data.
#' @param data The data.
#' @return The result.

Make description specific

# Good
#' Calculate age-adjusted mortality rates
#'
#' Computes mortality rates per 100,000 population, standardized to the
#' 2000 US Census age distribution using direct standardization.

Describe parameter structure

# Good
#' @param deaths Data frame with columns `age_group` and `count`.
#' @param population Data frame with columns `age_group` and `pop_size`.

Specify return value

# Good
#' @return A tibble with columns:
#'   \describe{
#'     \item{county}{County FIPS code}
#'     \item{rate}{Age-adjusted rate per 100,000}
#'     \item{se}{Standard error of the rate}
#'   }

Add realistic examples

# Good
#' @examples
#' counties <- data.frame(
#'   county = c("A", "B"),
#'   deaths = c(150, 200),
#'   population = c(50000, 80000)
#' )
#'
#' adjust_rates(counties, rate_per = 100000)
#' #> # A tibble: 2 x 3
#' #>   county  rate    se
#' #> 1 A       312.  25.4
#' #> 2 B       258.  18.2

Expected outcome: Documentation that teaches, not restates

Workflow 3: Prepare Package for CRAN

Context: Final checks before CRAN submission.

Steps:

Run all quality checks

# Standard checks
devtools::check()

# Anti-slop checks
lapply(list.files("R", full.names = TRUE), function(f) {
  system(paste("Rscript toolkit/scripts/detect_slop.R", f))
})

Fix documentation
- Check all
```
@param
```
  descriptions are specific
- Verify
```
@examples
```
  run and are realistic
- Ensure
```
@return
```
  describes structure

Validate code quality

# Format all files
styler::style_dir("R/")

# Check lints
lintr::lint_package()

Check CRAN-specific requirements
- Validate URLs in DESCRIPTION and documentation
- Check examples run in < 5 seconds
- Verify package structure meets CRAN standards

Expected outcome: Clean

R CMD check

with no slop patterns

Mandatory Rules Summary

1. Namespace Qualification

ALWAYS use

::

for external packages

Exceptions (don't need

::

Base R:
```
mean()
```
,
```
sum()
```
,
```
log()
```
, etc.
stats:
```
lm()
```
,
```
glm()
```
,
```
t.test()
```
, etc.
utils:
```
head()
```
,
```
tail()
```
,
```
str()
```
, etc.

2. Explicit Returns

ALWAYS use

return()

- never implicit

3. Naming: snake_case

All objects use

snake_case

Variables:
```
customer_data
```
not
```
customerData
```
or
```
df
```
Functions:
```
calculate_rate
```
not
```
calculateRate
```
Arguments:
```
input_data
```
not
```
inputData
```

4. Native Pipe

Prefer

|>

over
%>%
(unless R < 4.1)

5. No Generic Names

Never use:

df

data

result

temp

(except standard math notation)

Tidyverse Philosophy

Follow Tidyverse Style Guide as primary reference:

Design for humans - Code should be readable and intuitive
Reuse existing data structures - Work with tibbles and data frames
Compose simple functions with pipes - Build complexity through composition
Embrace functional programming - Functions are first-class objects

See reference/tidyverse.md for complete tidyverse conventions.

Resources & Advanced Topics

Reference Files

reference/naming.md - Complete naming conventions and forbidden patterns
reference/tidyverse.md - Pipe conventions, formatting, ggplot2 standards
reference/documentation.md - Roxygen2, vignettes, README quality
reference/statistical-rigor.md - Validation, uncertainty, reproducibility
reference/forbidden-patterns.md - Complete antipattern catalog

Related Skills

text/anti-slop - For cleaning prose in documentation
quarto/anti-slop - For cleaning vignettes and documentation

Tools

```
styler::style_file()
```
- Auto-format code
```
lintr::lint()
```
- Check code quality
```
Rscript toolkit/scripts/detect_slop.R
```
- Detect AI patterns

Integration with Posit Skills

This skill focuses on code quality and avoiding generic patterns.

Use together with Posit skills for complete coverage:

Task	Use This Skill	+ Posit Skill
Write error messages	r/anti-slop (quality)	+ r-lib/cli (structure)
Write tests	r/anti-slop (code quality)	+ r-lib/testing (test patterns)
Prepare for CRAN	r/anti-slop (no slop)	+ r-lib/cran-extrachecks (requirements)
Document lifecycle	r/anti-slop (doc quality)	+ r-lib/lifecycle (deprecation)

Claude-skill-registry r-anti-slop

R Anti-Slop: Stop Writing df <- data

When to Use This

Quick Example

When to Use What

Core Workflow

5-Step Quality Check

Quick Reference Checklist

Common Workflows

Workflow 1: Clean Up AI-Generated R Script

Workflow 2: Fix Generic Package Documentation

Workflow 3: Prepare Package for CRAN

Mandatory Rules Summary

1. Namespace Qualification

2. Explicit Returns

3. Naming: snake_case

4. Native Pipe

5. No Generic Names

Tidyverse Philosophy

Resources & Advanced Topics

Reference Files

Related Skills

Tools

Integration with Posit Skills

R Anti-Slop: Stop Writing
`df <- data`