Agent-almanac generate-statistical-tables
git clone https://github.com/pjt222/agent-almanac
T=$(mktemp -d) && git clone --depth=1 https://github.com/pjt222/agent-almanac "$T" && mkdir -p ~/.claude/skills && cp -r "$T/i18n/caveman/skills/generate-statistical-tables" ~/.claude/skills/pjt222-agent-almanac-generate-statistical-tables-32e49e && rm -rf "$T"
i18n/caveman/skills/generate-statistical-tables/SKILL.mdGenerate Statistical Tables
Create publication-ready statistical tables for reports and manuscripts.
When to Use
- Creating descriptive statistics tables
- Formatting regression or ANOVA output
- Building correlation matrices
- Producing APA-style tables for academic papers
- Generating tables for Quarto/R Markdown documents
Inputs
- Required: Statistical analysis results (model objects, summary data)
- Required: Output format (HTML, PDF, Word)
- Optional: Style guide (APA, journal-specific)
- Optional: Table numbering scheme
Procedure
Step 1: Choose Table Package
| Package | Best for | Formats |
|---|---|---|
| HTML, general-purpose | HTML, PDF, Word |
| LaTeX/PDF documents | PDF, HTML |
| Word documents | Word, PDF, HTML |
| Clinical/statistical summaries | All via gt/flextable |
Expected: A table package selected based on the output format and use case. The chosen package is installed and loadable.
On failure: If the required package is not installed, run
install.packages("gt") (or the appropriate package). For gtsummary, both gt and gtsummary must be installed.
Step 2: Descriptive Statistics Table
library(gt) descriptives <- data |> group_by(group) |> summarise( n = n(), M = mean(score, na.rm = TRUE), SD = sd(score, na.rm = TRUE), Min = min(score, na.rm = TRUE), Max = max(score, na.rm = TRUE) ) gt(descriptives) |> tab_header( title = "Table 1", subtitle = "Descriptive Statistics by Group" ) |> fmt_number(columns = c(M, SD), decimals = 2) |> fmt_number(columns = c(Min, Max), decimals = 1) |> cols_label( group = "Group", n = md("*n*"), M = md("*M*"), SD = md("*SD*") )
Expected: A
gt table object with formatted means, SDs, and counts grouped by category. Column headers use proper statistical notation (italicized M, SD, n).
On failure: If
group_by() produces unexpected results, verify the grouping variable exists and has the expected levels. If fmt_number() throws an error, ensure the target columns contain numeric data.
Step 3: Regression Results Table
model <- lm(outcome ~ predictor1 + predictor2 + predictor3, data = data) library(gtsummary) tbl_regression(model) |> bold_p() |> add_glance_source_note( include = c(r.squared, adj.r.squared, nobs) ) |> modify_header(label = "**Predictor**") |> modify_caption("Table 2: Regression Results")
Expected: A
gtsummary regression table with bolded p-values, model fit statistics (R-squared, N) in a source note, and a descriptive caption.
On failure: If
tbl_regression() fails, verify the input is a model object (e.g., lm, glm). If add_glance_source_note() errors, check that broom can tidy the model: broom::glance(model).
Step 4: Correlation Matrix
library(gt) cor_matrix <- cor(data[, c("var1", "var2", "var3", "var4")], use = "pairwise.complete.obs") # Format lower triangle cor_matrix[upper.tri(cor_matrix)] <- NA as.data.frame(cor_matrix) |> tibble::rownames_to_column("Variable") |> gt() |> fmt_number(decimals = 2) |> sub_missing(missing_text = "") |> tab_header(title = "Table 3", subtitle = "Correlation Matrix")
Expected: A lower-triangle correlation matrix rendered as a
gt table with blanked upper triangle, two decimal places, and a clear caption.
On failure: If
sub_missing() does not blank the upper triangle, verify that NA values were set correctly with cor_matrix[upper.tri(cor_matrix)] <- NA. If variables are non-numeric, cor() will fail; filter to numeric columns first.
Step 5: ANOVA Table
aov_result <- aov(score ~ group * condition, data = data) library(gtsummary) tbl_anova <- broom::tidy(aov_result) |> gt() |> fmt_number(columns = c(sumsq, meansq, statistic), decimals = 2) |> fmt_number(columns = p.value, decimals = 3) |> cols_label( term = "Source", df = md("*df*"), sumsq = md("*SS*"), meansq = md("*MS*"), statistic = md("*F*"), p.value = md("*p*") ) |> tab_header(title = "Table 4", subtitle = "ANOVA Results")
Expected: A formatted ANOVA table with Source, df, SS, MS, F, and p columns. Interaction terms are clearly labeled and p-values are formatted to three decimal places.
On failure: If
broom::tidy(aov_result) produces unexpected columns, verify the model is an aov object. For Type III sums of squares, use car::Anova(model, type = 3) instead of base aov().
Step 6: Save Tables
# Save as HTML gtsave(my_table, "table1.html") # Save as Word gtsave(my_table, "table1.docx") # Save as PNG image gtsave(my_table, "table1.png") # For LaTeX/PDF (kableExtra) kableExtra::save_kable(kable_table, "table1.pdf")
Expected: Table saved to the specified file format (HTML, Word, PNG, or PDF). The output file opens correctly in the appropriate application.
On failure: If
gtsave() fails for Word format, ensure the webshot2 package is installed. For PDF output via kableExtra, ensure a LaTeX distribution (TinyTeX or MiKTeX) is installed.
Step 7: Embed in Quarto Document
```{r} #| label: tbl-descriptives #| tbl-cap: "Descriptive Statistics by Group" gt(descriptives) |> fmt_number(columns = c(M, SD), decimals = 2) ``` See @tbl-descriptives for summary statistics.
Expected: The table renders inline in the Quarto document with a cross-referenceable label (
@tbl-*) and a proper caption. The table adapts to the document's output format automatically.
On failure: If the table does not render, verify the chunk label starts with
tbl- for Quarto cross-referencing. If formatting is lost in PDF, switch from gt to kableExtra for LaTeX-based output.
Validation
- Table renders correctly in target format (HTML, PDF, Word)
- Numbers are formatted consistently (decimal places, alignment)
- Statistical notation follows the style guide (italicized, proper symbols)
- Table has a clear caption and numbering
- Column headers are meaningful
- Notes/footnotes explain abbreviations or significance markers
Common Pitfalls
- gt in PDF: gt has limited PDF support. Use kableExtra for LaTeX-heavy documents.
- Rounding inconsistency: Always use
(gt) orfmt_number()
rather thanformat()
for displayround() - Missing values display: Configure with
in gt orsub_missing()options(knitr.kable.NA = "") - Wide tables in PDF: Tables exceeding page width need
or font size reductionlandscape() - APA number formatting: No leading zero for values bounded by 1 (p-values, correlations): ".03" not "0.03"
Related Skills
- tables within APA manuscriptsformat-apa-report
- embedding tables in reportscreate-quarto-report
- tables that adapt to parametersbuild-parameterized-report