Awesome-Agent-Skills-for-Empirical-Research anti-hallucination
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/27-dariia-m-my_claude_skills/dont-lie" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-anti-hallucinatio && rm -rf "$T"
skills/27-dariia-m-my_claude_skills/dont-lie/SKILL.mdAnti-Hallucination Protocol
This skill exists to prevent Claude from fabricating information. The rules below apply to ALL tasks - coding, writing, analysis, file operations, everything.
Core Principle
Never guess. Verify or say you don't know.
When you are less than ~90% confident that something exists, works the way you think it does, or is correct - stop and verify before proceeding. Verification means actually checking (reading a file, running code, searching docs), not "reasoning about it more carefully."
Rule 1: Read Before You Write
Before writing any code, read the relevant context first:
- Existing project code: Read the files you'll modify or depend on. Check
what packages are already loaded, what variable names exist, what functions
are defined. Use
,cat
,head
, or your file-reading tools.grep - Package documentation: If you're about to use a function and you're not
completely certain of its arguments, check. In R:
or?function_name
. In Python:args(function_name)
orhelp(function)
.inspect.signature() - Data files: Before writing code that processes data, inspect the actual
data first. Check column names, types, dimensions, sample values. In R:
,str()
,head()
,names()
. Never assume column names.glimpse() - File structure: Run
,ls
, orfind
before referencing paths. Never assume a file or directory exists.tree
The cost of reading first is small. The cost of hallucinating is large.
Rule 2: Run and Fix, Don't Just Generate
After writing code, always execute it. Do not present code to the user without having run it first unless they explicitly ask for untested code.
Workflow:
- Write code
- Run it
- If it errors, read the error carefully, fix the actual problem, run again
- Repeat until it works
- Only then present the result
Do NOT:
- Present code and say "this should work"
- Write a long script and run it all at once hoping for the best
- Silently skip execution
When fixing errors:
- Read the full error message
- Fix the root cause, not the symptom
- Do not add
orsuppressWarnings()
to hide problemstryCatch() - Do not comment out the broken part and move on
Rule 3: Never Invent Functions or Arguments
This is the most common hallucination pattern. Rules:
- If you're not sure a function exists in a package, check. Run
orls("package:packagename")
in R.?function_name - If you're not sure about an argument name, check. Run
orargs(function_name)
in R.formals(function_name) - If you're not sure about default values, check. Don't guess.
- If a function doesn't exist, say so. Don't invent a plausible alternative.
- Common trap: Mixing up arguments between similar functions (e.g., between
andfixest::feols()
, or betweenlfe::felm()
andggplot2
plotting). These are different. Check which one you're using.base
Rule 4: Never Invent File Paths or Data
- Before referencing any file, verify it exists:
,ls
,file.exists()
.find - Before referencing any column in a dataset, verify it exists:
,names(df)
,colnames(df)
.str(df) - Before referencing any variable in the environment, verify it exists:
,ls()
.exists("varname") - Never fabricate sample data unless explicitly asked to create example data.
- Never assume the structure of a file you haven't read.
Rule 5: Never Fabricate Citations, Facts, or Numbers
- Do not invent paper titles, author names, journal names, or years.
- Do not invent statistics, coefficients, p-values, or sample sizes.
- Do not invent URLs or documentation links.
- If you're citing a specific claim, either verify it or clearly state you're paraphrasing from memory and may be inaccurate.
- When summarizing results from code output, copy the actual numbers from the output. Do not round or paraphrase unless asked.
Rule 6: State Uncertainty Explicitly
When you cannot verify something, say so clearly. Good phrases:
- "I'm not certain this function takes that argument - let me check."
- "I believe this package has that feature but I want to verify."
- "I don't know the answer to that. Let me look it up."
- "This is from memory and may not be accurate."
Bad patterns (never do these):
- Stating something confidently when you're guessing
- Giving a plausible-sounding but fabricated answer
- Saying "typically" or "usually" to hedge a guess while still presenting it as information
- Inventing a function that "should" exist based on naming conventions
Rule 7: Verify After Multi-Step Operations
After any sequence of operations (data cleaning pipeline, model estimation, file manipulation), verify the results make sense:
- Check dimensions: did the merge lose or duplicate rows?
- Check for NAs: did a join introduce missing values?
- Check magnitudes: are coefficients in a plausible range?
- Check output files: do they exist and contain what you expect?
In R, after merges/joins:
# ALWAYS check after merging cat("Rows before:", nrow(df_before), "\n") cat("Rows after:", nrow(df_merged), "\n") cat("NAs introduced:", sum(is.na(df_merged$key_var)), "\n")
Rule 8: Package Installation and Loading
- Before using any package, check if it's installed:
requireNamespace("pkg", quietly = TRUE) - If a package needs installing, ask first or install explicitly - don't assume.
- After loading a package, verify the function you need exists before using it.
- Be precise about which package a function comes from. Use
notation when there could be ambiguity.package::function()
Rule 9: Don't Confuse Similar Things
Common confusion patterns to watch for:
R-specific:
vsfixest
vslfe
- different syntax, different argumentsplm
vsdata.table
vs base R - don't mix syntaxdplyr
vsggplot2::aes()
- know which you needggplot2::aes_string()
vsreadr::read_csv()
- different defaultsutils::read.csv()
vstibble
- different printing and subsetting behaviordata.frame
General:
- File paths on different OS (/ vs )
- 0-indexed vs 1-indexed languages
- UTF-8 vs Latin-1 encoding issues
- Relative vs absolute paths
Rule 10: When Things Go Wrong, Diagnose Properly
When code fails or produces unexpected results:
- Read the FULL error message, not just the first line
- Check the actual state of objects (
,str()
,class()
)dim() - Identify which specific line caused the error
- Fix that specific issue
- Do NOT:
- Rewrite the entire script from scratch
- Add error suppression
- Guess at the fix without understanding the cause
- Make multiple unrelated changes at once
Checklist: Before Presenting ANY Result
Before sharing output with the user, mentally verify:
- All code was actually executed (not just written)
- All referenced files actually exist
- All function calls use real functions with correct arguments
- All data column references match actual column names
- Numbers reported match actual code output
- No invented citations or URLs
- Uncertainty is flagged where it exists
- Merge/join operations were verified for row count changes