Agent-almanac setup-gxp-r-project

install
source · Clone the upstream repo
git clone https://github.com/pjt222/agent-almanac
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/pjt222/agent-almanac "$T" && mkdir -p ~/.claude/skills && cp -r "$T/i18n/caveman-ultra/skills/setup-gxp-r-project" ~/.claude/skills/pjt222-agent-almanac-setup-gxp-r-project-78dc35 && rm -rf "$T"
manifest: i18n/caveman-ultra/skills/setup-gxp-r-project/SKILL.md
source content

Set Up GxP R Project

Create an R project structure that meets GxP regulatory requirements for validated computing.

When to Use

  • Starting an R analysis project in a regulated environment (pharma, biotech, medical devices)
  • Setting up R for use in clinical trial analysis
  • Creating a validated computing environment for regulatory submissions
  • Implementing 21 CFR Part 11 or EU Annex 11 requirements

Inputs

  • Required: Project scope and regulatory framework (FDA, EMA, or both)
  • Required: R version and package versions to validate
  • Required: Validation strategy (risk-based approach)
  • Optional: Existing SOPs for computerized systems
  • Optional: Quality management system integration requirements

Procedure

Step 1: Create Validated Project Structure

gxp-project/
├── R/                          # Analysis scripts
│   ├── 01_data_import.R
│   ├── 02_data_processing.R
│   └── 03_analysis.R
├── validation/                 # Validation documentation
│   ├── validation_plan.md      # VP: scope, strategy, roles
│   ├── risk_assessment.md      # Risk categorization
│   ├── iq/                     # Installation Qualification
│   │   ├── iq_protocol.md
│   │   └── iq_report.md
│   ├── oq/                     # Operational Qualification
│   │   ├── oq_protocol.md
│   │   └── oq_report.md
│   ├── pq/                     # Performance Qualification
│   │   ├── pq_protocol.md
│   │   └── pq_report.md
│   └── traceability_matrix.md  # Requirements to tests mapping
├── tests/                      # Automated test suite
│   ├── testthat.R
│   └── testthat/
│       ├── test-data_import.R
│       └── test-analysis.R
├── data/                       # Input data (controlled)
│   ├── raw/                    # Immutable raw data
│   └── derived/                # Processed datasets
├── output/                     # Analysis outputs
├── docs/                       # Supporting documentation
│   ├── sop_references.md       # Links to relevant SOPs
│   └── change_log.md           # Manual change documentation
├── renv.lock                   # Locked dependencies
├── DESCRIPTION                 # Project metadata
├── .Rprofile                   # Session configuration
└── CLAUDE.md                   # AI assistant instructions

Expected: The complete directory structure exists with

R/
,
validation/
(including
iq/
,
oq/
,
pq/
subdirectories),
tests/testthat/
,
data/raw/
,
data/derived/
,
output/
, and
docs/
directories.

On failure: If directories are missing, create them with

mkdir -p
. Verify you are in the correct project root. For existing projects, create only the missing directories rather than overwriting existing structure.

Step 2: Create Validation Plan

Create

validation/validation_plan.md
:

# Validation Plan

## 1. Purpose
This plan defines the validation strategy for [Project Name] using R [version].

## 2. Scope
- R version: 4.5.0
- Packages: [list with versions]
- Analysis: [description]
- Regulatory framework: 21 CFR Part 11 / EU Annex 11

## 3. Risk Assessment Approach
Using GAMP 5 risk-based categories:
- Category 3: Non-configured products (R base)
- Category 4: Configured products (R packages with default settings)
- Category 5: Custom applications (custom R scripts)

## 4. Validation Activities
| Activity | Category 3 | Category 4 | Category 5 |
|----------|-----------|-----------|-----------|
| IQ | Required | Required | Required |
| OQ | Reduced | Standard | Enhanced |
| PQ | N/A | Standard | Enhanced |

## 5. Roles and Responsibilities
- Validation Lead: [Name]
- Developer: [Name]
- QA Reviewer: [Name]
- Approver: [Name]

## 6. Acceptance Criteria
All tests must pass with documented evidence.

Expected:

validation/validation_plan.md
is complete with scope, GAMP 5 risk categories, validation activities matrix, roles and responsibilities, and acceptance criteria. The plan references the specific R version and regulatory framework.

On failure: If the regulatory framework is unclear, consult the organization's QA department for applicable SOPs. Do not proceed with validation activities until the plan is reviewed and approved.

Step 3: Lock Dependencies with renv

# Initialize renv with exact versions
renv::init()

# Install specific validated versions
renv::install("dplyr@1.1.4")
renv::install("ggplot2@3.5.0")

# Snapshot
renv::snapshot()

The

renv.lock
file serves as the controlled package inventory.

Expected:

renv.lock
exists with exact version numbers for all required packages.
renv::status()
reports no issues. Every package version is pinned (e.g.,
dplyr@1.1.4
), not floating.

On failure: If

renv::install()
fails for a specific version, check that the version exists on CRAN archives. Use
renv::install("package@version", repos = "https://packagemanager.posit.co/cran/latest")
for archived versions.

Step 4: Implement Version Control

git init
git add .
git commit -m "Initial validated project structure"

# Use signed commits for traceability
git config user.signingkey YOUR_GPG_KEY
git config commit.gpgsign true

Expected: The project is under git version control with signed commits enabled. The initial commit contains the validated project structure and

renv.lock
.

On failure: If GPG signing fails, verify the GPG key is configured with

gpg --list-secret-keys
. For environments without GPG, document the deviation and use unsigned commits with manual audit trail entries in
docs/change_log.md
.

Step 5: Create IQ Protocol

validation/iq/iq_protocol.md
:

# Installation Qualification Protocol

## Objective
Verify that R and required packages are correctly installed.

## Test Cases

### IQ-001: R Version Verification
- **Requirement**: R 4.5.0 installed
- **Procedure**: Execute `R.version.string`
- **Expected:** "R version 4.5.0 (date)"
- **Result**: [ PASS / FAIL ]

### IQ-002: Package Installation Verification
- **Requirement**: All packages in renv.lock installed
- **Procedure**: Execute `renv::status()`
- **Expected:** "No issues found"
- **Result**: [ PASS / FAIL ]

### IQ-003: Package Version Verification
- **Procedure**: Execute `installed.packages()[, c("Package", "Version")]`
- **Expected:** Versions match renv.lock exactly
- **Result**: [ PASS / FAIL ]

Expected:

validation/iq/iq_protocol.md
contains test cases for R version verification, package installation verification, and package version verification, each with clear expected results and pass/fail fields.

On failure: If the IQ protocol template does not match organizational SOP requirements, adapt the format while retaining the required fields (requirement, procedure, expected result, actual result, pass/fail). Consult QA for approved templates.

Step 6: Write Automated OQ/PQ Tests

# tests/testthat/test-analysis.R
test_that("primary analysis produces validated results", {
  # Known input -> known output (double programming validation)
  test_data <- read.csv(test_path("fixtures", "validation_dataset.csv"))

  result <- primary_analysis(test_data)

  # Compare against independently calculated expected values
  expect_equal(result$estimate, 2.345, tolerance = 1e-3)
  expect_equal(result$p_value, 0.012, tolerance = 1e-3)
  expect_equal(result$ci_lower, 1.234, tolerance = 1e-3)
})

Expected: Automated test files exist in

tests/testthat/
covering OQ (operational verification of each function) and PQ (end-to-end validation against independently calculated reference values). Tests use explicit numeric tolerances.

On failure: If reference values are not yet available from independent calculation (e.g., SAS), create placeholder tests with

skip("Awaiting independent reference values")
and document in the traceability matrix.

Step 7: Create Traceability Matrix

# Traceability Matrix

| Req ID | Requirement | Test ID | Test Description | Status |
|--------|-------------|---------|------------------|--------|
| REQ-001 | Import CSV data correctly | OQ-001 | Verify data dimensions and types | PASS |
| REQ-002 | Calculate primary endpoint | PQ-001 | Compare against reference results | PASS |
| REQ-003 | Generate report output | PQ-002 | Verify report contains all sections | PASS |

Expected:

validation/traceability_matrix.md
links every requirement to at least one test case, and every test case is linked to a requirement. No orphaned requirements or tests.

On failure: If requirements are untested, create test cases for them or document a risk-based justification for exclusion. If tests have no linked requirement, either link them to an existing requirement or remove them as out-of-scope.

Validation

  • Project structure follows documented template
  • renv.lock contains all dependencies with exact versions
  • Validation plan is complete and approved
  • IQ protocol executes successfully
  • OQ test cases cover all configured functionality
  • PQ tests validate against independently computed results
  • Traceability matrix links requirements to tests
  • Change control process is documented

Common Pitfalls

  • Using
    install.packages()
    without version pinning
    : Always use renv with locked versions
  • Missing audit trail: Every change must be documented. Use git signed commits.
  • Over-validating: Apply risk-based approach. Not every CRAN package needs Category 5 validation.
  • Forgetting system-level qualification: The OS and R installation need IQ too
  • No independent verification: PQ should compare against results computed independently (SAS, manual calculation)

Related Skills

  • write-validation-documentation
    - detailed validation document creation
  • implement-audit-trail
    - electronic records and audit trails
  • validate-statistical-output
    - double programming and output validation
  • manage-renv-dependencies
    - dependency locking for validated environments