Awesome-omni-skill Node Tuning Helper Scripts

Generate tuned manifests and evaluate node tuning snapshots

install

source · Clone the upstream repo

git clone https://github.com/diegosouzapw/awesome-omni-skill

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/development/node-tuning-helper-scripts-majiayu000" ~/.claude/skills/diegosouzapw-awesome-omni-skill-node-tuning-helper-scripts && rm -rf "$T"

manifest: skills/development/node-tuning-helper-scripts-majiayu000/SKILL.md

source content

Node Tuning Helper Scripts

Detailed instructions for invoking the helper utilities that back

/node-tuning

commands:

generate_tuned_profile.py

renders Tuned manifests (

tuned.openshift.io/v1

```
analyze_node_tuning.py
```
inspects live nodes or sosreports for tuning gaps.

When to Use These Scripts

Translate structured command inputs into Tuned manifests for the Node Tuning Operator.
Iterate on generated YAML outside the assistant or integrate the generator into automation.
Analyze CPU isolation, IRQ affinity, huge pages, sysctl values, and networking counters from live clusters or archived sosreports.

Prerequisites

Python 3.8 or newer (
```
python3 --version
```
).
Repository checkout so the scripts under
```
plugins/node-tuning/skills/scripts/
```
are accessible.
Optional:
```
oc
```
CLI when validating or applying manifests.
Optional: Extracted sosreport directory when running the analysis script offline.
Optional (remote analysis):
```
oc
```
CLI access plus a valid
```
KUBECONFIG
```
when capturing
```
/proc
```
/
```
/sys
```
or sosreport via
```
oc debug node/<name>
```
. The sosreport workflow pulls the
```
registry.redhat.io/rhel9/support-tools
```
image (override with
```
--toolbox-image
```
or
```
TOOLBOX_IMAGE
```
) and requires registry access. HTTP(S) proxy env vars from the host are forwarded automatically when present, but using a proxy is optional.

Script:

generate_tuned_profile.py

Implementation Steps

Collect Inputs

```
--profile-name
```
: Tuned resource name.
```
--summary
```
:
```
[main]
```
section summary.

Repeatable options:

--include

--main-option

--variable

--sysctl

--section

(

SECTION:KEY=VALUE

Target selectors:

--machine-config-label key=value

--match-label key[=value]

Optional:
```
--priority
```
(default 20),
```
--namespace
```
,
```
--output
```
,
```
--dry-run
```
.

Use

--list-nodes

--node-selector

to inspect nodes and

--label-node NODE:KEY[=VALUE]

(plus

--overwrite-labels

) to tag machines.

Inspect or Label Nodes (optional)

# List all worker nodes
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py --list-nodes --node-selector "node-role.kubernetes.io/worker" --skip-manifest

# Label a specific node for the worker-hp pool
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
  --label-node ip-10-0-1-23.ec2.internal:node-role.kubernetes.io/worker-hp= \
  --overwrite-labels \
  --skip-manifest

Render the Manifest

python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
  --profile-name "$PROFILE" \
  --summary "$SUMMARY" \
  --sysctl net.core.netdev_max_backlog=16384 \
  --match-label tuned.openshift.io/custom-net \
  --output .work/node-tuning/$PROFILE/tuned.yaml

Omit
```
--output
```
to write
```
<profile-name>.yaml
```
in the current directory.
Add
```
--dry-run
```
to print the manifest to stdout.

Review Output
- Inspect the generated YAML for accuracy.
- Optionally format with
```
yq
```
  or open in an editor for readability.

Validate and Apply

Dry-run:

oc apply --server-dry-run=client -f <manifest>

Apply:
```
oc apply -f <manifest>
```
.

Error Handling

Missing required options raise
```
ValueError
```
with descriptive messages.
The script exits non-zero when no target selectors (
```
--machine-config-label
```
or
```
--match-label
```
) are supplied.
Invalid key/value or section inputs identify the failing argument explicitly.

Examples

python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
  --profile-name realtime-worker \
  --summary "Realtime tuned profile" \
  --include openshift-node --include realtime \
  --variable isolated_cores=1 \
  --section bootloader:cmdline_ocp_realtime=+systemd.cpu_affinity=${not_isolated_cores_expanded} \
  --machine-config-label machineconfiguration.openshift.io/role=worker-rt \
  --priority 25 \
  --output .work/node-tuning/realtime-worker/tuned.yaml

python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
  --profile-name openshift-node-hugepages \
  --summary "Boot time configuration for hugepages" \
  --include openshift-node \
  --section bootloader:cmdline_openshift_node_hugepages="hugepagesz=2M hugepages=50" \
  --machine-config-label machineconfiguration.openshift.io/role=worker-hp \
  --priority 30 \
  --output .work/node-tuning/openshift-node-hugepages/hugepages-tuned-boottime.yaml

Script:

analyze_node_tuning.py

Purpose

Inspect either a live node (

/proc

/sys

) or an extracted sosreport snapshot for tuning signals (CPU isolation, IRQ affinity, huge pages, sysctl state, networking counters) and emit actionable recommendations.

Usage Patterns

Live node analysis

python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py --format markdown

Remote analysis via oc debug

python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py \
  --node worker-rt-0 \
  --kubeconfig ~/.kube/prod \
  --format markdown

Collect sosreport via oc debug and analyze locally

python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py \
  --node worker-rt-0 \
  --toolbox-image registry.example.com/support-tools:latest \
  --sosreport-arg "--case-id=01234567" \
  --sosreport-output .work/node-tuning/sosreports \
  --format json

Offline sosreport analysis

python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py \
  --sosreport /path/to/sosreport-2025-10-20

Automation-friendly JSON

python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py \
  --sosreport /path/to/sosreport \
  --format json --output .work/node-tuning/node-analysis.json

Implementation Steps

Select data source
- Provide
```
--node <name>
```
  (with optional
```
--kubeconfig
```
  /
```
--oc-binary
```
  ). By default the helper runs
```
sosreport
```
  remotely from inside the RHCOS toolbox container (
```
registry.redhat.io/rhel9/support-tools
```
  ). Override the image with
```
--toolbox-image
```
  , extend the sosreport command with
```
--sosreport-arg
```
  , or disable the curated OpenShift flags via
```
--skip-default-sosreport-flags
```
  . Pass
```
--no-collect-sosreport
```
  to fall back to the direct
```
/proc
```
  snapshot mode.
- Provide
```
--sosreport <dir>
```
  for archived diagnostics; detection finds embedded
```
proc/
```
  and
```
sys/
```
  .
- Omit both switches to query the live filesystem (defaults to
```
/proc
```
  and
```
/sys
```
  ).
- Override paths with
```
--proc-root
```
  or
```
--sys-root
```
  when the layout differs.
Run analysis
- The script parses
```
cpuinfo
```
  , kernel cmdline parameters (
```
isolcpus
```
  ,
```
nohz_full
```
  ,
```
tuned.non_isolcpus
```
  ), default IRQ affinities, huge page counters, sysctl values (net, vm, kernel), transparent hugepage settings,
```
netstat
```
  /
```
sockstat
```
  counters, and
```
ps
```
  snapshots (when available in sosreport).
Review the report
- Markdown output groups findings by section (System Overview, CPU & Isolation, Huge Pages, Sysctl Highlights, Network Signals, IRQ Affinity, Process Snapshot) and lists recommendations.
- JSON output contains the same information in structured form for pipelines or dashboards.
Act on recommendations
- Apply Tuned profiles, MachineConfig updates, or manual sysctl/irqbalance adjustments.
- Feed actionable items back into
```
/node-tuning:generate-tuned-profile
```
  to codify desired state.

Error Handling

Missing
```
proc/
```
or
```
sys/
```
directories trigger descriptive errors.
Unreadable files are skipped gracefully and noted in observations where relevant.
Non-numeric sysctl values are flagged for manual investigation.

Example Output (Markdown excerpt)

# Node Tuning Analysis

## System Overview
- Hostname: worker-rt-1
- Kernel: 4.18.0-477.el8
- NUMA nodes: 2
- Kernel cmdline: `BOOT_IMAGE=... isolcpus=2-15 tuned.non_isolcpus=0-1`

## CPU & Isolation
- Logical CPUs: 32
- Physical cores: 16 across 2 socket(s)
- SMT detected: yes
- Isolated CPUs: 2-15
...

## Recommended Actions
- Configure net.core.netdev_max_backlog (>=32768) to accommodate bursty NIC traffic.
- Transparent Hugepages are not disabled (`[never]` not selected). Consider setting to `never` for latency-sensitive workloads.
- 4 IRQs overlap isolated CPUs. Relocate interrupt affinities using tuned profiles or irqbalance.

Follow-up Automation Ideas

Persist JSON results in
```
.work/node-tuning/<host>/analysis.json
```
for historical tracing.
Gate upgrades by comparing recommendations across nodes.
Integrate with CI jobs that validate cluster tuning post-change.

Awesome-omni-skill Node Tuning Helper Scripts

Node Tuning Helper Scripts

When to Use These Scripts

Prerequisites

Script:
`generate_tuned_profile.py`

Implementation Steps

Error Handling

Examples

Script:
`analyze_node_tuning.py`

Purpose

Usage Patterns

Implementation Steps

Error Handling

Example Output (Markdown excerpt)

Follow-up Automation Ideas