install
source · Clone the upstream repo
git clone https://github.com/plurigrid/asi
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/kscale-kinfer" ~/.claude/skills/plurigrid-asi-kscale-kinfer && rm -rf "$T"
manifest:
skills/kscale-kinfer/SKILL.mdsource content
K-Scale kinfer Skill
"The K-Scale model export and inference tool"
Trigger Conditions
- User asks about deploying RL policies to real robots
- Questions about ONNX model inference, Rust ML runtime
- Policy execution on embedded systems
- Real-time neural network inference
Overview
kinfer is K-Scale's model inference engine for deploying trained policies:
- Model Loading: ONNX format support via
(ONNX Runtime)ort - Real-time Execution: Rust implementation for low latency
- Logging: NDJSON telemetry for debugging
- Integration: Seamless connection with KOS firmware
Architecture
┌─────────────────────────────────────────────────────────────────────────┐ │ kinfer Inference Pipeline │ │ │ │ ┌──────────────┐ load ┌──────────────┐ │ │ │ ONNX Model │───────────────▶│ Runtime │ │ │ │ (.onnx) │ │ (ort-sys) │ │ │ └──────────────┘ └──────┬───────┘ │ │ │ │ │ ┌──────────────┐ step ┌──────┴───────┐ output │ │ │ Observation │───────────────▶│ Inference │───────────────▶Action │ │ │ (sensors) │ │ Engine │ │ │ └──────────────┘ └──────────────┘ │ │ │ │ │ ▼ │ │ ┌──────────────┐ │ │ │ Logger │ │ │ │ (NDJSON) │ │ │ └──────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘
Key Features
1. Single Tokio Runtime
// Efficient async execution with GIL management lazy_static! { static ref RUNTIME: Runtime = Runtime::new().unwrap(); }
2. Pre-fetch Inputs
// Minimize latency by preparing inputs ahead of time fn step_and_take_action(&mut self, observation: &[f32]) -> Vec<f32> { // Pre-fetch next input while processing current ... }
3. NDJSON Logging
// Async logging thread for telemetry struct Logger { file: File, tx: Sender<LogEntry>, }
Language & Stack
- Primary: Rust (performance-critical)
- ML Runtime: ONNX Runtime (
,ort
)ort-sys - Async: Tokio for non-blocking I/O
- Bindings: Python via PyO3
GF(3) Trit Assignment
Trit: -1 (MINUS) Role: Verification/Validation (inference must be correct) Color: #6E5FE4 URI: skill://kscale-kinfer#6E5FE4
Balanced Triads
kscale-kinfer (-1) ⊗ kscale-ksim (0) ⊗ onnx-export (+1) = 0 ✓ kscale-kinfer (-1) ⊗ rust-ml (0) ⊗ policy-training (+1) = 0 ✓
Key Contributors
| Contributor | Focus Areas |
|---|---|
| b-vm | Step function, command names |
| codekansas | Performance, refactoring |
| WT-MM | Logging, env variables |
| alik-git | NDJSON logging, plotting |
| nfreq | Tokio runtime, GIL management |
Example Usage
import kinfer # Load model model = kinfer.load_model("walking_policy.onnx") # Get observation from sensors obs = get_sensor_data() # Run inference action = model.step(obs) # Apply to actuators apply_action(action)
Rust API
use kinfer::InferenceEngine; let mut engine = InferenceEngine::load("policy.onnx")?; loop { let obs = get_observation(); let action = engine.step_and_take_action(&obs); send_to_actuators(&action); }
ACSet Schema
@present SchKinfer(FreeSchema) begin # Objects Model::Ob # ONNX model Tensor::Ob # Input/output tensors Runtime::Ob # ONNX Runtime session LogEntry::Ob # Telemetry records # Morphisms (inference pipeline) load::Hom(Model, Runtime) # Model → Runtime loading input::Hom(Tensor, Runtime) # Observation → Runtime output::Hom(Runtime, Tensor) # Runtime → Action step::Hom(Tensor, Tensor) # obs → action (composition) # Morphisms (logging) log::Hom(Runtime, LogEntry) # Runtime → Telemetry # Attributes Shape::AttrType Dtype::AttrType Latency::AttrType shape::Attr(Tensor, Shape) dtype::Attr(Tensor, Dtype) latency::Attr(Runtime, Latency) # Key constraint: deterministic inference # step = output ∘ input (functorial) # Same input → same output (reproducibility) end
References
- kscalelabs/kinfer - Main repository (17 stars)
- kscalelabs/kinfer-sim - Simulation visualization
- ONNX Runtime - Inference backend