install
source · Clone the upstream repo
git clone https://github.com/SharpAI/DeepCamera
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/SharpAI/DeepCamera "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/detection/yolo-detection-2026" ~/.claude/skills/sharpai-deepcamera-yolo-detection-2026 && rm -rf "$T"
manifest:
skills/detection/yolo-detection-2026/SKILL.mdsource content
YOLO 2026 Object Detection
Real-time object detection using the latest YOLO 2026 models. Detects 80+ COCO object classes including people, vehicles, animals, and everyday objects. Outputs bounding boxes with labels and confidence scores.
Model Sizes
| Size | Speed | Accuracy | Best For |
|---|---|---|---|
| nano | Fastest | Good | Real-time on CPU, edge devices |
| small | Fast | Better | Balanced speed/accuracy |
| medium | Moderate | High | Accuracy-focused deployments |
| large | Slower | Highest | Maximum detection quality |
Hardware Acceleration
The skill uses
to automatically detect hardware and convert the model to the fastest format for your platform. Conversion happens once during deployment and is cached.env_config.py
| Platform | Backend | Optimized Format | Compute Units | Expected Speedup |
|---|---|---|---|---|
| NVIDIA GPU | CUDA | TensorRT | GPU | ~3-5x |
| Apple Silicon (M1+) | MPS | CoreML | Neural Engine (NPU) | ~2x |
| Intel CPU/GPU/NPU | OpenVINO | OpenVINO IR | CPU/GPU/NPU | ~2-3x |
| AMD GPU | ROCm | ONNX Runtime | GPU | ~1.5-2x |
| CPU (any) | CPU | ONNX Runtime | CPU | ~1.5x |
Apple Silicon Note: Detection defaults to
(CPU + Neural Engine), keeping the GPU free for LLM/VLM inference. Setcpu_and_neto include GPU if not running local LLM.compute_units: all
How It Works
detects your hardware viadeploy.shenv_config.HardwareEnv.detect()- Installs the matching
(e.g. CUDA → includesrequirements_{backend}.txt
)tensorrt - Pre-converts the default model to the optimal format
- At runtime,
loads the cached optimized model automaticallydetect.py - Falls back to PyTorch if optimization fails
Set
use_optimized: false to disable auto-conversion and use raw PyTorch.
Auto Start
Set
auto_start: true in the skill config to start detection automatically when Aegis launches. The skill will begin processing frames from the selected camera immediately.
auto_start: true model_size: nano fps: 5
Performance Monitoring
The skill emits
perf_stats events every 50 frames with aggregate timing:
{"event": "perf_stats", "total_frames": 50, "timings_ms": { "inference": {"avg": 3.4, "p50": 3.2, "p95": 5.1}, "postprocess": {"avg": 0.15, "p50": 0.12, "p95": 0.31}, "total": {"avg": 3.6, "p50": 3.4, "p95": 5.5} }}
Protocol
Communicates via JSON lines over stdin/stdout.
Aegis → Skill (stdin)
{"event": "frame", "frame_id": 42, "camera_id": "front_door", "timestamp": "...", "frame_path": "/tmp/aegis_detection/frame_front_door.jpg", "width": 1920, "height": 1080}
Skill → Aegis (stdout)
{"event": "ready", "model": "yolo2026n", "device": "mps", "backend": "mps", "format": "coreml", "gpu": "Apple M3", "classes": 80, "fps": 5} {"event": "detections", "frame_id": 42, "camera_id": "front_door", "timestamp": "...", "objects": [ {"class": "person", "confidence": 0.92, "bbox": [100, 50, 300, 400]} ]} {"event": "perf_stats", "total_frames": 50, "timings_ms": {"inference": {"avg": 3.4}}} {"event": "error", "message": "...", "retriable": true}
Bounding Box Format
[x_min, y_min, x_max, y_max] — pixel coordinates (xyxy).
Stop Command
{"command": "stop"}
Installation
The
deploy.sh bootstrapper handles everything — Python environment, GPU backend detection, dependency installation, and model optimization. No manual setup required.
./deploy.sh
Requirements Files
| File | Backend | Key Deps |
|---|---|---|
| NVIDIA | (cu124), |
| Apple | , |
| Intel | , |
| AMD | (rocm6.2), |
| CPU | (cpu), |