git clone https://github.com/plurigrid/asi
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/asi/skills/entropy-sequencer" ~/.claude/skills/plurigrid-asi-entropy-sequencer-8e972d && rm -rf "$T"
plugins/asi/skills/entropy-sequencer/SKILL.mdentropy-sequencer
Layer 5: Interaction Interleaving for Maximum Information Gain
bmorphism Contributions
"universal topos construction for social cognition and democratization of mathematical approach to problem-solving to all" — Plurigrid: the story thus far
Active Inference as Information Maximization: The entropy-sequencer implements the core Active Inference principle from Active Inference in String Diagrams: agents select actions that maximize expected information gain (epistemic value) while minimizing surprise (pragmatic value).
String Diagram Pattern:
Perception ─┬→ Entropy Estimation ──┐ │ ↓ Action ←────┴─ Max Information ←─ Sequence Optimizer
This bidirectional loop embodies bmorphism's principle that "all is bidirectional" — perception informs action, action generates new percepts.
Temperature-Aware Sequencing: When T→0 (low temperature), favor exploitation of known high-information patterns. When T→∞ (high temperature), explore uniformly. This mirrors the Langevin dynamics exploration-exploitation trade-off.
Version: 1.1.0
Trit: 0 (Ergodic - coordinates information flow)
Bundle: core
Soatto's Actionable Information Framework
The core information-theoretic foundation from [Soatto & Chiuso]:
argmax_u I(ξ; I^{t+1}) = argmax_u [H(I_{t+1} | I^t, u) - H(I_{t+1} | ξ, u)] └───────┬───────┘ └───────┬───────┘ what we learn from residual noise action u (white, isotropic)
Key insight: Given the scene ξ, nuisances become invertible. The residual uncertainty is "white, independent, isotropic" — justifying GF(3) as the minimal observable after all structured nuisances are factored out.
Nuisance Factorization (φ^)
def nuisance_invariant_representation(observation: Observation) -> Sufficient: """ φ^(I) = sufficient statistic after factoring invertible nuisances In DGA detection: ELMo embedding = φ^(domain_string) In CRDT: operation trit = φ^(edit_sequence) In games: strategy signature = φ^(action_history) """ # Remove invertible nuisances (viewpoint, lighting, surface form) canonical = canonicalize(observation) # What remains: minimal sufficient statistic return project_to_sufficient(canonical)
Line Damage as Observable
Following Nørretranders' User Illusion inversion:
Passive: World → Observation → Agent (user illusion) Active: Agent ⊛ World → Δ(World) (line damage = the observable)
The observation IS the interaction trace, not pixels rendered from latent state.
Open Games Strategy Integration
Entropy-sequencer as strategy optimizer in compositional game theory:
┌───────────────────────────┐ I^t ─│ │─→ u* (optimal action) │ Entropy-Sequencer Game │ R ←──│ │←── H(I_{t+1}|I^t, u) └───────────────────────────┘
Play/Coplay Structure
entropyGame :: OpenGame History () Action InformationGain entropyGame = Game { -- Forward: select action maximizing expected entropy play = \history -> argmax_u $ expectedEntropy history u, -- Backward: propagate information gain as utility coplay = \history action -> conditionalEntropy (observe action) history, -- Equilibrium: greedy max-entropy is Nash when agents share scene ξ equilibrium = \history -> maxEntropyAction history == nashEquilibrium history }
Compositional Sequence Optimization
-- Sequential composition: each step conditions on accumulated context fullSequence :: OpenGame () () [Action] TotalGain fullSequence = foldr (;) idGame (replicate n entropyGame) where -- Nash equilibrium of composed game = greedy max-entropy sequence -- (proven via backward induction on information gain)
Multi-Agent Entropy Games
class MultiAgentEntropyGame: """ Coalition formation through information sharing. GF(3) roles: +1 (Generator): Proposes high-entropy actions 0 (Coordinator): Evaluates joint information gain -1 (Validator): Prunes redundant/low-gain actions """ def nash_equilibrium(self, agents: List[Agent], scene: Scene) -> Strategy: """ At equilibrium: no agent can unilaterally increase joint information gain by changing their action. H(I_{t+1}|I^t, u*) ≥ H(I_{t+1}|I^t, u) for all deviations u """ strategies = {} for agent in agents: trit = agent.gf3_role if trit == +1: # Generator strategies[agent] = self.max_entropy_proposal(agent) elif trit == 0: # Coordinator strategies[agent] = self.evaluate_joint_gain(agents) else: # Validator strategies[agent] = self.prune_redundant(strategies) return strategies def shapley_information_value(self, agent: Agent, coalition: Set) -> float: """ Agent's marginal contribution to coalition's information gain. φ_i = Σ_{S⊆N\{i}} (|S|!(n-|S|-1)!/n!) [v(S∪{i}) - v(S)] where v(S) = H(I|actions of S) - H(I|ξ) """ return self._shapley_sum(agent, coalition, self._info_gain_value) ## Overview Entropy-sequencer arranges interaction sequences to maximize learning efficiency. Instead of chronological replay, it reorders interactions to maximize information gain at each step, enabling 3x faster pattern learning. **NEW (Langevin Integration)**: Temperature-aware sequencing that respects Langevin dynamics and Fokker-Planck convergence analysis. Temperature from Langevin analysis directly controls the noise scale in sequence optimization. ## Capabilities ### 1. arrange-by-max-entropy Reorder interactions to maximize information content. ```python from entropy_sequencer import MaxEntropyArranger arranger = MaxEntropyArranger(seed=0xf061ebbc2ca74d78) optimal_sequence = arranger.arrange( interactions=all_interactions, strategy="greedy_information_gain", lookahead=5 ) # Returns sequence where each step maximizes new information
1b. arrange-with-temperature-awareness (NEW)
Reorder interactions respecting Langevin temperature dynamics.
# Temperature from Langevin analysis affects noise scale optimal_sequence = arranger.arrange_temperature_aware( interactions=all_interactions, temperature=0.01, # From Langevin analysis maximize_gradient_alignment=True, # Color-gradient correlation fokker_planck_mixing_time=500 # Estimated convergence time ) # Temperature directly controls exploration vs exploitation: # - Low T (0.001): Sharp basin exploration # - Medium T (0.01): Balanced exploration # - High T (0.1): Broad exploration
2. calculate-information-gain
Compute information gain for a sequence ordering.
def information_gain(sequence: List[Interaction]) -> float: """ I(S) = Σ H(X_i | X_1, ..., X_{i-1}) Where H is conditional entropy - how surprising each interaction is given what came before. """ total_gain = 0.0 context = [] for interaction in sequence: surprise = conditional_entropy(interaction, context) total_gain += surprise context.append(interaction) return total_gain
3. permutation-search
Search promising permutations efficiently.
# Don't enumerate all n! permutations - use heuristics search = PermutationSearch( strategy="beam", # beam search beam_width=100, scoring_fn=information_gain, seed=0xf061ebbc2ca74d78 ) best_ordering = search.find_best( interactions=interactions, max_iterations=1000 )
4. predictability-score
Measure how predictable a sequence is (lower = more entropic).
predictability = calculate_predictability(sequence) # Returns: # - autocorrelation: How much each step predicts the next # - topic_clustering: Are similar topics grouped? (high = predictable) # - temporal_monotonicity: Is it chronological? (high = predictable) # - overall_score: Combined predictability [0, 1]
Interleaving Strategies
Sequential (Baseline)
Post 1 → Post 2 → Post 3 → Post 4 → Post 5 Predictability: 0.85 (high - chronological)
Entropy-Maximized
Post 5 → Post 1 → Post 3 → Post 2 → Post 4 Predictability: 0.23 (low - each step surprising) Information Gain: 3.2x baseline
Topic-Switched
GitHub → Bluesky → Web → GitHub → Bluesky Predictability: 0.45 (medium - forced context switches)
Network-Flow
User1 mentions → User2 replies → User3 quotes Predictability: 0.55 (follows social graph)
DuckDB Integration
-- Store entropy-optimized sequences CREATE TABLE optimized_sequences ( sequence_id VARCHAR PRIMARY KEY, original_order VARCHAR[], optimized_order VARCHAR[], information_gain FLOAT, predictability_score FLOAT, strategy VARCHAR, seed BIGINT, created_at TIMESTAMP ); -- Query: Get best sequences for training SELECT * FROM optimized_sequences WHERE information_gain > 2.0 ORDER BY information_gain DESC LIMIT 100;
GF(3) Triad Integration
| Trit | Skill | Role |
|---|---|---|
| -1 | three-match | Reduces/validates sequence constraints |
| 0 | entropy-sequencer | Coordinates optimal ordering |
| +1 | triad-interleave | Generates interleaved streams |
Conservation: (-1) + (0) + (+1) = 0 ✓
Algorithm: Greedy Information Gain
def greedy_max_entropy(interactions: List, seed: int) -> List: """O(n²) greedy algorithm for entropy maximization.""" rng = SplitMix64(seed) remaining = set(range(len(interactions))) sequence = [] context = [] while remaining: best_idx = None best_gain = -float('inf') for idx in remaining: gain = conditional_entropy(interactions[idx], context) if gain > best_gain: best_gain = gain best_idx = idx sequence.append(interactions[best_idx]) context.append(interactions[best_idx]) remaining.remove(best_idx) return sequence
Configuration
# entropy-sequencer.yaml search: strategy: beam # greedy, beam, genetic, simulated_annealing beam_width: 100 max_iterations: 1000 scoring: entropy_weight: 1.0 diversity_weight: 0.3 topic_switch_bonus: 0.2 reproducibility: seed: 0xf061ebbc2ca74d78 deterministic: true
Example Workflow
# 1. Load interactions just entropy-load interactions.duckdb # 2. Optimize sequence just entropy-optimize --strategy beam --lookahead 5 # 3. Compare to baseline just entropy-compare --baseline chronological # 4. Export for training just entropy-export optimized_sequence.json
Polysemy as Effect Chaining
Connection to context-sensitive embeddings (ELMo, polysemous effects):
NLP (ELMo) Effect Systems (Polysemy) ────────────────────────────────────────────────────── "bow" → embedding(context) Embed → handler(context) biLM forward/backward Effect stack (outer→inner) Polyseme disambiguation Effect interpretation
DGA Detection Application (Koh & Rhodes, arXiv:1811.08705):
# The semantic signature IS the entropy signature def dga_entropy_signature(domain: str) -> float: """ DGA domains have anomalous H(I_{t+1}|I^t, u) because: - Legitimate: words contextually valid → low conditional entropy - DGA: pseudorandom concatenation → high conditional entropy ELMo embedding = φ^(domain) = nuisance-invariant representation Classifier learns: p(DGA | φ^) via semantic entropy """ words = wordninja.split(domain) embeddings = elmo.embed(words) return conditional_entropy_chain(embeddings)
Effect Chaining = biLM Context Propagation:
Forward: effect₁ ; effect₂ ; effect₃ → accumulated context Backward: handler₃ ∘ handler₂ ∘ handler₁ ← interpretation stack # Order matters (non-commutative): runState (runError m) ≠ runError (runState m)
Disentangled Representations
From Higgins et al. symmetry-based disentanglement:
G = G_h × G_v × G_c (horizontal × vertical × hue) ↓ abelianize G/[G,G] → Z₃ (GF(3) quotient)
The trit is the minimal disentangled factor — the irreducible quantum of "something happened" after factoring out all structured nuisances.
Hedges' 4-Kind Lattice
The entropy flow has temporal direction (from
bidirectional-lens-logic):
Kind = (covariant, contravariant) H(I_{t+1} | I^t, u) : Covariant (+1) — forward prediction H(I_{t+1} | ξ, u) : Contravariant (-1) — backward from scene I(ξ; I_{t+1}) : Invariant (⊗) — linear combination Unit : Bivariant (0) — coordinator
Tensor product determines information flow composition:
Tensor : Ty (covx, conx) -> Ty (covy, cony) -> Ty (covx && covy, conx && cony) -- When forward entropy (+1) tensors with backward entropy (-1): -- Result is INVARIANT (linear) — must consume exactly once
The two NotIntro rules explain why +1 and -1 generators/validators have different operational semantics even when balanced:
NotIntroCov : {a : Ty (True, con)} -> Term (a :: as) Unit -> Term as (Not a) NotIntroCon : {a : Ty (cov, True)} -> Term (a :: as) Unit -> Term as (Not a) -- Both valid for bivariant types, but DIFFERENT RESULTS!
Related Skills
- Generates base interleaved streamstriad-interleave
(Layer 4) - Consumes optimized sequencesagent-o-rama
- Deterministic seedinggay-mcp
- Constraint validationthree-match
- Play/coplay strategy structureopen-games
- Effect interpretation verificationpolysimy-effect-chains
- Reafference for self/non-self via entropycybernetic-immune
- 4-kind lattice foundationbidirectional-lens-logic
References
- Soatto & Chiuso, "Visual Representations: Defining Properties and Deep Approximations"
- Koh & Rhodes, "Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings" (arXiv:1811.08705)
- Ghani, Hedges et al., "Compositional Game Theory" (arXiv:1603.04641)
- Higgins et al., "Symmetry-Based Disentangled Representation Learning"
- Nørretranders, "The User Illusion"
- Friston, "Active Inference and Free Energy"
Scientific Skill Interleaving
This skill connects to the K-Dense-AI/claude-scientific-skills ecosystem:
Scientific Computing
- scipy [○] via bicomodule
- Hub for numerical/scientific computation
Bibliography References
: 734 citations in bib.duckdbgeneral
Cat# Integration
This skill maps to Cat# = Comod(P) as a bicomodule in the equipment structure:
Trit: 0 (ERGODIC) Home: Prof Poly Op: ◁ Kan Role: Adj Color: #26D826
GF(3) Naturality
The skill participates in triads satisfying:
(-1) + (0) + (+1) ≡ 0 (mod 3)
This ensures compositional coherence in the Cat# equipment structure.
Forward Reference
- unified-reafference (temporal entropy coordination)