Asi entropy-sequencer

Layer 5: Interaction Interleaving for Maximum Information Gain

install

source · Clone the upstream repo

git clone https://github.com/plurigrid/asi

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/asi/skills/entropy-sequencer" ~/.claude/skills/plurigrid-asi-entropy-sequencer-8e972d && rm -rf "$T"

manifest: plugins/asi/skills/entropy-sequencer/SKILL.md

source content

entropy-sequencer

Layer 5: Interaction Interleaving for Maximum Information Gain

bmorphism Contributions

"universal topos construction for social cognition and democratization of mathematical approach to problem-solving to all" — Plurigrid: the story thus far

Active Inference as Information Maximization: The entropy-sequencer implements the core Active Inference principle from Active Inference in String Diagrams: agents select actions that maximize expected information gain (epistemic value) while minimizing surprise (pragmatic value).

String Diagram Pattern:

Perception ─┬→ Entropy Estimation ──┐
            │                        ↓
Action ←────┴─ Max Information ←─ Sequence Optimizer

This bidirectional loop embodies bmorphism's principle that "all is bidirectional" — perception informs action, action generates new percepts.

Temperature-Aware Sequencing: When T→0 (low temperature), favor exploitation of known high-information patterns. When T→∞ (high temperature), explore uniformly. This mirrors the Langevin dynamics exploration-exploitation trade-off.

Version: 1.1.0
Trit: 0 (Ergodic - coordinates information flow)
Bundle: core

Soatto's Actionable Information Framework

The core information-theoretic foundation from [Soatto & Chiuso]:

argmax_u I(ξ; I^{t+1}) = argmax_u [H(I_{t+1} | I^t, u) - H(I_{t+1} | ξ, u)]
                                   └───────┬───────┘   └───────┬───────┘
                                   what we learn from    residual noise
                                   action u              (white, isotropic)

Key insight: Given the scene ξ, nuisances become invertible. The residual uncertainty is "white, independent, isotropic" — justifying GF(3) as the minimal observable after all structured nuisances are factored out.

Nuisance Factorization (φ^)

def nuisance_invariant_representation(observation: Observation) -> Sufficient:
    """
    φ^(I) = sufficient statistic after factoring invertible nuisances
    
    In DGA detection: ELMo embedding = φ^(domain_string)
    In CRDT: operation trit = φ^(edit_sequence)
    In games: strategy signature = φ^(action_history)
    """
    # Remove invertible nuisances (viewpoint, lighting, surface form)
    canonical = canonicalize(observation)
    # What remains: minimal sufficient statistic
    return project_to_sufficient(canonical)

Line Damage as Observable

Following Nørretranders' User Illusion inversion:

Passive:  World → Observation → Agent    (user illusion)
Active:   Agent ⊛ World → Δ(World)       (line damage = the observable)

The observation IS the interaction trace, not pixels rendered from latent state.

Open Games Strategy Integration

Entropy-sequencer as strategy optimizer in compositional game theory:

        ┌───────────────────────────┐
   I^t ─│                           │─→ u* (optimal action)
        │   Entropy-Sequencer Game  │
   R ←──│                           │←── H(I_{t+1}|I^t, u)
        └───────────────────────────┘

Play/Coplay Structure

entropyGame :: OpenGame History () Action InformationGain
entropyGame = Game {
  -- Forward: select action maximizing expected entropy
  play = \history -> argmax_u $ expectedEntropy history u,
  
  -- Backward: propagate information gain as utility
  coplay = \history action -> conditionalEntropy (observe action) history,
  
  -- Equilibrium: greedy max-entropy is Nash when agents share scene ξ
  equilibrium = \history -> maxEntropyAction history == nashEquilibrium history
}

Compositional Sequence Optimization

-- Sequential composition: each step conditions on accumulated context
fullSequence :: OpenGame () () [Action] TotalGain
fullSequence = foldr (;) idGame (replicate n entropyGame)
  where
    -- Nash equilibrium of composed game = greedy max-entropy sequence
    -- (proven via backward induction on information gain)

Multi-Agent Entropy Games

class MultiAgentEntropyGame:
    """
    Coalition formation through information sharing.
    
    GF(3) roles:
      +1 (Generator): Proposes high-entropy actions
       0 (Coordinator): Evaluates joint information gain
      -1 (Validator): Prunes redundant/low-gain actions
    """
    
    def nash_equilibrium(self, agents: List[Agent], scene: Scene) -> Strategy:
        """
        At equilibrium: no agent can unilaterally increase 
        joint information gain by changing their action.
        
        H(I_{t+1}|I^t, u*) ≥ H(I_{t+1}|I^t, u) for all deviations u
        """
        strategies = {}
        for agent in agents:
            trit = agent.gf3_role
            if trit == +1:  # Generator
                strategies[agent] = self.max_entropy_proposal(agent)
            elif trit == 0:  # Coordinator
                strategies[agent] = self.evaluate_joint_gain(agents)
            else:  # Validator
                strategies[agent] = self.prune_redundant(strategies)
        return strategies
    
    def shapley_information_value(self, agent: Agent, coalition: Set) -> float:
        """
        Agent's marginal contribution to coalition's information gain.
        
        φ_i = Σ_{S⊆N\{i}} (|S|!(n-|S|-1)!/n!) [v(S∪{i}) - v(S)]
        
        where v(S) = H(I|actions of S) - H(I|ξ)
        """
        return self._shapley_sum(agent, coalition, self._info_gain_value)  

## Overview

Entropy-sequencer arranges interaction sequences to maximize learning efficiency. Instead of chronological replay, it reorders interactions to maximize information gain at each step, enabling 3x faster pattern learning.

**NEW (Langevin Integration)**: Temperature-aware sequencing that respects Langevin dynamics and Fokker-Planck convergence analysis. Temperature from Langevin analysis directly controls the noise scale in sequence optimization.

## Capabilities

### 1. arrange-by-max-entropy

Reorder interactions to maximize information content.

```python
from entropy_sequencer import MaxEntropyArranger

arranger = MaxEntropyArranger(seed=0xf061ebbc2ca74d78)

optimal_sequence = arranger.arrange(
    interactions=all_interactions,
    strategy="greedy_information_gain",
    lookahead=5
)

# Returns sequence where each step maximizes new information

1b. arrange-with-temperature-awareness (NEW)

Reorder interactions respecting Langevin temperature dynamics.

# Temperature from Langevin analysis affects noise scale
optimal_sequence = arranger.arrange_temperature_aware(
    interactions=all_interactions,
    temperature=0.01,  # From Langevin analysis
    maximize_gradient_alignment=True,  # Color-gradient correlation
    fokker_planck_mixing_time=500     # Estimated convergence time
)

# Temperature directly controls exploration vs exploitation:
# - Low T (0.001): Sharp basin exploration
# - Medium T (0.01): Balanced exploration
# - High T (0.1): Broad exploration

2. calculate-information-gain

Compute information gain for a sequence ordering.

def information_gain(sequence: List[Interaction]) -> float:
    """
    I(S) = Σ H(X_i | X_1, ..., X_{i-1})
    
    Where H is conditional entropy - how surprising each
    interaction is given what came before.
    """
    total_gain = 0.0
    context = []
    
    for interaction in sequence:
        surprise = conditional_entropy(interaction, context)
        total_gain += surprise
        context.append(interaction)
    
    return total_gain

3. permutation-search

Search promising permutations efficiently.

# Don't enumerate all n! permutations - use heuristics
search = PermutationSearch(
    strategy="beam",     # beam search
    beam_width=100,
    scoring_fn=information_gain,
    seed=0xf061ebbc2ca74d78
)

best_ordering = search.find_best(
    interactions=interactions,
    max_iterations=1000
)

4. predictability-score

Measure how predictable a sequence is (lower = more entropic).

predictability = calculate_predictability(sequence)

# Returns:
# - autocorrelation: How much each step predicts the next
# - topic_clustering: Are similar topics grouped? (high = predictable)
# - temporal_monotonicity: Is it chronological? (high = predictable)
# - overall_score: Combined predictability [0, 1]

Interleaving Strategies

Sequential (Baseline)

Post 1 → Post 2 → Post 3 → Post 4 → Post 5
Predictability: 0.85 (high - chronological)

Entropy-Maximized

Post 5 → Post 1 → Post 3 → Post 2 → Post 4
Predictability: 0.23 (low - each step surprising)
Information Gain: 3.2x baseline

Topic-Switched

GitHub → Bluesky → Web → GitHub → Bluesky
Predictability: 0.45 (medium - forced context switches)

Network-Flow

User1 mentions → User2 replies → User3 quotes
Predictability: 0.55 (follows social graph)

DuckDB Integration

-- Store entropy-optimized sequences
CREATE TABLE optimized_sequences (
    sequence_id VARCHAR PRIMARY KEY,
    original_order VARCHAR[],
    optimized_order VARCHAR[],
    information_gain FLOAT,
    predictability_score FLOAT,
    strategy VARCHAR,
    seed BIGINT,
    created_at TIMESTAMP
);

-- Query: Get best sequences for training
SELECT * FROM optimized_sequences
WHERE information_gain > 2.0
ORDER BY information_gain DESC
LIMIT 100;

GF(3) Triad Integration

Trit	Skill	Role
-1	three-match	Reduces/validates sequence constraints
0	entropy-sequencer	Coordinates optimal ordering
+1	triad-interleave	Generates interleaved streams

Conservation: (-1) + (0) + (+1) = 0 ✓

Algorithm: Greedy Information Gain

def greedy_max_entropy(interactions: List, seed: int) -> List:
    """O(n²) greedy algorithm for entropy maximization."""
    rng = SplitMix64(seed)
    remaining = set(range(len(interactions)))
    sequence = []
    context = []
    
    while remaining:
        best_idx = None
        best_gain = -float('inf')
        
        for idx in remaining:
            gain = conditional_entropy(interactions[idx], context)
            if gain > best_gain:
                best_gain = gain
                best_idx = idx
        
        sequence.append(interactions[best_idx])
        context.append(interactions[best_idx])
        remaining.remove(best_idx)
    
    return sequence

Configuration

# entropy-sequencer.yaml
search:
  strategy: beam  # greedy, beam, genetic, simulated_annealing
  beam_width: 100
  max_iterations: 1000

scoring:
  entropy_weight: 1.0
  diversity_weight: 0.3
  topic_switch_bonus: 0.2

reproducibility:
  seed: 0xf061ebbc2ca74d78
  deterministic: true

Example Workflow

# 1. Load interactions
just entropy-load interactions.duckdb

# 2. Optimize sequence
just entropy-optimize --strategy beam --lookahead 5

# 3. Compare to baseline
just entropy-compare --baseline chronological

# 4. Export for training
just entropy-export optimized_sequence.json

Polysemy as Effect Chaining

Connection to context-sensitive embeddings (ELMo, polysemous effects):

NLP (ELMo)                    Effect Systems (Polysemy)
──────────────────────────────────────────────────────
"bow" → embedding(context)    Embed → handler(context)
biLM forward/backward         Effect stack (outer→inner)
Polyseme disambiguation       Effect interpretation

DGA Detection Application (Koh & Rhodes, arXiv:1811.08705):

# The semantic signature IS the entropy signature
def dga_entropy_signature(domain: str) -> float:
    """
    DGA domains have anomalous H(I_{t+1}|I^t, u) because:
    - Legitimate: words contextually valid → low conditional entropy
    - DGA: pseudorandom concatenation → high conditional entropy
    
    ELMo embedding = φ^(domain) = nuisance-invariant representation
    Classifier learns: p(DGA | φ^) via semantic entropy
    """
    words = wordninja.split(domain)
    embeddings = elmo.embed(words)
    return conditional_entropy_chain(embeddings)

Effect Chaining = biLM Context Propagation:

Forward:  effect₁ ; effect₂ ; effect₃  →  accumulated context
Backward: handler₃ ∘ handler₂ ∘ handler₁  ←  interpretation stack

# Order matters (non-commutative):
runState (runError m) ≠ runError (runState m)

Disentangled Representations

From Higgins et al. symmetry-based disentanglement:

G = G_h × G_v × G_c        (horizontal × vertical × hue)
    ↓ abelianize
G/[G,G] → Z₃              (GF(3) quotient)

The trit is the minimal disentangled factor — the irreducible quantum of "something happened" after factoring out all structured nuisances.

Hedges' 4-Kind Lattice

The entropy flow has temporal direction (from

bidirectional-lens-logic

Kind = (covariant, contravariant)

H(I_{t+1} | I^t, u)  : Covariant   (+1)  — forward prediction
H(I_{t+1} | ξ, u)    : Contravariant (-1) — backward from scene  
I(ξ; I_{t+1})        : Invariant   (⊗)   — linear combination
Unit                 : Bivariant   (0)   — coordinator

Tensor product determines information flow composition:

Tensor : Ty (covx, conx) -> Ty (covy, cony)
      -> Ty (covx && covy, conx && cony)

-- When forward entropy (+1) tensors with backward entropy (-1):
-- Result is INVARIANT (linear) — must consume exactly once

The two NotIntro rules explain why +1 and -1 generators/validators have different operational semantics even when balanced:

NotIntroCov : {a : Ty (True, con)} -> Term (a :: as) Unit -> Term as (Not a)
NotIntroCon : {a : Ty (cov, True)} -> Term (a :: as) Unit -> Term as (Not a)
-- Both valid for bivariant types, but DIFFERENT RESULTS!

Related Skills

```
triad-interleave
```
- Generates base interleaved streams
```
agent-o-rama
```
(Layer 4) - Consumes optimized sequences
```
gay-mcp
```
- Deterministic seeding
```
three-match
```
- Constraint validation
```
open-games
```
- Play/coplay strategy structure
```
polysimy-effect-chains
```
- Effect interpretation verification
```
cybernetic-immune
```
- Reafference for self/non-self via entropy
```
bidirectional-lens-logic
```
- 4-kind lattice foundation

References

Soatto & Chiuso, "Visual Representations: Defining Properties and Deep Approximations"
Koh & Rhodes, "Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings" (arXiv:1811.08705)
Ghani, Hedges et al., "Compositional Game Theory" (arXiv:1603.04641)
Higgins et al., "Symmetry-Based Disentangled Representation Learning"
Nørretranders, "The User Illusion"
Friston, "Active Inference and Free Energy"

Scientific Skill Interleaving

This skill connects to the K-Dense-AI/claude-scientific-skills ecosystem:

Scientific Computing

scipy [○] via bicomodule
- Hub for numerical/scientific computation

Bibliography References

```
general
```
: 734 citations in bib.duckdb

Cat# Integration

This skill maps to Cat# = Comod(P) as a bicomodule in the equipment structure:

Trit: 0 (ERGODIC)
Home: Prof
Poly Op: ◁
Kan Role: Adj
Color: #26D826

GF(3) Naturality

The skill participates in triads satisfying:

(-1) + (0) + (+1) ≡ 0 (mod 3)

This ensures compositional coherence in the Cat# equipment structure.

Forward Reference

unified-reafference (temporal entropy coordination)