Asi vertex-protein-bisimulation
Protein folding as compositional game on Vertex AI. GameOpt combinatorial Bayesian optimization over residue positions, bisimulation on conformational trajectories, monad-bayes posterior over folding pathways.
install
source · Clone the upstream repo
git clone https://github.com/plurigrid/asi
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/vertex-protein-bisimulation" ~/.claude/skills/plurigrid-asi-vertex-protein-bisimulation && rm -rf "$T"
manifest:
skills/vertex-protein-bisimulation/SKILL.mdsource content
vertex-protein-bisimulation Skill
======= description: > Protein folding as compositional game on Vertex AI. GameOpt combinatorial Bayesian optimization over residue positions, bisimulation on conformational trajectories, monad-bayes posterior over folding pathways. Use when applying game-theoretic optimization to protein design, checking bisimulation equivalence of folding trajectories, or running AlphaFold/ESMFold batch prediction on Vertex AI.
vertex-protein-bisimulation
origin/main
Folding funnel = payoff landscape. Minimal frustration = Nash equilibrium.
Architecture
Basin-Hedges (ParaLens 6-wire) | +-- GameOpt layer (Bal, Sessa, Mutny, Krause 2024) | Residue positions = players | Amino acid identities = strategies | Upper confidence bound equilibria guide search | Counterfactual gating prunes combinatorial space | +-- Vertex AI Pipeline (compute backend) | AlphaFold v2 batch: KFP pipeline, 3 phases | CPU (MSA) -> GPU (predict) -> GPU (relax) | ESMFold: single-seq, no MSA, 10-30x faster | HuggingFace: facebook/esmfold_v1 | Batch prediction: 50% cost discount, 24hr | +-- Bisimulation on Folding Two trajectories bisimilar iff same native state Despite different intermediate conformations Stochastic process algebra on Markov state models CellValue lattice: Nothing=unfolded, Value=native, Contradiction=misfolded aggregate
monad-bayes Integration
-- Posterior over folding pathways foldingPathway :: MonadMeasure m => Sequence -> m Structure foldingPathway seq = do <<<<<<< HEAD -- Prior: Ramachandran angles per residue ======= >>>>>>> origin/main angles <- replicateM (length seq) $ do phi <- uniform (-pi) pi psi <- uniform (-pi) pi return (phi, psi) <<<<<<< HEAD -- Energy function as likelihood let energy = forceField seq angles factor (Exp (negate energy / kT)) -- Return structure ======= let energy = forceField seq angles factor (Exp (negate energy / kT)) >>>>>>> origin/main return (buildStructure seq angles) -- GameOpt: combinatorial optimization as open game proteinGame :: OpenGame Stochastic [AminoAcid] Energy proteinGame = sequentialCompose residueGames where residueGames = map residueChoice [1..nPositions] residueChoice i = decision "residue_i" aminoAcids ucbPayoff
<<<<<<< HEAD
Key Papers
=======
Concrete Affordances
AlphaFold Batch Workflow on Vertex AI
Run the 3-phase AlphaFold pipeline (MSA on CPU, prediction on GPU, relaxation on GPU) via Vertex AI Pipelines. See also:
vertex-ai-protein-interleave skill for full gcloud project setup.
# Prerequisites: # gcloud auth login # gcloud config set project YOUR_PROJECT_ID # gcloud services enable aiplatform.googleapis.com lifesciences.googleapis.com # 1. Build the AlphaFold container (one-time) gcloud builds submit \ --tag gcr.io/${GOOGLE_CLOUD_PROJECT}/alphafold-batch:latest \ --timeout=3600s \ /Users/alice/v/asi/skills/vertex-ai-protein-interleave/docker/ # 2. Submit a batch prediction pipeline # Input: FASTA file in GCS; Output: PDB structures in GCS gcloud ai custom-jobs create \ --region=us-central1 \ --display-name="alphafold-batch-$(date +%Y%m%d-%H%M%S)" \ --worker-pool-spec="\ machine-type=n1-standard-8,\ accelerator-type=NVIDIA_TESLA_A100,\ accelerator-count=1,\ replica-count=1,\ container-image-uri=gcr.io/${GOOGLE_CLOUD_PROJECT}/alphafold-batch:latest" \ --args="--fasta_paths=gs://${GOOGLE_CLOUD_PROJECT}-alphafold/input/sequences.fasta,\ --output_dir=gs://${GOOGLE_CLOUD_PROJECT}-alphafold/output/,\ --model_preset=monomer,\ --db_preset=reduced_dbs,\ --max_template_date=2026-03-01" # 3. Monitor the job gcloud ai custom-jobs list --region=us-central1 --filter="displayName~alphafold-batch" --limit=5 gcloud ai custom-jobs describe JOB_ID --region=us-central1
ESMFold Single-Sequence Prediction (Python)
Faster alternative when MSA is unnecessary:
# pip install torch transformers from transformers import EsmForProteinFolding, AutoTokenizer import torch tokenizer = AutoTokenizer.from_pretrained("facebook/esmfold_v1") model = EsmForProteinFolding.from_pretrained("facebook/esmfold_v1") model = model.eval() sequence = "MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG" inputs = tokenizer([sequence], return_tensors="pt", add_special_tokens=False) with torch.no_grad(): output = model(**inputs) # Extract pLDDT confidence and PDB string plddt = output["plddt"].mean().item() pdb_str = model.output_to_pdb(output)[0] print(f"Mean pLDDT: {plddt:.1f}") with open("/tmp/predicted.pdb", "w") as f: f.write(pdb_str) print("Structure written to /tmp/predicted.pdb")
GameOpt over Residue Positions (Python)
Combinatorial Bayesian optimization treating residue positions as players in a game. Uses
scipy.optimize for the upper confidence bound acquisition:
# pip install scipy numpy import numpy as np from scipy.optimize import minimize # 20 standard amino acids, N mutable positions AMINO_ACIDS = list("ACDEFGHIKLMNPQRSTVWY") N_POSITIONS = 5 # Simulated energy function (replace with AlphaFold/ESMFold pLDDT call) def energy_fn(encoding: np.ndarray) -> float: """Negative pLDDT proxy. encoding: (N_POSITIONS,) floats in [0, 19].""" rounded = np.round(encoding).astype(int) % 20 # Placeholder: pairwise contact energy return sum(abs(rounded[i] - rounded[i+1]) * 0.1 for i in range(len(rounded)-1)) # UCB acquisition: f(x) - kappa * sigma(x) # In GameOpt each position is a "player" choosing an amino acid def ucb_acquisition(x, kappa=2.0, gp_mean=None, gp_std=None): """Upper confidence bound for game-theoretic BO.""" mu = gp_mean(x) if gp_mean else energy_fn(x) sigma = gp_std(x) if gp_std else 0.5 # prior uncertainty return mu - kappa * sigma # Per-player (per-residue) best response via scipy def gameopt_step(current: np.ndarray, position: int) -> int: """Find best amino acid at `position` holding others fixed (Nash best response).""" best_aa, best_val = 0, float('inf') for aa_idx in range(20): candidate = current.copy() candidate[position] = aa_idx val = ucb_acquisition(candidate) if val < best_val: best_val = val best_aa = aa_idx return best_aa # Iterative best-response loop (converges to Nash equilibrium) state = np.random.randint(0, 20, size=N_POSITIONS).astype(float) for iteration in range(50): for pos in range(N_POSITIONS): state[pos] = gameopt_step(state, pos) e = energy_fn(state) if iteration % 10 == 0: seq = ''.join(AMINO_ACIDS[int(s) % 20] for s in state) print(f"Iter {iteration}: sequence={seq} energy={e:.4f}") final_seq = ''.join(AMINO_ACIDS[int(s) % 20] for s in state) print(f"GameOpt result: {final_seq} energy={energy_fn(state):.4f}")
Bisimulation Check on Folding Trajectories (Python)
Check whether two MD folding trajectories reach bisimilar native states:
# pip install mdtraj numpy import mdtraj as md import numpy as np def rmsd_bisimulation(traj_a_path: str, traj_b_path: str, threshold_nm: float = 0.3) -> dict: """ Two trajectories are bisimilar iff their final frames have RMSD < threshold. This implements the stochastic process algebra check from the architecture doc: bisimilar <=> same native state despite different intermediates. """ traj_a = md.load(traj_a_path) traj_b = md.load(traj_b_path) # Align final frames final_a = traj_a[-1] final_b = traj_b[-1] rmsd = md.rmsd(final_b, final_a)[0] return { "bisimilar": rmsd < threshold_nm, "rmsd_nm": float(rmsd), "threshold_nm": threshold_nm, "traj_a_frames": traj_a.n_frames, "traj_b_frames": traj_b.n_frames, } # Usage: # result = rmsd_bisimulation("fold_run1.xtc", "fold_run2.xtc", threshold_nm=0.3) # print(result)
Key Papers
origin/main
- GameOpt (2024): arxiv.org/abs/2409.18582
- Bayesian Open Games (Bolt, Hedges, Zahn 2019): arxiv.org/abs/1910.03656
- MELD Bayesian protein (PNAS): doi.org/10.1073/pnas.1506788112
- AMix-1 Bayesian Flow Networks (2025): protein foundation model <<<<<<< HEAD
GF(3) Trit Classification
| Component | Trit | Role |
|---|---|---|
| ESMFold/AlphaFold prediction | +1 | Generation |
| GameOpt equilibrium search | 0 | Coordination |
| Bisimulation equivalence check | -1 | Validation |
Conservation: +1 + 0 + (-1) = 0
Edges in Interactome TUI
- -> monad-bayes (w=0.65, Bayesian structure posterior)
- -> geomstats (w=0.60, protein manifold geometry)
- -> bisimulation-game (w=0.90, conformational bisimulation)
- -> zubyul/Nikolova_lab (w=0.70, transcription factor bridge)
Trit: 0 (ERGODIC)
=======
origin/main