wave 5: predictability min/max + unsupervised features (4 stubs)#9
Merged
0bserver07 merged 5 commits intomainfrom May 8, 2026
Merged
wave 5: predictability min/max + unsupervised features (4 stubs)#90bserver07 merged 5 commits intomainfrom
0bserver07 merged 5 commits intomainfrom
Conversation
… codes Implementation of the LOCOCODE / flat-minimum-search proxy from Hochreiter & Schmidhuber (1999, NC 11). Tied k×k autoencoder on whitened sparse Laplacian mixtures, MSE + L1 activity penalty + weight decay. Headline (seed 0, k=8, n=2000, 200 epochs, 0.18 s training): LOCOCODE Amari = 0.093 (kurtosis 2.61) PCA Amari = 0.388 (kurtosis 1.08) FastICA Amari = 0.022 (kurtosis 3.22) LOCOCODE crosses cleanly from PCA-quality to ICA-family quality: 4× lower Amari than PCA, super-Gaussian codes, recovered demixer is near-permutation up to small off-diagonal cross-talk. Plateau at ~0.10 Amari is the L1-saturation gap to higher-order-moment ICA; documented as Open Question. Files: - lococode_ica.py: data, model, train, PCA + FastICA baselines, Amari - visualize_lococode_ica.py: 5 PNGs in viz/ - make_lococode_ica_gif.py: 528 KB GIF, 41 frames - README.md: 8 sections including paper-vs-our deviations Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…l inputs
Encoder + decoder + K per-unit predictors trained adversarially in pure numpy
(Adam, manual gradients). K=4, D=8 with linear Gaussian mixing of independent
+/-1 factors converges in ~3 s on an M-series laptop:
L_recon = 0.0026
L_pred = 0.2500 (= chance for sigmoid against balanced binary target)
pMI = 9.6e-05 nats
bit_acc = 100% modulo permutation+sign on 4096 held-out samples
seeds = 8/8 reach 100% bit accuracy at 2000 steps
Files:
predictability_min_binary_factors.py
make_predictability_min_binary_factors_gif.py
visualize_predictability_min_binary_factors.py
predictability_min_binary_factors.gif (567 KB, well below 2 MB target)
viz/{training_curves,pairwise_mi_init_vs_final,code_vs_factor_mi,code_distribution}.png
results.json
README.md (8 sections)
Removed problem.py stub.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tability max Two MLPs each see one view of synthetic binary stereo (16 dims = 8 shared + 8 view-specific distractors per view) and train cooperatively under the Becker-Hinton 1992 IMAX objective I(yL;yR) = 0.5 log(var(yL+yR)/var(yL-yR)) to recover a hidden binary depth bit. Headline (seed 0, 200 epochs, ~0.1 s on M-series CPU): held-out depth recovery 1.000, IMAX I = 7.598 nats. 8-seed mean held-out recovery 0.997 (min 0.994). Shuffled negative control (no shared depth): 0.513 (chance). Files: predictable_stereo.py (model + IMAX loss + closed-form gradient + training + held-out eval + multi-seed sweep + --shuffled control), visualize_predictable_stereo.py (5 PNGs to viz/), make_predictable_stereo_gif.py (51 frames, 844 KB), run.json, README.md (8 sections). Pure numpy + matplotlib. Deterministic under --seed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…al-image patches
Wave 5 stub for Schmidhuber, Eldracher, Foltin (1996) "Semilinear
predictability minimization produces well-known feature detectors".
Implementation: linear encoder W (M=16, orthonormal rows) + linear
predictor on standardised squared codes z = (y^2 - mu)/sigma. The
squaring is the one nonlinearity ("semilinear"); encoder ascends
L_pred, predictor descends. With Stiefel constraint + z-standardisation
the PM minimax stays bounded and converges in 2500 steps / 1.2 s.
Synthetic dataset: 1/f^2 pink-noise images + random oriented bars,
ZCA-whitened 8x8 patches.
Headline (seed 0, 1.2 s wallclock):
- 12/16 filters with FFT orientation concentration > 0.5 (oriented bars)
- 16/16 filters with concentration > 0.4
- mean code excess kurtosis 19.96 (random projection: 2.95)
- bit-identical across two runs of the same seed
- 12-15/16 oriented across seeds 0..4 (median 14/16)
Visual signature reproduces the V1 simple-cell template; PCA baseline on
the same data gives global Fourier modes (not oriented), as expected.
Files: model + train + eval (semilinear_pm_image_patches.py),
8 static PNGs (visualize_*.py), 1.1 MB GIF of filter evolution.
Pure numpy + matplotlib, --grad-check matches numerical to <1e-9.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Octopus merge of 4 wave-5 stubs per SPEC issue #1. - wave-5-local/predictability-min-binary-factors: predictability minimization on synthetic factorial binary patterns (1992) - wave-5-local/predictable-stereo: predictability maximization (Becker-Hinton-style IMAX) on synthetic binary stereo (1993) - wave-5-local/semilinear-pm-image-patches: semilinear PM on synthetic natural-image patches (1996) - wave-5-local/lococode-ica: tied autoencoder + L1 sparsity on synthetic sparse data (1999) All 4 verified by separate audit subagent: numpy-only, deterministic, branch protocol followed (no wave-5-local on remote), all 8 README sections, algorithmic faithfulness confirmed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
Audit Report — PR #9 wave 5 (4 stubs)Wave 5 verdict: APPROVE. Independent review by separate Explore subagent. Per-stub verdicts
Cross-cut findings
Algorithmic faithfulness (2 deep dives)
Reproduce results (3 spot-checks)All identical across reruns. All wallclocks well under 5-minute budget. agent-0bserver07 (Claude Code) on behalf of Yad — wave-5 audit subagent |
10 tasks
0bserver07
added a commit
that referenced
this pull request
May 8, 2026
wave 5: predictability min/max + unsupervised features (4 stubs)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Wave 5 — predictability min/max + unsupervised feature extraction
Four stubs implementing Schmidhuber's 1992-1999 unsupervised-coding lineage per SPEC issue #1. Octopus-merged from 4 local-only
wave-5-local/<slug>branches.predictability-min-binary-factorspredictable-stereosemilinear-pm-image-patcheslococode-icaAudit verdict (separate Explore subagent)
APPROVE across all 4 stubs.
wave-5-local/*branches on origin.__pycache__committed.agent-0bserver07 <agent-0bserver07@users.noreply.github.com>.Per-stub deviations (in each stub's §Deviations)
Citation gaps
All 4 source papers are retrievable. Some implementation details (exact dataset compositions, optimizer hyperparams) are reconstructed from secondary sources where the originals don't pin them down — flagged in §Open questions per SPEC's methodological caveat.
Wave 0 → 1 → 2 → 3 → 4 → 5 progress
7 + 5 + 5 + 5 + 4 = 26/50 v1 stubs done (52%). 4 waves remaining = 24 stubs.
agent-0bserver07 (Claude Code) on behalf of Yad