How compressible is the signal, and where does the redundancy come from?
Shannon entropy at multiple block sizes, compression ratio via Lempel-Ziv (zlib), and mutual information at lags 1 and 8. Together these measure the signal's intrinsic complexity from three complementary angles: distributional (are byte patterns uniform?), algorithmic (can a compressor find structure?), and temporal (does the past predict the future?).
Lempel-Ziv compression ratio: compressed_size / original_size. 1.0 means incompressible (Wichmann-Hill, MINSTD, XorShift32 — good PRNGs are incompressible). 0.0 means trivially compressible (constants). Logistic period-2 scores 0.002 (alternating between two values compresses almost completely). This is a direct proxy for Kolmogorov complexity.
Mutual information between consecutive bytes: how much does byte t tell you about byte t+1? Logistic periodic orbits score 1.0 (each byte perfectly predicts the next). L-System Dragon Curve scores 0.0 (its symbolic dynamics are unpredictable one step ahead despite being deterministic). This separates "locally predictable" from "locally random" deterministic systems.
Mutual information at lag 8. Rule 30 scores 0.0 (no 8-step memory), while logistic period-2 still scores 1.0 (period divides 8). The comparison between lag-1 and lag-8 mutual information reveals the memory timescale: signals where MI_8 ≈ MI_1 have long memory; signals where MI_8 ≪ MI_1 have short memory.
The total shared information between past and future. Hilbert walk (0.95) and sawtooth (0.95) maximize this: their deterministic structure creates maximum past-future coupling. Random steps (0.94) are high too — the random-walk integration creates long-range correlations even from IID increments. block_entropy_2 / block_entropy_4 — Shannon entropy of byte pairs and 4-grams, normalized. PRNGs and white noise score ~1.0 (all patterns equally likely). Constants and periodic orbits score 0.0. The drop from block_entropy_2 to block_entropy_4 measures how much additional structure emerges at longer pattern lengths.
| Source | Domain | Value |
|---|---|---|
| Wichmann-Hill | binary | 0.9986 |
| XorShift32 | binary | 0.9986 |
| White Noise | noise | 0.9986 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| Collatz Gap Lengths | number_theory | 0.0000 |
| Source | Domain | Value |
|---|---|---|
| MINSTD (Park-Miller) | binary | 0.8602 |
| Pi Digits | number_theory | 0.8602 |
| glibc LCG | binary | 0.8602 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| Collatz Gap Lengths | number_theory | 0.0000 |
| Source | Domain | Value |
|---|---|---|
| Wichmann-Hill | binary | 1.0000 |
| MINSTD (Park-Miller) | binary | 1.0000 |
| XorShift32 | binary | 1.0000 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| Logistic r=3.2 (Period-2) | chaos | 0.0018 |
| Source | Domain | Value |
|---|---|---|
| Hilbert Walk | exotic | 0.9515 |
| Sawtooth Wave | waveform | 0.9491 |
| Random Steps | exotic | 0.9352 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| Collatz Gap Lengths | number_theory | 0.0000 |
| Source | Domain | Value |
|---|---|---|
| Logistic r=3.83 (Period-3 Window) | chaos | 1.0000 |
| Logistic r=3.74 (Period-5 Window) | chaos | 1.0000 |
| Logistic r=3.2 (Period-2) | chaos | 1.0000 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| L-System (Dragon Curve) | exotic | 0.0000 |
| Source | Domain | Value |
|---|---|---|
| Logistic r=3.2 (Period-2) | chaos | 1.0000 |
| Logistic r=3.5 (Period-4) | chaos | 1.0000 |
| Logistic r=3.83 (Period-3 Window) | chaos | 1.0000 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| Rule 30 | exotic | 0.0000 |
Information Theory metrics are the framework's workhorse for separating noise from structure. Compression ratio alone separates PRNGs (incompressible) from all other sources. Mutual information at multiple lags provides the temporal skeleton that static entropy measures miss. In the atlas, Information Theory drives the distributional view's separation between C1 (compressible, high MI: oscillators and periodic chaos) and C5 (incompressible, zero MI: noise and PRNGs).