How the distribution of values differs from uniform, and how stable that distribution is across the signal.
Bins the data into a 32-bin histogram and treats it as a probability distribution. Then asks three questions: how far is this distribution from uniform (optimal transport cost)? How concentrated is it (peak height)? And does the first half of the signal look like the second half (self-similarity)?
The peak bin height times the number of bins. 1.0 means uniform (De Bruijn scores exactly 1.0 — its construction guarantees every byte pattern appears equally). Above 1.0 means the distribution has a spike. Collatz gap lengths (31.8), Rainfall (31.5), and Forest fire (29.4) are the most concentrated signals in the atlas — their heavy-tailed distributions pile most of their mass into the lowest bin.
Earth mover's distance from the uniform distribution: the minimum amount of "dirt" you'd need to move to make the histogram flat. Collatz gap lengths (0.48) and Rainfall (0.48) are farthest from uniform. Neural net pruned weights (0.46) are close behind — pruning creates a spike at zero. De Bruijn scores near 0 (already uniform).
Shannon entropy of the 32-bin histogram. De Bruijn, circle map quasiperiodic, and phyllotaxis all score 5.0 (near the maximum of 5 bits — flat distribution). Collatz gap lengths scores 0.05 (almost all mass in one bin). This is the classical measure of distributional spread, here computed on the Wasserstein embedding.
One minus the earth mover's distance between the first-half and second-half histograms. 1.0 means the distribution is perfectly stable over time (logistic period-4, logistic period-2, De Bruijn). Hilbert walk scores 0.60 (its deterministic sweep creates different distributions in the first and second halves). This catches nonstationarity that entropy and concentration miss: a signal can have high entropy overall but low self_similarity if its distribution drifts.
| Source | Domain | Value |
|---|---|---|
| Collatz Gap Lengths | number_theory | 31.8133 |
| Rainfall (ORD Hourly) | climate | 31.4687 |
| Forest Fire | exotic | 29.4376 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| De Bruijn Sequence | number_theory | 1.0000 |
| Source | Domain | Value |
|---|---|---|
| Collatz Gap Lengths | number_theory | 0.4842 |
| Rainfall (ORD Hourly) | climate | 0.4830 |
| Neural Net (Pruned 90%) | binary | 0.4643 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| De Bruijn Sequence | number_theory | 0.0000 |
| Source | Domain | Value |
|---|---|---|
| De Bruijn Sequence | number_theory | 5.0000 |
| Circle Map Quasiperiodic | chaos | 4.9996 |
| Phyllotaxis | bio | 4.9996 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| Collatz Gap Lengths | number_theory | 0.0517 |
| Source | Domain | Value |
|---|---|---|
| Logistic r=3.5 (Period-4) | chaos | 1.0000 |
| Logistic r=3.2 (Period-2) | chaos | 1.0000 |
| De Bruijn Sequence | number_theory | 1.0000 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Constant 0x00 | noise | 0.0000 |
| Hilbert Walk | exotic | 0.5974 |
Wasserstein self_similarity is the distributional lens's nonstationarity detector. Signals that change character midstream — sensor drift, regime switches, concatenated recordings — score low on self_similarity while potentially scoring high on all other distributional metrics. In the atlas, Wasserstein's concentration axis separates the heavy-tailed cluster (Collatz, rainfall, forest fire) from the uniform-distribution cluster (PRNGs, De Bruijn), while self_similarity provides an orthogonal axis that catches temporal instability invisible to any single-histogram metric.