How gappy the signal's ternary address space is.
Each byte is re-interpreted as a base-3 address into the unit interval, the way the classical Cantor set construction works: each bit selects "left third" or "right third," skipping the middle. The resulting coordinates cluster around Cantor-set-like positions if the data has ternary self-similarity, or spread uniformly if it doesn't. The geometry then measures the gap structure of these coordinates.
Fraction of distinct embedded coordinates relative to total data length. Sprott-B, Projectile, and Damped Pendulum score 0.016 (many repeated coordinates — smooth dynamics map many different byte values to the same Cantor address). Fibonacci word scores 0.0001 (its binary structure creates extreme degeneracy in ternary representation). High coverage means the data explores many distinct positions in the Cantor embedding; low coverage means it collapses to a dust-like subset.
The largest gap between adjacent sorted coordinates. L-System Dragon, Morse code, and Rule 110 all hit 1.0 (a gap spanning the full interval — the data avoids an entire region of the ternary address space). Collatz gap lengths scores 0.005 (tiny gaps, nearly uniform coverage). A large max_gap means the data has a forbidden zone in its ternary structure, like the middle third removed in the classical Cantor construction.
Average spacing between consecutive sorted coordinates. Accel walk, Kepler exoplanet, and Zipf distribution cluster at 6.1e-5 (tightly packed — many distinct coordinates with small gaps). Collatz gap lengths scores 7.8e-7 (extremely dense). Mean gap complements max_gap: a signal can have large max_gap but small mean_gap if it has one big hole and is densely packed everywhere else.
Average lag-1 autocorrelation across the 8 bit planes of the byte stream. Logistic period-2 and constants score 1.0 (each bit plane is perfectly correlated). L-System Dragon and De Bruijn score 0.0001 (bit planes are uncorrelated — the low-order bits change unpredictably). This detects temporal structure in the binary representation that the ternary Cantor embedding misses. Evolved via ShinkaEvolve.
Shannon entropy of the gap sizes between consecutive Cantor coordinates. DNA Human scores 1.0 (maximally diverse jump sizes). Gzip (0.053) and Pi Digits (0.055) have the lowest entropy — their gap sizes are nearly uniform, meaning the Cantor coordinates change by approximately the same amount at each step. High jump entropy means the signal's ternary representation has diverse inter-step structure. Evolved via ShinkaEvolve.
| Source | Domain | Value |
|---|---|---|
| Constant 0x00 | noise | 1.0000 |
| Logistic r=3.2 (Period-2) | chaos | 1.0000 |
| PID Controller | exotic | 0.9945 |
| ··· | ||
| L-System (Dragon Curve) | exotic | 0.0001 |
| De Bruijn Sequence | number_theory | 0.0001 |
| Categorical Sensor | exotic | 0.0053 |
| Source | Domain | Value |
|---|---|---|
| Projectile with Drag | motion | 0.0156 |
| Shuffled Blocks | exotic | 0.0156 |
| Kicked Rotor | quantum | 0.0156 |
| ··· | ||
| Constant 0xFF | noise | 0.0001 |
| Fibonacci Word | exotic | 0.0001 |
| Square Wave | waveform | 0.0001 |
| Source | Domain | Value |
|---|---|---|
| DNA Chimp | bio | 1.0000 |
| DNA Human | bio | 1.0000 |
| DNA Phage Lambda | bio | 1.0000 |
| ··· | ||
| Gzip (level 9) | binary | 0.0538 |
| Pi Digits | number_theory | 0.0553 |
| XorShift32 | binary | 0.0556 |
| Source | Domain | Value |
|---|---|---|
| Fibonacci Word | exotic | 0.9998 |
| Symbolic Lorenz | exotic | 0.9998 |
| Pulse-Width Modulation | waveform | 0.9998 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Collatz Gap Lengths | number_theory | 0.0053 |
| Poisson Counts | exotic | 0.0095 |
| Source | Domain | Value |
|---|---|---|
| Kepler Exoplanet | astro | 0.0001 |
| Sprott-B | chaos | 0.0001 |
| Tohoku Aftershock Intervals | geophysics | 0.0001 |
| ··· | ||
| Constant 0xFF | noise | 0.0000 |
| Collatz Gap Lengths | number_theory | 0.0000 |
| Poisson Counts | exotic | 0.0000 |
Cantor Set geometry detects structure in the low bits of byte values — the ternary address depends on the full bit pattern, not just the magnitude. Signals with repetitive low-bit patterns (periodic orbits, symbolic dynamics) produce degenerate Cantor embeddings with extreme gaps. In the atlas, the combination of low coverage and high max_gap identifies signals whose byte values avoid specific ternary regions, which is a different kind of regularity than the distributional uniformity measured by Torus or Wasserstein.