How gappy the signal's ternary address space is.
Each byte is re-interpreted as a base-3 address into the unit interval, the way the classical Cantor set construction works: each bit selects "left third" or "right third," skipping the middle. The resulting coordinates cluster around Cantor-set-like positions if the data has ternary self-similarity, or spread uniformly if it doesn't. The geometry then measures the gap structure of these coordinates.
Fraction of distinct embedded coordinates relative to total data length. Sprott-B, Projectile, and Damped Pendulum score 0.016 (many repeated coordinates — smooth dynamics map many different byte values to the same Cantor address). Fibonacci word scores 0.0001 (its binary structure creates extreme degeneracy in ternary representation). High coverage means the data explores many distinct positions in the Cantor embedding; low coverage means it collapses to a dust-like subset.
The largest gap between adjacent sorted coordinates. L-System Dragon, Morse code, and Rule 110 all hit 1.0 (a gap spanning the full interval — the data avoids an entire region of the ternary address space). Collatz gap lengths scores 0.005 (tiny gaps, nearly uniform coverage). A large max_gap means the data has a forbidden zone in its ternary structure, like the middle third removed in the classical Cantor construction.
Average spacing between consecutive sorted coordinates. Accel walk, Kepler exoplanet, and Zipf distribution cluster at 6.1e-5 (tightly packed — many distinct coordinates with small gaps). Collatz gap lengths scores 7.8e-7 (extremely dense). Mean gap complements max_gap: a signal can have large max_gap but small mean_gap if it has one big hole and is densely packed everywhere else.
Average lag-1 autocorrelation across the 8 bit planes of the byte stream. Logistic period-2 and constants score 1.0 (each bit plane is perfectly correlated). L-System Dragon and De Bruijn score 0.0001 (bit planes are uncorrelated — the low-order bits change unpredictably). This detects temporal structure in the binary representation that the ternary Cantor embedding misses. Evolved via ShinkaEvolve.
Shannon entropy of the gap sizes between consecutive Cantor coordinates. DNA Human scores 1.0 (maximally diverse jump sizes). Gzip (0.053) and Pi Digits (0.055) have the lowest entropy — their gap sizes are nearly uniform, meaning the Cantor coordinates change by approximately the same amount at each step. High jump entropy means the signal's ternary representation has diverse inter-step structure. Evolved via ShinkaEvolve.
| Source | Domain | Value |
|---|---|---|
| Logistic r=3.2 (Period-2) | chaos | 1.0000 |
| Prime Indicator | number_theory | 0.9997 |
| OTOC Growth | quantum | 0.9973 |
| ··· | ||
| L-System (Dragon Curve) | exotic | 0.0001 |
| De Bruijn Sequence | number_theory | 0.0001 |
| Ramanujan Tau | number_theory | 0.0024 |
| Source | Domain | Value |
|---|---|---|
| Logistic Chaos | chaos | 0.0156 |
| Henon Map | chaos | 0.0156 |
| Tent Map | chaos | 0.0156 |
| ··· | ||
| Mian-Chowla | number_theory | 0.0001 |
| Fibonacci Word | exotic | 0.0001 |
| Pell Word | exotic | 0.0001 |
| Source | Domain | Value |
|---|---|---|
| DNA Dog | bio | 1.0000 |
| DNA Chimp | bio | 1.0000 |
| DNA Human | bio | 1.0000 |
| ··· | ||
| Gzip (level 9) | binary | 0.0533 |
| ln(2) Digits | number_theory | 0.0549 |
| Euler-Mascheroni γ Digits | number_theory | 0.0553 |
| Source | Domain | Value |
|---|---|---|
| L-System (Dragon Curve) | exotic | 0.9998 |
| Fibonacci Word | exotic | 0.9998 |
| Pell Word | exotic | 0.9998 |
| ··· | ||
| Prime Indicator | number_theory | 0.0003 |
| Partition Function | number_theory | 0.0015 |
| Human Proteome | bio | 0.0128 |
| Source | Domain | Value |
|---|---|---|
| Speech "Five" | speech | 0.0001 |
| Speech "Zero" | speech | 0.0001 |
| Bearing Ball | bearing | 0.0001 |
| ··· | ||
| Prime Indicator | number_theory | 0.0000 |
| Partition Function | number_theory | 0.0000 |
| DNA SARS-CoV-2 | bio | 0.0000 |
Cantor Set geometry detects structure in the low bits of byte values — the ternary address depends on the full bit pattern, not just the magnitude. Signals with repetitive low-bit patterns (periodic orbits, symbolic dynamics) produce degenerate Cantor embeddings with extreme gaps. In the atlas, the combination of low coverage and high max_gap identifies signals whose byte values avoid specific ternary regions, which is a different kind of regularity than the distributional uniformity measured by Torus or Wasserstein.