What the digit-by-digit "best rational approximation" structure of each sample looks like.
For every sample value x ∈ (0, 1), computes the simple continued-fraction expansion x = 1/(a₁ + 1/(a₂ + 1/(a₃ + ...))) and aggregates the partial quotients across all samples. The math is classical: Khintchine's theorem says the geometric mean of partial quotients of almost every real converges to K₀ ≈ 2.685; the Gauss-Kuzmin distribution says P(a₁ = k) = log₂((k+1)²/(k·(k+2))) for uniformly sampled x. Sources whose value distribution matches "uniform real" reproduce these reference values; sources with arithmetic or algebraic structure deviate in characteristic ways.
Total-variation distance between the empirical distribution of first partial quotients P(a₁ = k) and the theoretical Gauss-Kuzmin distribution. All 8 DNA sources plus Codon Usage all score exactly 0.8924 — alphabet restriction (DNA is 4 letters) collapses the CF distribution to one specific shape regardless of which organism. Poker Hands (0.84) scores similarly. White Noise (0.08) and other uniform-byte sources sit near the theoretical floor. F-stat 9.7 puts it in the top tier of atlas discriminators. The metric saturates on small alphabets — it can detect "this is 4-letter" but can't distinguish humans from viruses.
Average number of partial quotients computed before float-precision exhausts the expansion. uint8 sources (Pi Digits, e Digits, AES Encrypted) all hit ~4.8 because input precision is only 8 bits. Float-precision sources like Lorenz, Brownian, and Stern-Brocot reach the cap of 20. Effectively an encoding-richness axis, orthogonal (r ≈ 0) to the Khintchine-norm metrics.
P(a₁ = 1), the fraction of samples whose continued fraction starts [0; 1, ...]. Theoretical Gauss-Kuzmin value is log₂(4/3) ≈ 0.415. Stern-Brocot Walk scores 0.97 (extreme — the source is by construction a tour of rationals organized by CF complexity, so its values are dominated by small-quotient patterns). von Mangoldt Function scores 0.99 (its sparse log-of-prime values land in CF-flavored regions). DNA scores 0.0 (the 4 specific values produce no a₁ = 1 patterns). This is the only metric in the atlas that directly probes mediant-tree / Sturmian structure.
Geometric mean of all partial quotients across all samples, log-scaled. Khintchine's theorem predicts log K₀ ≈ 0.988 for almost every real. Heavy-tailed empirical distributions show up here: Rainfall (5.09) is the extreme outlier — its event-driven sparse signal produces anomalous partial quotients. Devil's Staircase (1.93) reflects its Cantor-measure values' anomalously large partial quotients (a known property of fractal-measure-distributed reals).
| Source | Domain | Value |
|---|---|---|
| Accel Sit | motion | 0.8135 |
| Lotka-Volterra | bio | 0.7205 |
| English Literature | speech | 0.7063 |
| ··· | ||
| Pi Digits | number_theory | 0.0830 |
| ln(2) Digits | number_theory | 0.0831 |
| Bzip2 (level 1) | binary | 0.0836 |
| Source | Domain | Value |
|---|---|---|
| Ambient Microseism | geophysics | 1.9653 |
| Earthquake Depths | geophysics | 1.9611 |
| IMS Bearing Failed | bearing | 1.9266 |
| ··· | ||
| Speech "Five" | speech | 0.4985 |
| Speech "Nine" | speech | 0.5622 |
| Accel Walk | motion | 0.5637 |
Continued Fraction adds a Diophantine-approximation axis the atlas lacked. Its strongest discrimination is alphabet-restricted sources (DNA, codons, dice rolls, poker hands) on gauss_kuzmin_distance, and number-theoretic sources (Stern-Brocot Walk, von Mangoldt, Champernowne) on small_quotient_fraction. The geometry does not discriminate between transcendental constants and PRNGs at byte resolution (Pi, e, AES, White Noise are all uniformly distributed over n/255 fractions, producing identical CF stats) — a limitation that reinforces the broader transcendental-as-atlas-noise finding rather than overturning it.