Continued Fraction

Khintchine-Lévy deviation and Liouville-class partial-quotient outliers on continuous signals
distributionaldim 1D partial-quotient sequence2 metrics

What It Measures

What the digit-by-digit "best rational approximation" structure of each sample looks like.

For every sample value x ∈ (0, 1), computes the simple continued-fraction expansion x = 1/(a₁ + 1/(a₂ + 1/(a₃ + ...))) and aggregates the partial quotients across all samples. The math is classical: Khintchine's theorem says the geometric mean of partial quotients of almost every real converges to K₀ ≈ 2.685; the Gauss-Kuzmin distribution says P(a₁ = k) = log₂((k+1)²/(k·(k+2))) for uniformly sampled x. Sources whose value distribution matches "uniform real" reproduce these reference values; sources with arithmetic or algebraic structure deviate in characteristic ways.

Metrics

gauss_kuzmin_distance

Total-variation distance between the empirical distribution of first partial quotients P(a₁ = k) and the theoretical Gauss-Kuzmin distribution. All 8 DNA sources plus Codon Usage all score exactly 0.8924 — alphabet restriction (DNA is 4 letters) collapses the CF distribution to one specific shape regardless of which organism. Poker Hands (0.84) scores similarly. White Noise (0.08) and other uniform-byte sources sit near the theoretical floor. F-stat 9.7 puts it in the top tier of atlas discriminators. The metric saturates on small alphabets — it can detect "this is 4-letter" but can't distinguish humans from viruses.

cf_depth_mean

Average number of partial quotients computed before float-precision exhausts the expansion. uint8 sources (Pi Digits, e Digits, AES Encrypted) all hit ~4.8 because input precision is only 8 bits. Float-precision sources like Lorenz, Brownian, and Stern-Brocot reach the cap of 20. Effectively an encoding-richness axis, orthogonal (r ≈ 0) to the Khintchine-norm metrics.

small_quotient_fraction

P(a₁ = 1), the fraction of samples whose continued fraction starts [0; 1, ...]. Theoretical Gauss-Kuzmin value is log₂(4/3) ≈ 0.415. Stern-Brocot Walk scores 0.97 (extreme — the source is by construction a tour of rationals organized by CF complexity, so its values are dominated by small-quotient patterns). von Mangoldt Function scores 0.99 (its sparse log-of-prime values land in CF-flavored regions). DNA scores 0.0 (the 4 specific values produce no a₁ = 1 patterns). This is the only metric in the atlas that directly probes mediant-tree / Sturmian structure.

log_khintchine_mean

Geometric mean of all partial quotients across all samples, log-scaled. Khintchine's theorem predicts log K₀ ≈ 0.988 for almost every real. Heavy-tailed empirical distributions show up here: Rainfall (5.09) is the extreme outlier — its event-driven sparse signal produces anomalous partial quotients. Devil's Staircase (1.93) reflects its Cantor-measure values' anomalously large partial quotients (a known property of fractal-measure-distributed reals).

Atlas Rankings

gauss_kuzmin_distance
SourceDomainValue
Accel Sitmotion0.8135
Lotka-Volterrabio0.7205
English Literaturespeech0.7063
···
Pi Digitsnumber_theory0.0830
ln(2) Digitsnumber_theory0.0831
Bzip2 (level 1)binary0.0836
log_khintchine_mean
SourceDomainValue
Ambient Microseismgeophysics1.9653
Earthquake Depthsgeophysics1.9611
IMS Bearing Failedbearing1.9266
···
Speech "Five"speech0.4985
Speech "Nine"speech0.5622
Accel Walkmotion0.5637

When It Lights Up

Continued Fraction adds a Diophantine-approximation axis the atlas lacked. Its strongest discrimination is alphabet-restricted sources (DNA, codons, dice rolls, poker hands) on gauss_kuzmin_distance, and number-theoretic sources (Stern-Brocot Walk, von Mangoldt, Champernowne) on small_quotient_fraction. The geometry does not discriminate between transcendental constants and PRNGs at byte resolution (Pi, e, AES, White Noise are all uniformly distributed over n/255 fractions, producing identical CF stats) — a limitation that reinforces the broader transcendental-as-atlas-noise finding rather than overturning it.

Open in Atlas
← CayleyLevel Statistics →