Continued Fraction

Khintchine-Lévy deviation and Liouville-class partial-quotient outliers on continuous signals

distributionaldim 1D partial-quotient sequence2 metrics

What It Measures

What the digit-by-digit "best rational approximation" structure of each sample looks like.

For every sample value x ∈ (0, 1), computes the simple continued-fraction expansion x = 1/(a₁ + 1/(a₂ + 1/(a₃ + ...))) and aggregates the partial quotients across all samples. The math is classical: Khintchine's theorem says the geometric mean of partial quotients of almost every real converges to K₀ ≈ 2.685; the Gauss-Kuzmin distribution says P(a₁ = k) = log₂((k+1)²/(k·(k+2))) for uniformly sampled x. Sources whose value distribution matches "uniform real" reproduce these reference values; sources with arithmetic or algebraic structure deviate in characteristic ways.

Metrics

gauss_kuzmin_distance

Total-variation distance between the empirical distribution of first partial quotients P(a₁ = k) and the theoretical Gauss-Kuzmin distribution. All 8 DNA sources plus Codon Usage all score exactly 0.8924 — alphabet restriction (DNA is 4 letters) collapses the CF distribution to one specific shape regardless of which organism. Poker Hands (0.84) scores similarly. White Noise (0.08) and other uniform-byte sources sit near the theoretical floor. F-stat 9.7 puts it in the top tier of atlas discriminators. The metric saturates on small alphabets — it can detect "this is 4-letter" but can't distinguish humans from viruses.

cf_depth_mean

Average number of partial quotients computed before float-precision exhausts the expansion. uint8 sources (Pi Digits, e Digits, AES Encrypted) all hit ~4.8 because input precision is only 8 bits. Float-precision sources like Lorenz, Brownian, and Stern-Brocot reach the cap of 20. Effectively an encoding-richness axis, orthogonal (r ≈ 0) to the Khintchine-norm metrics.

small_quotient_fraction

P(a₁ = 1), the fraction of samples whose continued fraction starts [0; 1, ...]. Theoretical Gauss-Kuzmin value is log₂(4/3) ≈ 0.415. Stern-Brocot Walk scores 0.97 (extreme — the source is by construction a tour of rationals organized by CF complexity, so its values are dominated by small-quotient patterns). von Mangoldt Function scores 0.99 (its sparse log-of-prime values land in CF-flavored regions). DNA scores 0.0 (the 4 specific values produce no a₁ = 1 patterns). This is the only metric in the atlas that directly probes mediant-tree / Sturmian structure.

log_khintchine_mean

Geometric mean of all partial quotients across all samples, log-scaled. Khintchine's theorem predicts log K₀ ≈ 0.988 for almost every real. Heavy-tailed empirical distributions show up here: Rainfall (5.09) is the extreme outlier — its event-driven sparse signal produces anomalous partial quotients. Devil's Staircase (1.93) reflects its Cantor-measure values' anomalously large partial quotients (a known property of fractal-measure-distributed reals).

Atlas Rankings

gauss_kuzmin_distance
Source	Domain	Value
Accel Sit	motion	0.8135
Lotka-Volterra	bio	0.7205
English Literature	speech	0.7063
···
Pi Digits	number_theory	0.0830
ln(2) Digits	number_theory	0.0831
Bzip2 (level 1)	binary	0.0836

log_khintchine_mean
Source	Domain	Value
Ambient Microseism	geophysics	1.9653
Earthquake Depths	geophysics	1.9611
IMS Bearing Failed	bearing	1.9266
···
Speech "Five"	speech	0.4985
Speech "Nine"	speech	0.5622
Accel Walk	motion	0.5637

When It Lights Up

Continued Fraction adds a Diophantine-approximation axis the atlas lacked. Its strongest discrimination is alphabet-restricted sources (DNA, codons, dice rolls, poker hands) on gauss_kuzmin_distance, and number-theoretic sources (Stern-Brocot Walk, von Mangoldt, Champernowne) on small_quotient_fraction. The geometry does not discriminate between transcendental constants and PRNGs at byte resolution (Pi, e, AES, White Noise are all uniformly distributed over n/255 fractions, producing identical CF stats) — a limitation that reinforces the broader transcendental-as-atlas-noise finding rather than overturning it.

Open in Atlas

← CayleyLevel Statistics →