Zipf–Mandelbrot (16-bit)

Zipf exponent, vocabulary richness, frequency decay
distributionaldim Linguistic7 metrics

What It Measures

The same vocabulary structure as the 8-bit version, but at the resolution of byte pairs.

Packs each pair of consecutive bytes into a 16-bit symbol (65,536 possible values) using a sliding window, then performs the same Zipf-Mandelbrot analysis. The larger alphabet dramatically increases sensitivity to sequential structure: two signals with identical 8-bit histograms can have completely different 16-bit Zipf profiles if their byte-to-byte transitions differ.

Metrics

zipf_alpha

Zipf exponent at 16-bit resolution. Poker hands (2.74) and Sensor event streams (2.56) still top the ranking, but with lower exponents than the 8-bit version — the larger vocabulary dilutes the concentration. Collatz gap lengths drops from 3.62 (8-bit) to 2.39 (16-bit), indicating its dominance is mainly a single-byte phenomenon. Signals where alpha is similar at both resolutions have concentration that extends to pair structure.

zipf_r_squared

Fit quality at 16-bit resolution. Collatz gap lengths (0.99), Poker hands (0.98), and Continued fractions (0.98) score highest — their frequency decay is genuinely power-law even at pair resolution. This is more discriminating than the 8-bit version: many signals that fit Zipf at byte level fall apart at the pair level because their byte transitions are not power-law distributed.

mandelbrot_q

The same plateau parameter. Solar wind IMF and Solar wind speed still score 10.0 — their distributional plateau persists at pair resolution. Devil's staircase drops from nonzero (8-bit) to 0.0 (16-bit): its plateau structure is a single-byte phenomenon that doesn't extend to pairs. Tidal gauge (10.0) joins the top at 16-bit — its slow tidal oscillations create repeated byte pairs that flatten the top of the frequency curve.

gini_coefficient

Frequency inequality at pair resolution. Rainfall (0.97), Forest fire (0.95), and Neural net pruned (0.91) remain the most unequal. The pruned network drops from 0.94 (8-bit) to 0.91 (16-bit) — its zero-dominated weight distribution is slightly less extreme when viewed as pairs. Logistic period-3 scores 0.0 (its 3 repeated values produce 3 repeated pairs, all equally common).

hapax_ratio

Fraction of 16-bit symbols appearing exactly once. Champernowne (0.95) tops the ranking — its digit-concatenation construction creates an enormous number of unique byte pairs. Middle-Square (0.93) and Intermittent silence (0.90) follow. The hapax explosion at 16-bit resolution is expected: with 65,536 possible symbols and finite data, many pairs will be seen only once. The signals at the bottom (Sawtooth wave at 0.0) are those whose pair vocabulary is so constrained that every pair repeats.

Atlas Rankings

bigram_predictability
SourceDomainValue
Logistic r=3.2 (Period-2)chaos4.0000
Constant 0xFFnoise4.0000
Logistic r=3.83 (Period-3 Window)chaos4.0000
···
Wichmann-Hillbinary0.0104
XorShift32binary0.0105
White Noisenoise0.0105
entropy_nonstationarity
SourceDomainValue
Nikkei Returnsfinancial1.1226
NASDAQ Returnsfinancial1.1039
NYSE Returnsfinancial1.0961
···
Constant 0xFFnoise0.0000
Logistic r=3.2 (Period-2)chaos0.0000
Logistic r=3.83 (Period-3 Window)chaos0.0000
gini_coefficient
SourceDomainValue
Rainfall (ORD Hourly)climate0.9724
Forest Fireexotic0.9512
Neural Net (Pruned 90%)binary0.9092
···
Logistic r=3.83 (Period-3 Window)chaos0.0000
Logistic r=3.2 (Period-2)chaos0.0000
Logistic r=3.5 (Period-4)chaos0.0000
hapax_ratio
SourceDomainValue
Champernownenumber_theory0.9435
Middle-Square (von Neumann)binary0.9345
Intermittent Silenceexotic0.9033
···
Constant 0xFFnoise0.0000
Sawtooth Wavewaveform0.0000
Logistic r=3.83 (Period-3 Window)chaos0.0000
mandelbrot_q
SourceDomainValue
Spike Trainexotic10.0000
Accel Jogmotion10.0000
Accel Walkmotion10.0000
···
El Centro 1940geophysics0.0000
Rainfall (ORD Hourly)climate0.0000
Sandpileexotic0.0000
zipf_alpha
SourceDomainValue
Poker Handsexotic2.7400
Sensor Event Streamexotic2.5753
Categorical Sensorexotic2.5011
···
De Bruijn Sequencenumber_theory0.0001
Gray Code Counterexotic0.0001
Logistic r=3.74 (Period-5 Window)chaos0.0002
zipf_r_squared
SourceDomainValue
Collatz Gap Lengthsnumber_theory0.9911
Poker Handsexotic0.9796
Continued Fractionsnumber_theory0.9769
···
De Bruijn Sequencenumber_theory0.0053
Gray Code Counterexotic0.0058
Middle-Square (von Neumann)binary0.3477

When It Lights Up

The 16-bit version's primary value over the 8-bit is detecting transition structure. A signal with a flat byte histogram (high 8-bit entropy, low 8-bit alpha) can still have extreme 16-bit concentration if certain byte transitions dominate. Comparing alpha and hapax between the two resolutions reveals how much of the signal's vocabulary structure is local (single-byte) versus sequential (pair). In the atlas, Champernowne's jump from low hapax at 8-bit (it uses all digits) to 0.95 hapax at 16-bit (its digit pairs are almost all unique) is the clearest example: its structure is entirely in the sequence of bytes, not in their individual distribution.

Open in Atlas
← Zipf–Mandelbrot (8-bit)Cantor Set →