Friday, February 22, 2013

Harder than Diehard: The Scintillating Entropy Test

With apologies to the late great George Marsaglia, whose Diehard Tests are a cornerstone in the quest for credible pseudorandomness...

The Second Law of Thermodynamics, which has more to do with math than physics, states that "the entropy of an isolated system never decreases", to quote Wikipedia. But this law applies to physical systems under the classical assumption of continuity among infinite particles, as distinct from quantization among finitely many particles. A Marsaglia oscillator is definitely the latter variety, its state space being constrained by the number of bits in its iterand.

We all know that such finite discrete systems much eventually fall into a limit cycle. Indeed, one of the beauties of Marsaglia oscillators is that their limit cycle lengths can be predicted exactly without actually running an entire cycle.

So what this means is that if we start with a value on the limit cycle, that we will eventually return to the same value, given enough time. Returning to the thermodynamic metaphor, it's as though you have an egg on the table. It then rolls off and splats on the floor. The next day, you find that the egg has suddenly cleaned itself up and reassembled itself on the table.

What would one expect to see on the way to the egg splatting? First, the speed of the egg would increase. Then, as it hit the floor, cracks would spread throughout the shell, migrating from the point of impact, wider and wider, until the shell loses structural integrity, and the yolk spews out all over the place. In other words, a monotonic increase in entropy from start to finish, straight in tune with the Second Law.

We expect to see the same in a Marsaglia oscillator. The trouble is, we know that, eventually, the system will return to its original seed, provided that the seed is indeed on a limit cycle (which is usually the case). How does this occur? Does the egg go from complete splat to happily sitting back on the table, instantaneously? Not exactly. Believe it or not, what we observe is entropy scintillation. Just like a star on the horizon, which rapidly dims and flares with changing atmospheric phenomena, the entropy comes, and goes, then comes back again. (I'm not referring to the fact that the scintillation of a star can be an entropy source, which is clearly true; I'm saying that the degree of entropy itself is scintillating.)

We can see this in practice. Here's one good way to measure entropy accrual in any N-bit integer oscillator:



1. Pick a random N-bit seed from a uniform distribution. Call it S0.

2. Copy S0 to S1.

3. Flip a bit of S1 at random, so S0 and S1 differ by a bit.

4. Iterate S0 and S1 separately, but after each stage, take their xor. (This is called the "xor compensator".) Count the number of 0-to-1 or 1-to-0 transitions, T, in the xor. (Consider the N bits to be arranged in a loop, so that the maximum value of T is N, not (N-1).) [EDIT: On second thought, it's probably better to think of the xor compensator as having bit (negative 1) as 0. That way, (all 1s) is considered to have 1 transition, which is more entropic than (all 0s), which has 0 transitions. (And moreover, the number of transitions can now be odd as well as even.) But this doesn't change the results much, apart from making them a little more accurate.]



So after step 3 and before we iterate at all, we expect to find 2 transitions in the xor, namely, one 0-to-1 and one 1-to-0, because the xor must contain a single 1 bit, corresponding to the bit that was flipped in #3. After each iteration of this oscillator, we should expect to find more and more transitions. If we run many trials from #1 onwards, then, after a certain number of iterations, the average number of transitions should be N/2. (This is easily demonstrated by substituting a true random number generator for the pseudorandom iterator.) This comes from the fact that a TRNG has a 50/50 chance of generating the next bit different from the current bit.

In a perfect TRNG, we hit 50/50 after the first iteration, because there is no iterator! With a pseudorandom iterator, however, we expect to have to "rev it up", like a car starting on a cold day, only approaching the N/2 threshold after so-many iterations. The natural implication is that, if we simply iterate a few times between producing numbers, instead of just once, that we can approach the N/2 limit as closely as desired.

Unfortunately, Marsaglia oscillators (and, I suspect, various other wannabe random iterators) don't behave like this. Yes, they do approach the N/2 limit in 10 or 20 iterations. But then they back off, while the egg puts itself (somewhat) back together again! What we have, most certainly, is scintillating entropy. This is a smoking gun for pseudorandomness.

Let's look at the data. I happen to be investigating a 512-bit Marsaglia oscillator with B=(2^256) and A=(B-0xE66FAD12). So perfect entropy means averaging 256 transitions in the xor of the iterands. (Too many transitions are just as bad as too few: 01010101... is not very entropic!) However, that's not what happens, as you can see below. Entropy maxes out after 10 iterations, then backs off, only to return 6 iterations later:



0: 2
1: 11.0839
2: 37.1358
3: 68.3862
4: 96.3759
5: 125.272
6: 154.67
7: 181.451
8: 205.462
9: 232.128
10: 255.925
11: 264.856
12: 265.29
13: 263.62
14: 259.082
15: 257.353
16: 256.811
17: 255.263
18: 257.525
19: 255.262
20: 249.468
21: 249.762
22: 253.123
23: 256.236
24: 258.244



(The "2" next to "0" (iterations) is just a sanity check. It's correct, meaning that we do indeed always begin with 2 seeds which differ by a single bit.) Now perhaps you're thinking "These numbers are all pretty close to 256 after the 10th iteration. The error is just noise." But you would be wrong: I seed the program with the Jytter TRNG every time, but I get (very nearly) the same numbers, with maximum entropy after iterations 10, 16, 19, and so on. Remember that I'm starting from random values. It's incredible that this pattern exists at all, let alone so many standard deviations out after millions of tests starting from different seeds. What you see above is the Marsaglia system genuinely reversing its entropy, to some extent. (Yes, it's true that my definition of entropy as bit transitions is by no means the ultimate definition, but I think you'll see some form of scintillation, which persists when averaged over many seeds, no matter how you choose to quantify entropy.)

But this should not be surprising, because after all, we know that the oscillator must eventually return to its point of origin, to the extent that such point is on the limit cycle. It would be more surprising if it did so as a singular event, as opposed to the culmination of entropy scintillation throughout the entire cycle.

A TRNG should hit the N/2 limit after a single iteration. Any oscillator which fails this test, remaining defiantly aloof from N/2 despite many trials, is a pseudorandom generator or a broken TRNG. The tantalizing hypothesis is that this remains true, regardless of how many "revving" iterations the generator is allowed. (A really smart PRNG would know exactly where its "entropic nodes" were, and produce numbers at only those points, for example, 10 and 16 and 19 above. But then we would then need to worry about correlations between iterations 10 and 16, etc., which is why this test is so diehard.)

I wonder if this has any security ramifications to existing infrastructure.

But more to the point, I wonder if this has physical equivalents. Perhaps physical systems can oscillate in and out of chaos, despite having essentially no net energy exchange with the environment. Meteorologists say that there are clouds in the sky because energy from the sun drives the processes which form them. But could clouds exist if the atmosphere were a closed system? Could we oscillate between periods of uniformally distributed gasses, and organized thunderstorms? If we view the entire atmosphere as a cellular automaton, of the sort described in A New Kind of Science, then I think the answer would be yes...

No comments:

Post a Comment