LMD7, whose C source code is
here, is a kilobit hash which operates on 32-kilobit blocks, and largely resembles
LMD6,
but it's about twice as fast. (I won't repeat all the discussion of how
LMD6 was evolved, but it's worth a read if you want to understand LMD7,
because this article will focus only the differences.)
[Sorry about the excessive whitespace which Blogspot injects at random just to be annoying.]
Here's the algo:
j=((2^15)/N)-1
x=X0
c=C0
y=Y0
d=D0
FOR (i=0 to (j-1),i+=2){
x=x xor y
c=c xor d
p=A*(x xor L[N, i])+(c xor L[N, i+1])
x=p MOD 2^N
c=p/(2^N)
q=B*(y xor L[N, i xor ((j+1)/2)])+(d xor L[N, (i+1) xor ((j+1)/2)])
y=q MOD 2^N
d=q/(2^N)
}
p=p+(y*(2^N))+d
p=p MOD 2^(2N)
z=p xor M
where:
j is the maximum index of the block, each item of which is N bits wide.
N is half the bit width of the hash, which is 512 because an LMD7 hash is 1024 bits wide.
X0,
C0, Y0, and D0 are the N-bit hash seeds, which are intended to be
implied by a cryptographic context, and which are neither used for any
other purpose nor stored anywhere. In the C source code, they appear in
memory in the aforementioned order at p_base on input to the hash. Note
that we have 4N bits going into a hash which will ultimately be only 2N
bits wide.
x, c, y, and d are also N bits wide. (x, c)
and (y, d) function in a manner similar to (x, c) in a Marsaglia
oscillator. (x is at the memory address immediately below c.)
p
is the first oscillator state, 2N bits wide. It's more complicated than
a Marsaglia oscillator because its design is a product of evolution,
but it contains the components of one.
A is the first
Marsaglia oscillator's multiplier which, in terms of the source code
constant LMD7_D0, is (2^512-(LMD7_D0*(2^32))). LMD7_D0=0xD1AEF329.
L
is the array of (j+1) N-bit words whose hash to evaluate. (In all
cases, L is 2^15 bits long. It's just a question of how wide we consider
a word to be, and thus how many words L contains.) Note that L is
indexed from 1 through 64 in Mathematica, but from 0 through 63 in the
algo above.
q is the second oscillator state, 2N bits wide.
B
is the second Marsaglia oscillator's multiplier which, in terms of the
source code constant LMD7_D1, is (2^512-(LMD7_D1*(2^32))).
LMD7_D1=0xE5467E8F.
M is a xor mask, essentially a
cryptographically generated nonce, which will mask the hash. (Encrypting
the hash with this mask would appear to be overkill, but it's dirt
cheap compared to encrypting and taking the hash of the payload, so why
not.)
z is the hash (the digest) of L, which is 2N bits wide.
And here is a Mathematica example, using exactly the same seeds and hash as over
here,
which also appears in demo.c in the source code. Cut and paste into the
Mathematica Kernel command prompt, without any carriage returns that
this website might have injected. Then hit Enter to execute it.
A=(2^512-((2^32)*16^^D1AEF329));B=(2^512-((2^32)*16^^E5467E8F));n=512;t=2^n;u=2^(n+n);X0=1;C0=2;Y0=3;D0=4;M=0;j=((2^15)/n)-1;L=Array[#&,j+1];x=X0;c=C0;y=Y0;d=D0;For[i=0,i<j,i+=2,x=BitXor[x,y];c=BitXor[c,d];p=(A*BitXor[x,L[[i+1]]])+BitXor[c,L[[i+2]]];x=Mod[p,t];c=BitShiftRight[p,n];q=(B*BitXor[y,L[[BitXor[i,((j+1)/2)]+1]]])+BitXor[d,L[[BitXor[i,((j+1)/2)]+2]]];y=Mod[q,t];d=BitShiftRight[q,n]];p=Mod[p+BitShiftLeft[y,n]+d,u];z=BitXor[p,M];Print[BaseForm[z,16]];
The above code, when executed in the Mathematica Kernel, should produce:
23f02b7e5b0d92db672706d46d00b7e19bf25a746dffe8c4afcf7c4d71bdb4df8128af2d41a40605a2c158cb4ea33e776b7da1877f1053bf9fa2d5c40dd01c5a19b4aa63034626d7f6adff9fa871b5709878115ec2e4c5eb1caa4b86d8dea9b28caea6278db03a6b03c6bebaad49cf3279fb724a653225613d9f6422ce63a70e
Thus
LMD7 involves 2 oscillators of large mutually prime period --
((A*(2^511))-1) and ((B*(2^511))-1) -- starting half a block apart and
rotating through all words in a block, with feedback moving from
oscillator B to oscillator A after each step (but not in the other
direction). Because a block is 2^15 bits long, and an LMD7 is 2^10 bits
long, that means each oscillator iterates 32 times before the hash is
issued.
Thus immediately after oscillator A processes word 0, oscillator B processes word 16; immediately after
oscillator A processes word 31, oscillator B processes word 15.
Thus in the weakest case, a given word is iterated through one of the
oscillators only once, while being iterated 17 times through the other.
The
main reason why LMD7 is about twice as fast as LMD6 is that the word
width of the former is twice as long as the latter. In other words,
while LMD6 only xors X with the plaintext, LMD7 xors both X and C.
Normally, this would be unsafe because the high bit of a Marsaglia
oscillator is
biased. But in this case, the bias of each oscillator is 1 part in O(2^(512-64)), or O(10^134). Is that exploitable? Good luck.
One
obvious requirement is that both oscillators come up to essentially
maximum entropy in at most 17 iterations. That way, when one oscillator
is at its weakest (i.e. a single digesting iteration remaining), the
other is sufficiently strong to pick up the slack. To that end, LMD7_D0
and LMD7_D1 have been chosen so that the
scintillating entropy profile
of the oscillators ends up at 511 bit transitions after 17 iterations,
in both cases. In other words, starting from 2 1024-bit random seeds
which differ by only a bit, the xor of the 17-iterated versions of these
seeds will itself contain 511 bit transitions -- trivially shy of
maximum entropy. But there are 2 such oscillators, which after digesting
the block, are combined by crossing high and low 512-bit parts, using
carry-propagated addition, to yield a 1024-bit hash which is for all
practical purposes unbiased. And by the way, this is where the factor of
2^32 comes in A and B: without this factor, the oscillator doesn't
entropize fast enough. But with it, we can use simple 32-to-64-bit
multiplies to rapidly iterate each oscillator.
In the
interest of creating a compelling case for the security of LMD7, perhaps
the most important consideration is demonstrating that setting a single
1 bit in a block consisting of all 0s produces a change to the hash of
such complexity as to appear random. This "change" is what I call the
"xor
compensator" of the 2 hashes, because when this value is xored to the
hash of the poisoned block, it morphs into the hash of the legitimate
block. Statistically speaking, this means that if we do 2^2N such tests,
and thereby generate as many xor compensators, that we should expect to
encounter (R*(2^2N)) unique xor compensators, where R is the
reflexive density constant.
(By
the way, xor is obviously not the only way one could compensate a hash.
Addition, modulo multiplication, and an infinite number of other
functions are candidates as well. But xor tends to be a good proxy for
addition: if the xor compensator shows dangerous skews, then chances are
that the addition compensator does so as well, and vice versa. Bias in
other compensator spaces are an interesting open question, and I don't
pretend to be
proving anything in this regard.)
The only minor problem, of course, is that 2N=1024. This forces us into a regime of
extrapolatory algogenesis,
which is about the best we can do, absent a proof of complexity, which
has thus far eluded all hash algos ever invented. We can always to a
more thorough job in this regard, but 2 cases are explored below, one
with N=8 and the other with N=9. In other words, we are scaling LMD7
down from a 1024-bit hash to a 16-or-18-bit hash, repsectively. The algo
is otherwise unchanged, including the number of words per block, 64.
Mainly,
we're looking for evidence that setting a single 1 in a block of 0s
with 2^2N random seeds creates a xor compansator set which is compliant
with R, as described above. Any more complicated change to any more
complicated block might lead us to unjustifiably optimistic assumptions
about the strength LMD7.
The Marsaglia A and B values
in this case will be (t-46) and (t-52), where t=2^8, as you can see in
the code below. (By the way, (A*(2^8)-1) and (A*(2^7)-1) are both prime
as required, and similarly for B.) So here's the Mathematica code which
will do this (remove any carriage returns injected by this blog
website, then paste into the Mathematica Kernel command prompt). (I
realize that the code looks horrible, which is because it's only
intended for verification purposes; the C is much more elegant, I
promise!)
n=8;t=2^n;A=(t-46);B=(t-52);u=2^(n+n);v=t-1;K=Array[#*0&,u];R=0;S=0;j=63;h=(j+1)/2;L=Array[#&,j+1];For[w=0,w<u,w++,X0=RandomInteger[v];C0=RandomInteger[v];Y0=RandomInteger[v];D0=RandomInteger[v];x=X0;c=C0;y=Y0;d=D0;For[i=0,i<=j,i+=2,x=BitXor[x,y];c=BitXor[c,d];p=(A*BitXor[x,L[[i+1]]])+BitXor[c,L[[i+2]]];x=Mod[p,t];c=BitShiftRight[p,n];q=(B*BitXor[y,L[[BitXor[i,h]+1]]])+BitXor[c,L[[BitXor[i,h]+2]]];y=Mod[q,t];d=BitShiftRight[q,n]];p=p+(y*t)+d;p=Mod[p,u];z0=p;F=RandomInteger[j]+1;G=RandomInteger[n-1];L[[F]]=BitShiftLeft[1,G];x=X0;c=C0;y=Y0;d=D0;For[i=0,i<=j,i+=2,x=BitXor[x,y];c=BitXor[c,d];p=(A*BitXor[x,L[[i+1]]])+BitXor[c,L[[i+2]]];x=Mod[p,t];c=BitShiftRight[p,n];q=(B*BitXor[y,L[[BitXor[i,h]+1]]])+BitXor[c,L[[BitXor[i,h]+2]]];y=Mod[q,t];d=BitShiftRight[q,n]];p=p+(y*t)+d;p=Mod[p,u];z1=p;L[[F]]=F;z=BitXor[z0,z1];If[K[[z]]==0,R++];K[[z]]++;If[K[[z]]>S,S=K[[z]]];If[BitAnd[w,16^^FFF]==0,Print["w=",BaseForm[w,16]]]];Print["R=",R/(2.^(n+n))];Print["R_ideal=",1-Exp[-1.]];Print["population_max=",S];Print["population_max_density_log2=",Log[2.,u/S]];
The
output might take 10 minutes to generate. It looks like this (but might
slightly differ from yours, on account of the RandomInteger[] function
being nondeterministic):
w=0
16
.
.
.
w=f000
16
R=0.607376
R_ideal=0.632121
population_max=8
population_max_density_log2=13.
(The
W's are just progress indicators, and the "16"s are just Mathematica's
obnoxious way of telling us that the result is in hexadecimal.) The
important part is at the bottom: This was a 16-bit hash (because A and B
both have 8 bits), which as you can see from the W's, we've tested 2^16
times. On each test, we hashed a block of 0s against the same block
with a randomly selected bit set to 1, starting from the same random
seed in both cases. We then xor the resulting hashes. Finally, we set
the bit corresponding to this xor-of-hashes (Z) in a table (K) of 2^16
xor compensator population counts, initially all 0s.
When
the experiment is done, the fraction of nonzeros we expect to find in
the table is R, or 0.632121... As you can see, this hash performed well,
but not great, coming in at 0.61. That means that the xor compensator
is modestly biased, given a single bit flip in the simplest possible
plaintext configuration, which is a block of 0s. You could say that
we're measuring the "ground state entropy" of the hash.
As
to population_max_density_log2, that's the log2 of
((2^2N)/population_max), or log2(65536/8) in this case. This represents
the log2 rarity of the most popular xor compensator. It's subject to a
lot of noise, but it functions like a canary in a coal mine: if it's too
small, then the most popular xor compensator can be easily guessed by
examining test cases involving known seeds. For example, if it were 3,
that would mean that 1 in 8 xor compensators were the same value. 13 is
probably fine for a 16-bit hash.
In fact, what
is the
statistically expected value of the maximum population? Combinatorics is informative:
Let
Q be the probability of one particular integer in the population map
(K, in the code above) being incremented ("hit") T times, i.e. of the
corresponding xor compensator occurring T times, assuming that the xor
compensators are distributed randomly because the hash is "hard".
Let U=2^(2N), where 2N is the number of bits in the hash.
Then:
Q=((1/U)^T)*((1-(1/U))^(U-T))*(U nCr T)
because:
((1/U)^T) is the probability of being hit T times, at exactly the iteration numbers predicted.
((1-(1/U))^(U-T)) is the probability of not being hit (U-T) times, at exactly the iterations numbers predicted.
(U nCr T) is the number of possible combinations involving T hits over U attempts.
So then:
P=1-((1-Q)^U)
is the probability that at least one integer will be hit exactly T times.
because:
(1-Q) is the probability that a given integer is not hit exactly T times.
((1-Q)^U) is the probability that no integer will be hit exactly T times.
So (1-((1-Q)^U)) is the probability that at least one integer will be hit exactly T times.
If
P=0.5, then T is the expected maximum population. Since T is discrete,
we don't expect to hit P=0.5 exactly. That's OK, so long as T ends up in
a state which is "reasonably probable". T=1000 with N=8, for example,
would
not be reasonably probable. That would be bad because it
would suggest that the xor compensator could easily be guessed based on
statistical analysis involving a random set of known seeds.
So in this case, we have T=8 and U=2^16. Find P with Mathematica:
t=8;n=8;u=2^(n+n);p=(1/u)^t*(1-(1/u))^(u-t)*Binomial[u,t];p=1-((1.-p)^u);Print["p=",p];Print["p_reciprocal_log2=",-Log[2.,p]];
which produces:
p=0.449961
p_reciprocal_log2=1.15213
Let's
kick it up a notch, as Emeril Lagasse would say. Try N=9, meaning an
18-bit hash, with A=(t-17) and B=(t-47), where t=2^9 and the usual
primality requirements are satisfied:
n=9;t=2^n;A=(t-17);B=(t-47);u=2^(n+n);v=t-1;K=Array[#*0&,u];R=0;S=0;j=63;h=(j+1)/2;L=Array[#&,j+1];For[w=0,w<u,w++,X0=RandomInteger[v];C0=RandomInteger[v];Y0=RandomInteger[v];D0=RandomInteger[v];x=X0;c=C0;y=Y0;d=D0;For[i=0,i<=j,i+=2,x=BitXor[x,y];c=BitXor[c,d];p=(A*BitXor[x,L[[i+1]]])+BitXor[c,L[[i+2]]];x=Mod[p,t];c=BitShiftRight[p,n];q=(B*BitXor[y,L[[BitXor[i,h]+1]]])+BitXor[c,L[[BitXor[i,h]+2]]];y=Mod[q,t];d=BitShiftRight[q,n]];p=p+(y*t)+d;p=Mod[p,u];z0=p;F=RandomInteger[j]+1;G=RandomInteger[n-1];L[[F]]=BitShiftLeft[1,G];x=X0;c=C0;y=Y0;d=D0;For[i=0,i<=j,i+=2,x=BitXor[x,y];c=BitXor[c,d];p=(A*BitXor[x,L[[i+1]]])+BitXor[c,L[[i+2]]];x=Mod[p,t];c=BitShiftRight[p,n];q=(B*BitXor[y,L[[BitXor[i,h]+1]]])+BitXor[c,L[[BitXor[i,h]+2]]];y=Mod[q,t];d=BitShiftRight[q,n]];p=p+(y*t)+d;p=Mod[p,u];z1=p;L[[F]]=F;z=BitXor[z0,z1];If[K[[z]]==0,R++];K[[z]]++;If[K[[z]]>S,S=K[[z]]];If[BitAnd[w,16^^FFF]==0,Print["w=",BaseForm[w,16]]]];Print["R=",R/(2.^(n+n))];Print["R_ideal=",1-Exp[-1.]];Print["population_max=",S];Print["population_max_density_log2=",Log[2.,u/S]];
...and the results:
R=0.627625
R_ideal=0.632121
population_max=8
population_max_density_log2=15.
Thus
R is closer to R_ideal. This isn't an accident. In part, it comes from
the fact that we're using less biased oscillators, i.e. A and B values
which are proportionally closer to 2^N in with N=9 than N=8. You can
imagine how this would work out if the bias were 1 part in O(10^134), as
with LMD7: R should be indistinct, in practice, from R_ideal.
And what about population_max_density_log2? This time, N=9 but still T=8:
t=8;n=9;u=2^(n+n);p=(1/u)^t*(1-(1/u))^(u-t)*Binomial[u,t];p=1-((1.-p)^u);Print["p=",p];Print["p_reciprocal_log2=",-Log[2.,p]];
which produces:
p=0.908519
p_reciprocal_log2=0.138411
Which
means it's 90% likely that we'll find at least one population of 8
(i.e. T=8) in the map -- a tad high, when we're looking for something
nearer 50%. But it turns out that T=9 has a 23% chance of the same. So
T=8 is unsurprising.
What this all points to, thus far,
is that the xor compensator appears to be randomly distributed, i.e.
"mini LMD7" is hard. Extrapolation suggests, then, that LMD7 is also
hard, as measured by the same tests.
But it behooves us to pound harder on the most weakly protected bit in the block.
The
weakest bit in the block is probably bit ((2^14)-1), as measured on
average over a wide variety of seeds. This bit is weakly protected
because it's the last bit to be integrated into the B oscillator, and is
thus subject to the least number of entropizing iterations after the
fact. Additionally, while the state of the B oscillator feeds into the A
oscillator (via xor at the top of the loop), the reverse isn't true, so
B probably cannot entropize faster than A. We return to the N=8 case,
with the difference that we'll only compare a block of 0s to the same
block with bit ((2^14)-1) set ("L[[(j+1)/2]]=BitShiftLeft[1,n-1]"):
n=8;t=2^n;A=(t-46);B=(t-52);u=2^(n+n);v=t-1;K=Array[#*0&,u];R=0;S=0;j=63;h=(j+1)/2;L=Array[#*0&,j+1];For[w=0,w<u,w++,X0=RandomInteger[v];C0=RandomInteger[v];Y0=RandomInteger[v];D0=RandomInteger[v];x=X0;c=C0;y=Y0;d=D0;For[i=0,i<=j,i+=2,x=BitXor[x,y];c=BitXor[c,d];p=(A*BitXor[x,L[[i+1]]])+BitXor[c,L[[i+2]]];x=Mod[p,t];c=BitShiftRight[p,n];q=(B*BitXor[y,L[[BitXor[i,h]+1]]])+BitXor[c,L[[BitXor[i,h]+2]]];y=Mod[q,t];d=BitShiftRight[q,n]];p=p+(y*t)+d;p=Mod[p,u];z0=p;L[[(j+1)/2]]=BitShiftLeft[1,n-1];x=X0;c=C0;y=Y0;d=D0;For[i=0,i<=j,i+=2,x=BitXor[x,y];c=BitXor[c,d];p=(A*BitXor[x,L[[i+1]]])+BitXor[c,L[[i+2]]];x=Mod[p,t];c=BitShiftRight[p,n];q=(B*BitXor[y,L[[BitXor[i,h]+1]]])+BitXor[c,L[[BitXor[i,h]+2]]];y=Mod[q,t];d=BitShiftRight[q,n]];p=p+(y*t)+d;p=Mod[p,u];z1=p;L[[(j+1)/2]]=0;z=BitXor[z0,z1];If[K[[z]]==0,R++];K[[z]]++;If[K[[z]]>S,S=K[[z]]];If[BitAnd[w,16^^FFF]==0,Print["w=",BaseForm[w,16]]]];Print["R=",R/(2.^(n+n))];Print["R_ideal=",1-Exp[-1.]];Print["population_max=",S];Print["population_max_density_log2=",Log[2.,u/S]];
The results are somewhat disturbing:
R=0.521194
R_ideal=0.632121
population_max=11
population_max_density_log2=12.5406
R
is badly skewed away from R_ideal. And population_max is also elevated,
with a P value of 1 part in 6x10^(-4). So the xor compensators do not
appear randomly distributed. On the other hand, R is still large enough
to create intractability when N=512, and population_max is still small
enough to be safe, in the sense that population_max_density_log2 exceeds
N, which means that a executing a
birthday attack on
the hash is still easier than guessing the correct xor compensator,
even if the correct answer happens to be the most popular one. In other
words, the xor compensators have a density which is (exponentially)
larger than the square root of the hash space -- even in this uberweak
case.
So do things improve when N increases to 9? Let's try:
n=9;t=2^n;A=(t-17);B=(t-47);u=2^(n+n);v=t-1;K=Array[#*0&,u];R=0;S=0;j=63;h=(j+1)/2;L=Array[#*0&,j+1];For[w=0,w<u,w++,X0=RandomInteger[v];C0=RandomInteger[v];Y0=RandomInteger[v];D0=RandomInteger[v];x=X0;c=C0;y=Y0;d=D0;For[i=0,i<=j,i+=2,x=BitXor[x,y];c=BitXor[c,d];p=(A*BitXor[x,L[[i+1]]])+BitXor[c,L[[i+2]]];x=Mod[p,t];c=BitShiftRight[p,n];q=(B*BitXor[y,L[[BitXor[i,h]+1]]])+BitXor[c,L[[BitXor[i,h]+2]]];y=Mod[q,t];d=BitShiftRight[q,n]];p=p+(y*t)+d;p=Mod[p,u];z0=p;L[[(j+1)/2]]=BitShiftLeft[1,n-1];x=X0;c=C0;y=Y0;d=D0;For[i=0,i<=j,i+=2,x=BitXor[x,y];c=BitXor[c,d];p=(A*BitXor[x,L[[i+1]]])+BitXor[c,L[[i+2]]];x=Mod[p,t];c=BitShiftRight[p,n];q=(B*BitXor[y,L[[BitXor[i,h]+1]]])+BitXor[c,L[[BitXor[i,h]+2]]];y=Mod[q,t];d=BitShiftRight[q,n]];p=p+(y*t)+d;p=Mod[p,u];z1=p;L[[(j+1)/2]]=0;z=BitXor[z0,z1];If[K[[z]]==0,R++];K[[z]]++;If[K[[z]]>S,S=K[[z]]];If[BitAnd[w,16^^FFF]==0,Print["w=",BaseForm[w,16]]]];Print["R=",R/(2.^(n+n))];Print["R_ideal=",1-Exp[-1.]];Print["population_max=",S];Print["population_max_density_log2=",Log[2.,u/S]];
The results, averaged over 2^18 trials, are much more consistent with randomness:
R=0.627911
R_ideal=0.632121
population_max=8
population_max_density_log2=15.
T=8
at N=9 is, as discussed above, utterly unremarkable. So the statistics
of the system are indeed starting to whiten (sic) out.
Results
may vary, and it's probably true that, given enough computer power, we
would observe scintillating entropy on the way to N=512, but it appears
that the LMD7 xor compensator space is asymptotically random. In other
words, LMD7 would appear to be hard -- even if left unencrypted (which
is not recommended, out of an abundance of caution) -- to the extent
that its seeds are opaque. But prove me wrong if you can.