Friday, March 16, 2012

LMD2: Breaking the Megabyte Barrier

Perhaps this sounds like the title to an article in an issue of Byte Magazine from 1988. But I'm not talking about memory capacity. The focus of this post is LMD2, an algorithm closely related to LMD, but with the capacity to offer comparable 2-bit error protection strength over a full megabyte, instead of slightly less than that. This is significant because it permits the use of megabyte block size for storage or transmission protocols, without having to worry about 1-bit or 2-bit errors (except in the negligible case where an error bit in the message compensates for an error bit in the LMD2, which appears to have odds of 1:2^64 in the long message limit).

The LMD2 algo is identical to LMD, with the exception of the seed values and the multiplier. Specifically:


ITERATE_NO_ZERO_CHECK means:

p = 0xFE001000 * x + c
x <- p MOD 2^32
c <- p / 2^32 (Shift p right by 32.)

(x0, c0) = (
0x129E5CFA, 0xC97A34B3)

Here are the first few values:

(x1,c1) = (0xBB49D4B3, 0x1279216A)
(x2, c2) = (0x49C4516A, 0xB9D34CBE)
(x3, c3) = (0x2AE9ECBE, 0x4930CD64)

which implies that:

(LMD2 of null) = 0x12AB02173D8849B8


You might reasonably wonder why we multiply by 0xFE001000 on each every iteration. This value is chosen so that (0xFE001000 * 2^32 - 1) and (0xFE001000 * 2^31 - 1) are both prime. This is the same constraint satisfied by 0x7FFFFDCD in LMD. This requirement comes from the multiply-with-carry generator previously discussed.

The cycle length is thus (0xFE001000 * 2^31 - 1), or 9,151,323,238,909,870,079 -- about double that of LMD. This is just under 2^63, which implies that a complete cycle of 32-bit "x" outputs would exceed the size of a modern 64-bit memory space. The reason LMD was limited to half as much was some doubt on my part as to whether or not it was safe to use a coefficient with MSB 31, as opposed to MSB 30. That doubt has now been removed.

But there's something else which is special about 0xFE001000: it's easier to multiply by, which may accelerate the process on certain microprocessor architectures:


0xFE001000 * x = (x * 2^32) + (x * 2^12) - (x * 2^25)


Apart from the coefficient, LMD2 has a longer unique shiftoid sequence length. Specifically, the first 263,837 x outputs (excluding the seeds themselves) constitute unique shiftoids. Because each "x" is 32 bits, this implies error coverage for up to 1,055,348 bytes -- enough to cover a 1MiB data block plus 6772 bytes of metadata.

The noise quality appears to be similar to LMD.

All of these features suggest that LMD2 should replace LMD in practice.

Sorry again for the incorrigible font problems. Blogger sucks, but it's free.

No comments:

Post a Comment