How bitcoin uses hashcash

hashcash

(For background on hashcash, there is a site with FAQ, paper, code etc).

Difficulty in hashcash is very simple, it is the number of leading bits that are 0 when the SHA1 hash output is viewed in binary.

echo -n 1:20:130502:adam@cypherspace.org::+V6MTRWjWO8npEog:1Q5h | sha1 0000068826a6ed96459563ccbca90bb3e016014a and you can see 5 hex 0s = 20 bits. (And actually the next hex nibble 6 in binary is 0110, so its actually 21 bits. However difficulty is defined as the min of the leading bits and the length field included in the stamp, so the sender does not benefit from luck). The visual ability to read the bits is a design feature - hashcash appears in mail headers, and people may want to check them with sha1sum tool, or shell scripts. (Version 0 hashcash format and hash definition was not nearly so nice!)

Hashcash was expected to adjust difficulty every 6months or year, manually based on observed moore's law estimates. It was proposed that this be measured against a benchmark machine by some vague/undefined human consensus process and broadcast via signed difficulty update headers, and/or long TTL DNS TXT records.

In fact even the difficulty headers & DNS TXT records did not get implemented, and it was left to the users to adapt. This was because hashcash adoption on the server side was primarily spamassassin, and it used hashcash with a soft scoring; it was fairly forgiving - the user could put a higher difficulty to give themselves more -ve spam scoring, and adapt that over time as they felt the need. So that was more distributed, and self-organizing so I left it there.

bitcoin hashcash changes

Bitcoin has automatic inflation control which is Satoshi Nakamoto's extra innovation that helped make the leap from non-reusable proof-of-work, to virtual scarcity backed currency. Note there were some precursor ideas, discussions and implemented systems and some of them got startlingly close, but they didnt quite put it together. Anwyay that is a story for another page on some 1999 cypherpunks list posts on distributed ecash.

Bitcoin aims for a smoother inflation control and adjusts inflation every 2 weeks automatically.

Bitcoin makes three changes in the way it uses hashcash:

hash update bitcoin uses SHA256 rather than SHA1 (SHA256 didnt exist back in 1997, and its now conservative to use SHA256 in new systems)
extra security bitcoin uses double SHA256 ie SHA256( SHA256( x ) ) to effectively double the rounds, for extra security margin
fractional bits hashcash difficulty can only double or halve; bitcoin needs more fine grained control, because inflation is adjusted more frequently, so it defines fractional difficulty.

Fractional bits explained: instead of looking for k 0-bits, the bitcoin fractional bit extension looks for a hash that is less than 1/2^k (when viewed as a fixed point number). So for example 20.5-bits challenge would be to find a hash when (viewed as a fixed-point hexadecimal) that is less than .00000B504h (to 4 sf). When difficulty k is an integer this is the same as the base hashcash challenge.

some bit-twiddling

That would have been the end of the story but there are two or three implementation artefacts in the way bitcoin expresses difficulty. Implementation artefact one: the number of bits of precision used for bitcoins fractional difficulty varies between 23-bits and 16-bits depending on how close to a multiple of 8-bits the difficulty is (more detail on why this is below). While it seems odd, it is fully safe even with 11-bit difficulty precision (you want difficulty bit precision >= log₂( interval ) and interval = 2016 with bitcoins parameters -- the number of 10 minute coinbases within a 2 week period).

Bitcoin difficulty isnt an absolute number of hashes, but rather a relative difficulty, the number of times harder than a defined base difficulty. The defined base difficulty is unfortunately not a power of 2¹⁰ nor an even quite a power of 2 though it is very close (0.0015% discrepancy). The reason for this leads on to the next implementation detail.

Deeper bit-twiddling (here's why base difficulty is not quite a power of 2): bitcoin for compactness stores the difficulty (which is a 256-bit number mostly full of 0s) as a 32-bit custom floating point format, 8-bit exponent and 23-bit mantissa (plus an unused sign bit). And actually can only hold multiples of 8 (the exponent is multipled by 8 before use). Consequently (back to the variable precision part) the mantissa varies in size between 23-bits and 16-bits (1-bit bigger than 23-bits and the mantissa is shifted right 8-bits, 1-bit smaller than 16-bits and the mantissa is shifted left 8-bits). It would have been more space efficient, and allowed constant 23/24-bit precision if the 8-bit exponent had been in bits not bytes (not multipled by 8 before use). Anyway the minimum difficulty is 16-bit mantissa FFFFh and exponent 4 (4 * 8-bits = 32-bits due to the byte interpretation on the exponent value) so the base difficulty is .FFFFh/2^32 = .00000000FFFF000 so that is a little more than 32-bits of difficulty at -log₂(.FFFFh/2^32)=32.00002201394726 (and it doesnt look better in hex 20.000171552EFD6). (Technically bitcoin treats the target and hash for less-than testing as 256 bit integers, and not as fixed point numbers. Consequently the exponent of the minimum difficulty is actually encoded as 33-4 = 29 = 1Dh, not 4; I just found the fixed point view of things more intuitive when I was reading about and figuring this stuff out.)

There is a claim, apparently, that mining pools set share difficulty to an integral number of bits (the hashcash challenge), to make them faster to verify. However that doesnt seem quite right to me because the number of instructions is the same (intel CMP, JL, CMP, JL vs CMP, JL, AND, JE or if there is no split over a word boundary CMP, JL in both cases). And even if the instructions differ in a cycle or two, it hardly matters for a simple test that you only do 1 in 2³² times. Though I do prefer 2³² as a base difficulty as its simpler.

simplifying

Anyway the discrepency and complicated story is kind of inconvenient, but the best thing to do is ignore it as the discrepancy is very small, this by treating bitcoin difficulty as if base difficulty is 32-bits exactly (2^32). Then current difficulty can be converted to log₂ difficulty by bits = log₂( difficulty ) + 32. And bitcoin difficulty can be converted into expected hashes by multiplying by 4 and treating as Gigahashes expected hashes = difficulty * 4 GH (all within 0.0015%).

The discrepancy calculation:

Comments, html bugs to me (Adam Back) at <adam@cypherspace.org>