A crash course on hash algorithms
A hash is a one-way function; you can't take a hash value and work backwards to an input. The hash value is a fixed length defined by the algorithm, whereas the input is a variable length. You can create a SHA-256 hash value, 256 bits long, for a single letter or for the entire works of Shakespeare.
Some hash examples using SHA-256 include:
- The ASCII letter a (lowercase):
ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb
- The ASCII letter A (uppercase):
559aead08264d5795d3909718cdd05abd49572e84fe55590eef31a88a08fdffd
- Shakespeare's The Tragedy of Titus Andronicus (entire play):
02b8d381c9e39d6189efbc9a42511bbcb2d423803bb86c28ae248e31918c3b9a
- Shakespeare's The Tragedy of Titus Andronicus but with a single word misspelled:
4487eba46b2327cfb59622a6b8984a74f1e1734285e4f8093fe242c885b4aadb
With these examples, you can see the fundamental nature of a hash algorithm at work. The output is fixed length; in these examples, the output is 64 hexadecimal characters long. (A single hexadecimal character is 4 bits long; 256 divided by 4 yields 64 characters.) A SHA-256 hash is always 64 characters, no matter the length of the input – even if the length is zero! Yes, there's even a hash value for literally nothing. It's 64 characters even for massive inputs, like Shakespeare's Titus Andronicus – that's 1.19 million characters. When it comes to the security application of hashing, one critical feature is the fact that changing a single character in a Shakespeare play radically changed the hash value. This is due to a principle in cryptography called the avalanche effect, and it's a core feature of secure algorithms.
Let's suppose that a bad guy has captured a hash representing my password. Thanks to the avalanche effect, he has no way of knowing by merely hashing his guesses that he was getting close to the actual value. He could be a single character off and the hash would look radically different. I know what the hacker in you is thinking, though: "mathematically speaking, as long as the fixed-length one-way function will accept inputs of arbitrarily longer lengths, there will always be some pair of values that will hash to the same output." Brilliant point, and you're right. This is called a collision. The primary goal of any secure hashing algorithm design is to reduce the risk of collisions. Mathematically speaking, you can't eliminate them – you can just make them extremely hard to find so that you may as well just try to find the target input.
Now, it's best to not go too deep into the rabbit hole of hashing when discussing Windows security, because in classic Microsoft form, they just had to do things their way. A Windows hash, from any point in the history of the operating system, is no ordinary hash.