Modern Big Data Processing with Hadoop
上QQ阅读APP看书,第一时间看更新

Hashing

This is also a cryptography-based technique where the original data is converted to an irreversible form. Let's see the mathematical form for this:

Here, unlike in the case of encryption, we cannot use the output to discover what the input is.

Let's see a few examples to understand this better:

Input

Output

Method

10-point

7d862a9dc7b743737e39dd0ea3522e9f

MD5

10th

8d9407b7f819b7f25b9cfab0fe20d5b3

MD5

10-point

c10154e1bdb6ea88e5c424ee63185d2c1541efe1bc3d4656a4c3c99122ba9256

SHA256

10th

5b6e8e1fcd052d6a73f3f0f99ced4bd54b5b22fd4f13892eaa3013ca65f4e2b5

SHA256

 

We can see that depending upon the encryption algorithm we have used, the output size varies. Another thing to note is that a given hash function produces the same output size irrespective of the input size.