Pratt Primality Certificates

The previous post implicitly asserted that J = 8675309 is a prime number. Suppose you wanted proof that this number is prime.

You could get some evidence that J is probably prime by demonstrating that

2J-1 = 1 mod J.

You could do this in Python by running the following [1].

    >>> J = 8675309
    >>> assert( pow(2, J-1, J) == 1 )

This shows J is a probable prime to base 2.

If you want more evidence, you could also show J is a probable prime to base 3.

    >>> assert( pow(3, J-1, J) == 1 )

But no matter how many bases you try, you won’t have proof that J is prime, only evidence. There are pseudoprimes, (rare) composite numbers that satisfy the necessary-but-not-quite-sufficient conditions of Fermat’s primality test.

Primality certificates

A primality certificate is a proof that a number is prime. To be practical, a certificate must be persuasive and efficient to compute.

We could show that J is not divisible by any integer less than √J. That would actually be practical because J is not that large.

    >>> for n in range(2, int(J**0.5)+1):
    ...     assert(J % n > 0)

But we’d like to use J to illustrate a method that scales to much larger numbers than J.

Pratt certificates

Pratt primality certificates are based on a theorem by Lucas [2] that says a number n is prime if there exists a number a such that two conditions hold:

an-1 = 1 mod n,

and

a(n-1)/p ≠ 1 mod n

for all primes p that divide n-1.

How do you find a? See this post.

Example

To find a Pratt certificate for J, we have to factor J-1. I assert that

J-1 = 8675308 = 4 × 2168827

and that 2168827 is prime. Here’s verification that Lucas’ theorem holds:

    >>> assert( pow(2, (J-1)//2, J) != 1 )
    >>> assert( pow(2, (J-1)//2168827, J) != 1 )

What’s that you say? You’re not convinced that 2168827 is prime? Well then, we’ll come up with a Pratt certificate for 2168827.

Pratt certificates are generally recursive. To prove that p is prime, we have to factor p-1, then prove all the claimed prime factors of p-1 are prime, etc. The recursion ends when it gets down to some set of accepted primes.

Now I assert that

    2168827 - 1 = 2168826 = 2 × 3 × 11 × 17 × 1933

and that all these numbers are prime. I’ll assume you’re OK with that, except you’re skeptical that 1933 is prime.

The following code is proof that 2168827 is prime, assuming 1933 is prime.

    >>> m = 2168827
    >>> for p in [2, 3, 11, 17, 1933]:
    ...     assert( pow(3, (m-1)//p, m) != 1 )

Finally, we’ll prove that 1933 is prime.

You can verify that

    1933 - 1 = 1932 = 2² × 3 × 7 × 23

and I assume you’re convinced that each of these factors is prime.

    >>> m = 1933
    >>> for p in [2, 3, 7, 23]:
    ...     assert( pow(5, (m-1)//p, m) != 1 )

Pratt certificates can be written in a compact form that verification software can read. Here I’ve made the process more chatty just to illustrate the parts.

Update: Here’s a visualization of the process above, drawing arrows from each prime p to the prime factors of p-1.

In this post we’ve agree to recognize 2, 3, 7, 11, 17, and 23 as primes. But you only have to assume 2 is prime. This would make the software implementation more elegant but would make the example tedious for a human consumption.

Efficiency

A primality certificate does not have to be efficient to produce, though of course that would be nice. It has to be efficient to verify. You could imagine that the prime number salesman has more compute power available than the prime number customer. In the example above, I used a computer to generate the Pratt certificate, but it wouldn’t be unreasonable to verify the certificate by hand.

The brute force certificate above, trying all divisors up to √p, obviously takes √p calculations to verify. A Pratt certificate, however, takes about

4 log2 p

calculations. So verifying a 10-digit prime requires on the order of 100 calculations rather than on the order of 100,000 calculations.

Atkin-Goldwasser-Kilian-Morain certificates

Producing Pratt certificates for very large numbers is difficult. Other certificate methods, like Atkin-Goldwasser-Kilian-Morain certificates, scale up better. Atkin-Goldwasser-Kilian-Morain certificates are more complicated to describe because they involve elliptic curves.

Just as Pratt took a characterization of primes by Lucas and turned it into a practical certification method, Atkin and Morain turned a characterization of primes by Goldwasser and Kilian, one involving elliptic curves, and turned it into an efficient certification method.

These certificates have the same recursive nature as Pratt certificates: proving that a number is prime requires proving that another (smaller) number is prime.

Update: More on elliptic curve primality proofs.

***

[1] This is a more efficient calculation than it seems. It can be done quickly using fast exponentiation. Note that it is not necessary to compute 2J-1 per se; we can carry out every intermediate calculation mod J.

[2] Lucas was French, and so the final s in his name is silent.

The Orange Book

I was spelunking around in Unicode and saw that there’s an emoji for orange book, U+1F4D9.

Orange Book emoji Linux

As is often the case, the emoji renders differently in different contexts. The image above is from my Linux desktop and the image below is from my Macbook. I tried created an image on my Windows box but it didn’t have a glyph for this character.

Orange book emoji Apple

When I saw “orange book” my first thought was the US Department of Defense publication Trusted Computer System Evaluation Criteria (TCSEC), aka DoDD 5200.28-STD, commonly know in security circles as the Orange Book.”For a split second I thought “Is this book so famous that there’s an emoji for it?!” But of course that’s not the case.

There are emoji for several colors of books, and there are several books in the DOD “Rainbow Series” publications. such as the Green Book for password management. (There’s also an emoji for green book: U+1F4D7).

TCSEC

So what is The Orange Book? Here’s a photo of the cover.

Department of Defense Trusted Computer System Evaluation Criteria

Here’s how Bruce Schneier describes the book in Applied Cryptography.

The NCSC publishes the infamous “Orange Book.” It’s actual title is the Department of Defense Trusted Computer System Evaluation Criteria, but that’s a mouthful to say and the book has an orange cover. The Orange Book attempts to define security requirements, gives computer manufacturers an objective way to measure the security of their systems, and guides them as to what to build into their secure products. It focuses on computer security and doesn’t really say a lot about cryptography.

The Orange Book is now deprecated in favor of The Common Criteria for Information Technology Security Evaluation, ISO/IEC 15408.

Related posts

Repunits: primes and passwords

A repunit is a number whose base 10 representation consists entirely of 1s. The number consisting of n 1s is denoted Rn.

Repunit primes

A repunit prime is, unsurprisingly, a repunit number which is prime. The most obvious example is R2 = 11. Until recently the repunit numbers confirmed to be prime were Rn for n = 2, 19, 23, 317, 1031. Now the case for n = 49081 has been confirmed.

R_{49081} = \frac{10^{49081} - 1}{9} = \underbrace{\mbox{111 \ldots 1}}_{\mbox{{\normalsize 49,081 ones}} }

Here is the announcement. The date posted at the top of the page is from March this year, but I believe the announcement is new. Maybe the author edited an old page and didn’t update the shown date.

Repunit passwords

Incidentally, I noticed a lot of repunits when I wrote about bad passwords a few days ago. That post explored a list of commonly used but broken passwords. This is the list of passwords that password cracking software will try first. The numbers Rn are part of the list for the following values of n:

1–45, 47–49, 51, 53–54, 57–60, 62, 67, 70, 72, 77, 82, 84, 147

So 46 is the smallest value of n such that Rn is not on the list. I would not recommend using R46 as a password, but surprisingly there are a lot of worse choices.

The bad password file is sorted in terms of popularity, and you might expect repunits to appear in the file in order, i.e. shorter sequences first. That is sorta true overall. But you can see streaks in the plot below showing multiple runs where longer passwords are more common than shorter passwords.

Exploring bad passwords

If your password is in the file rockyou.txt then it’s a bad password. Password cracking software will find it instantly. (Use long, randomly generated passwords; staying off the list of worst passwords is necessary but not sufficient for security.)

The rockyou.txt file currently contains 14,344,394 bad passwords. I poked around in the file and this post reports some things I found.

To make things more interesting, I made myself a rule that I could only use command line utilities.

Pure numeric passwords

I was curious how many of these passwords consisted only of digits so I ran the following.

    grep -P '^\d+$' rockyou.txt | wc -l

This says 2,346,744 of the passwords only contain digits, about 1 in 6.

Digit distribution

I made a file of digits appearing in the passwords

    grep -o -P '\d' rockyou.txt > digits

and looked at the frequency of digits.

    for i in 0 1 2 3 4 5 6 7 8 9
    do 
        grep -c $i digits
    done

This is what I got:

    5740291
    6734380
    5237479
    3767584
    3391342
    3355180
    3118364
    3100596
    3567258
    3855490

The digits are distributed more evenly than I would have expected. 1’s are more common than other digits, but only about twice as common as the least common digits.

Longest bad passwords

How long is the longest bad password? The command

    wc -L rockyou.txt

shows that one line in the file is 285 characters long. What is this password? The command

    grep -P '.{285}' rockyou.txt

shows that it’s some HTML code. Nice try whoever thought of that, but you’ve been pwned.

A similar search for all-digit passwords show that the longest numeric passwords are 255 digits long. One of these is a string of 255 zeros.

Dictionary words

A common bit of advice is to not choose passwords that can be found in a database. That’s good advice as far as it goes, but it doesn’t go very far.

I used the comm utility to see how many bad passwords are not in the dictionary by running

    comm -23 sorted dict | wc -l

and the answer was 14,310,684. Nearly all the bad passwords are not in a dictionary!

(Here sorted is a sorted version of the rockyou.txt file; I believe the file is initially sorted by popularity, worst passwords first. The comm utility complained that my system dictionary isn’t sorted, which I found odd, but I sorted it to make comm happy and dict is the sorted file.)

Curiously, the command

    comm -13 sorted dict | wc -l

shows there are 70,624 words in the dictionary (specifically, the american-english file on my Linux box) that are not on the bad password list.

Smallest ‘good’ numeric password

What is the smallest number not in the list of pure numeric passwords? The following command strips leading zeros from purely numeric passwords, sorts the results as numbers, removes duplicates, and stores the results in a file called nums.

    grep -P '^\d+$' rockyou.txt | sed 's/^0\+//' | sort -n | uniq > nums

The file nums begins with a blank. I removed this with sed.

    sed -i 1d nums

Next I used awk to print instances where the line number does not match the line in the file nums.

    awk '{if (NR-$0 < 0) print $0 }' nums | less

The first number this prints is 61. This means that the first line is 1, the second line is 2, and so on, but the 60th line is 61. That means 60 is missing. The file rockyou.txt does not contain 60. You can verify this: the command

    grep '^60$' rockyou.txt

returns nothing. 60 is the smallest number not in the bad password file. There are passwords that contain ’60’ as a substring, but just 60 as a complete password is not in the file.

Related posts

More on attacking combination locks

A couple weeks ago I wrote about how De Bruijn sequences can be used to attack locks where there is no “enter” key, i.e. the lock will open once the right symbols have been entered.

Here’s a variation on this theme: what about locks that let you press more than one button at a time? [1]

Lock that lets you push more than one button at a time

You could just treat this as if it were a keypad with more buttons. For example, suppose you can push either one or two buttons at a time on the lock pictured above. Then you could treat this as a lock with 15 buttons: the five actual buttons and 10 virtual buttons corresponding to the ten ways you could choose 2 buttons out of 5 to press at the same time.

Maybe a lock would let you press even more buttons at once. I don’t think I’ve ever used a combination that required more than two buttons at once, but in principle you could push up to five buttons at once in the photo above, for a total of 31 possible button combinations. And since 31 is prime, you could use the algorithm here to generate the corresponding De Bruijn sequence.

If you know how long the password is, you can try the De Bruijn sequence for passwords of that length. But what if you don’t know the length of the password a priori? You could try the De Bruijn sequence for passwords of length 1, then length 2, then length 3, etc. But is there a more efficient way?

If there are k button combinations and the password has length n, then the optimal solution is to start entering a De Bruijn sequence B(k, n) until the lock opens. If you know the password has length no more than 4, then you could try a B(k, 4) sequence, and if the password is actually shorter than 4, say length 3, you’ll still open it.

But what if you’ve tried a B(k, 4) sequence and it turns out the password has length 5? You could do better than starting over with a B(k, 5) sequence, because some of the substrings in that B(k, 5) sequence will have already been tried. But how could you do this systematically? If you don’t know the length of the password, how could you do better than trying B(k, 1), then B(k, 2), then B(k, 3) etc.?

Related posts

[1] For this post, I’m assuming the lock will open as soon as you enter the right sequence of buttons. For example, if the pass code is 345 and you enter 12345 the lock will open. I don’t know whether these locks work that way. Maybe you have to turn the handle, which would effectively act as an enter key. But maybe there’s a way to listen to the lock so that you’ll know when the combination has been entered, before you twist the handle.

Update: According to the first comment below, locks of the kind featured in the photo only allow each button to be used once. That puts severe limits on the number of possible combinations, and the approach outlined here would be unnecessary. However, the post brings up more general questions that could be useful in another setting.

Cracking pass codes with De Bruijn sequences

keypad

Suppose you have a keypad that will unlock a door as soon as it sees a specified sequence of four digits. There’s no “enter” key to mark the end of a four-digit sequence, so the four digits could come at any time, though they have to be sequential. So, for example, if the pass code is 9235, if you started entering 1139235… the lock would open as soon as you enter the 5.

How long would it take to attack such a lock by brute force? There are 104 possible 4-digit codes, so you could enter

000000010002…99989999

until the lock opens, but there’s a more efficient way. It’s still brute force, but not quite as brute.

The sequence above could be make more efficient. For example, it tests for codes 0000 and 0001 by entering both full codes. But entering 00001 would test for both. This takes advantage of the fact that the code could start anywhere, not just at every fourth position.

De Bruijn sequences

The optimal solution to this problem uses a De Bruijn sequence B(k, n), the shortest sequence of symbols from an alphabet of size k that contains every possible subsequence of length n. In the example above, n = 4 and k = 10. A sequence in B(k, n) has length kn. Using a De Bruijn sequence, you could break the lock above by entering at most 10,000 digits rather than 40,000 as in the most naive approach. (That’s almost true. We’ve left out a detail that we’ll get to shortly.)

Update: See this post on generating De Bruijn sequences.

Binary example

Before we go any further, let’s look at a couple examples of De Bruijn sequences. First, let’s use k = 2 and n = 3. Then

00010111

is a B(2,3) sequence. You can see that this contains 000, 001, 010, and 011. But what about 100? Here’s the detail we left out: In order to contain every possible subsequence, you have to consider the De Bruijn sequence as a cycle. So if we use the last bit and then start over with the first two bits, now we have 100. We also have to wrap around to find 110.

DNA example

Next, here’s an example with k = 4 and n = 3. The following is an example of a De Bruijn sequence for all triples of DNA base pairs.

AAACAAGAATACCACGACTAGCAGGAGTATCATGATTCCCGCCTCGGCGTCTGCTTGGGTGTTT

You could specify any triple by giving its starting location in the sequence: AAA starts in position 1, AAC in position 2, AAG in position 4, etc. Note that TTA starts in position 63 and wraps around. TAA starts in position 64 and also wraps around.

De Bruijn sequences are as short as possible for the problem they solve. For example, the sequence above has 64 characters. Since each of the 64 possible triples corresponds to a starting location, the sequence could not have any fewer than 64 characters.

Brute force with and without an enter key

The De Bruijn sequence has kn symbols, but to attack a pass code of length n on an alphabet of k symbols we might have to enter kn + n – 1 symbols because of wrapping the sequence back to the beginning. Using De Bruijn sequences cuts the amount of work necessary to perform a brute force attack by about n; it would be exactly n if it weren’t for accounting for the need to possibly wrap around and reuse the first few symbols.

In our keypad example, an attack on 4-digit pass codes might need to enter as many as 10,003 digits. If someone added an “enter” key to the keypad and required the user to enter exactly four digits at a time, this would increase the effort of a brute force attack by a factor of approximately 4, from 10,003 keys to 40,000 keys.

But this requires the user to enter five keys rather than four, the last one being the enter key. If the designer increased the pass code length to five digits that could occur at any time, then a brute force attack using De Bruijn sequences would require up to 100,004 keys. Increasing the pass code length by one increases the difficulty of a brute force attack more than requiring a terminator key would.

This is true in general when the alphabet size k is larger than the pass code length n. Suppose you have an alphabet of size k and are using pass codes of length n with no terminator. Requiring a terminator multiplies the difficulty of a brute force attack by n, but requiring one more character in pass codes multiplies the difficulty by k [1].

Increasing the alphabet size also increases security. So instead of using #, for example, as an enter key, a designer could use it as a possible pass code symbol. It could still appear at the end of a pass code, but it could also appear anywhere else, increasing the number of possible pass codes.

Related posts

[1] To be precise, a brute force attack on keys of length n using De Bruijn sequences takes kn + n – 1 symbols. Adding a terminator changes the brute force difficulty to n kn. Requiring one more symbol in a pass code instead changes it to kn+1 + n.

So roughly n kn versus k kn. If k > n, multiplying by k makes a bigger increase than multiplying by n.

Making public keys factorable with Rowhammer

The security of RSA encryption depends on the fact that the product of two large primes is difficult to factor. So if p and q are large primes, say 2048 bits each, then you can publish n = pq with little fear that someone can factor n to recover p and q.

But if you can change n by a tiny amount, you may make it much easier to factor. The Rowhammer attack does this by causing DRAM memory to flip bits. Note that we’re not talking about breaking someone’s encryption in the usual sense. We’re talking about secretly changing their encryption to a system we can break.

To illustrate on a small scale what a difference changing one bit can make, let p = 251 and q = 643.  Then n = pq = 161393. If we flip the last bit of n we get m = 161392. Although n is hard to factor by hand because it has no small factors, m is easy to factor, and in fact

161392 = 24 × 7 × 11 × 131.

For a larger example, I generated two 100-bit random primes in Mathematica

p = 1078376712338123201911958185123
q = 1126171711601272883728179081277

and was able to have it factor n = pq in about 100 seconds. But Mathematica was able to factor n-1 in a third of a second.

So far we have looked at flipping the least significant bit. But Rowhammer isn’t that precise. It might flip some other bit.

If you flip any bit of a product of two large primes, you’re likely to get an easier factoring problem, though the difficulty depends on the number you start with and which bit you flip. To illustrate this, I flipped each of the bits one at a time and measured how long it took to factor the result.

The median time to factor n with one bit flipped was 0.4 seconds. Here’s a plot of the factoring times as a function of which bit was flipped.

factoring times for corrupted product of primes

The plot shows about 80% of the data. Twenty percent of the time the value was above 11 seconds, and the maximum value was 74 seconds. So in every case flipping one bit made the factorization easier, usually quite a lot easier, but only a little easier in the worst case.

To verify that the results above were typical, I did a second experiment. This time I generated a sequence of pairs of random 100-bit primes. I factored their product, then factored the product with a randomly chosen bit flipped. Here are the factoring times in seconds.

    |----------+---------|
    | Original | Flipped |
    |----------+---------|
    |  117.563 |   3.828 |
    |  108.672 |   4.875 |
    |   99.641 |   0.422 |
    |  103.031 |   0.000 |
    |   99.188 |   0.000 |
    |  102.453 |   0.234 |
    |   79.594 |   0.094 |
    |   91.031 |   0.875 |
    |   64.313 |   0.000 |
    |   95.719 |   6.500 |
    |  131.125 |   0.375 |
    |   97.219 |   0.000 |
    |   98.828 |   0.203 |
    |----------+---------|

By the way, we started this post by talking about maliciously flipping a bit. The same thing can happen without a planned attack if memory is hit by a cosmic ray.

More security posts

Microsoft replacing SHA-1

According to this article, Microsoft is patching Windows 7 and Windows Server 2008 to look for SHA-2 hash functions of updates. These older versions of Windows have been using SHA-1, while newer version are already using SHA-2.

This is a good move, but unnecessary. Here’s what I mean by that. The update was likely unnecessary for reasons I’ll explain below, but it was easy to do, and it increased consistency across Microsoft’s product line. It’s also good PR.

What are SHA-1 and SHA-2?

Let’s back up a bit. SHA-1 and SHA-2 are secure hash functions [1]. They take a file, in this case a Microsoft software update, and return a relatively small number, small relative to the original file size. In the case of SHA-1, the result is 160 bits (20 bytes).  They’re designed so that if a file is changed, the function value is nearly certain to change. That is, it’s extremely unlikely that a change to the file would not result in a change to the hash value.

The concern isn’t accidental changes. The probability of accidentally producing two files with the same hash function value is tiny as I show here.

The concern is a clever attacker who could modify the software update in such a way that the hash function remains unchanged, bypassing the hash as a security measure. That would be harder to do with SHA-2 than with SHA-1, hence Microsoft’s decision years ago to move to SHA-2 for new versions of the operating system, and its recent decision to make the change retroactive.

How hard is it to produce collisions?

By a collision we mean two files that hash to the same value. It’s obvious from the pigeon hole principle [2] that collisions are possible, but how hard are they to produce deliberately?

Google demonstrated two years ago that it could produce two PDF files with the same SHA-1 hash value. But doing so required over 6,500 years of CPU time running in parallel [3]. Also, Google started with a file designed to make collisions possible. According to their announcement,

We started by creating a PDF prefix specifically crafted to allow us to generate two documents with arbitrary distinct visual contents, but that would hash to the same SHA-1 digest.

It would be harder to start with a specified input, such as a software update file and generate a collision. It would be harder still to generate a collision that had some desired behavior.

According to this page, it’s known how to tamper with two files simultaneously so that they will have the same SHA-1 hash values. This is what Google did, at the cost of thousands of CPU years. But so far, nobody has been able to start with a given file and create another file with the same SHA-1 value. (Update: Now they have!)

As I said at the beginning, it made sense for Microsoft to decide to move from SHA-1 to SHA-2 because the cost of doing so was small. But the use of SHA-1 hash codes is probably not the biggest security risk in Windows 7.

More secure hash posts

[1] SHA-1 is a hash function, but SHA-2 is actually a family of hash functions: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, and SHA-512/256. All are believed to provide at least 112 bits of security, while SHA-1 provides less than 63.

The SHA-x functions output x bits. The SHA-x/y functions use x bits of internal state and output y bits. To be consistent with this naming convention, SHA-1 should be called SHA-160.

[2] The pigeon hole principle says that if you put more than n things into n boxes, one of the boxes has to have more than one thing. If you hash files of more than n bits to n-bit numbers, at least two files have to go to the same value.

[3] If you were to rent this much CPU time in the cloud at 5 cents per CPU hour, it would cost about $2,800,000. If the only obstacle were the cost of computing resources, someone might be willing to pay that to tamper with a Microsoft update. (Update: with a new algorithm announced in January 2020, the estimated cost has dropped $45,000.)

Hash function menagerie

Here’s an oversimplified survey of cryptographic hash functions: Everyone used to use MD5, now they use some variation on SHA.

There’s some truth to that. MD5 was very popular, and remains popular years after it was proven insecure. And now variations on SHA like SHA1 and SHA256 are commonly used. But there are a lot more cryptographic hash functions in common use.

Python support

If Python’s hashlib is a reliable guide, the most common hashing algorithms are

  • MD5
  • SHA1
  • SHA224
  • SHA256
  • SHA384
  • SHA512

because these are the six algorithms guaranteed to be supported on every platform, as listed in the output of the algorithms_guaranteed method in hashlib.

The algorithms_available methods in hashlib includes additional algorithms available in a particular installation. On the computer I’m using at the moment, it lists 18 hash algorithms in addition to those on the guaranteed list.

Mathematica support

Mathematica supports the hash functions on hashlib‘s guaranteed list, and a few more:

  • Adler32
  • CRC32
  • MD3
  • MD4
  • RIPEMD160
  • RIPEMD160SHA256
  • SHA256SHA256

The first two hashes, Adler32 and CRC32, were never intended to be secure. They were designed simply as error detection codes and weren’t designed to be tamper-resistant. As the names imply, MD3 and MD4 were predecessors to MD5.

The hash that Mathematica calls RIPEMD160SHA256 is SHA 256 followed by the RIPEMD160. The reason this combination gets a name of its own is because it is used in Bitcoin. Finally, SHA256SHA256 is simply SHA256 applied twice.

The long tail

The hash functions mentioned above are the most commonly used, but there are hundreds of others in common use. The hashcat password cracking tool lists 260 kinds of hash functions it can attack.

Some of these hash functions are fundamental algorithms, such as Whirlpool and variations of GOST. Some are combinations of primitive functions, such as salted or iterated variations. Many of them are vendor and product specific. For example, hashcat lists nine different hashing algorithms associated with various versions of Microsoft Office, six algorithms for Cisco products, five algorithms for SAP, etc.

It’s interesting to speculate on why there are so many custom hash functions: hashing technology progress, differing emphases on security and speed, not-invented-here syndrome, etc.

Security by variety

There’s something going on that isn’t exactly security-by-obscurity, i.e. relying on keeping your encryption algorithm a secret. The hashing algorithms for all the products mentioned above are well known, but there may be some small advantage to being a little bit off the beaten path.

People have built special hardware and software for attacking popular hashing algorithms, and doing something a little different could prevent this from working on your hash. Of course doing something a little different could also introduce a weakness you didn’t anticipate. Creating your own encryption algorithm is a bad idea unless you’re an expert, and often even if you are an expert But making a new hash function by combining secure primitives is not as dangerous as creating your own encryption algorithm.

More hash function posts