How Diffie-Hellman Fails in Practice

beero1000 · May 22, 2015

In other words, how the NSA (or every determined nation-state) is currently breaking internet cryptography.

https://weakdh.org/imperfect-forward-secrecy.pdf said:
...
We go on to consider Diffie-Hellman with 768- and 1024-bit groups. A small number of fixed or standardized groups are in use by millions of TLS, SSH, and VPN servers. Performing precomputations on a few of these groups would allow a passive eavesdropper to decrypt a large fraction of Internet traffic. In the 1024-bit case, we estimate that such computations are plausible given nation-state resources, and a close reading of published NSA leaks shows that the agency’s attacks on VPNs are consistent with having achieved such a break. We conclude that moving to stronger key exchange methods should be a priority for the Internet community.
...

http://www.scottaaronson.com/blog/?p=2293 said:
These authors study vulnerabilities in Diffie-Hellman key exchange, the “original” (but still widely-used) public-key cryptosystem, one that predates even RSA. Diffie-Hellman is the thing where Alice and Bob first agree on a huge prime number p and a number g, then Alice picks a secret a and sends Bob ga (mod p), and Bob picks a secret b and sends Alice gb (mod p), and then Alice and Bob can both compute (ga)b=(gb)a=gab (mod p), but an eavesdropper who’s listening in only knows p, g, ga (mod p), and gb (mod p), and one can plausibly conjecture that it’s hard from those things alone to get gab (mod p). So then Alice and Bob share a secret unknown to the eavesdropper, which they didn’t before, and they can use that secret to start doing cryptography.

As far as anyone knows today, the best way to break Diffie-Hellman is simply by calculating discrete logarithms: that is, solving the problem of recovering a given only g and h=ga (mod p). At least on a classical computer, the fastest known algorithm for discrete logarithms (over fields of prime order) is the number field sieve (NFS). Under plausible conjectures about the distribution of “smooth” numbers, NFS uses time that grows like exp((1.923+o(1))(log p)1/3(log log p)2/3), where the exp and logs are base e (and yes, even the lower-order stuff like (log log p)2/3 makes a big difference in practice). Of course, once you know the running time of the best-known algorithm, you can then try to choose a key size (that is, a value of log(p)) that’s out of reach for that algorithm on the computing hardware of today.

(Note that the recent breakthrough of Antoine Joux, solving discrete log in in quasipolynomial time in fields of small characteristic, also relied heavily on NFS-like ideas. But there are no improvements from this yet for the “original” discrete log problem, over prime fields.)

But there’s one crucial further fact, which has been understood for at least a decade by theoretical cryptographers, but somehow was slow to filter out to the people who deploy practical cryptosystems. The further fact is that in NFS, you can arrange things so that almost all the discrete-logging effort depends only on the prime number p, and not at all on the specific numbers g and h for which you’re trying to take the discrete log. After this initial “precomputation” step, you then have a massive database that you can use to speed up the “descent” step: the step of solving of ga=h (mod p), for any (g,h) pair that you want.

It’s a little like the complexity class P/poly, where a single, hard-to-compute “advice string” unlocks exponentially many inputs once you have it. (Or a bit more precisely, one could say that NFS reveals that exponentiation modulo a prime number is sort of a trapdoor one-way function, except that the trapdoor information is subexponential-size, and given the trapdoor, inverting the function is still subexponential-time, but a milder subexponential than before.)

The kicker is that, in practice, a large percentage of all clients and servers that use Diffie-Hellman key exchange use the same few prime numbers p. This means that, if you wanted to decrypt a large fraction of all the traffic encrypted with Diffie-Hellman, you wouldn’t need to do NFS over and over: you could just do it for a few p’s and cache the results. This fact can singlehandedly change the outlook for breaking Diffie-Hellman.

...

As you’d expect, many servers today are configured more intelligently, and will only agree to 1024-bit keys. But even there, Adrian et al. found that a large fraction of servers rely on just a single 1024-bit prime (!), and many of the ones that don’t rely on just a few other primes. Adrian et al. estimate that, for a single 1024-bit prime, doing the NFS precomputation would take about 45 million years using a single core—or to put it more ominously, 1 year using 45 million cores. If you built special-purpose hardware, that could go down by almost two orders of magnitude, putting the monetary cost at a few hundred million dollars, completely within the reach of a sufficiently determined nation-state. Once the precomputation was done, and the hundreds or thousands of petabytes of output stored in a data center somewhere, computing a particular discrete log would then take about 30 days using 1 core, or mere minutes using a supercomputer. Once again, none of this is assuming any algorithmic advances beyond what’s publicly known. (Of course, it’s possible that the NSA also has some algorithmic advances; even modest ones could obviate the need for special-purpose hardware.)

Pay no attention to the man behind the curtain...

NobleSavage · May 22, 2015

Yeah, I was reading about that here https://www.schneier.com/blog/archives/2015/05/the_logjam_and_.html

If you are doing something like trying to buy Yellowcake you should know better than to use a $10 a month VPN. Then again, Snowden used https://www.hidemyass.com/ and he is still alive and free.

beero1000 · May 22, 2015

NobleSavage said:
Yeah, I was reading about that here https://www.schneier.com/blog/archives/2015/05/the_logjam_and_.html

If you are doing something like trying to buy Yellowcake you should know better than to use a $10 a month VPN. Then again, Snowden used https://www.hidemyass.com/ and he is still alive and free.

Sieving for a couple of primes allowing them to defeat significant fractions of all HTTPS, VPN, and SSH traffic is a bit more significant than beating a $10 a month VPN. Of course, this analysis only uses public knowledge techniques, so even if your system is set up properly, they could very well even have advances that let them beat keys larger than 1024.

rjh01 · May 22, 2015

If you really want to send secret messages, then invent your own method. That way anyone trying to break it would have to start from scratch. Just make sure that there are no weaknesses. You might want to also try to break it yourself just to be sure.

NobleSavage · May 22, 2015

rjh01 said:
If you really want to send secret messages, then invent your own method. That way anyone trying to break it would have to start from scratch. Just make sure that there are no weaknesses. You might want to also try to break it yourself just to be sure.

That's generally a very bad idea. From Phil Zimmermann's (PGP creator) Introduction to Cryptography (Page 54):

When I was in college in the early 70s, I devised what I believed was a brilliant encryption scheme. A simple pseudorandom number stream was added to the plaintext stream to create ciphertext. This would seemingly thwart any frequency analysis of the ciphertext, and would be uncrackable even to the most resourceful government intelligence agencies. I felt so smug about my achievement.

Years later, I discovered this same scheme in several introductory cryptography texts and tutorial papers. How nice. Other cryptographers had thought of the same scheme. Unfortunately, the scheme was presented as a simple homework assignment on how to use elementary cryptanalytic techniques to trivially crack it. So much for my brilliant scheme.

From this humbling experience I learned how easy it is to fall into a false sense of security when devising an encryption algorithm. Most people don’t realize how fiendishly difficult it is to devise an encryption algorithm that can withstand a prolonged and determined attack by a resourceful opponent.

If I had the money to hire a few professional cryptographers I might have them roll-their-own for my hypothetical clandestine project. Isn't D-H key length limited to 1024 because of standards? That seem like a good way to go about it: have the pros get around whatever is blocking the longer keys. So, roll your own standards, but not algo. Around 2013 I started using a 2048 RSA key, because of rumors about the crackability of 1024. But, RSA is in bed with the NSA so that is no good.

rjh01 · May 24, 2015

NobleSavage said:
That's generally a very bad idea. From Phil Zimmermann's (PGP creator) Introduction to Cryptography (Page 54):

When I was in college in the early 70s, I devised what I believed was a brilliant encryption scheme. A simple pseudorandom number stream was added to the plaintext stream to create ciphertext. This would seemingly thwart any frequency analysis of the ciphertext, and would be uncrackable even to the most resourceful government intelligence agencies. I felt so smug about my achievement.

Years later, I discovered this same scheme in several introductory cryptography texts and tutorial papers. How nice. Other cryptographers had thought of the same scheme. Unfortunately, the scheme was presented as a simple homework assignment on how to use elementary cryptanalytic techniques to trivially crack it. So much for my brilliant scheme.

From this humbling experience I learned how easy it is to fall into a false sense of security when devising an encryption algorithm. Most people don’t realize how fiendishly difficult it is to devise an encryption algorithm that can withstand a prolonged and determined attack by a resourceful opponent.

Click to expand...

If I had the money to hire a few professional cryptographers I might have them roll-their-own for my hypothetical clandestine project. Isn't D-H key length limited to 1024 because of standards? That seem like a good way to go about it: have the pros get around whatever is blocking the longer keys. So, roll your own standards, but not algo. Around 2013 I started using a 2048 RSA key, because of rumors about the crackability of 1024. But, RSA is in bed with the NSA so that is no good.

Anyone writing their own cipher would have to know what they are doing and so not fall into the trap that you mention. There are a few good ones around.

beero1000 · May 24, 2015

rjh01 said:
NobleSavage said:

That's generally a very bad idea. From Phil Zimmermann's (PGP creator) Introduction to Cryptography (Page 54):

If I had the money to hire a few professional cryptographers I might have them roll-their-own for my hypothetical clandestine project. Isn't D-H key length limited to 1024 because of standards? That seem like a good way to go about it: have the pros get around whatever is blocking the longer keys. So, roll your own standards, but not algo. Around 2013 I started using a 2048 RSA key, because of rumors about the crackability of 1024. But, RSA is in bed with the NSA so that is no good.

Click to expand...

Anyone writing their own cipher would have to know what they are doing and so not fall into the trap that you mention. There are a few good ones around.

Lol, I thought you were joking in that first post. 'Write your own crypto' is close to being the worst advice ever...

bilby · May 24, 2015

beero1000 said:
rjh01 said:

Anyone writing their own cipher would have to know what they are doing and so not fall into the trap that you mention. There are a few good ones around.

Click to expand...

Lol, I thought you were joking in that first post. 'Write your own crypto' is close to being the worst advice ever...

I understand that a large number of design flaws rendered the de Havilland Comet unsafe as a jet airliner.

In light of these problems, perhaps the best option for intercontinental flight is to design your own aircraft.

Of course, you would need to know what you are doing, and avoid the problems that other designs have experienced.

skepticalbip · May 24, 2015

There seems to have been a few "codes" that were undecipherable. Egyptian hieroglyphs resisted all attempts to read them until the Rosetta Stone was found that gave the same text in three writing styles, two that had been deciphered and Egyptian hieroglyphics. The Voynich manuscript has baffled the world's best code breakers for a few hundred years so far. But the funny one (to me) is that the Germans during WWII were never able to break an extant language - that used by our "Navajo code talkers".

beero1000 · May 24, 2015

skepticalbip said:
There seems to have been a few "codes" that were undecipherable. Egyptian hieroglyphs resisted all attempts to read them until the Rosetta Stone was found that gave the same text in three writing styles, two that had been deciphered and Egyptian hieroglyphics. The Voynich manuscript has baffled the world's best code breakers for a few hundred years so far. But the funny one (to me) is that the Germans during WWII were never able to break an extant language - that used by our "Navajo code talkers".

The interesting thing is that all of your examples aren't codes, they are/were unknown languages. Those are indeed significantly harder to understand.

skepticalbip · May 24, 2015

beero1000 said:
skepticalbip said:

There seems to have been a few "codes" that were undecipherable. Egyptian hieroglyphs resisted all attempts to read them until the Rosetta Stone was found that gave the same text in three writing styles, two that had been deciphered and Egyptian hieroglyphics. The Voynich manuscript has baffled the world's best code breakers for a few hundred years so far. But the funny one (to me) is that the Germans during WWII were never able to break an extant language - that used by our "Navajo code talkers".

Click to expand...

The interesting thing is that all of your examples aren't codes, they are/were unknown languages. Those are indeed significantly harder to understand.

But then the most simplistic code is a substitution code so essentially a different language is a "code" where words are substituted rather than letters and, of course, different grammatical rules make it more difficult to "break'.

beero1000 · May 24, 2015

skepticalbip said:
beero1000 said:

The interesting thing is that all of your examples aren't codes, they are/were unknown languages. Those are indeed significantly harder to understand.

Click to expand...

But then the most simplistic code is a substitution code so essentially a different language is a "code" where words are substituted rather than letters and, of course, different grammatical rules make it more difficult to "break'.

Not really. Or at least, I feel the need to explicitly distinguish between "codes" and "ciphers". There is some overlap, but in general: Codes work by mapping the meaning of plaintext words/phrases onto encoded words/phrases and ciphers work by modifying the symbols that make up the plaintext without reference to meaning.

In general, this often makes ciphers easier to attack, especially if the structure of the plaintext language is known. Modern ciphers work really hard to hide their underlying structure behind (pseudo)randomness in order to avoid these kinds of attacks.

On the other hand, codes can leave plenty of structure in the codetext without being insecure. That's because decoding them isn't a matter of rearranging letters or symbols but rather finding the right mapping of meaning. Without external information, they are essentially impossible to decode.

For example: How would anyone decode "The king wears a fur coat" without any external information? I could mean anything from "Attack at dawn" to "The king wears a fur coat". Knowing the structure of the plaintext language (English) offers no help at all. On the other hand "Nggnpx ng qnja" isn't too hard to crack if you know it started off as English...

The issue with using codes as cryptography are the same as one-time pads, so nowadays ciphers are more practical. On the other hand, decoding a secret/dead language is pretty close to being impossible unless you can find clear references in meaning in the coded text. External information is vital - we need to know what topic the code is referring to in order to have any chance of figuring out the original message.

Bomb#20 · May 24, 2015

skepticalbip said:
But the funny one (to me) is that the Germans during WWII were never able to break an extant language - that used by our "Navajo code talkers".

Not that funny. We sent them all to the Pacific to fight the Japanese. We used Comanche code talkers against the Germans. But the operation in Europe was much smaller scale. We never really trusted the Germans not to be able to break the codes, because we'd already used code talkers in WWI, and the Germans remembered, and they sent anthropologists to the U.S. in the 1930s to study American Indian languages.

skepticalbip · May 24, 2015

Bomb#20 said:
skepticalbip said:

But the funny one (to me) is that the Germans during WWII were never able to break an extant language - that used by our "Navajo code talkers".

Click to expand...

Not that funny. We sent them all to the Pacific to fight the Japanese. We used Comanche code talkers against the Germans. But the operation in Europe was much smaller scale. We never really trusted the Germans not to be able to break the codes, because we'd already used code talkers in WWI, and the Germans remembered, and they sent anthropologists to the U.S. in the 1930s to study American Indian languages.

AHA! Thanks. I always like to learn something especially if it corrects what I incorrectly thought was right.

Peez · May 25, 2015

beero1000 said:
skepticalbip said:

There seems to have been a few "codes" that were undecipherable. Egyptian hieroglyphs resisted all attempts to read them until the Rosetta Stone was found that gave the same text in three writing styles, two that had been deciphered and Egyptian hieroglyphics. The Voynich manuscript has baffled the world's best code breakers for a few hundred years so far. But the funny one (to me) is that the Germans during WWII were never able to break an extant language - that used by our "Navajo code talkers".

Click to expand...

The interesting thing is that all of your examples aren't codes, they are/were unknown languages. Those are indeed significantly harder to understand.

The moral of this story is: If you wish to transmit information secretly, invent your own language.

Peez

Malintent · May 26, 2015

Language is a method of conveying information to as many people as possible in a way that is clearly understood and conveys the message originally intended by the speaker.
Encryption is a method of hiding information from as many people as possible, except those the message was specifically intended for.

So... "Language" and "encryption" are nothing alike... pretty close to opposites.

skepticalbip · May 26, 2015

Malintent said:
Language is a method of conveying information to as many people as possible in a way that is clearly understood and conveys the message originally intended by the speaker.
Encryption is a method of hiding information from as many people as possible, except those the message was specifically intended for.

So... "Language" and "encryption" are nothing alike... pretty close to opposites.

In some cases, yes. In some cases, no.

Cryptolects have probably been around as long as languages. Children learn "pig-Latin" and other "languages" for the explicit purpose of trying to keep their conversations secret from those they don't think understand. Many ritualistic groups use invented languages in rituals that only the initiates understand to keep outsiders outside. OTOH, ASKII is a code (encryption) that is used for the purpose of having as many people as possible being able to communicate.

http://en.wikipedia.org/wiki/Language_game
“Language game
A language game (also called secret language or ludling or argot) is a system of manipulating spoken words to render them incomprehensible to the untrained ear. Language games are used primarily by groups attempting to conceal their conversations from others.”

http://en.wikipedia.org/wiki/Cant_(language)
“ Cant language
A cant (or cryptolect) is the jargon or argot of a group, often employed to exclude or mislead people outside the group.”

barbos · May 26, 2015

I am confused about OP.
First they seems to imply that current implementations are a bit compromised because they (re)use the same prime numbers?
Is that true?
If it is true, does 45 million years single core assumes this fact or it is for uncompromised case?

beero1000 · May 26, 2015

barbos said:
I am confused about OP.
First they seems to imply that current implementations are a bit compromised because they (re)use the same prime numbers?
Is that true?
If it is true, does 45 million years single core assumes this fact or it is for uncompromised case?

It is 45 million core-years to pre-compute the values for a 1024 bit prime, using standard theory and hardware. Once that computation is done, any cryptographic protocol that uses the discrete-log problem (DH, PGP, TLS, HTTPS) for that specific prime is cracked.

Another way to say it: if anyone wants to read your (uncompromised) 1024 bit encrypted files, it might cost them months and many millions of dollars, but they could do it.

The main compromising vulnerability is that even though many protocols use varying keys, they often use the same primes repeatedly, so the NSA only needs to do this for a few primes to read a large fraction of encrypted internet traffic.

barbos · May 26, 2015

beero1000 said:
barbos said:

I am confused about OP.
First they seems to imply that current implementations are a bit compromised because they (re)use the same prime numbers?
Is that true?
If it is true, does 45 million years single core assumes this fact or it is for uncompromised case?

Click to expand...

It is 45 million core-years to pre-compute the values for a 1024 bit prime, using standard theory and hardware. Once that computation is done, any cryptographic protocol that uses the discrete-log problem (DH, PGP, TLS, HTTPS) for that specific prime is cracked.

Another way to say it: if anyone wants to read your (uncompromised) 1024 bit encrypted files, it might cost them months and many millions of dollars, but they could do it.

On top of the 45mil core*years?
I think I can live with that. I mean can make my browser to generate new set of primes for every fake https gmail session and it will cost them millions each time.

The main compromising vulnerability is that even though many protocols use varying keys, they often use the same primes repeatedly, so the NSA only needs to do this for a few primes to read a large fraction of encrypted internet traffic.

I can see why servers would like to reuse it, but client (desktop) computers can afford to generate new ones each time.

How Diffie-Hellman Fails in Practice

Veteran Member

Veteran Member

Veteran Member

Member

Veteran Member

Member

Veteran Member

Fair dinkum thinkum

Contributor

Veteran Member

Contributor

Veteran Member

Contributor

Contributor

Member

Veteran Member

Contributor

Contributor

Veteran Member

Contributor