blog/content/posts/2018-10-26-twenty-years-of-rsa-attacks.md

173 lines
13 KiB
Markdown
Raw Normal View History

2018-10-19 16:10:28 +00:00
+++
2021-06-30 06:03:48 +00:00
title = "twenty years of attacks on rsa with examples"
2018-10-26 06:50:37 +00:00
date = 2018-10-26
2021-07-06 05:52:05 +00:00
toc = true
2021-07-06 21:32:34 +00:00
tags = ["crypto"]
languages = ["python"]
2018-10-19 16:10:28 +00:00
+++
2021-07-06 21:57:42 +00:00
There's [a great paper][1] I found by Dan Boneh from 1998 highlighting the weaknesses of the RSA cryptosystem. I found this paper to be a particularly enlightening read (and interestingly enough, it's been 20 years since that paper!), so here I'm going to reiterate some of the attacks described in the paper, but using examples with numbers in them. <!--more-->
2018-10-19 16:10:28 +00:00
2021-07-06 21:57:42 +00:00
(Also please excuse the lack of proper formatting, I've yet to figure out how to get Gutenberg to accept Latex)
2018-10-19 16:10:28 +00:00
That being said, I _am_ going to skip over the primer of how the RSA cryptosystem works, since there's already a great number of resources on how to do that.
2021-07-06 21:57:42 +00:00
### factoring large integers
2018-10-19 16:10:28 +00:00
Obviously this is a pretty bruteforce-ish way to crack the cryptosystem, and probably won't work in time for you to see the result, but can still be considered an attack vector. This trick works by just factoring the modulus, N. With N, finding the private exponent d from the public exponent e is a piece of cake.
Let's choose some small numbers to demonstrate this one (you can follow along in a Python REPL if you want):
```py
>>> N = 881653369
>>> e = 17
>>> c = 875978376
```
N is clearly factorable in this case, and we can use resources like [msieve][7] or [factordb][2] to find smaller primes in this case. Since we know now that `N = 20717 * 42557`, we can find the totient of N:
```py
>>> p = 20717
>>> q = 42557
>>> tot = (p - 1) * (q - 1)
881590096
```
Now all that's left is to discover the private exponent and solve for the original message! (you can find the modular inverse function I used [here][3])
```py
>>> d = modinv(e, tot)
51858241
>>> pow(c, d, N)
31337
```
And that's it! Now let's look at some more sophisticated attacks...
2021-07-06 21:57:42 +00:00
## elementary attacks
2018-10-19 16:10:28 +00:00
These attacks are related to the _misuse_ of the RSA system. (if you can't tell, I'm mirroring the document structure of the original paper)
2021-07-06 21:57:42 +00:00
### common modulus
2018-10-19 16:10:28 +00:00
My cryptography professor gave this example as well. Suppose there was a setup in which the modulus was reused, maybe for convenience (although I suppose with libraries today, it'd actually be more _inconvenient_ to reuse the key). Key pairs would be issued to different users and they would share public keys with each other and keep private keys to themselves.
The problem here is if you have a key pair, and you got someone else's public key, you could easily derive the private key by just factoring the modulus. Let's see how this works with a real example now.
2018-10-26 06:50:37 +00:00
Since this is a big problem if you were to really use this cryptosystem, I'll be using actual keys from an actual crypto library instead of the small numbers like in the first example to show that this works on 2048-bit RSA. The library is called [PyCrypto][4], and if you're planning on doing anything related to crypto with Python, it's a good tool to have with you. For now, I'm going to generate a 2048-bit key (by the way, in practice you probably shouldn't be using 2048-bit keys anymore, I'm just trying to spare my computer here).
2018-10-19 16:10:28 +00:00
```py
>>> from Crypto.PublicKey import RSA
>>> k1 = RSA.generate(2048)
2018-10-26 06:49:33 +00:00
<_RSAobj @0x7f3d3226dfd0 n(2048),e,d,p,q,u,private>
2018-10-19 16:10:28 +00:00
```
Now, normally when you generate a new key, it'd generate a new modulus. For the sake of this common modulus attack, we'll force the new key to use the same modulus. This also means we'll have to choose an exponent e other than the default choice of 65537 (see [this link][5] for documentation):
```py
2018-10-26 06:45:44 +00:00
>>> N = k1.p * k1.q
2018-10-26 06:49:33 +00:00
29977270253913673973269594877868500604696844309480395834898813292056864035968758602074842333119394545818563664205865827843973433118231606201251719390934610989873635763197929136439794366715495587924829697045618064595517091398323127000591150167969423793125376862942962617933168868125721044755585292104012767604921511927694421931531763256179277376290836490302585046803170658011843375751827334637689505406974645481089358325805114205957009910758378725866614617688361814922628596814445370820099034880786971816556547138716303030977389113515312289367195090368607322922710704592536914377782096784092012774047931602714559411641
2018-10-26 06:45:44 +00:00
>>> e = k1.e
2018-10-26 06:49:33 +00:00
65537
2018-10-26 06:45:44 +00:00
>>> d = k1.d
2018-10-26 06:49:33 +00:00
15565200260470091881477501931717765645013182095721628848830000114674199708256113134107524142907363428287225581416015506594787249272629252596585055146773790032720599834991872233759704632573379913049026195290680640250863651116064783079834540016568221344526961094787464713454198443832494032866744158338151738236661515444305521301583312800890473043854752775780731961801793612989845832052044110301479536119434333369042172368546513808726742737729539432085793131998509039970952524552914892677427673231515899625998973161553704772256496315467235759715665448324408858980400807019213972046972829905566822336304711418843041721957
2018-10-19 16:10:28 +00:00
>>> e2 = 65539
>>> d2 = modinv(e2, (k1.p - 1) * (k1.q - 1))
2018-10-26 06:49:33 +00:00
28155004966198083605557147846430301877082565365203402029588435163682086478799751838610856433805281302245406343554098644058282620662395619703047797297929171630352487059669029554823105971149580111303390225692229359101863845359614581890498607677708812792166993283364928728648227920436362454567967968010840963546889938282011875589987758165583590886451185216017928261116297436515322115306907044332595229241201447860504794919920665520170088035323466070517987985855014612353911537010064927051269052451478774966384895845225295261610911375622081716902881447610710645142912550905885899057916649884624811336671599114611316629599
2018-10-26 06:45:44 +00:00
>>> k2 = RSA.construct((N, e2, d2))
2018-10-26 06:49:33 +00:00
<_RSAobj @0x7f3d31c7c5f8 n(2048),e,d,p,q,u,private>
2018-10-19 16:10:28 +00:00
```
Ok, now we have two keys, `k1` and `k2`. Now I'll show how using only the public and private key of `k1` (assuming this is the pair that we got legitimately from the crypto operator), and the public key of `k2`, which is tied to the same modulus, we can find the private key of `k2`.
2018-10-26 06:45:44 +00:00
To do this, we'll try to find the roots of the equation `f(x) = x^2 - (p + q)x + pq`. You'll find that for values of `p` and `q`, this will produce `f(p) = p^2 - p^2 - qp + pq`, and `f(q) = q^2 - pq - q^2 + pq`. We know that `N = pq`. How can we find `p + q`? Since `phi(N) = (p - 1)(q - 1) = pq - p - q + 1`, we can find that `phi(N) = N - (p + q) + 1`, so `p + q = N - phi(N) + 1`. Now we need to use `e` and `d` to estimate `phi(N)`. Recall that `ed = 1 mod phi(N)`. This is equivalent to saying `ed = 1 + k*phi(N)`. Then `(ed - 1) / phi(N) = k`.
2018-10-19 16:10:28 +00:00
2018-10-26 06:45:44 +00:00
It turns out that `k` is extremely close to `ed/N`: `ed/N = (1 + k*phi(N)) / N = 1/N + k*phi(N)/N`. `1/N` is basically 0, and `phi(N)` is very close to `N`, so it shouldn't change the value of `k` by very much. We now use `ed/N` to estimate `k`: `phi(N) = (ed - 1) / (ed / N)`.
2018-10-19 16:10:28 +00:00
2018-10-26 06:45:44 +00:00
```py
>>> from decimal import Decimal, getcontext
>>> getcontext().prec = 1000
>>> k = round(Decimal(e) * Decimal(d) / Decimal(N))
2018-10-26 06:49:33 +00:00
34029
2018-10-26 06:45:44 +00:00
>>> phi = (Decimal(e) * Decimal(d) - 1) / Decimal(k)
2018-10-26 06:49:33 +00:00
Decimal('29977270253913673973269594877868500604696844309480395834898813292056864035968758602074842333119394545818563664205865827843973433118231606201251719390934610989873635763197929136439794366715495587924829697045618064595517091398323127000591150167969423793125376862942962617933168868125721044755585292104012767604575090001864613992237960887242026855773279634028088706121371418922552125986506064146112561599205615974813154971272528592745144988174228621487749404677959591894452249599588096076892574585613962026186332366180174253118634077603697727952204486962202338916762987146793208323561031870496718547544796269555861921652')
2018-10-26 06:45:44 +00:00
```
Then we can get `p + q` through the formula mentioend above:
```py
>>> B = Decimal(N) - phi + 1
2018-10-26 06:49:33 +00:00
Decimal('346421925829807939293802368937250520517556856274496340681799239089291249765321270491576943807769029506276203354532585613211864922584150104378865213010402223028176347214857274743206460295173009790370214772536128777858755035911614561414990603406404984005947717445743706054221064913595294226503135333158697489990')
2018-10-26 06:45:44 +00:00
>>> C = Decimal(N)
```
2018-10-19 16:10:28 +00:00
2018-10-26 06:45:44 +00:00
Check to make sure B and C are integers. If they're not, try using a higher precision in `getcontext().prec`. Now solve the quadratic equation:
2018-10-19 16:10:28 +00:00
2018-10-26 06:45:44 +00:00
```py
>>> p = (B + (B * B - 4 * C).sqrt()) / Decimal(2)
2018-10-26 06:49:33 +00:00
Decimal('178187650567807686297508761669341068026596182918164336679269778091413760248796912297951278062644499145975246732979455707116872915963269648808994075794761810506203681312867668286737214808081540392248516550834072470288052831951959306342657446325786002900014749794262752196461389552859745880480150585554246119623')
2018-10-26 06:45:44 +00:00
>>> q = (B - (B * B - 4 * C).sqrt()) / Decimal(2)
2018-10-26 06:49:33 +00:00
Decimal('168234275262000252996293607267909452490960673356332004002529460997877489516524358193625665745124530360300956621553129906094992006620880455569871137215640412521972665901989606456469245487091469398121698221702056307570702203959655255072333157080618981105932967651480953857759675360735548346022984747604451370367')
2018-10-26 06:45:44 +00:00
>>> p * q == N
True
```
We've successfully recovered `p` and `q` from just `N`, `e`, and `d`!
2018-10-19 16:10:28 +00:00
2021-07-06 21:57:42 +00:00
### blinding
2018-10-19 16:10:28 +00:00
2018-10-26 06:45:44 +00:00
This attack is actually about RSA _signatures_ (which uses the opposite keys as encryption: private for signing and public for verifying), and shows how you can compute the signature of a message M using the signature of a derived message M'.
2018-10-19 16:10:28 +00:00
Suppose Marvin wants Bob to sign the following message: `"I (Bob) owes Marvin $100,000 USD"`. Marvin hands this to Bob saying something like, "I'll just need you to sign this with your private key." Let's generate Bob's private key:
```py
>>> from Crypto.Util.number import bytes_to_long, long_to_bytes
>>> from Crypto.PublicKey import RSA
>>> bob = RSA.generate(2048)
<_RSAobj @0x7f4309521128 n(2048),e,d,p,q,u,private>
>>> M = b"I (Bob) owes Marvin $100,000 USD"
```
Obviously, Bob, an intellectual, will refuse to sign the message. However, suppose Marvin now transforms his message into a more innocent looking one. He does this by turning M into `M' = (r^e)*M mod N` where r is an integer that's coprime to N:
```py
>>> from random import randint
>>> N = bob.p * bob.q # this is publicly available knowledge
>>> r = 19
>>> Mp = long_to_bytes((pow(r, bob.e, N) * bytes_to_long(M)) % N)
b'7\x90\xbc\xf9%T\xa9\xee\xf4\xe3?>]\x88\xcd\xb4\xd6D#\xfc\xcb\x0fd\xf0\x8e\xbc>\n\x06\xcd\x0f\x89\x0bp\xa7o\xd6\x02\xa6\xa7\x81\xd8\n\xae\xfb\x08\xaa|\xbd.\xc9E\xf1|\x86\xcaZ\xaa\xd4L\xafaA\x0c}\x84\x04\n\xa4\xa5\x80\xecX<\xe0\xb5\xf6\xfb\xe3\xcc\xd5BD7\xdc\xaep\x7f\xe9vi\xabB\xe2\xadE\xa41K\xc6\xb7\xae\x01\xcb\x04C\xaf\x8b\x17\x83\xffX7z\xb1\xbf\xceF\xafN(x\x00\x9f\xe1kV\xee\x0b\xbd\xc3H\r\xee9\x81\x16\xb2\x10hb.\x90\x08\xe42$Q\x92Ew+\xe1@\xf9\x17%\xce/\xbd\x00\xad\xe2\x12\x01\x93\x8b\xc4\x1bx\xe6H?\x15\xdfPE@\xf9j\xe3\xb7\x9e\xa0\x86\xd1\xd3\xb6[\xf7q\xf1\x95N\xd3>/\x06\x80\xc7\xa3\x8a\xcbDy\xc6v\x01P\x14\xa9Be\xf7~p\xc5\xaa\xac\xa0\xaf\xbe#\xe5\x18\xc6\x1d\xd5\x14\xc1\xbbYXD\x0c\x91{\xc0s\xde]\x18Z\x8bSk\x07k\xb6\x9a\xa5`Iqe~'
```
Now he asks Bob to sign this more... innocently-looking message. Without questioning, Bob, an intellectual, signs his life away. Let's say he produces a signature `S' = (M'^d) = (r^e * M)^d = r^(ed) * M^d = r * M^d mod N`.
```py
>>> Sp, = bob.sign(Mp, 0)
4222298342813922437811434251340999736739055616654488323193778229765071846717137952694561809398626068283668428796351354154566771597532278827070832905206221261994843265685464173739776886856384806238418884247949451413559988796455422271296883338455956330421559319009950760931899199217936823999874162064553735563087382870564193673989865778229832918474778963380170967676966373703157629615331081637805594392084045827925764529711433584853942576464491576212176547485726609891593617931393545058401472883178443786988683045423150809606471425615670582973274971087459634959553685559458456237617436410759134193279063427911112115134
```
Now, all Marvin has to do is multiply by the modular inverse of r, to obtain `M^d`, the signature of the original message:
```py
>>> S = (Sp * modinv(r, N)) % N
6137678992536399703654836416525985142902780822513172949427421060785532284955531529418529725602418902796840570634560123808769013384654624916503940938715718120521434666716675795201896105310462331838807171312705686415521871046533303776516500490921892398440988515777575520183847518597482163414665355222659603386541869176930658730416118799866012276767364050134126722746224706026850062367243018313483359694686773566231956425606553198607719740067340776177716443517567144901614253170719278035838849363127850910135864099535083004590180745762100334268408681888925040382341592080592207557742366581814701422371311084081150092871
```
Sure enough, if you try to verify the "original" signature against the original message, it checks out.
```py
>>> bob.verify(M, (S,))
True
```
Marvin has now successfully tricked Bob into signing his life away.
2018-10-26 06:53:31 +00:00
This post is a work in progress.. I'll update it as I add more.
2018-10-19 16:10:28 +00:00
[1]: https://crypto.stanford.edu/~dabo/papers/RSA-survey.pdf
[2]: http://factordb.com/
[3]: https://stackoverflow.com/a/9758173
[4]: https://github.com/dlitz/pycrypto
[5]: https://www.dlitz.net/software/pycrypto/api/current/Crypto.PublicKey.RSA.RSAImplementation-class.html#construct
[6]: https://crypto.stackexchange.com/a/14713
[7]: https://github.com/radii/msieve