You want this: Encrypt a message to somebody else – using information that is publicly available. Somebody else should then be able to decrypt the message, using only information they have; nobody else should be able to read this information.
The public key cryptography algorithm RSA does achieve this. This article is my way of thinking about it – in particular ‘plausibilizing’ the related theorems from number theory. RSA is named after Rivest, Adleman, Shamir – authors of this 1978 paper: A Method for Obtaining Digital Signatures and Public-Key Cryptosystems.
Public key cryptography provides computational security: It is not impossible to decrypt the message without the secret information, it is ‘just’ very, very hard. You need a calculation that cannot be reverted easily. RSA uses the factorization of a large integer into prime numbers.
The original plain text (message M) and the cipher text (C) are both represented as (series of) integer number(s). Then the operations of encryption and decryption are functions that act on these (blocks of) number(s). RSA employs three different numbers to define these functions – two of them public and one secret. These numbers have to meet some requirements so that the following will work in the computationally secure way:
e is called the public key exponent,
d the private key exponent,
and n the public key modulus,
and the pairs (e,n) and (d,n) the public and private key, respectively. Both functions take the source text (plain or cipher) as an input, raise the corresponding number to the power of one of the exponents. The result is divided by the modulus n (operation mod n): The numbers on the left and the right side are considered equal if a division by n results in the same remainder:
Encryption: C = Me (mod n)
Decryption: M = Cd (mod n)
The future owner of a key pair creates the three numbers in this way:
Two different prime numbers p, q are generated (secretly), and n is computed as their product:
n = p . q
Then we need a special function of n called Euler’s Totient function Φ(n) – the number of all numbers smaller or equal than n whose greatest common factors with n is 1. If n would be a prime number, this Φ function would be n minus 1: the number of all positive integers smaller than the prime itself are relatively prime to it. Also in the chosen case of a product of primes the result is simple: You start from the theoretical maximum (if n would be prime – the number n minus 1), then subtract all multiples of either prime factor p or q. There are p multiples of q, and q multiples of p. But each set of multiples includes n = pq itself. You need to make sure that you do not subtract pq too often as this is not included in the total we start from (pq – 1):
Φ(n) = (pq – 1) – (p – 1) – (q – 1) = pq – p – q + 1 = (p – 1)(q – 1)
Φ is used to defined a relation between public and private key exponents: These are the multiplicative inverses of each other, with respect to a modulus of Φ(n). The ‘normal’ inverse of a number is its reciprocal value, a number times its inverse is 1. In modular arithmetic, the same relation holds, but only ‘up to’ a factor of the chosen modulus. You want a product of numbers whose product is a multiple of Φ(n) plus 1.
If the product of e and d is divided by Φ(n), the remainder should be 1. The multiplicative inverse operation only makes sense if the number under consideration – the one we try to find an inverse for – has no common factors with the modulus, I think it can be seen like this: If two integer numbers a and b are multiplicative inverses to each other modulo z, then ab = kz +1, k being some integer. If a and z share a common factor f, then a = lf and z = mf – thus lfb = kmf + 1. All numbers here are integers, so [some integral multiple of an integer f] would be equal to [some other integral multiple f] + only 1 … which is impossible.
So we have a constraint for the private number you start from:
d is chosen as a random large integer, relatively prime with Φ(n) and the keys are related by
e . d = 1 (mod Φ(n))
As 1 is the remainder of the division by Φ(n), this can also be expressed as:
e . d = 1 + k Φ(n) = 1 + (p – 1)(q – 1)
After d has been chosen, the public exponent e can be calculated as the multiplicative inverse of d modulo Φ(n).
The message M is broken into blocks of numbers, each smaller than n, and M is also designed not to share common factors with n (so that the hopefully recovered message will be unique despite the modulo operation). Using the proposed encryption function (to be scrutinized for its usefulness) the cipher text is C = Me (mod n). We apply the decryption function to this: Raise to the power of d, calculate (mod n) – [Me (mod n)]d (mod n). The exponents used in these functions are integer numbers, any of the power functions (mod n) could be written as a product of a number of factors M or C. But a product of integers (mod n) is the same as the product of each factors (mod n) – which can be seen from calculating either product for two arbitrary integers, say: a = x n + r and b = y n + s. The product of remainders, a(mod n) times b(mod) is rs, but if you multiply a and b first, the same remainder is obtained as the other terms are proportional to n or the square of n. We can always ‘pull’ the mod n operators to the right and ‘consolidate’ them:
Cd (mod n) = [Me (mod n)]d (mod n) = Med (mod n)
The product of ed has just been defined before in terms of Φ(n), while creating e from d. The rule for products (mod n) is applied again, and mod n is applied ‘separately’ to two factors in a product. One is then M (mod n). As n is larger than M (per deliberate design), M ‘really’ / ‘unambiguously’ is the remainder of its division by n: M (mod n) = M.
Med (mod n) = M1+k(Φ(n)) (mod n) = M (mod n) . Mk(Φ(n)) (mod n)
= M . Mk(Φ(n)) (mod n)
There is one remaining problem when trying to turn this into the message M: The expression highlighted in red has to be 1 – only then we would have recovered the message M from the cipher text C:
M . Mk(Φ(n)) (mod n) = M . (MΦ(n) (mod n))k = M . 1k
It is indeed true that MΦ(n) (mod n) = 1 if M and n are relatively prime – this is the theorem of Euler-Fermat!
My way to ‘prove’ it, that is, understand why this is plausible:
Look at the series of all numbers that are relatively prime to n, Φ(n) elements in total. For example, if n = 4, Φ(4) = 2, the series of numbers 1 and 3. Every number in this series is multiplied by another integer that has no common factors with n (this integer is M in our case). If the series { 1 , 3 } is e.g. multiplied by 3, the resulting series – before ‘reducing it’ through the mod n operation would be { 3 , 9 }. Applying (mod 4), it becomes { 3 , 1 } – so this is the same series, but the elements have been shuffled (a permutation). It would not work for the number of 2 – the result would be { 2 , 6 } before ‘reduction’, thus { 2 , 2 } when taken modulo 4. Multiplying by any number from the series itself will always ‘translate’ the series to a permuted version of it. If you take a relatively prime integer larger than n, say 5, the effect is the same: { 1 , 3 } is multiplied to become { 5 , 15 }, the remainders are { 1 , 3 }. 5 has the same effect as 1 as 1 = 5 modulo 4 – as the product of two numbers each taken (mod n) is the same as the product (mod n).
Now multiply all the numbers in the series from 1 to Φ(n) and compare it to the product of the numbers, each multiplied by one of the valid (relatively prime) integers – it has to be the same product again, just modulo n. Using the example again: 1 x 3 = [(1 x 5) x (3 x 5)] (mod 4). But the common factor 1 x 3 – the number on the left side, can also be pulled out of the expression on the right side. What remains is: 5 – the chosen ‘integer message’, multiplied with itself as many times as there are numbers in the series – Φ(4) = 2 times and taken (mod 4) is equal 1. But multiplying Φ(4) by itself is the definition of calculating a number to the power of Φ(4).
Thus employing the theorem of Euler-Fermat it is indeed true, that the message M can be obtained from the cipher text C via
Cd (mod n)= M