Cryptographic functions have the property that they are much harder to invert than they are to compute. We generally aim for functions that are NP-hard to invert, such that for big enough inputs it would take a prohibitively long time to invert them.
There are many cryptographic functions, almost all relying on mathematical principles that most of you have not learned. That said, there are a few classes of functions that are worth understanding even if the details of their implementation is more complicated than we want to discuss.
A cryptographically secure hash (there are other, hashes used elsewhere in computing that are not secure) has the following properties:
Going form big message to hash is fast and easy
Finding a message that would give a particular hash is very hard
The SHA-1 hash of Hi!
is eb8e0e5fcbd4675e9e6fc6f770a170e56bd5923b
The SHA-1 hash of Hi.
is 0693b2ba3945a014125952e12afe4cd5a1519161
A symmetric cipher is a pir of functions, one to encrypt and one to decrypt. These look quite similar in most ways
A message of arbitrary type and length; typically provided as a list of bytes.
For encryption, this is the plaintext
– the message we want to send. For decryption, this is the ciphertext
– it looks random to everyone except the intended recipient.
A key, which is an arbitrary secret number known only to the two communicating parties.
For encryption, the ciphertext.
For decryption, the plaintext.
If you and I know the same key, we can encrypt messages efficiently and decrypt what the other encrypted.
If you and I do not know the same key, encrypting or decrypting or guessing the key is very hard.
Diffie-Helman key exchange is a way of having two people agree on a single random number without anyone else being able to figure out what number they came up with even if they hear everything both of them say to each other.
There is some difference of opinion about the justice of this name. Whitfield Diffie and Martin Hellman published a paper describing it in 1976, but Hellman later stated the idea in the paper was that of their graduate student Ralph Merkle. When Diffie and Hellman were given the Turing award in 2015 (the top prize in computer science), the press photo for the award announcement was a clipped version of a photo that, if unclipped, included Merkle.
The core work of Diffie-Hellman is an operator in a cyclic group, which is similar to the idea of associative operators. As long as and the inverse of is hard to compute, we can exchange keys as follows:
A signature is very much like a hash, but it also has two keys: one used by the signer (called the private key) and another shared publicly with everyone (called the public key). Both keys are needed to encrypt, but only the public key is needed to decrypt.
To create a signature, we do this:
To check a signature, we do this:
Note that public and private key pairs are potentially valuable in many contexts, forming what is called an asymmetric cipher. However, they tend to be quite limited in how they can be securely used, often working only on fixed-size input (like the outputs of a hash), and some only work reliably in the context of a signature.
To provide confidential communication with a single other party,
Note that this provides confidentiality (no one can listen in), but does not provide authentication: you know you’re talking to just one person but don’t know who that person is.
Suppose I want you be confident that the message I’m sending you has not been tampered with by others; and that you know my public key. I share the message and my signature of it; you can check the signature to verify that the message has not been changed.
Suppose I want to know what your public key is. I can’t just ask you: someone could change your reply. Instead, I ask for a certificate signed by someone I already trust and have a public key for, called a certificate authority.
The certificate has four important pieces of information:
The expiration date is very important: cryptography is hard to invert, but not impossible, so we need to make keys expire so enough that finding an inversion can’t be done in time.
Certificate authorities are important in this process: if I trust an authority I shouldn’t or don’t have the right key for the authority then someone I don’t trust can pretend to be someone, giving me a forged certificate with their own key instead of the correct key of entity I wnated to contact.
The public keys of several certificate authority companies are shipped with your computer. Those companies have a financial interest in never having been found to give bad certificates, giving reasonable assurance that they should be trusted.
When I log into some system, I send them my username and password. They need to compare those to the information they have on file. But if they have my password stored then they become an attractive mark for others wishing to steal my information.
Instead, they can store the hash of my password. If the password I type hashes to the hash they have on file, I must have typed the right password.
But hashes are predictable, meaning if someone finds a password with the same hash they can break my password, leading to a similar desire to steal those files.
To prevent that, we store not just the hash of the password as-is, but rather a hash of the a longer message containing both password and some additional data. Ideally that is two different pieces of data: one, called the salt, is different for each user and stored in the file of hashed passwords; the other, called the pepper, is the same for all users but kept secret and not stored anywhere.
Instead of having you send me a password, I can send you random value and ask you to add your password to it and send me back the resulting hash. If you and I both know your password, we will get the same hash doing this but others listening in won’t learn what that password was.
If instead of us both knowing your password I know your public key, I can send you a random value and you can send back your a signature of that value. This way your secret key never needs to be shared with anyone, not even initially when creating your account.
These kinds of techniques are used by tools such as chip-based credit cards and highly secure 2-factor authentication.
If you go to a site that tells you you can log in with another site, what happens?
The service provider site (the one you are visiting)
sends you a list of identity providers (ones you may have and account with) that it trusts.
You
There are more complicated protocols that let the service provider to ask the identity provider more questions about you, but the main parts of the flow are the same.
Suppose I visit Slack, which has me log in with Google, and use a security key as my 2-factor authentication with Google.
I visit https://slack.com
I now know I’m talking to Slack and not someone else, and can talk with them securely.
The site asks me to pick a login provider from a list it supports
I pick Google and am redirected to their login page.
This is HTTPS, so I repeat step 1 with this page.
I now know I’m talking to Google and not someone else, and can talk with them securely.
To log in
Google now knows who I am.
Google gives me a certificate of my identity to share with Slack
I send the certificate with Slack
Slack now knows who I am.