Non-mathematical Overview of Common Cryptography Concepts

This is my best attempt at giving concepts that will be practically useful later on. My hope is concepts logically gelling will make security "best practices" easier to follow. Ultimately cryptography is about either keeping something secret (encryption) and/or authenticating that somebody is who they say they are (authentication). Each concept ultimately is a way to one of those two things.

Symmetric Encryption

The simplest method for both authentication and encryption is effective symmetric encryption. The two parties that want to communicate to each other:

Public/Private Key Cryptography for Secrecy

One interesting area of math is one-way functions. Basically, let's say you have a pair of two numbers, Public_Key and Private_Key. You have a number X that you want to keep secret. A magic mathematical function can apply Public_Key to X, where f(Public_Key, X) = Y. This same function goes backwards with the private key, i.e. f(Private_Key, Y) = X. The important part is f(Public_Key, Y) does not give you back X. If somebody is hanging out and overhears Y, they can't go back to X. Doing f(Public_Key, Y) just gives this listener gobbledlygook.

The upshot is the public key for anyone, the CIA, Citibank, whoever, can be put on a billboard and broadcasted to everyone, while the private key is kept as secret as possible. Historically X would be the symmetrical key used by both sides to encrypt the actual message. A symmetric key is sent because symmetrical algorithms are almost always more efficient that asymmetric algorithms.

Hashing and Public/Private Key Cryptography for Authentication

The functions in the previous section also work the other way. Let's say you get two numbers: the Public_Key, X and Y. There is Private_Key number, that you don't have, which corresponds to Public_Key as in the last section. The following is true about these three numbers:

Remember that the Private_Key is kept very, very secret. If only, say, Citibank has access to Private_Key, then only Citibank could have created Y out of X. When you're given the three numbers Public_Key, X and Y, you know for sure that Citibank has done the mathematical operation f(Private_Key, X) to get Y. This mathematical operation is therefore similar to signing a check or a contract with your signature that nobody else can recreate. Creating Y out of X is therefore called a signature.

But why would it matter that Citibank did this mathematical operation only they could do? The fact Citibank signed a number like 638,382,742 doesn't mean anything because the number 638.382,742 means nothing in the real world. However, there is a way to turn messages into numbers called hashing. A hashing algorithm takes a message, such as "John Q. Borrower's bank balance on 4/5/23 is $10,000" and turns it into a very long number. A very close message, such as "Jon Q. Borrower's bank balance on 4/5/23 is $10,000" will give a completely different number. In fact, it must be infeasible to find any other message that creates the same number that is created by a different message.

To order the operations more concretely, let's say a loan officer wants to check John Q. Borrower's bank balance (with their permission to Citibank). The following steps would be done over an insecure wire that anyone could eavesdrop. Doing it over an insecure line would be bad for John's privacy, but the loan officer could still authenticate Citibank's message.

Authenticating Public Keys

One question has lurked under this asymmetric key talk. When encrypting communications using a public key or using a public key to authenticate a signature, how does the user know the public key has a private key held only by the person or entity you think you're talking to?

The issue is that authenticating someone is who they say they are is in fact a very difficult issue. The issue can go, ultimately, to philosophical issues of why society does not completely fall apart in a Hobbesian State of Man against Man.

Let's just use an example and say you are a Citibank customer and know that citi.com is their website. Maybe you saw citi.com in their physical branch and assume the physical branch is actually part of Citibank. You also buy a computer from Best Buy, which comes loaded with Windows and Microsoft Edge. Edge's program will include a number of public keys for certificate authorities. Very roughly (and honestly wrong but simplified), when you go to https://citi.com, then:

There is a whole lot left out. First what does Big Certificate Authority do exactly? Well they make some assumptions in validating citi.com. What if Citibank loses the private key? Well, all that stuff is just going to be left out. My only hope, if anyone non-technical reads this, that there is some sense how the cryptographic backbone of the Internet works.