In this deep-dive article, we talk about one fundamental technology behind cryptocurrencies — cryptography — covering the basics of encryption, hash functions, and ciphers as an introduction to cryptography in action.
Introduction to Cryptography
Cryptography, in simplest terms, is to make communications secret. The ‘secret’ here means that, even in the presence of an eavesdropper who can monitor all communications, the intended message can still be delivered to the receiver while kept secret from others.
Cryptography often uses mathematical techniques to achieve desirable properties for secrecy. It is often associated with the process of ordinary plaintext being converted to ciphertext, which is text that only the intended receiver can decode. The process that converts plaintext to ciphertext is known as encryption, while the process of conversion from ciphertext to plaintext is called decryption. A cipher (or cypher) is an algorithm for performing encryption or decryption — a series of well-defined steps that can be followed as a procedure.
Cryptography is not something new, nor just for cryptocurrencies. It is widely used in daily life where information needs to be protected, like banking transaction passwords, email account passwords, and e-commerce transactions.
Blockchain and cryptocurrencies use cryptography in multiple ways — for wallets, transactions, security, and privacy-preserving protocols. For instance:
- Public-key cryptography used in a transaction
- Hashing for Bitcoin mining
- Merkle Trees for transaction positioning in blocks
We can almost say that if you don’t understand cryptography, you don’t understand cryptocurrencies. To better learn, let’s dive deep into the world of cryptography.
Symmetric and Asymmetric Encryption and Hash Functions
Classification of cryptographic algorithms is based on the number of keys employed for encryption and decryption: symmetric encryption, asymmetric encryption, and hash functions.
Each has specific applications that are irreplaceable to each other. For example, asymmetric encryption is needed to generate the private/public key pairs, while hash functions are needed to produce unique digital fingerprints.
Symmetric encryption uses a single key for both encryption and decryption. The message sender uses the key to encrypt the plaintext and sends the ciphertext to the receiver, who applies the same key to decrypt the message and recover the plaintext. Symmetric key systems are faster and simpler (when compared to asymmetric key systems), but they don’t solve the key-exchange problem between sender and receiver if the key is not known in advance.
Symmetric cryptography schemes are further divided into block ciphers and stream ciphers.
Block ciphers break the input into fixed-size (e.g., 128 bits) blocks, and each of the blocks is processed by several functions with the secret key. The algorithm determines the block length, key, and functions used in the process.
Some commonly used block ciphers include:
Data Encryption Standard (DES)
- Block length = 64 bits
- Key length = 56 bits
Created at IBM, DES was one of the most popular block symmetric ciphers in the early 1970s and is one of the most thoroughly examined encryption algorithms. It was adopted as a federal standard by the National Bureau of Standards (US) in 1976, and included in ANSI standards as the Data Encryption Algorithm for the private sector in 1981.
At the beginning of the twenty-first century, DES was considered insecure, mainly due to its relatively short secret key length, making it vulnerable to brute-force attacks.
Advanced Encryption Standard (AES)
- Block length = 128 bits
- Key length = 128, 192, or 256 bits
AES, a modern block symmetric cipher, is one of the most popular ciphers in the world. Developed in 1997 by Vincent Rijmen and Joan Daemen, it was later approved as a federal encryption standard in the United States in 2002.
AES is considered a solid and secure cipher. Over the years, there have been several attacks against different AES implementations, but they concern special cases and are not considered to be a threat to the AES algorithm itself.
Triple DES (3DES)
- Block length = 64 bits
- Key length = 56, 112, or 168 bits
3DES cipher was developed because DES encryption, invented in the early 1970s and protected by a 56-bit key, turned out to be too weak and easy to break using modern computers of that time.
Stream ciphers are more flexible than block ciphers, as they’re designed to encrypt individual characters (usually binary digits) of a plaintext message one at a time using an encryption transformation that varies with time. By contrast, block ciphers tend to encrypt plaintext blocks using a fixed encryption transformation simultaneously.
Overall, the hardware in stream ciphers is faster than block ciphers. Stream ciphers are also more appropriate and, in some cases, mandatory (e.g., in some telecommunications applications) when buffering is limited or when characters must be individually processed as they are received. Because they have limited or no error propagation, stream ciphers may also be advantageous in situations where transmission errors are highly probable.
Some popular stream ciphers include:
- Key length = up to 2,048 bits
RC4 is a variable key-size stream cipher with byte-oriented operations that is widely used in popular protocols. For example, it is used to protect Internet traffic (Transport Layer Security, known as TLS) or to protect wireless networks (Wired Equivalent Privacy, or WEP).
One-Time Pad (OTP)
- Key length = message length
The theory behind the One-Time Pad (OTP) is that the key must have at least the same length as the message (the plaintext) and consist of truly random numbers. Each letter of the plaintext is ‘mixed’ with one element from the OTP. This results in a ciphertext that has no relation with the plaintext when the key is unknown. At the receiving end, the same OTP is used to retrieve the original plaintext.
- Key length = 32 bytes
Salsa20 is a cipher submitted to the eSTREAM project, running from 2004 to 2008, which was supposed to promote the development of stream ciphers. It’s considered to be a well-designed and efficient algorithm. There aren’t any known and effective attacks on the family of Salsa20 ciphers.
Asymmetric encryption is also known as Public Key Cryptography (PKC). It uses two different keys for encryption and decryption. The key that needs to be kept secret is called the private key, while the key that doesn’t is called the public key. For instance, if A wants to send a message to B and ensure that B will be the only person able to comprehend the message, A can encrypt the message using the public key such that only B can decrypt the message using the private key.
First described publicly by Stanford University professor Martin Hellman and graduate student Whitfield Diffie in 1976, asymmetric encryption is described as a two-key cryptosystem in which two parties could engage in secure communication over a non-secure communications channel without having to share a common secret key.
While the public key and private key are different, they are mathematically related. But the mathematical relationship is usable only upon encryption and decryption, and others cannot derive the private key even if they know the public key.
Asymmetric cryptography algorithms are widely used in cryptocurrencies. For example, the wallet address is created from a public key, and only those who have the private key are able to use the money inside.
Some popular asymmetric encryption schemes include:
The public key and private key are not chosen arbitrarily. They need to be generated through specified procedures. The public key consists of two large integers (e, n), and the private key consists of two large integers (d, n). The three numbers e, d, and n are related in a special way, requiring a bit more mathematics:
Let’s assume you’re using an encoder that turns all English plaintext into Arabic numbers. For example, ‘Hey! Hey! Hey!’ is translated into ‘7! 7! 7!’ and ‘bitconnnnnnnnect’ is translated into ‘83333331’. Now, Alice wants to send the message ‘Hello’ (which is translated into ‘2’) to Bob.
The easiest method for Alice to send the message to Bob is by directly telling Bob ‘2’. But Alice and Bob don’t want their message to be seen by anyone else. Luckily, Bob knows RSA. He has created a public/private key pair and asked Alice to encode the message ‘2’ using the public key he provided.
Alice then encrypts the message ‘2’ using Bob’s public key (5, 14), and the encrypted message becomes ‘4’, meaning ‘Translate Server Error’. Even if an eavesdropper saw their message, they would have no idea why they are sending ‘4’ out.
Bob knows the ‘4’ is not Alice’s true message. He then decrypts the message with his private key, which he hasn’t told anyone (not even Alice). The private key is (11, 14), and by applying some decrypting procedures, Bob finds out the true message is ‘2’.
Let’s illustrate in a table how the above process works:
From Bob’s side, he only receives the encrypted message ‘4’. But by applying the decryption with his private key, he has successfully recovered the true message ‘2’.
So, how do we get the public key (5, 14) and private key (11, 14)?
This process is called RSA key generation.
For those who are interested, check out RSA encryption.
Elliptic Curve Cryptography (ECC)
ECC is an alternative asymmetric encryption algorithm to RSA. Similarly, it allows users to create a public/private key pair, but the algorithms/procedures used are different than RSA.
For example, for the elliptic curve (see below), the formula is y2 = x3 + ax + b, where a and b are the coefficients to determine the shape of the curve.
We start with a specific point on the curve. Next, we use a function (called the dot function) to find a new point. We repeat the dot function to hop around the curve until finally ending up at our last point. Let’s walk through the algorithm:
This is a great trapdoor function because, if one knows the location of the starting point (A) and the number of hops required to get to the ending point (E), it is very easy to find the ending point. On the other hand, if only the locations of the starting point and ending point are known, it is almost impossible to find how many hops it took to get there. So, in this example, we can define public key and private key as:
- Public Key: Starting point A, ending point E
- Private Key: Number of hops from A to E
To be more general, for a given elliptic curve (as shown below), the public key is (P, G), where P is the start point and G is the end point (a special pre-defined [constant] EC point), and the private key is k (k is an integer and P = k * G).
Diffie-Hellman Key Exchange Algorithm
The Diffie-Hellman algorithm is used to establish a shared secret that can be used for private communications while exchanging cryptographic keys over a public network.
Traditionally, encrypted communication between two parties required exchanging keys by some secure physical channel, such as paper key lists transported by a trusted courier. The Diffie-Hellman key exchange method allows for two parties with no prior knowledge of each other to jointly establish a shared secret key over an insecure channel. This key can then be used to encrypt subsequent communications using a symmetric key cipher (as discussed above).
To simplify the algorithm, we use the following image to present the procedure of this encryption. Please note that the letter P at the beginning is selected randomly in an elliptic curve, and G is a primitive root of P. The private keys of a and b are chosen randomly.
Alice and Bob can now use the same secret key (3) to encrypt the message between them without knowing this key in advance.
Hash functions, also called message digests and one-way encryption, compress plaintext into a fixed-length text, called hash value (or digest). It is impossible to recover the hash value back into the plaintext.
A hash function should fullfil three security properties:
It is computationally infeasible to find two different input strings that, if applied to the hash function, have the same output.
It can be expressed like:
It is difficult to find x and y, where H(x) = H(y), but x is not equal to y.
Please note that there have to be collisions since we are moving from a much larger input space t (any strings) to a fixed output space (2^256).
Application: Hash as a Message Digest. If we know that H(x) = H(y), it is safe to assume that x = y.
Example: To recognise a file that we saw before, remember its hash. This works because the hash is small.
Given H(x), it is infeasible to find x. This can be explained as, if given the hashed version of x, we will not find x.
Application of Hiding Property: Commitment
This is the digital analogy of taking a value (a number), sealing it in an envelope, and putting that envelope on the table — where anyone can see it — thus committing to what’s in the envelope and its secret from everyone else. Later, the envelope can be opened and the value taken out, but it’s sealed. So commit to a value and reveal it later.
The commitment algorithm does two things:
1. Commit to a message: Returns two values — a commitment and a key. The commitment is the same as the envelope put on the table; the key is necessary to open it.
2. Later, we reveal to someone the original message and let them verify that it corresponds to the original, providing them with the commitment and the key.
If someone wants to target a certain hash function, then gain some value of y. If part of the input is chosen randomly, it’s difficult to find another value to target the hash function value.
Given an output y of the hash function, if k is chosen from a random distribution, it is infeasible to find an x, such that the hash of k|x (k concatenated with x) is y: H(k|x) = y.
Popular Hash Functions
Message Digest (MD)
The MD family comprises hash functions MD2, MD4, MD5, and MD6. It was adopted as Internet Standard RFC 1321 and is a 128-bit hash function.
MD5 digests have been widely used in the software world to provide assurance about the integrity of transferred files. For example, file servers often provide a pre-computed MD5 checksum for the files so that a user can compare the checksum of the downloaded file to it.
Secure Hash Function (SHA)
The family of SHA consists of four SHA algorithms: SHA-0, SHA-1, SHA-2, and SHA-3. Though from the same family, they are structurally different.
SHA-0 was published by the US National Institute of Standards and Technology (NIST) in 1993. It had few weaknesses but did not become very popular.
Later, in 1995, SHA-1 was designed to correct the alleged weaknesses of SHA-0. Collisions against the full SHA-1 algorithm can be produced using the shattered attack, and the hash function should be considered broken.
The SHA-2 family has four further SHA variants: SHA-224, SHA-256, SHA-384, and SHA-512, depending on the number of bits in their hash value. SHA-256 and SHA-512 are commonly used. SHA-512 is more secure than SHA-256 and is often faster than SHA-256 on 64-bit machines.
In October 2012, the NIST chose the Keccak algorithm as the new SHA-3 standard. Keccak offers many benefits, such as efficient performance and good resistance to attacks.
RIPEMD is a family of cryptographic hash functions that includes RIPEMD, RIPEMD-128, and RIPEMD-160. There also exist 256- and 320-bit versions of this algorithm. RIPEMD-160 is an improved version and the most widely used version in the family.
Final Words: Applying Cryptography in Practise
Cryptography has a surprisingly long history, but one with seemingly endless possibilities for future usage. The process is the base of current-day cryptocurrency and wallets.
For more info about the different kinds of crypto wallets, read What Is a Crypto Wallet? A Beginner’s Guide.
Due Diligence and Do Your Own Research
All examples listed in this article are for informational purposes only. You should not construe any such information or other material as legal, tax, investment, financial, cybersecurity, or other advice. Nothing contained herein shall constitute a solicitation, recommendation, endorsement, or offer by Crypto.com to invest, buy, or sell any coins, tokens, or other crypto assets. Returns on the buying and selling of crypto assets may be subject to tax, including capital gains tax, in your jurisdiction. Any descriptions of Crypto.com products or features are merely for illustrative purposes and do not constitute an endorsement, invitation, or solicitation.
Past performance is not a guarantee or predictor of future performance. The value of crypto assets can increase or decrease, and you could lose all or a substantial amount of your purchase price. When assessing a crypto asset, it’s essential for you to do your research and due diligence to make the best possible judgement, as any purchases shall be your sole responsibility.