Bitcoin from Scratch: The Math Behind the Cryptography
“We have elected to put our money and faith in a mathematical framework that is free of politics and human error.” - Tyler Winklevoss
Intro
This is the first part in a series to re-implement Bitcoin using C++. To build a solid foundation we need to understand the math in Bitcoin’s core. In this post, we will be covering the math behind Bitcoin’s cryptography.
But I have to warn you, I don’t have a formal academic background in math. If you do, you should probably search for a deeper and more technical explanation. The way I approached the mathematical side of Bitcoin Core was to get a high-level overview of the concepts and, most importantly, to answer the question of “why?” Why exactly do we use these mathematical creatures in blockchain cryptography?
Most of what you’ll read here is me trying to make sense of these mathematical concepts intuitively and drawing conclusions without formal proof. Are you okay with that? You better be! I’ve tried to link as many resources as possible for further digging, and I encourage you to do so. Dig deep!
Fields
Our first introduction to mathematics was through real numbers and the basic operations of addition and multiplication. These operations felt concrete and real, but then we encounter algebra where letters represent unknown variables in equations. Abstract algebra is a continuation of this line of thought, but more abstract.
As awesome as algebra is, it is limited to a known family of mathematical objects (real numbers, complex numbers, polynomials…) that are not suitable for representing complex computational entities. We need a more general, broad, and abstract way to translate the world into mathematical objects.
To that end, an algebraic structure is simply a set of elements with one or more rules defined on them. For example, consider a set of colours (red, blue) and a rule that states that when you mix red and blue in any order, you get purple. This is an algebraic structure: a set of elements (colours) with a rule defined on them (mixing colours).
We humans (or them mathematicians) define what elements should be in the set and what rules should be defined on them. We call these rules axioms. Unsurprisingly, it turns out that many natural phenomena obey our algebraic structures and we can use those to model them.
Some examples of algebraic structures include:
Briefly, the difference between these structures lies in the rules defined on the elements:
- If you can add and subtract elements, you have a group
- If you can add, subtract, and multiply elements, you have a ring
- If you can add, subtract, multiply, and divide elements, you have a field
One thing to note is that in abstract algebra, we define subtraction as the “addition of the negative” and call it the additive inverse: a + (-a) = 0
The same goes for multiplication. We define division as “multiplication by the multiplicative inverse”: a * 1/a = 1
You may be asking why we don’t just define subtraction and division as we’re used to. The truth is, I don’t know why. But I guess that this shift in thinking helps think more abstractly about the properties of the elements in the field.
Now, for a formal definition of a field:
A field is a set of elements with two binary operations defined on them, addition and multiplication, such that:
- Addition is commutative and associative
- Multiplication is commutative and associative
- Addition is distributive over multiplication
- There exists an additive identity element 0 such that for all
ainF,a + 0 = a - There exists an additive inverse element -a such that for all
ainF,a + (-a) = 0 - There exists a multiplicative identity element 1 such that for all
ainF,a * 1 = a - There exists a multiplicative inverse element 1/a such that for all
ainF,a * 1/a = 1
Some examples of fields are real numbers R, complex numbers C, and rational numbers Q—meaning all elements in those fields adhere to the axioms defined above.
Other sets like integers Z and natural numbers N are not fields because they don’t obey all the axioms. Specifically, integers don’t have a multiplicative inverse element and natural numbers don’t possess an additive inverse element (they don’t have negative numbers).
Galois Fields
We took this whole detour to get the foundation first! As you may guess, Galois Fields (named after the guy who discovered them, Évariste Galois—beside being math savvy this guy was a wild fella who died in his early 20s in a duel fight, check him out here) are a family of fields differentiated from fields like real numbers and complex numbers by the fact that they’re finite and they possess a unique and fascinating property.
In cryptography, you almost always want to deal with finite algebraic structures. My reasoning behind this is to create a deterministic environment for your computations.
So what’s this unique and fascinating property about Galois fields?
A field is a Galois field if it has P^m elements, where P is a prime number and m is a positive integer.
Some examples of Galois fields are:
GF(2³) = {0,1,2,3,4,5,6,7}(8 elements)GF(3²) = {0,1,2,3,4,5,6,7,8}(9 elements)GF(5²) = {0,1,...,24}(25 elements)
How to compute in a Galois field?
In finite fields, addition, subtraction, multiplication, and division are performed in a similar way to how they are performed in a set of real numbers. However, the main difference is that in a finite field—and make sure to remember this—the result of any of these operations must always be an element of the field.
For example, if we have a finite field with the elements {0, 1, 2, 3, 4}, we can perform addition in the same way we do with real numbers. If we want to add 4 + 3, we would calculate 4 + 3 = 7. However, since 7 is not an element of our finite field, we need to “wrap around” and find the remainder when 7 is divided by the number of elements in the field:
4 + 3 = 7 mod 5 = 2
Multiplication in a finite field works similarly. If we want to multiply 2 × 4, we calculate 2 × 4 = 8. Since 8 is not an element of our finite field:
2 × 4 = 8 mod 5 = 3
Division in finite fields
With division things get more interesting. In regular arithmetic, if we want to perform 6/3, we know that division is the inverse of multiplication, so we can rewrite this as 6 * (1/3). The inverse of 3 is 1/3 because 3 * 1/3 = 1.
Now what about division in a finite field? Suppose we want to perform 3/7 in the finite field of prime order 19. We know that division in a finite field is closed, which means the result of any operation must always be an element of the field.
So how do we find the inverse of the divisor in a finite field? One way is by using Fermat’s little theorem. This theorem states that if p is a prime number and a is an integer not divisible by p, then:
a^(p-1) ≡ 1 (mod p)
This means if we raise a to the power of p-2, we get its multiplicative inverse!
For example, in a finite field of prime order 19, to calculate 2/7:
- First calculate the inverse of 7:
7^(-1) = 7^(19-2) = 7^17 - Then:
2/7 = (2 × 7^17) mod 19 = 3
Not as intuitive, hmmm?
For large prime orders (and that’s what we deal with most in cryptography), finding the inverse via modular exponentiation can be computationally expensive. We’ll use binary exponentiation to speed things up.
Implementation in C++
My initial folder structure looks like this:
.
├── Makefile
├── README.md
├── crypto
│ ├── FieldElem.cpp
│ └── includes
│ └── FieldElem.hpp
└── main.cpp
I represented a field element in FieldElem.hpp as:
#pragma once
#include <boost/multiprecision/cpp_int.hpp>
using namespace boost::multiprecision;
class FieldElem
{
public:
FieldElem(cpp_int num, cpp_int prime);
FieldElem(const FieldElem &other);
FieldElem &operator=(const FieldElem &other);
~FieldElem();
// getters
cpp_int const &getNum() const;
cpp_int const &getPrime() const;
// overloads
bool operator==(const FieldElem &other) const;
bool operator!=(const FieldElem &other) const;
FieldElem operator+(const FieldElem &other) const;
FieldElem operator*(const FieldElem &other) const;
FieldElem operator/(const FieldElem &other) const;
FieldElem pow(cpp_int exponent) const;
private:
cpp_int num;
cpp_int prime;
};
Two points worth highlighting:
- We’re using arbitrary length integers from the boost library since we’ll deal with pretty big numbers
- We check in our constructor if num is in the range from 0 to prime-1 and throw an exception if not
The constructor and basic operators:
FieldElem::FieldElem(cpp_int num, cpp_int prime) : num(num), prime(prime)
{
if (num >= prime || num < 0)
{
throw std::invalid_argument("Num is not in field range 0 to prime-1");
}
}
bool FieldElem::operator==(const FieldElem &other) const
{
return (num == other.num && prime == other.prime);
}
bool FieldElem::operator!=(const FieldElem &other) const
{
return !(*this == other);
}
Now for the arithmetic operators—this is where it gets interesting:
FieldElem FieldElem::operator+(const FieldElem &other) const
{
if (prime != other.prime)
{
throw std::invalid_argument("Cannot add two numbers in different Fields");
}
FieldElem res = *this;
res.num = (res.num + other.num) % prime;
return res;
}
FieldElem FieldElem::operator*(const FieldElem &other) const
{
if (prime != other.prime)
{
throw std::invalid_argument("Cannot multiply two numbers in different Fields");
}
FieldElem res = *this;
res.num = (res.num * other.num) % prime;
return res;
}
FieldElem FieldElem::operator/(const FieldElem &other) const
{
if (prime != other.prime)
{
throw std::invalid_argument("Cannot divide two numbers in different Fields");
}
FieldElem res = *this;
// Using Fermat's little theorem: a^(-1) = a^(p-2) mod p
res.num = (res.num * other.pow(prime - 2).num) % prime;
return res;
}
The pow method uses binary exponentiation to reduce complexity from O(n) to O(log n):
FieldElem FieldElem::pow(cpp_int exponent) const
{
cpp_int n = exponent % (prime - 1);
if (n < 0)
{
n += prime - 1;
}
// Binary exponentiation: O(log n) instead of O(n)
cpp_int res = 1;
cpp_int x = num;
while (n > 0)
{
if (n % 2 == 1)
{
res = (res * x) % prime;
}
x = (x * x) % prime;
n /= 2;
}
FieldElem result(res, prime);
return result;
}
What’s Next
And with that, I’ll see you in the next part where we’ll be talking about Elliptic Curves—the mathematical structures that make Bitcoin’s public-key cryptography possible.