PROLOGUE (abridged)

At the moment I’m still reading up on general topology and working on the article, 0a explains: general topology. At some point in time I decided to have set theory, which initially was one of its segments, in a separate article instead. And just to make things more interesting, I figured I should “dive in” slightly deeper and give an axiomatic view on some of the concepts.

So here you go.


We can't be absolutely certain that anything is real. Did the past actually exist? That may appear to be an absurd question to ask. But how can you be so sure that events you remember really did occur? You can argue that you are able see their consequences in the present, or other people would agree with you that they happened. But that is based on the assumption that reality did not just come into existence a moment ago in a predefined state, together with memory of a non-existed past implanted in your mind.

As a human being, we live our lives presuming that some things are real just because it is more convenient to do so, despite the fact that there is no way to absolutely prove that they are true without making more assumptions.

We do that too in mathematics. An axiomatic system is made up of a collection of mathematical statements (known as axioms) that are defined to be true, so that we can use them to further define mathematical objects (e.g. number, function) and prove statements which we believe to be true (such as the Riemann hypothesis).

Sometimes, an axiom can appear to be stating the obvious. Here is an example:

This is an axiom in Frege’s axiomatic system on propositional calculus.

"$\Rightarrow$" means imply. "$\neg$" means not.

We can read it as:

When I said that “” means imply, it is more of a suggestion on how we should interpret it. After all, “” is nothing more than a symbol for a mathematical concept that exists in a rather abstract way. It doesn’t really have a “meaning” per se.

But since we are humans, it is often useful to “impose” meanings upon mathematical concepts (e.g. “+” means addition). Just bear in mind that when we want to closely examine these concepts, we should take a step back and forget what we know, and perceive them in terms of axioms.

The axiom above is defining a fundamental property about the mathematical concept associated with the symbol “”.

So what exactly is an axiomatic system?

An axiomatic system is a list of undefined terms (each to represent some mathematical object) with a list of axioms that express relations between these terms.

When we have a collection of mathematical objects that are to be represented by these terms, and we start interpreting them (with regard to the axioms), it is said that we have a model of the axiomatic system.

● ● ●

There are many different axiomatic systems that formalize set theory. Before getting started on these axiomatic systems, let’s first look at set theory in a more “naive” way without concerning too much about the formalism.

What is set theory?

Set theory is considerably the heart of modern mathematics. As suggested by its name, set theory is all about the mathematical objects known as sets.

A set can be thought as a collection of things. We refer to these things as the elements of the set.

When talking about the elements in a set, our only concern is their existence. So we don’t really care about the number of identical elements. It either exists, or it doesn’t. The concept of quantity is not important to us.

Neither do we care about the order they are in.

This whole idea of existence being our only concern for elements in a set can be seen in the axiom of extensionality.

The axiom of extensionality is an axiom that’s been used in a couple of axiomatic systems on set theory, including the famous Zermelo–Fraenkel set theory (ZF), von Neumann–Bernays–Gödel set theory (NGB) and New Foundations (NF).

To understand the symbolism, just keep in mind that

"$\in$" means is an element of (or in). "$\forall$" means for any (or for all, or for every, etc). "$\Leftrightarrow$" is the "two-ways imply", it means if and only if.

Thus the axiom can be read as:

Basically what it’s saying is that two sets ( and ) are equivalent as long as every element in is in , and every element in is also in . This appears to be a rather obvious thing to say. But it is necessary to have an axiom like this serving as a foundation for the mathematical idea of set.

With this axiom, we can prove that by stating that each element in is also in , and vice versa: so by the axiom, the ordering of elements does nothing to how we perceive a set.

The same goes to the quantity:

Note: in the above statement works as a variable, in this case it is representing any element of a certain set. Since for any element in , it is also in , and vice versa, so is equivalent to .

More on the axiom of extensionality

An interesting consequence of the axiom of extensionality is that it implies a universe where everything is built up with sets. A universe where the notion of urelements no longer makes sense.

What is a urelement?

A urelements (also called “atom”) is a mathematical objects that is an element of some set, but is not a set itself.

For example, natural numbers are often considered to be urelements. (Provided that you don’t define natural numbers to be sets of certain structure.) Since urelements are not sets, they contain no elements. But they can still be different from one another by their properties:

The axiom of extensionality expresses that the statement

is true for all objects A and B in the universe (hence the universal quantification, ). As long as we have two mathematical objects that contain no elements, they would be identical by definition.

Simply put it, this axiom implies that every mathematical object in the universe is distinguished only by their elements. So you can’t have mathematical objects that contain no element but are different. This is what I mean by the notion of urelements no longer makes sense, and this is where things get really interesting.

In such universe, empty set is the building block of everything. Let’s say we have a set . Here is an example of what and can be:

By the axiom of extensionality, we can see that:

No matter what mathematical objects we come up with, if we look at their elements, the elements of their elements, the elements of the elements of their elements … we will eventually arrive at an empty set. Here is an example

In this universe, natural numbers would have to be built up with empty set too. This is Zermelo’s construction/definition of natural numbers:

We can say that this is the universe the axiom of extensionality has beautifully entailed.

“It is said that the world is empty, the world is empty, lord. In what respect is it said that the world is empty?” The Buddha replied, “Insofar as it is empty of a self or of anything pertaining to a self: Thus it is said, Ānanda, that the world is empty.”

I was exposed to ideas in Buddhism when I was young. Just pointing out an interesting resemblance here.

● ● ●

Defining a set with a statement

Rather than explicitly writing down its elements like this: , a set can also be defined this way:

The statement above can be read as:

Since is the set of all real numbers, and is the set of all rational numbers, here is the set of all irrational numbers.

On set operators

Besides the set operator, which is the primary operator in set theory, there are other set operators too. And they can all be defined in terms of .

Union ( ): getting all elements from two sets.

Intersection ( ): getting elements two sets have in common.

Difference ( ): getting elements from one set, that are not in another sets.

Hence, in the example above, can expressed as a difference:

Complement ( ): getting all elements in the universe that are not in some set.

So can also be defined as a complement, if we define to be the universe:

What is a subset?

A set is a subset of another set when all its elements are in the other set ( Denoted as )

In the example above, is a subset of . We often use “” to denote this relationship:

A subset of a set can be viewed as a set we obtain from selecting elements from another set satisfying some statement.

In ZF, the axiom of specification states that a set of such selection alway exists.

We can read it as:

$\exists$ means there exists. $\land$ means and. $\phi(x)$ here is known as a meta-variable to indicate some statement about $x$.

We often refer to an axiom with meta-variable as an axiom schema. Axiom schema can generate an infinite number of axioms because we can make an axiom out of it by putting any statement into the variable.

In programming, you can visualize an axiom schema as a simply function that returns an axiom:

function make_an_axiom(phi){
     return "Forall A, there exists B such that for all x, if x is in B, then x is in A and " + phi;
 var axiom_schema = make_an_axiom;
 var axiom1 = axiom_schema("x is a negative number");
 // axiom1 would be: "Forall A, there exists B such that for all x, if x is in B, then x is in A and x is a negative number"
 var axiom2 = axiom_schema("x is a prime number");
 // axiom1 would be: "Forall A, there exists B such that for all x, if x is in B, then x is in A and x is a prime number"

In the previous example, is the statement . So elements satisfying this statement, namely the irrational numbers, would be selected.

A set is always a subset of itself. Selecting all elements from a set gives us back the original set.

is the general notation for subset. It can be used for all subsets, while can only be used when the subset is not the set itself.

Empty set is the subset of all sets. Selecting no elements from a set and we would get an empty set.

Axiom of specification in Cantor's Paradise

Cantor is consideredly the founder of modern set theory, due to his 1874’s paper which illustrated a fundamental concept about infinite sets. Cantor’s Paradise is the name for the set theory came up by Cantor in the era before there were any axiomatic systems on set theory.

Cantor developed his theory of set in what we called an “intuitive” approach: he did not formalize the mathematical concepts with formal system such as first-order logic (which is what’s used in ZF and many other axiomatic systems). His set theory is in a sense a paradise due to its simplicity and straightforwardness. A paradise where, back in the early 20th century, many people had comfortably settled in with no plan to leave, even though at that time it was becoming clearer and clearer that such approach to developing a theory of set would result in paradoxes.

Here is the concept of “specification” in Cantor’s paradise formalized into an axiom for comparison with the axiom of specification in ZF.

Axiom of specification in Cantor’s paradise

Axiom of specification in ZF

Apparently, Cantor’s version allows a set to be constructed by all elements in the universe that satisfied . It does not restrict the elements of selection be in a particular set (hence instead of ).

This would lead to what's known as the Russell Paradox.

Let’s say we call all sets that don’t contain themselves normal set, and define a set , that contains all normal sets. We would realize that if is itself a normal set, must contain itself. But that would make no longer a normal set (since a normal set does not contain itself).

So we conclude that shouldn’t contain itself. And that would mean is a normal set…

We would end up having this absurd statement about :

And it’s derivable from the axiom of specification that exists.

The Barber paradox is an alternative form of the Russell Paradox. Instead of talking about a set that contains all the sets that don’t contain themselves, the Barber paradox talks about a barber who only shaves men who do not shave themselves.

In ZF, its axiom of specification avoided this paradox by restricting the selection process from all elements in the universe to only elements in a certain set.

So this axiom of specification only guarantees the existence of a set made up of elements from a set that’s already defined.

We can’t just “squeeze” into before is defined. So can’t contain . The paradox can’t occur.

On the other hand, to avoid this paradox, Russell invented a theory of “type” (and included it in Principia Mathematica, a book he co-wrote with Whitehead). It basically states that every set has a “type number” based on what it contains.

In this universe, urelements exist. And they exist with type number 0. Sets containing urelements are type 1 objects. Sets containing type 1 objects are type 2 objects. And so on. We can only define a set by first having objects of lower types. This hierarchy of types prevent a set from containing itself because self-referencing is not possible in a system like this.

Von Neumann–Bernays–Gödel set theory (NGB) extended ZF by introducing the concept of class. Class is basically a collection of things, just like sets in ZF. Sets in NGB are defined to be classes that are elements of other classes. So we end up having two types of classes: set and “proper class”. A “proper class” is simply a class that is not an element of any class. “Proper classes” can contain all sets that satisfy some statement. This would not result in Russell Paradox becaues “proper class” is by definition not a set. Just as we can’t define a set to contain all sets in the universe which satisfy some statement, we can’t define a class to contain all classes which satisfy some statement.

● ● ●

Author's Note

This marks the end of our journey into the world of axiomatic systems on set theory (which are often referred to as “axiomatic set theories”, since each of them builds up a slightly different theory of set).

For the maths students who happen to be reading this article, you may find this rather short journey unsatisfactory. Many interesting things are not covered - the Gödel’s incompleteness theorem, Skolem’s paradox, the axiom of choice, well-foundedness, Aczel’s anti-foundation axiom (AFA), etc - basically stuff that you would expect to see in textbooks on axiomatic set theory and logic. Please keep in mind that this is in no mean an attempt to be a comprehensive and thorough guide on axiomatic systems.

This article is more for those who would like to take a glimpse into what an axiomatic system is and how a theory of set can be constructed from axioms. If you are one of these curious mind, I hope you are enjoying your expedition so far ;)

We shall now proceed to other concepts in set theory.

● ● ●

What is an ordered pair?

An ordered pair is a mathematical object that contains 2 elements wherein there is an order.

So unlike sets, ordered pairs are distinguishable by the order of their elements.

Formally, an ordered pair is defined to be a set of this structure:

As you can see, only when , which means .

The set of all possible ordered pairs between two sets is known as the Cartesian Product.

It is often denoted with a cross (like the one used for multiplication).

What is a tuple?

Tuple is the generalization of ordered pair. It is a mathematical object containing elements wherein there’s an order.

and here are 3-tuples. An ordered pair is a 2-tuple.

A -tuple can be defined recursively like this

when we have one element, for simplicity, we would define it as .

As we can see, the definition of 2-tuple (an ordered pair) above can be derived from the recursive definition.

This is actually known as Kuratowski definition of ordered pair. (The recursive defintion of -tuple is a generalization of it.)

There’re other ways of defining an ordered pair too. Here is Hausdorff’s definition that uses natural number:

We can once again generalize it for the definition of -tuple:

One with a curious mind may ask, which of these definitions should be used? And my answer to her or him is: it’s all up to you. It really doesn’t matter which one you pick. You can just come up with your own definition if you like. What’s important is to have mathematical objects that are distinguishable not only by the elements in them, but also the order the elements are in.

What is a relation?

A relation is basically a set of -tuples, each formed by elements from sets.

Here, is a binary relation between and . We call it a binary relation when it’s between 2 sets.

A Cartesian Product, for example, is also a binary relation. Actually, any binary relation between 2 sets is a subset of their Cartesian Product. above is a subset of the Cartesian Product of and .

Here is an example of relation between 3 sets.

What is a function?

A function is a relation in which no two -tuples have their first element(s) the same.

above is a 1-ary (or single-input) function. For any 1-ary function, the first element (which plays the role of “input”) has to be unique.

If we are to have a 2-ary function, our first 2 elements in each tuple must then be unique.

Often, we would write:

to express that a particular -tuple exists in the function.

For example:

And often, we use “” to express which sets the function is a relation between.

Here (whose elements play the role of “input”) is called the domain, while is called the codomain.

It is always true that for every element in the domain, there exists a tuple whose first element is that element. But it is not necessary the case that for every element in the co-domain there exists a tuple whose second element is that element. Take any for example.

For function with more than 1 input, the domain would be expressed as a Cartesian product of two or more sets:

Normally, we would define a function with a statement.

This can be translated into

So to be more specific, we can state that above maps from the set of real numbers to itself:

Or the set of integers to itself:

In the case, we would get because is not an integer, it is not in the domain.

Functions can be classified into 4 types:

  1. not injective & not surjective
  2. injective & not surjective
  3. surjective & not injective
  4. injective & surjective

When a function is injective, each element in X is mapped to a unique element in Y.

We often refer to the set of elements being mapped to as image. (An image is always a subset of the codomain)

When a function is surjective, each element in Y is mapped by a element in X.

For a surjective function, the codomain is equivalent to the image.

If a function is both subjective and injective, we call it bijective. In a bijective function, each element in is mapped to a unique element in and no element in is “unmapped”.

A function only has an inverse (often devoted as of ) if it is bijective.

● ● ●

What is a cardinal number?

A cardinal number of a set can be viewed as the number of elements in the set. (denoted with )

The idea of “number” gets really fussy when we have sets that contain infinitely many elements. So formally, we say that two sets have the same cardinal number (or the same cardinality) only when there is a bijective function between them.

That’s to say, when , we can construct a set of ordered pairs, each made up of a unique element from and individually, for all elements in and .

On the idea of countable, infinite sets and their cardinal numbers

A set is considered to be “countable” when it has the same cardinality as a subset of .

Or, to put it another way, a set is countable when there is an injective function from it to .

It’s pretty obvious that all finite sets (sets with finite number of elements) are countable.

Other than , there’re sets containing infinitely many elements (often referred to as infinite sets) that are countable too. The set of all integers, , for example, is countable. And interestingly, the two sets have the same cardinality.

is the symbol used to represent this cardinal number. (As we can see, the cardinal number of infinite sets can no longer to represent by a natural number.)

To prove that , we only need to show that there’s a bijective function from to . And this is the bijective function:

This is bijective because we can just keep feeding in natural number to this function for it to output every integer:

Every natural number is mapped to precisely one integer. All integers are mapped as there’re infinitely many natural numbers.

Actually, all infinite subsets of has as the cardinal number.

But for any two infinite sets, do they always have the same cardinal number? Interestingly, the answer is no. In this sense, there are some infinities that are bigger than other infinities.

Infinite sets with a bigger cardinal number than are “uncountable” or not “countable” (by definition).

The idea of “uncountable” can be demonstrated in what’s known as Cantor’s diagonal argument.

Let’s say we have an infinite set which contains all the different binary (every digit is either 0 or 1) strings of infinite length.

Now let’s say we have another set, , that contains all binary strings of infinite length enumerated by a function like this:

Apparently, this is a bijective function from to . has the cardinal number .

Now we can take the 1st digit of the 1st element, , flip it to a different value (0 to 1 or 1 to 0), take the 2nd digit of the 2nd element, , does the same thing to it … and get the -th digit from every element to construct a binary string. We would end up with an infinitely long binary string that is different from every infinitely long string in .

. And the same thing can be done to every set whose element is enumerated by . So we conclude that no matter how these binary strings are listed (using a bijective function with domain ), we would always be able to construct a new permutation that’s not in the list. In other words, enumeration (using ) cannot capture every permutation of infinitely long strings. We can’t list down every element in , the set of all possible binary strings of infinite length.

In this sense, is “uncountable”. There exists no bijective function from to .

On power sets

A power set of a set is the set of all its subsets. (Denote with )

A power set of a set with a cardinal number has a cardinal number .

For this reason, sometimes, a power set of set X is denoted as .

Cantor’s theorem states that a power set of a set always has a bigger cardinal number than the set.

This is true for infinite sets too. So there are infinitely many different-sizes infinite sets.

The chapter below has been edited on 19th Feb, 4:08 pm EST time. Great thanks to some useful comments on Reddit.

On Aleph number and Beth number

The cardinal number of infinite sets are often expressed in Aleph number ($\aleph_n$) or Beth number ($\beth_n$) .

in the example above is the smallest Aleph number.

So an infinite set is “uncountable” when its cardinality is not .

is simply defined to be the next cardinal number bigger than

That is to say, there is no cardinal number between and .

Meanwhile, the smallest Beth number is , and by definition it is equivalent to .

Here is the recursive definitions for Beth numbers with bigger ordinals (bigger subscript ),

has as the cardinality since there is a bijective function between and .

The continuum hypothesis (CH) states that

This basically “closes the gap” between and .

this was added in on 20th Feb Just roughly 70 years ago, Gödel showed that CH cannot be disproved in ZF & ZFC (ZF with the axiom of choice).

Two decades later, Cohen showed that no contradiction would arise if CH is defined to be false in ZF & ZFC.

CH is therefore considered to be independent of ZF and ZFC.

- The End -

many typos & careless mistakes were corrected and the last chapter was updated again on 20th Feb, EST 5:15 am. Huge thanks to f_of_g from Reddit for thoroughly pointing them out. Truly appreciate that.

The last chapter was later on edited again on 24th Feb, EST time 4:25pm