 Oddtown Chronicles

Hola!

Today, I will be talking about a classic puzzle that uses linear algebra to attack a combinatorics problem. I first learned of this puzzle during our research group seminar here at UNM, where Jared presented this problem to us as an exercise. Later on, I read through a couple of formalizations for the same, and will now be presenting my take on the problem taking helpful references from [MIT307], [UC726] and [UC07]. My interest in this problem originates from the fact that upon first hearing the problem statement, it doesn’t strike to be something that can be made easier using linear algebra, however, the approach baffles my mind every time I think about it. Let’s dive in.

The problem, which is referred to as Elwyn Berlekamp Theorem in [UC726], is as follows: In Oddtown, there are $n$ citizens and $m$ clubs satisfying the rules that each club has an odd number of members and each pair of clubs shares an even number of members. We then need to show that $m \leq n$, i.e. the number of clubs cannot be more than the number of citizens.

Before proceeding to the proof, I must point out that this problem has a close connection to Fisher’s inequality [MIT307], which is similar to a slight modification of the original problem. Apart from this, it is related to the study of designs, set systems with special intersection patterns. [MIT307] shows how such a system can be used to construct a graph on $n$ vertices, which does not have any clique or independent set of size $\omega \left( n^{1/3} \right)$. I will briefly talk about these results towards the end of this post.

Let us now focus on proving that the number of clubs in Oddtown cannot exceed the number of citizens. Formally, let $\mathcal{C} = \{ C_1, \dots, C_m \}$ be the set of clubs in Oddtown and $\mathcal{P} = \{ P_1,\dots,P_n \}$ be the set of citizens (people). We start by paying attention to the fact the no two clubs are allowed to share an odd number of members. The entire problem is full of even-odd type of constraints, which suggests that we should attack this problem using some form of parity checking. The easiest such setting is to work in the field with characteristic 2. In other words, we will perform all addition and multiplication operations modulo 2 from now on.

Within this setting, for each club $C_i \in \mathcal{C}$, define its membership vector $v_i = \left[ v_{i,1},\dots,v_{i,n} \right] \in \mathbb{Z}_2^n$, where $v_{i,j} = 1$ if citizen $j$ is a member of club $i$, and 0 otherwise. Intuitively, this is similar to each club maintaining a ledger in which all the citizen names are listed and only those names are marked that correspond to member citizens for that club. These ledgers must satisfy the property that for any pair of clubs, if we compare their ledgers, the number of common members must be even (which allows no common members as well). Mathematically, we can represent this constraint using dot product of the membership vectors of the clubs.

Notice that if the number of citizens common to any given pair of clubs is even, then in our field, the dot product of the membership vectors of these clubs will be zero. Hence, in the magical world of modulo 2, all clubs have membership vectors that are orthogonal to each other.  More importantly, note that none of these vectors is identically zero since the number of members in each club must be odd. Hence, the puzzle now reduces to asking the maximum number of pairwise independent vectors that exist in $\mathbb{Z}_2^n$. This is where our friend, linear algebra, kicks in.

Form a matrix $M \in \mathbb{Z}_2^{n \times m}$ whose columns are the membership vectors of the clubs. Note that since all columns are pairwise independent and none of them are identically zero, the rank of this matrix is exactly $m$. However, we know that the rank of any matrix must never exceed the number of rows or the number of columns. Voila! We’re done. The number of rows in this matrix is $n$, which immediately proves that $m \leq n$. Try holding your right ear using your left hand while swirling your arm around the back of your head. I felt exactly like that my first time with this puzzle!

Another way to see this without forming the matrix is by noticing that the number of pairwise independent vectors in $\mathbb{Z}_2^n$ cannot be more than the dimension of $\mathbb{Z}_2^n$, which is $n$. Hence, the same result as above. No matter how you prove it, we always obtain the condition we wanted to prove. See! that’s why we must befriend linear algebra.

A small technicality. Can we have exactly $n$ clubs, if not more? The answer is, trivially, yes. Since the number of common members to any pair of clubs must be even, set it to zero! Have a dedicated club for every man and you’re done. If you call this cheating, think about this. How cool will be it to have your own clubhouse?!

Maximality

Given this construction above, we can prove some more exciting results about the fate of people in Oddtown. Specifically, we can prove that for every $n$, we can always save the Mayor’s money by building no more than two clubs that satisfy all the constraints. Furthermore, we can prove that this system is maximal, in the sense that no more clubs can be added without violating at least one of the conditions. Let’s see how.

So the task at hand is to divide the $n$ citizens into 2 clubs, say $C_1$ and $C_2$, such that (1) the number of citizens common to both the clubs is even, (2) the number of citizens in each club is odd and 3) adding even one more club will violate (1) and (2). If $n$ is even, then one way to divide the citizens is to allocate one citizen to the first club and the remaining citizens to the second club. This satisfies (1) and (2). To see if this satisfies (3), add another club, say $C_3$ to the town. Since this club must have even number of common members with the other two clubs,  the number of common members between $C_3$ and $C_1$ must be zero. This requires all the members in $C_3$ to also belong to $C_2$, which immediately gives a contradiction since (2) will require this number of common members to be even while (1) requires them to be odd. Hence, this distribution of citizens into two clubs is maximal.

What happens when $n$ is odd? Trivially, put all the members into one club and the problem is solved. The second club is not needed at all. The addition of any more clubs will not be possible because of the same argument as above. Hence, one club is maximal in this case. Thus, in both the cases, no more than two clubs are required for a maximal membership of citizens.

Fisher’s inequality

We now discuss a result that is closely related to a slight modification of the rules in Oddtown. We remove the restriction on the size of the clubs and require that every two clubs share a fixed number $k$ of members. We assume that if two clubs have exactly the same members, then they are the same. Fisher’s inequality then states that the number of non-empty clubs is at most $n$, similar to the result above. The proof of this inequality is slightly involved, although the basic principle is the same. We consider the membership vectors of the clubs and prove them to be linearly independent in some field, which in this case will be the field of real numbers $\mathbb{R}^n$.

To see how this proof works, an important observation needs to be made : There is at most one club with exactly $k$ members. Wondering why? Well, let’s assume otherwise and try to get a contradiction. Let there be at least two clubs with exactly $k$ members. Then each of these must have the same members by the condition of the problem. This contradicts the fact that these clubs are distinct (because of our assumption) and hence, we have proved that at most one club can have exactly $k$ members.

Now, with this critical observation in hand, we proceed with the proof as follows. Let $\mathcal{C} = \{ C_1,\dots,C_m \}$ be the clubs of size $|C_1|,\dots,|C_m|$, respectively (assuming each of these is non zero). The size here refers to the number of members in the clubs. Represent by $C_i \cap C_j$ the set of common members in clubs $C_i$ and $C_j$. Then, according to the given problem, $\left| C_i \cap C_j \right| = k$ for each pair $i,j \in \{ 1,\dots,m \}$ with $i \neq j$. We define the membership vectors of each club similar to the proof above as $v_i = \{ v_{i,1},\dots,v_{i,n} \}$, where $v_{i,j} = 1$ if citizen $j$ belongs to club $i$ and 0 otherwise. Clearly, in $\mathbb{R}^n$, we have the dot product of any $v_i$ and $v_j$ to be $k$, if $i \neq j$. All we have to do now is to prove that these membership vectors are linearly independent in $\mathbb{R}^n$.

To see this, assume they are not. Then, using standard practice in such proofs, assume there exist real numbers $\alpha_1,\dots,\alpha_m$ such that $\sum_{i=1}^m \alpha_i v_i = \mathbf{0}$, where $\mathbf{0}$ is the zero vector in $\mathcal{R}^n$. Then, we must also have $\parallel \sum_{i=1}^m \alpha_i v_i \parallel_2 = 0$, where $\parallel v \parallel_2$ denotes the 2-norm of the vector $v$. Hence, we have $0 = \left( \sum_{i=1}^m \alpha_i v_i \right) \cdot \left(\sum_{i=1}^m \alpha_i v_i \right)$, since the 2-norm can be written as the dot product of the vector with itself. We can rewrite this sum as $0 = k \left( \sum_{i=1}^m \alpha_i \right)^2 + \sum_{i=1}^m \alpha_i^2 \left( |C_i| - k \right)$. Using our critical observation now, each of the two terms on the right of this equation are non-negative, which implies that they are both identically zero.

Hence, we have (1) $\left( \sum_{i=1}^m \alpha_i \right)^2 = 0$, which implies $\sum_{i=1}^m \alpha_i = 0$, and (2) $\sum_{i=1}^m \alpha_i^2 \left( |C_i| - k \right) = 0$. From (2), since at most one club, say $C_r$ can have exactly $k$ members, each $\alpha_i = 0$ whenever $i \neq r$. However, then, from (1), $\sum_{i=1}^m \alpha_i = \alpha_r = 0$, which contradicts the fact that the membership vectors are linearly dependent. Hence, Fisher’s inequality holds.

Fisher’s inequality has a number of cool applications. Two interesting examples are presented in [MIT307], where they prove the following : (1) For a fixed $k$, let $G$ be a graph whose vertices are triples $T \in {[k] \choose 3}$ and $\{ A, B \}$ is an edge if $|A \cap B| = 1$. Then $G$ does not contain any clique or independent set of size more than $k$. (2) Suppose $P$ is a set of $n$ points in the plane, not all on one line. Then pairs of points from $P$ define at least $n$ distinct lines.

I hope you liked this post of mine. My next post will likely be on another interesting puzzle, maybe from one of the CACM journals. Until then, stay tuned. Ciao!

References

[MIT307] http://math.mit.edu/~fox/MAT307-lecture15.pdf