# Notorious Coins

Have you ever found yourself reading a problem in an exam and never forget it again because it gives you chills as well as makes you want to solve it every time you come across it? Well, the geek in me found one such problem in the final exam I took with my advisor Jared in the course CS561 (Data Structures and Algorithms) at UNM. I wanted to share this awesome problem with you, with the hope that you will find it as challenging and intriguing as I did. So here is what it says.

You and an opponent are playing a game using a row of n coins of values $v_1, \dots, v_n$ where $n$ is even. Players take turns selecting either the first or last coin from the row, removing it from the row, and receiving the value of the coin. Assume you play first. Following are some examples, assuming optimal play for both players:

• 2, 4, 8, 10 : You can collect maximum value 14 (10 + 4).
• 8, 20, 3, 2 : You can collect maximum value 22 (2 + 20).

Assume that your opponent does not always play optimally. In particular, if $k$ coins remain, then they choose the optimal move with probability $p_k$ (for example $p_k$ may decrease as $k$ grows). Describe an algorithm to optimize your expected winnings.

Sounds challenging, right! This was one of the bonus problems in the final exam, which still gives me chills every time I attack it. However, with a very clever use of dynamic programming, this problem can be attacked in a nice way. I will first focus on the easy case first, where $p_k = 1$. This means the opponent plays optimally always. For this easy case, before jumping to the dynamic programming solution, let us first try to convince ourselves why a greedy approach won’t work (this is standard practice with dynamic programming problems).

Greedy (non optimal) solution for $p_k = 1$

To see why a greedy solution won’t work for this problem, we have to find an example in which the greedy solution fails to provide an optimal solution. I will formalize my proof once we understand this example.

So, the problem at hand is to devise a sequence of coin values on which the greedy approach fails. Recall that the greedy approach picks the coin of highest value from either end of the sequence, alternating between the players. Assuming I go first, consider the sequence $S = 8, 20, 3, 2$, the same sequence as in the example above. The greedy approach makes me pick coins with values 8 and 3, making my total value 11. However, as pointed above, the optimal value is 22, which is significantly higher. KMN!

So why does the greedy approach not work? To see this, we use this trick of constructing a sequence $S$, which given a value $v$, gives me a total value of $v$ if I use the greedy approach, but a value $v' > v$ in the optimal play. This would prove that we will need to dig a bit deeper to solve this problem. Here is how this construction works.

Let $v > 4$ be given as the value. (Note : The lower bound on $v$ is just a technicality. The solution can be generalized to all $v > 0$ easily, although I will spare that discussion here. Also, I am assuming that coin values can be arbitrary rational numbers and not necessarily integers. However, I again claim that extending the solution to integer coin values is fairly trivial and hence, I leave that discussion from this post.)

Consider this sequence : $\left[ \frac{v}{2} -1, \frac{v}{2} - 1, \frac{3v}{4}, \frac{v}{2}+1 \right]$. For example, given $v = 40$, the sequence we are constructing is $[19, 19, 30, 21]$. Using the greedy solution, the total value I obtain is 21 + 19 = 40, as required. However, an optimal strategy will yield me a value of 19 + 30 = 49. Generalizing this, the greedy strategy yields a total value of $\frac{v}{2} +1 +\frac{v}{2} -1 = v$, while the optimal play yields $\frac{v}{2} -1 + \frac{3v}{4} = \frac{5v}{4} - 1$, which is more than $v$ when $v > 4$. So, what on Earth is this optimal strategy!!

[Enter dynamic programming.]

Dynamic programming solution for $p_k = 1$

In cases of optimization like these, if greedy doesn’t work, dynamic has to. Ofcourse, this comes at the cost of efficiency and the requirement of precomputing the entire solution in the beginning, but hey! correctness is way more important than efficiency here. So, let’s take a deep breath and dive in.

Let $n > 0$ and $S = [v_1,\dots,v_n]$ be a sequence of coin values. Assuming I play first, define $f(i,j)$ to be the maximum value of the coins I can obtain using an optimal play in the sequence $[v_i,\dots,v_j]$ only. Clearly, if $i > j$, then $f(i,j) = 0$ and when $i = j$, then $f(i,j) = v_i$. Another easy case is $f(i,i+1) = \max \{ v_i, v_{i+1} \}$. We are interested in the case when $i < j-1$. Let me first state the solution and then justify it.

High five if you just rolled your eyes! Understanding this solution isn’t hard if you keep in mind that both players are playing optimally. This means both of them are trying to maximize their total values. Hence, when I am presented with a choice of two coins, I need to look ahead into what possible strategies my opponent can play next, and then choose a move that maximizes my gain over all those strategies. The max in front of the solution justifies this fact. I can either choose coin $i$ to gain value $v_i$ or I can choose coin $j$ to gain value $v_j$. This choice will depend on what my opponent may do next. Let us discuss these two cases separately.

Case I : If I choose coin $i$, my opponent has to choose a coin from values $[v_{i+1},\dots,v_j]$. If he chooses $v_{i+1}$, I will then be making my choice from values $[v_{i+2},\dots,v_j]$, in which case I gain $f(i+2,j)$, by defn. Else, if my opponent chooses $v_j$, I will have to make my choice from the values$[v_{i+1},\dots,v_{j-1}]$, in which case I gain $f(i+1,j-1)$. Since the play is optimal for both of us, I will assume my worst case for his choice and minimize over these two cases. Hence, the equation on the top.

Case II : Works similarly.

Clever, right! The beauty of this solution lies in the fact that even without knowing what my opponent will choose, just knowing his strategy is able to let me exactly compute what he will go for. Note that using techniques like memoization, I can precompute my optimal value in just $\mathcal{O}(n)$ time, which is linear in the number of coins. Once I have this precomputed solution, all I need to do is make my evil laugh! Bwahahaha!!!

Not so optimal opponent

All good so far, but beware! Common sense is not so common in this world. Finding such an intelligent player may be rarer than we want. So, why not give the poor opponent some slack and allow him to deviate from optimality with some probability. More formally, let $p_k$ be the probability that when $k$ coins remain, the opponent chooses to play optimally, and not otherwise.

An important assumption to compute the expected value I gain here is that I know $p_k$ ahead of time for all $k$. If not, I may be in trouble. We can argue about how practical is it to assume this, but we can always engineer the game to make this assumption hold. I will simply let my opponent flip a coin in which the probability of heads is $p_k$ when he has to choose a coin from $k$ coins. Refer to my previous post to ensure that I can always have such a coin with me.

Now, assuming that I have found a friend who is ready to bow to my rules of this game, how can I compute my expected value at the end of the game? Well, not so tricky this time! The min in the equation will just be replaced by max when the opponent is not playing optimally. Voila! Here we go. Thus, out final solution is that $f(i,j) = 0$, when $i > j$, it is $v_i$ when $i=j$ and $\max \{ v_i, v_{i+1} \}$ if $i = j-1$. Remember that I am still playing optimally. For $i < j-1$, we will now have the following. (Note, here $\mathbb{E}(f(i,j))$ stands for the expected value of $f(i,j)$.)

Double high five if you rolled your eyes again! One more crazy step? Why assume that I play optimally? I have to be realistic right? Similar to what I am assuming for my opponent, let me assume that with probability $q_k$, I play optimally when given $k$ coins, and not optimally otherwise. I will then use a similar trick, where I will choose the max in the front of the equation above with probability $q_{j-i+1}$, and I will choose min otherwise.

We’re done! So much for choosing coins! As you saw, dynamic programming has the potential to make us rich. Of course, at the cost of going through the math. An interesting question before I close this post. What if I do not know $p_k$ and/or $q_k$? As stated earlier, I can engineer my opponent to play such that I know the probability of optimal play every time he makes a choice, but to be more considerate of his feelings, I need to make myself more adaptive of his moves. Every time I realize that he is not picking the coin which the optimal strategy suggests, I will have to revise my knowledge of his probability of erring on the side of non-optimality. We have to agree that such a strategy will only work when the number of coins is large, to begin with, in which case I can hope to come close to guessing his exact probability of erring. In any case, realism is not a friend of mathematics, especially when it comes to modeling human behavior.

Thanks for reading this post. Stay tuned for my next post, which is likely to be on some cool application of linear algebra in a puzzle type setting. Until then, ciao!

# “Irrational” Coin Flips

Recently I came across this very interesting problem : How can you simulate an unbiased coin flip with a biased coin? What about the other way round?.

Although the problem is not new (to me as well), this time I found myself with a good solution for the same, which I though would be nice to share here. I will try to provide my implementation (in Python3) of the problems and also discuss some theoretical aspects of the more interesting problem : What is the minimum number of biased coin flips required to generate an unbiased flip? I will be referring to the paper [MU08] for this.

Unbiased Flips from biased coins

So, let’s talk about the easy case first. Given an unbiased coin, how do I simulate a biased flip. Slight formalization : Let a coin flip be either $0$ or $1$, where $Pr (1) = p$ and $Pr(0) = q = 1-p$ for some $p \in (0,1)$. Thus, a flip is said to be unbiased if $p = 1/2$ and biased, otherwise.

We have two sub-cases here. First, when $p$ is not known to us, and second, when it is. For the first case, we have to find a way around simulating an unbiased flip without making any assumption about $p$ in our algorithm. An important motivation for dealing with such a case is when dealing with sources of randomness whose accuracy is not known. With this respect, I will be briefly discussing the following question : How can we deal with biased coin flips on a bit-precision computer? In other words, $p$ is never really a perfect real number, when working on modern computers, due to the bit precision. We can only make it as small as the smallest floating point number that the computer can represent in its word size. So, how does our solution change in such a situation? Turns out that the solution for unknown $p$ works here as well. Try reasoning this out yourself (or leave a comment otherwise) after reading the algorithm below.

Consider flipping the biased coin twice. Note that among the four possible outcomes, the events $01$ and $10$ are equally likely. Each occurs with probability $pq$. Thus, the following algorithm immediately becomes a candidate solution to our problem.

def unbiasedFlipCase1():
(x,y) = (flip(), flip())
if x==y:
return biasedFlipCase1()
else:
return x

Here, $\texttt{flip()}$ is a function that returns $1$ with probability $p$ and $0$, otherwise. Recall that $p$ is unknown to us, and we don’t care about it either as long as we are assured that all calls to $\texttt{flip()}$ are identical and independent of all previous calls. To see why this solution works, it will suffice if we show that $Pr(\texttt{unbiasedFlipCase1() = 1}) = 1/2$. This can be easily seen through the geometric sum $\sum_{k \geq 0} \left( p^2 + q^2 \right)^k pq = \frac{pq}{1 - \left( p^2 + q^2 \right)} = \frac{1}{2}$.

Neat, right! So the next time someone flips a coin for you to make some decision and you are not sure if you trust the coin, flip it twice and do as the above algorithm suggests and you can always be sure of an unbiased outcome. One can also see that the expected number of recursive calls before this function returns something is computed as $\sum_{k \geq 0}2k \left( p^2 + q^2 \right)^k pq = \frac{p^2 + q^2}{2pq} = O(\frac{1}{p(1-p)})$. Can we do better than this? Turns out that if $p$ is unknown, we probably can not.

Let us now turn our focus to the known $p$ case. Wondering why I am discussing this after the seemingly harder case? The reason(s) will be apparent shortly. Note that when we know $p$, we can take advantage of this knowledge to produce an unbiased bit in fewer expected number of steps than the above algorithm, which also works perfectly fine here. A little math before we proceed with this thought : let us look at the variance of the number of recursive calls of the above function. This is easily calculated to be $\frac{p^2 + q^2}{4(pq)^2}$. Hence, by using Chebyshev bounds from [AS72], the probability that the number of recursive calls differs by more than $k$ from the mean is at most $\frac{p^2 + q^2}{4(pqk)^2}$, which decreases quadratically as fast as $k$. Hence, it is unlikely that this number deviates too much from the mean, implying that the solution is efficient. But, it is the most efficient we can get? The answer is negative. We can do much better than this. Let’s see how.

Recall the concept of entropy and information from any of your undergraduate probability theory classes. The entropy of a biased coin flip is given by $H(p) = -p\log p - (1-p)\log (1-p)$, which denotes the average information gained on one flip of this biased coin. You can now see where I am going with this. [MU08] shows that the most we can get out a biased coin is this information, which directly implies that $\left \lceil \frac{1}{H(p)} \right \rceil$ flips of the biased coin will produce one unbiased flip. (Note that [MU08] was not the first person to prove this result. However, I find the discussion in this report very approachable.)

For $p = \frac{1}{2}$, we have $H(p)$ attains its maximum value of $1$, which gives $\left \lceil \frac{1}{H(\frac{1}{2})} \right \rceil = 1$. Hence, one coin flip suffices, which justifies well with intuition. As $p$ deviates in either direction, $H(p)$ decreases and hence, the number of biased coin flips increases. Once we have these flip outcomes, a decision is made to return $0$ or $1$ using the Advanced Multilevel algorithm as in [MU08]. Comparing $\left \lceil \frac{1}{H(p)} \right \rceil$ with $\left \lceil \frac{p^2 + q^2}{2pq} \right \rceil = \left \lceil \frac{2p^2 - 2p + 1}{2p(1-p)} \right \rceil$, here is the plot I obtained.

As you can see, the entropy based solution always produces lesser number of biased coin flips. Here, I have only varied $p$ in increments of $\frac{1}{100}$, but the graph remains similar for more refined values as well. However, the entropy-based algorithm is quite complicated to understand (at least for me). If you have any interest in coding theory or information theory, help me understand it!

Biased Flips from an unbiased coin

Let us now turn look at the case of simulating a biased coin flip from an unbiased one. On the face of it, this problem seems to be more of a mathematical puzzle than one having any perceivable real word application, but that’s not even remotely true. A ton of applications require us to make decisions which are not always equally favorable to all its possible outcomes. A very simple example being as follows. Suppose we want to simulate the outcome of the sum of a pair of six-sided die. Clearly, all outcomes here are not equally likely. It becomes increasingly impractical to throw two die everytime we want a sample. If only we had a way around!

Unbiased coin flips to the rescue! It turns out that any biased coin flip can be simulated by an appropriate number of unbiased coin flips. The only caveat here is that the unbiased coin flips must be perfectly random, or else we land into the territory of another very interesting question : Given a biased coin with $Pr(1) = p_1$, what is the minimum number of coin flips required to generate a biased flip with $Pr(1) = p_2$ for some $p_2 \neq p_1$? Although I will not spend much time on this question in this post and assume that all unbiased coin flips are available perfectly to me, a brief answer to this question is to consider the maximum amount of information that is obtained about the coin to be simulated from a single flip of the coin used in the simulation. Then, the problem becomes very similar to the technique discussed in [MU08], as described above.

So, how should we pray to the perfection of our unbiased flip in order to be bestowed with a biased Gold coin? Let us discuss an easy-to-understand algorithm to begin with. Recall that if we were instead given a uniform random number generator (RNG), say $\texttt{rand()}$, which produces a real number in the range $0,1$ uniformly at random, we could have easily been able to simulate a biased flip by the following algorithm.

def biasedFlipFromRNG(p):
u = rand()
if (u<=p):
return 1
else:
return 0

This works because the probability that a uniformly generated random number is no more than $p$ is exactly $p$. However, the world is not so generous to bestow us with such a Godly RNG. Lose no hope! There is an easy fix.

We can simulate the algorithm above exactly by treating our sequence of $0$‘s and $1$‘s, as generated by repeatedly flipping our unbiased coin, as the binary representation of a special number, which I call $\tilde{u}$. I use the tilde sign to partially indicate that it is just an estimate of the actual $u$ in the algorithm above. With this approach, all we have to do is flip the unbiased coin, say $n$ times (starting with $n=1$, of course) and record the outcomes in an ordered tuple $(b_1, \dots,b_n)$, and then obtain our estimate at the end of $n^{th}$ iteration as $\tilde{u}_n = \sum_{i = 1}^n b_i 2^{-i}$. Now comes the trick! Let $p_n \leq p$ be a number whose binary expansion matches that of $p$ up to $n$ bits and then it is all zeros. For example, if $p = 0.125$ and $n = 2$, then $p_n = 0$ since the first two bits in the binary expansion of $p$ are zero. Compare $\tilde{u}_n$ with $p_n$. If it so happens that $\tilde{u}_n = p_n$, then repeat this process with $n+1$. Otherwise, return $1$ if $\tilde{u}_n < p_n$ and $0$ if not. The rationale behind this approach is that once $\tilde{u}_n$ becomes less than $p_n$, no further addition of bits will make it larger. Similarly for the case when $p_n$ is smaller. Only when the two are equal, can we make no decision. The algorithm to do the above is summarized in the code below.

def getBinDigit(i,p,q):
if (q + 2**(-i) <= p):
return 1
else:
return 0

def biasedCoinFlip(p):
flip = 1
value = 0
while(True):
unbiasedFlip = unbiasedCoinFlip()
decimalDigit = getBinDigit(flip,p,value)
if decimalDigit:
value = value + 2**(-flip)
if unbiasedFlip < decimalDigit:
return 1
if unbiasedFlip > decimalDigit:
return 0
flip = flip+1

Here, the function $\texttt{unbiasedCoinFlip()}$ simulates the unbiased coin for us. Note that in the code above, I do not produce the entire tuple of bits for $p_n$ every time. Instead, I generate bits only if I require them. A natural question that arises here is the expected number of iterations of the $\texttt{while}$ loop before a bit is returned by the function. But first, let us convince ourselves that the probability that this function returns $1$ is indeed $p$.

To prove this, let us assume that the binary expansion of $p$ has bits $b_1,b_2,\dots$. Note that we are only considering the bits after the decimal point, since $p < 1$. Now, the question becomes the following. Given some $n \geq 1$ and a sequence of unbiased bits $u_1,u_2,\dots,u_n$, what is the probability of the event $\{ u_i = b_i \quad \forall 1 \leq i \leq n \}$? Clearly, this is equal to $2^{-n}$, since $p$ is fixed. Hence, with probability $2^{-n}$, we enter into yet another iteration of the $\texttt{while}$ loop. Once we are done with these iterations, we return $1$ if $\sum_{i=1}^{n+1} u_i 2^{-i} < \sum_{i=1}^{n+1} b_i 2^{-i}$, which is the same as $u_{n+1} < b_{n+1}$. This happens with probability $0$ if $b_{n+1} = 0$ and probability $\frac{1}{2}$ otherwise. Thus, the probability that we return $1$ after $k \geq 0$ iterations of the $\texttt{while}$ loop can be written as $\frac{2^{-k}b_{k+1}}{2} = b_{k+1}2^{-(k+1)}$.

We are almost there! All that remains to be done is to sum the above expression for $k \geq 0$, which gives $\sum_{k \geq 0}b_{k+1}2^{-(k+1)} = p$, by definition. Hence, the algorithm is correct. Phew! One last thing before I wrap up. How efficient is this algorithm? We can calculate the expected number of loop iterations before returning $1$ using the expression above as $\sum_{k \geq 0} kb_{k+1}2^{-(k+1)}$. Wait! This seems like a tricky sum. However, we can compute an upper bound for this sum. Writing $k = (k+1)-1$ and using $b_{k+1} \leq 1$ for all $k \geq 0$, we get that the average number of loop iterations before the algorithm returns a bit is no more than $2-p$. This may not be tight, but at least it tells us that in fewer than two iterations, the algorithm will return a bit, which is pretty fast if you ask me! A tighter analysis may be non-trivial since I am not aware of any closed form expression for $b_{k+1}$ in terms of $k$ and $p$. If you do, please let me know!

So people! I hope you enjoyed reading this post as much as I enjoyed writing it. Leave your comments below and I will try my best to address them. I have of course left out a lot of details and other cool algorithms to tackle the problems above, but I want to confess that I understand what I presented the most. I would love to hear about more techniques. Do stay tuned for my next post, in which I will talk about one of my other favorite topics : Dynamic programming. Until then, ciao!

References :

[MU08] Mitzenmacher, Michael. “Tossing a biased coin.” (2008).

[AS72] Abramowitz, M. and Stegun, I. A. (Eds.). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing. New York: Dover, p. 11, 1972.

# Mid South Theory Day 2016

Hey all!

I will be attending the Mid South Theory Day 2016 at the Louisiana State University on 9th December, 2016 to give a talk on our latest result in the field of secure multiparty interactive computation. Stay tuned for an awesome algorithm that compiles a noise free protocol (asynchronous) into a robust protocol that can tolerate a bit flipping oblivious adversary with an unbounded budget, with high probability while incurring a overhead that’s within log factors of the conjectured optimal. I will upload the slides here shortly.