# CMU Randomized Algorithms

Randomized Algorithms, Carnegie Mellon: Spring 2011

## Notes on Lecture 6

1. The Derandomization for Independent Sets

Here’s the derandomization of the probabilistic argument for Turan’s theorem. It’s just a little trickier than I’d figured, but it also leads to a different algorithm! (See Section 2.)

Let’s recall the randomized version: Given graph ${G}$ and permutation ${\pi}$, call a vertex ${v}$ a winner if it appears before all its neighbors in the order ${\pi}$. Let ${W_v}$ be the indicator of ${v}$ being a winner according to permutation ${\pi}$, and hence ${E_{\pi}[W_v] = 1/(d_G(v) + 1)}$. Now the number of winners ${W = \sum_{v \in V} W_v}$, and hence ${E_{\pi}[W] = \sum_{v \in V} 1/(d_G(v) +1)}$. The greedy algorithm which considers the vertices in the order ${\pi}$ will at least pick all the winners. Hence,

$\displaystyle E_\pi[Greedy] \geq E_\pi[W] = \sum_v 1/(d_v +1).$

The issue now is that we now don’t have an exact expression for the expected size of the independent set found by greedy that we can track, we only have a lower bound on it, namely the expected number of winners. But that’s OK: we’ll just track this lower bound instead, which we will call the “pessimistic estimator”. Let’s now pin this down.

First, given any set of vertices ${S}$, if we run the random permutation algorithm on the induced graph ${G[S]}$, we can compute the expected number of winners exactly — it’s just ${\sum_{v \in S} 1/(d_{G[S]}(v) + 1)}$, where ${d_{G[S]}(v)}$ is the degree of ${v}$ in graph ${G[S]}$.

So let’s do the following: having added some nodes ${I_t \subseteq V}$ to our independent set, let ${R_t = V \setminus (I_t \cup N(I_t))}$ be the “rest” of the nodes which are not already chosen in ${I_t}$ nor already ruled out because they are adjacent to some node in ${I_t}$. Now define the “pessimistic estimator” to be

$\displaystyle \begin{array}{rcl} \mu(t) &=& |I_t| + E[ \text{ number of winners on } G[R_t]\; ] \\ &=& |I_t| + \sum_{v \in R_t} \frac{1}{d_{G[R_t]}(v) + 1}. \end{array}$

Note that we start off with ${I_0}$ empty, then we get ${\mu(0) = \sum_v 1/(d_G(v) + 1) \geq n/(d_{avg} + 1)}$. And finally, ${R_t = \emptyset}$, then this is just ${|I_t|}$. So if we could ensure that at every step the value never decreases, we’d be done. One can do it in two ways (at least), here’s the approach some of you mentioned at the end of lecture.

1.1. Showing a Good Vertex to Add

Consider all the possible vertices in ${R_t}$: we’ll show one of them can be added to ${I_t}$ to cause ${\mu(t+1) \geq \mu(t)}$. For vertex ${x}$, let ${N^+_x = (\{x\} \cup N(x)) \cap R_t}$ be its neighbors (including itself) in ${R_t}$. If we add in ${x}$ to the I.S., the new value would be

$\displaystyle \begin{array}{rcl} \mu(I_t \cup \{x\}) &=& |I_t| + 1 + \sum_{v \in R_t \setminus N^+(x)} \frac{1}{d_{G[R_t \setminus N^+(x)]}(v) + 1} \\ &\geq& |I_t| + 1 + \sum_{v \in R_t} \frac{1}{d_{G[R_t]}(v) + 1} - \sum_{v \in N^+(x)} \frac{1}{d_{G[R_t]}(v) + 1}. \end{array}$

The inequality is because the degrees in ${G[R_t \setminus N^+(x)]}$ can never be larger than those in ${G[R_t]}$. Now rewriting,

$\displaystyle \mu(I_t \cup \{x\}) - \mu(t) \geq 1 - \sum_{v \in N^+(x)} \frac{1}{d_{G[R_t]}(v) + 1}. \ \ \ \ \ (1)$

If we average over all ${x \in R_t}$, and we get

$\displaystyle E[\mu(I_t \cup \{x\})] - \mu(t) \geq 1 - \frac{1}{R_t} \sum_{x \in R_t} \sum_{v \in N^+(x)} \frac{1}{d_{G[R_t]}(v) + 1}, \ \ \ \ \ (2)$

which is non-negative since for each ${v}$ (of which there are at most ${|R_t|}$, there are ${d_{G[R_t]}(v) + 1}$ many ${x}$‘s. And hence, there is at least one ${x}$ to add so that ${\mu(I_t \cup \{x\} \geq mu(I_t)}$.

The part that originally threw me off in the calculations was that averaging over the ${x}$‘s could potentially lead to a greater potential ${E[\mu(I_t \cup \{x\})]}$ than what we started off ${\mu(I_t)}$. But that is only better for us, so all’s well.

1.2. So To Summarize

The final idea is this: for the probabilistic analysis, we didn’t know how to compute the expectation we wanted (i.e., the expected number of vertices picked by random-permutation greedy). So we used a lower bound—the expected number of winners.

And then, that lower bound was useful for the derandomization. We show that we could add in one vertex at a time to the independent set ${I_t}$ while never decreasing ${|I_t|}$ plus the expected number of winners in the rest.

2. BTW, A Simpler Algorithm

If we go back and stare at inequality (1) for a few minutes, we realize that this analysis also gives an analysis for the following greedy algorithm:

While there is a vertex left in the graph, add the vertex with lowest remaining degree into ${I}$, and remove it and all its neighbors.

(Do you see why?) We’ve shown that this will give an independent set of size at least ${n/(\overline{d} + 1)}$.

3. Pessimistic Estimators for Low Congestion

The general method of pessimistic estimators is attributed to Prabhakar Raghavan, whose book is the text for the course, and who’s currently the head of Yahoo! Research. (He did this work while he was a graduate student in the mid-80s.)

His original paper used pessimistic estimators to derandomize, among other problems, the randomized-rounding algorithm for low-congestion routing example we saw in the previous lecture (due to Raghavan and Thompson). Aravind Srinivasan has some great notes on the derandomization; here’s a high-level summary.

Prabhakar’s use of these estimators is very neat, since looking at the proof it’s not obvious what to use as the pessimistic estimator, since there’s the Chernoff bound, which seems somewhat mysterious and awkward to handle. So he unrolls the first few steps of the Chernoff-bound calculation, which you recall goes as follows: if ${X_e}$ is the congestion on an edge ${e}$, then

$\displaystyle \begin{array}{rcl} \sum_e \Pr[ X_e > k C ] &=& \sum_e \Pr[ X_e - kC > 0 ] \\ &= &\sum_e \Pr[ k^{X_e - kC} > 1 ] \\ &\leq &\sum_e E[ k^{X_e - kC}] \\ & &\ldots \\ &\leq &\sum_e 1/m^c \leq 1/m^{c-1}.\end{array}$

where the sum over edges represents the union bound, we use Markov’s inequality in the middle, all the m.g.f. calculations are omitted here, and ${k > 0}$ is suitably chosen to ensure the bound of ${1/m^c}$.

He then notes that the expression ${\sum_e E[ k^{X_e - kC}]}$ upper bounds the actual probability of error, and also can be tracked efficiently. In particular, fixing flow-paths for some of the source-sink pairs, we can calculate the expectation over random choices for the rest of the pairs, since these ${X_e}$‘s are sums of independent random variables. And we can make choices that causes this quantity to not increase. So if the estimator starts off below ${1/m^{c-1} < 1}$, we ensure it always stays below that quantity.

And since the final probabilities are integers ${0}$ or ${1}$ (since we’ve chose one flow-path for each source-sink pair, either an edge has more than ${kC}$ congestion, or it does not), and they sum to less than ${1}$, they must all be zero. This gives the deterministic algorithm with congestion at most ${kC}$.

Again, Aravind’s notes, or the original paper, give you more details, and a more general properties that suffice for pessimistic estimators.