Randomized Algorithms, Carnegie Mellon: Spring 2011
Belated Notes on Lecture #2
(Sorry for the delayed notes: it made sense to wait until we’d seen Chernoff bounds before I posted these. — AG)
We saw in Lecture #2 that the expected number of comparisons for the randomized version of quicksort was at most ; the proof was via defining just the right random variables (“do we compare and at any step”), we could compute that , and hence the expected number of comparisons is .
However, often it helps to get a quick ballpark estimate before going for the best analysis: here’s a quick-and-dirty way to see that the number of comparisons for Quicksort is with probability at least for some other constant that depends on . Not only does this imply that the expected number of comparisons is (why?), but also that the number of comparisons is very unlikely to be much larger than the expectation.
Here’s the theorem for today:
First, let level comparisons be those done when we partition the entire input into two parts, level comparisons be those when we partition these two parts into two parts each, and so on. Note that the number of comparisons made in level is at most , so it suffices to show that the number of levels before all the parts have only a single element each is with high probability.
Let’s prove this claim. Consider a single element . Let be the set containing at level . Note that is the entire input. We call level to be -good if either , or the level- pivot falls in the middle half of and hence causes to be at most . Since we pick a pivot uniformly at random, the chance of any level being -good is at least , independent of all other levels. Hence, probability of seeing at least -good levels within the top levels is at least the probability of seeing heads in flips of an unbiased coin. (I.e., the number of -good levels stochastically dominates the Geom() random variable.)
Note that if there are -good levels, then since a good level reduces the size of the set by at least , the element must lie in a set of unit size. Setting , the chance of seeing fewer than good levels is at most the chance of seeing fewer than heads when we expect to see heads. By a multiplicative Chernoff bound on the lower tail, the chance of this is at most
Thus the branch of the recursion containing the element terminates before levels except with probability . There are possible elements, and we can take a trivial union bound over these elements: with probability at least , the Quicksort recursion must have bottomed out before levels.
Note that this shows the total number of comparisons is bounded by with probability .
But what about the expected number of comparisons? One can bound this in several ways. Here’s one: with probability the “good event” happens and the number of comparisons is at most . If the good event does not happen (w.p. ), we know that Quicksort takes comparisons in the worst case, and hence the expected number of comparisons is at most
This is much worse than the bound of we showed in class, but it has two advantages: (a) it was quick and dirty, as promised, and (b) it also shows a concentration result—Quicksort is unlikely to do many more comparisons than you’d expect.
In fact, one can prove much better bounds: a result of McDiarmid and Hayward from SODA 1992 shows that if is r.v. denoting the number of comparisons for Quicksort, then falls outside the interval with probability
That’s pretty well-concentrated! (One class project could be to understand this paper, and the concentration results they use.)
Comments are closed.