on the webpage, or here. Good luck!

Update: Remember it’s due 48 hours after you start, or Friday May 6, 11:59pm, whichever comes first.

Fixes: 1(b): “feasible solution to the LP plus the odd cycle inequalities.” And you need to only show the value is $m(1 - O(\varepsilon))$ whp.

Thanks for the great presentations yesterday, everyone!

The final will be posted on the course webpage Friday 4/29 evening at the latest, I will post something on the blog once we’ve done so. You can take it in any contiguous 48 hour period of your choice — just download it when you are ready, and hand in your solutions within 48 hours of that. Slip it under my door (preferably), or email it to me otherwise. We’ll stop accepting solutions at 11:59pm on Friday 5/6.

Hey, it may be useful for today’s lecture if you have a quick read over Karger’s randomized algorithm for min-cuts. Just the basic algorithm and analysis—we won’t need the improved Karger-Stein variant.

HW#6 is out, it’s a short one. Due next Wednesday April 27th.

Today we talked about random walks on graphs, and the result that in any connected undirected graph G, for any given start vertex u, the expected time for a random walk to visit all the nodes of G (called the cover time of the graph) is at most 2m(n-1), where n is the number of vertices of G and m is the number of edges.

In the process, we proved that for any G, if we think of the walk as at any point in time being on some edge heading in some direction, then each edge/direction is equally likely at probability 1/(2m) at the stationary distribution. (Actually, since we didn’t need to, we didn’t prove it is unique. However, if G is connected, it is not hard to prove by contradiction that there is a unique stationary distribution). We then used that to prove that the expected gap between successive visits to any given (u,v) is 2m. See the notes.

We also gave a few examples to show this is existentially tight. For instance, on a line (n vertices, n-1 edges) we have an expected $\Omega(n^2)$ time to reach the other end of the line. Also on a “lollipop graph” (a clique of n/2 vertices connected to a line of n/2 vertices), the expected time to get from the clique to the end of the line is $\Omega(n^3)$. Since this is not in the notes, here is a quick argument. First of all, if you are in the clique, then each step has 2/n probability of taking you to the node connecting the clique and the handle (let’s call this “node 0″). When you are at node 0, your next step has probability 2/n to take you to the next point on the handle (let’s call this “node 1″). So, when you are in the clique, it takes $\Omega(n^2)$ steps to get to node 1. Now, think about the following experiment. Say you go into a fair casino with 1 dollar and bet on fair games (betting a dollar each time) until either you lose all your money or you have n/2 dollars in your pocket. Whenever you lose all your money, you go back to your car to get another dollar, and if you get n/2 dollars, you go upstairs to the restaurant. What is the expected number of trips to your car before you go to the restaurant? The answer is n/2 because it’s a fair casino so in expectation, all the money in your pocket when you head to the restaurant came from your car. Now, think of node 0 as your car, node 1 as entering the casino, and the end of the lollipop stick (node n/2) as the restaurant.

We ended our discussion by talking about resistive networks, and using the connection to give another proof of the cover time of a graph. In particular, we have $C_{uv} = 2m R_{uv}$ where $C_{uv}$ is the commute-time between u and v, and $R_{uv}$ is the effective resistance between u and v.

In this lecture we talked about Rademacher bounds in machine learning. These are never worse and sometimes can be quite a bit tighter than VC-dimension bounds. Rademacher bounds say that to bound how much you are overfitting (the gap between your error on the training set and your true error on the distribution), you can do the following. See how much you would overfit on random labels (how much better than 50% is the empirical error of the best function in your class when you give random labels to your dataset) and then double that quantity (and add a low-order term). See the notes.

HW5 has been posted; it’s due on Monday April 4th.

Update: for Exercise #2, the expression $E_{T \gets \mathcal{D}} [\max_{v \in V} d(v, F_T)]$ is indeed correct. (It is $d$ and not $d_T$ —you want to show that the solution $F_T$ found using the tree is lousy when used on the original metric.)

A simpler problem, if you’re stuck, is the furthest pair problem. Here, you are given a metric and want to output a pair of points whose distance is the largest. A natural (yet lousy) algorithm would be: embed the metric into a random tree while maintaining distances in expectation, find a furthest pair in the tree, and output this pair. Show an example where this algorithm sucks.