Lecture #16: Distance-preserving trees (part II)
1. Embeddings into Distributions over Trees
In this section, we prove the following theorem using tree embeddings (and then, in the following section, we improve it further to ).
- For all , , and
To prove this theorem, we will use the idea of a low diameter decomposition. Given a metric space on points and a parameter , a (randomized) low-diameter decomposition is an efficiently sampleable probability distribution over partitions of into such that
- (Low Radius/Diameter) For all , there exists such that for all , . Hence, for any , .
- (Low Cutting Probability) For each pair , with .
We’ll show how to construct such a decomposition in the next section (next lecture), and use such a decomposition to prove Theorem 1.
Consider the following recursive algorithm, which takes as input a pair where is a set of vertices of diameter at most , and returns a rooted tree .
- Apply the low-diameter decomposition to with the parameter to get the partition .
- Recurse: Let TreeEmbed(). As a base case, when is a single point, simply return that point.
- For every tree with , add the edge with length . This is a new tree which we denote .
- Return the tree/root pair .
Recall that since the low diameter decomposition is randomized, this algorithm defines a distribution over trees over . To build the tree for , we first rescale so that for all , and . We define the distribution as the one obtained by calling TreeEmbed.
Lemma 2 For all , for all .
Proof: Fix and , and let be such that . Consider the invocation of TreeEmbed such that . First, we examine the case in which . By the definition of the low diameter decomposition, since , and will fall into separate parts of the partition obtained in Step 1, and so we will have , the length of the edge placed between different subtrees. In the case in which , then it must be that and have been separated at a higher level of the recursion, are consequently separated by a higher subtree edge, and hence .
Lemma 3 For all ,
Proof: We begin with two easy subclaims. Suppose TreeEmbed:
- Claim 1: for all . By induction, lies in some piece of the partition having diameter at most and hence inductively is at distance at most from its root . That root is connected to the root by an intertree edge of weight , giving us in total.
- Claim 2: If , then . From the previous claim, each and is at distance at most from , distances are symmetric, and the triangle inequality applies.
We now have from the definition:
where the first inequality follows from our subclaims, the second follows from the property of the low diameter decomposition. Setting and completes the proof.
The two lemmas above prove Theorem 1. How do we implement these low diameter decompositions? And how can we get the promised ? Keep reading…
2. Low Diameter Decompositions
Recall the definition of a (randomized) low-diameter decomposition from above: given a metric and a bound , we want a partition with pieces of radius at most , and want vertices to be separated with “small” probability (i.e., proportional to their distance, and inversely proportional to ).
Before we proceed, think about how you’d get such a decomposition for a line metric, or a tree metric, with ; moreover, you cannot hope to get subconstant for even the line. So the theorem says that general graphs lose a factor more, which is not bad at all! (And this factor is existentially optimal, we will show a tight example.)
2.1. Algorithm I: Geometric Region Growing
To make our life easier, we’ll assume that all distances in the metric are at least . (We can enforce this via a pre-processing without much effort, I’ll come back to it.)
The algorithm just picks a “truncated” geometric distance , carves out a piece of radius around some vertex, and repeats until the metric is eaten up.
- Choose ; if , then set .
- Pick an arbitrary vertex , and set .
- Return Geom-Regions.
Clearly, the radius bound is maintained by the fact that with probability .
What’s the chance that lie in separate parts? So let’s view this process as picking a vertex and starting with a ball of radius zero around it; then we flip a coin with bias , increasing the radius by one after each tails, until either we see a heads or we reach tails, when we cut out the piece. And then we pick another vertex, and repeat the process.
Consider the first time when one of these lies in the current ball. Note that either this ball will eventually contain both of them, or will separate them. And to separate them, it must make a cut within the next steps. The chance of this is at most the chance of seeing a heads from a bias- coin in steps, plus the chance that a r.v. sees more than tails in a row. Using a naive union bound for the former, we get
We now use the fact that all distances are at least to claim that and hence the probability of separated is at most , which proves the second property of the decomposition.
Finally, the loose ends: to enforce the minimum-distance condition that , just think of the metric as a complete graph with edge-lengths , contract all edges with , and recompute edge lengths to get the new metric . Running the decomposition Geom-Regions on this shrunk metric, and then unshrinking the edges, will ensure that each pair is separated with probability either (if it has length ), or probability at most . And finally, since the output had radius at most according to , any path has at most nodes and its length can change by at most for , the new radius is at most !.
Another advantage of this shrinking preprocessing: a pair is separated only when , and it is separated for sure when . Using this observation in the calculation from the previous section can change the to just . But to get the ultimate guarantee, we’ll need a different decomposition procedure.
2.2. Algorithm II: The CKR Decomposition
The procedure for the decomposition is a little less intuitive, but very easy to state:
- Choose uniformly at random.
- Choose a random permutation uniformly at random.
- Consider the vertices one by one, in the order given by . When we consider , we assign all the yet-unassigned vertices with to ‘s partition.
For example, suppose the ordering given by is . The figure below illustrates the coverage when the vertices are visited by this process.
This construction directly implies the low-radius property, restated in the following claim.
Lemma 5 (Low Radius) The output of the algorithm has the property that for all , there exists such that for all , .
The real work is in showing that for each pair , it is separated with small probability. Before proving this, let us state two definitions useful for the proof. For the analysis only: suppose we re-number the vertices in order of the distance from the closer of .
- (Settling) At some time instant in this procedure, one (or both) of or gets assigned to some . We say that settles the pair .
- (Cutting) At the moment the pair is settled, if only one vertex of this pair is assigned, then we say that cuts the pair .
According to these definitions, each pair is settled at exactly one time instant in the procedure, and it may or may not be cut at that time. Of course, once the pair is settled (with or without being cut), it is never cut in the future.
Now to bound the separation probability. Consider , and let and . Assume (the other case is identical). If cuts when the random values are and , the following two properties must hold:
- The random variable must lie in the interval (else either none or both endpoints of would get marked).
- The node must come before in the permutation .
Suppose not, and one of them came before in the permutation. Since all these vertices are closer to the pair than is, then for the current value of , they would have settled the pair (either capturing one or both of the endpoints) at some previous time point, and hence would not settle—and hence not cut—the pair .
With these two properties, we establish
But we wanted to do better than that! No worries, the fix is easy, but clever. First, note that if then the probability of separating is at most . So suppose . Now, for to cut , it is not enough for and comes before all for . It also must be the case that be at most from the closer of the pair (say ) to even reach one of the vertices, let alone separate then. And at least from the further one (say ) so that some setting of would have a chance to separate the two. So the distance of from must be at most , and at least , and the same for its distance from . If we restrict the harmonic sum in the final expression over just the vertices that satisfy these bounds, we get the bound
and hence the bound in Theorem 4.
Theorem 6 (FRT 2003) Using the decomposition procedure from Theorem 4 in the TreeEmbed algorithm, we get that for all :
The proof for the TreeEmbed algorithm remains essentially unchanged, except for the final calculations:
where the last equality follows from observing that we have a telescoping sum.
Citations: The construction was due to Yair Bartal (1996); this substantially improved on the first non-trivial guarantee of due to Alon, Karp, Peleg and West (1992). The low-diameter decomposition is also from Bartal. The algorithm is by Fakcharoenphol, Rao, and Talwar (2003), based on the improved decomposition scheme due to Calinescu, Karloff and Rabani (2000).
3. Lower Bounds
Let us show two lower bounds: first, that no randomized low-diameter decomposition can achieve better than for general metrics. And that no random tree embeddings can do better than either.
3.1. Lower Bounds for Decompositions
First, given a graph with unit length edges, if we apply a decomposition with parameter to the graph metric , we will cut each edge with probability . The expected number of cut edges will be . So, for each the probabilistic method says there exists a diameter- partition that cuts at most edges.
Let be a graph with nodes and edges (with ), where the girth of the graph (the length of the shortest simple cycle) is at least (for constant ). Such graphs are known to exist, this can be shown by the probabilistic method.
Now, if we set and consider any diameter- partition: we claim no set in this partition can induce a cycle. Indeed, since every cycle is of length , two furthest points in the cycle would be distance from each other. So all sets induce a forest, which means the number of internal edges is at most . This means at least edges are cut.
Cool. For every diameter- partition, at least edges are cut because of the large girth property. But there exists one that cuts at most edges, because we have a good decomposition algorithm. So now we put the two facts together.
3.2. Lower Bounds for Random Tree Embeddings
Suppose there is a distribution that achieves expected stretch for the large-girth graphs above. Let’s use this to obtain a low-diameter decomposition with cutting parameter ; this will mean .
Sample a tree from the distribution, pick an arbitrary vertex , pick a random value . Delete all edges that contain points at distance exactly in from . The remaining forest has components with radius at most , and diameter in the tree. Since distances on the original graph are only smaller, the diameter of each part will only be less in the original graph.
Moreover, given the tree , a pair will be separated with probability at most . Taking expectations, the total probability of separated is at most
So we have a decomposition scheme with parameter . And combining this with the previous lower bound on any decomposition scheme, we get .