Lecture #16: Distance-preserving trees (part II)
1. Embeddings into Distributions over Trees
In this section, we prove the following theorem using tree embeddings (and then, in the following section, we improve it further to ).
Theorem 1 Given any metric
with
and aspect ratio
, there exists a efficiently sampleable distribution
over spanning trees of
such that for all
:
- For all
,
, and
.
To prove this theorem, we will use the idea of a low diameter decomposition. Given a metric space on
points and a parameter
, a (randomized) low-diameter decomposition is an efficiently sampleable probability distribution over partitions of
into
such that
- (Low Radius/Diameter) For all
, there exists
such that for all
,
. Hence, for any
,
.
- (Low Cutting Probability) For each pair
,
with
.
We’ll show how to construct such a decomposition in the next section (next lecture), and use such a decomposition to prove Theorem 1.
Consider the following recursive algorithm, which takes as input a pair where
is a set of vertices of diameter at most
, and returns a rooted tree
.
TreeEmbed
:
- Apply the low-diameter decomposition to
with the parameter
to get the partition
.
- Recurse: Let
TreeEmbed(
). As a base case, when
is a single point, simply return that point.
- For every tree
with
, add the edge
with length
. This is a new tree which we denote
.
- Return the tree/root pair
.
Recall that since the low diameter decomposition is randomized, this algorithm defines a distribution over trees over . To build the tree for
, we first rescale so that for all
,
and
. We define the distribution
as the one obtained by calling TreeEmbed
.
Lemma 2 For all
,
for all
.
Proof: Fix and
, and let
be such that
. Consider the invocation of TreeEmbed
such that
. First, we examine the case in which
. By the definition of the low diameter decomposition, since
,
and
will fall into separate parts of the partition obtained in Step 1, and so we will have
, the length of the edge placed between different subtrees. In the case in which
, then it must be that
and
have been separated at a higher level
of the recursion, are consequently separated by a higher subtree edge, and hence
.
Lemma 3 For all
,
![]()
Proof: We begin with two easy subclaims. Suppose TreeEmbed
:
- Claim 1:
for all
. By induction,
lies in some piece
of the partition having diameter at most
and hence inductively is at distance at most
from its root
. That root is connected to the root
by an intertree edge of weight
, giving us
in total.
- Claim 2: If
, then
. From the previous claim, each
and
is at distance at most
from
, distances are symmetric, and the triangle inequality applies.
We now have from the definition:
where the first inequality follows from our subclaims, the second follows from the property of the low diameter decomposition. Setting and
completes the proof.
The two lemmas above prove Theorem 1. How do we implement these low diameter decompositions? And how can we get the promised ? Keep reading…
2. Low Diameter Decompositions
Recall the definition of a (randomized) low-diameter decomposition from above: given a metric and a bound
, we want a partition with pieces of radius at most
, and want vertices to be separated with “small” probability
(i.e., proportional to their distance, and inversely proportional to
).
Before we proceed, think about how you’d get such a decomposition for a line metric, or a tree metric, with ; moreover, you cannot hope to get subconstant
for even the line. So the theorem says that general graphs lose a factor
more, which is not bad at all! (And this factor is existentially optimal, we will show a tight example.)
2.1. Algorithm I: Geometric Region Growing
To make our life easier, we’ll assume that all distances in the metric are at least . (We can enforce this via a pre-processing without much effort, I’ll come back to it.)
The algorithm just picks a “truncated” geometric distance , carves out a piece of radius
around some vertex, and repeats until the metric is eaten up.
Geom-Regions
:
- Choose
; if
, then set
.
- Pick an arbitrary vertex
, and set
.
- Return
Geom-Regions
.
Clearly, the radius bound is maintained by the fact that with probability
.
What’s the chance that lie in separate parts? So let’s view this process as picking a vertex
and starting with a ball of radius zero around it; then we flip a coin with bias
, increasing the radius by one after each tails, until either we see a heads or we reach
tails, when we cut out the piece. And then we pick another vertex, and repeat the process.
Consider the first time when one of these lies in the current ball. Note that either this ball will eventually contain both of them, or will separate them. And to separate them, it must make a cut within the next steps. The chance of this is at most the chance of seeing a heads from a bias-
coin in
steps, plus the chance that a
r.v. sees more than
tails in a row. Using a naive union bound for the former, we get
We now use the fact that all distances are at least to claim that
and hence the probability of
separated is at most
, which proves the second property of the decomposition.
Finally, the loose ends: to enforce the minimum-distance condition that , just think of the metric as a complete graph with edge-lengths
, contract all edges
with
, and recompute edge lengths to get the new metric
. Running the decomposition Geom-Regions
on this shrunk metric, and then unshrinking the edges, will ensure that each pair is separated with probability either
(if it has length
), or probability at most
. And finally, since the output had radius at most
according to
, any path has at most
nodes and its length can change by at most
for
, the new radius is at most
!.
Another advantage of this shrinking preprocessing: a pair is separated only when
, and it is separated for sure when
. Using this observation in the calculation from the previous section can change the
to just
. But to get the ultimate
guarantee, we’ll need a different decomposition procedure.
2.2. Algorithm II: The CKR Decomposition
Theorem 4 (The Better Decomposition) There exists an efficiently sampleable probability distribution
over partitions with parts having radius at most
such that
where
.
The procedure for the decomposition is a little less intuitive, but very easy to state:
CKR Decomposition
:
- Choose
uniformly at random.
- Choose a random permutation
uniformly at random.
- Consider the vertices one by one, in the order given by
. When we consider
, we assign all the yet-unassigned vertices
with
to
‘s partition.
For example, suppose the ordering given by is
. The figure below illustrates the coverage when the vertices are visited by this process.

This construction directly implies the low-radius property, restated in the following claim.
Lemma 5 (Low Radius) The output of the algorithm has the property that for all
, there exists
such that for all
,
.
The real work is in showing that for each pair , it is separated with small probability. Before proving this, let us state two definitions useful for the proof. For the analysis only: suppose we re-number the vertices
in order of the distance from the closer of
.
- (Settling) At some time instant in this procedure, one (or both) of
or
gets assigned to some
. We say that
settles the pair
.
- (Cutting) At the moment the pair is settled, if only one vertex of this pair is assigned, then we say that
cuts the pair
.
According to these definitions, each pair is settled at exactly one time instant in the procedure, and it may or may not be cut at that time. Of course, once the pair is settled (with or without being cut), it is never cut in the future.
Now to bound the separation probability. Consider , and let
and
. Assume
(the other case is identical). If
cuts
when the random values are
and
, the following two properties must hold:
- The random variable
must lie in the interval
(else either none or both endpoints of
would get marked).
- The node
must come before
in the permutation
.
Suppose not, and one of them came before in the permutation. Since all these vertices are closer to the pair than
is, then for the current value of
, they would have settled the pair (either capturing one or both of the endpoints) at some previous time point, and hence
would not settle—and hence not cut—the pair
.
With these two properties, we establish
But we wanted to do better than that! No worries, the fix is easy, but clever. First, note that if then the probability of separating
is at most
. So suppose
. Now, for
to cut
, it is not enough for
and
comes before all
for
. It also must be the case that
be at most
from the closer of the pair (say
) to even reach one of the vertices, let alone separate then. And at least
from the further one (say
) so that some setting of
would have a chance to separate the two. So the distance of
from
must be at most
, and at least
, and the same for its distance from
. If we restrict the harmonic sum in the final expression over just the vertices that satisfy these bounds, we get the bound
and hence the bound in Theorem 4.
Theorem 6 (FRT 2003) Using the decomposition procedure from Theorem 4 in the TreeEmbed algorithm, we get that for all
:
The proof for the TreeEmbed algorithm remains essentially unchanged, except for the final calculations:
where the last equality follows from observing that we have a telescoping sum.
Citations: The construction was due to Yair Bartal (1996); this substantially improved on the first non-trivial guarantee of
due to Alon, Karp, Peleg and West (1992). The low-diameter decomposition is also from Bartal. The
algorithm is by Fakcharoenphol, Rao, and Talwar (2003), based on the improved decomposition scheme due to Calinescu, Karloff and Rabani (2000).
3. Lower Bounds
Let us show two lower bounds: first, that no randomized low-diameter decomposition can achieve better than for general metrics. And that no random tree embeddings can do better than
either.
3.1. Lower Bounds for Decompositions
First, given a graph with unit length edges, if we apply a
decomposition with parameter
to the graph metric
, we will cut each edge with probability
. The expected number of cut edges will be
. So, for each
the probabilistic method says there exists a diameter-
partition that cuts at most
edges.
Let be a graph with
nodes and
edges (with
), where the girth of the graph (the length of the shortest simple cycle) is at least
(for constant
). Such graphs are known to exist, this can be shown by the probabilistic method.
Now, if we set and consider any diameter-
partition: we claim no set
in this partition can induce a cycle. Indeed, since every cycle is of length
, two furthest points in the cycle would be
distance from each other. So all sets induce a forest, which means the number of internal edges is at most
. This means at least
edges are cut.
Cool. For every diameter- partition, at least
edges are cut because of the large girth property. But there exists one that cuts at most
edges, because we have a good decomposition algorithm. So now we put the two facts together.
3.2. Lower Bounds for Random Tree Embeddings
Suppose there is a distribution that achieves expected stretch
for the large-girth graphs above. Let’s use this to obtain a low-diameter decomposition with cutting parameter
; this will mean
.
Sample a tree from the distribution, pick an arbitrary vertex
, pick a random value
. Delete all edges that contain points at distance exactly in
from
. The remaining forest has components with radius at most
, and diameter
in the tree. Since distances on the original graph are only smaller, the diameter of each part will only be less in the original graph.
Moreover, given the tree , a pair will be separated with probability at most
. Taking expectations, the total probability of
separated is at most
So we have a decomposition scheme with parameter . And combining this with the previous lower bound on any decomposition scheme, we get
.
