Lower bounds for MTS

 

Consider a rooted tree T=(V,E) with leaf set LV and positive vertex weights w:VLR+ that are non-increasing along root-leaf paths. We recall the ultrametric dw defined on L by

dw(,):=w(lca(,)).

Say that (L,dw) is a τ-HST metric for some τ1 if L is the leaf set of some weighted tree T=(V,E), and dw is the ultrametric corresponding to a weight w on T with the property that w(y)w(x)/τ whenever y is a child of x. (So for a finite metric space, the notions of 1-HST metric and ultrametic are equivalent.) Say that (L,d) is a strict τ-HST metric if we require the stronger property that w(y)=w(x)/τ whenever x,yV are internal nodes and y is a child of x.

Two metrics d and d on a set X are K-equivalent if there is a constant c>0 such that d(x,y)cd(x,y)Kd(x,y)x,yX.

We leave the following statements as an exercise: For every τ>1, every 1-HST is τ-equivalent to some strict τ-HST.

There is a constant C1 such that if (L,dw) is a C(logn)2-HST metric, where n=|L|, then the competitive ratio for MTS on (L,dw) is Ω(logn).

While the preceding theorem applies only to τ-HSTs with τ sufficiently large, the next lemma shows how we can improve the separation parameter by passing to a subset of leaves.

If (L,d) is a strict 2-HST, then for every kN, there is a subset LL with

|L||L|1/k,

and such that (L,d) is a 2k-HST metric.

Combining this with Theorem 1 yields:

If (L,d) is a 1-HST metric with n=|L|, then the competitive ratio for MTS on (L,d) is at least Ω(lognloglogn).

Recall that the Metric Ramsey theorem from the preceding lecture implies that every n-point metric space contains a subset of size n that is O(1)-equivalent to a 1-HST. This yields:

For every n-point metric space, the competitive ratio for MTS is at least Ω(logn/loglogn).

We will not prove Theorem 1, but we will prove a lower bound for two special cases that together capture the essential elements of the full argument. The general case is discussed at the end.

The complete d-ary tree

The first case we’ll consider is when T is a d-ary tree of height h, so that |L|=dh. Define w(x)=τdistT(x,r), where r is the root of T and distT denotes the combinatorial distance on T. Then (L,dw) is a τ-HST.

Our goal is to establish a lower bound of Ω(hlogd) on the competitive ratio for MTS when τ is sufficiently large. Note that when τ=Θ(1) and d=2, it is an open problem to exhibit an Ω(h) lower bound, and proving such a bound is likely the most difficult obstacle in obtaining an Ω(logn) lower bound for every n-point ultrametric.

Our costs functions will be of the form {ϵc:L}, where ϵ>0 is a number, and c()={0=1.

More succinctly: c:=11.

Let αh be the competitive ratio for height h. We will prove inductively that αhΩ(hlogd).

Consider first the case h=1. We define the height-1 cost sequence. The root r has d leaves beneath it. We do the following dlogd times: Choose a random leaf and play the cost function c. It is straightforward that if alg1 denotes the cost of an online algorithm, then: E[alg1]logd.

(Note that if a cost comes at , then the algorithm must either move and pay movement cost 1, or stay in place and pay service cost 1.)

Consider the following offline algorithm: Move to the leaf that will be chosen the smallest number of times, and stay there. The movement cost if 1, and a standard coupon collecting argument shows that the service cost is O(1), hence: E[opt1]O(1).

We conclude that the competitive ratio satisfies α1clogd
for some c>0.

Consider now the case of h2. We define the height-h cost sequence. The root has beneath it d subtrees T1,T2,,Td of height h1. For some number μh to be chosen shortly, we do the following μhd times: Choose i{1,2,,d} uniformly at random and play in Ti the height-(h1) cost sequence scaled by 1/τ.

Consider first what an online algorithm can do. If the algorithm’s state lies in Ti when a height-(h1) cost sequence is imposed on Ti, it can either leave Ti at some point during the sequence, or it can remain in Ti. Hence the algorithm pays at least min(1,τ1E[algh1]). It follows that:

E[algh]μhmin(1,τ1E[algh1])μhmin(1,τ1αh1E[opth1]).

Thus if

ταh1E[opth1]

it holds that E[algh]μhτ1αh1E[opth1].
We will choose τ so that (1) holds.

Let us now analyze an offline algorithm. Let ρi denote the number of times that Ti is chosen. The offline algorithm first moves to some Ti with ρi minimal, and then plays optimally against level-(h1) cost sequences arriving in Ti.

If μhΩ(logd), then the normal approximation to a binomial distribution shows that there is a constant 0<c<1 such that E[min(ρ1,,ρd)]μhcμhlogd.

Hence incorporating the movement cost yields: E[opth]1+(μhcμhlogd)τ1E[opth1].

We now choose μh so that cμhlogdτ1E[opth1]=2,

i.e., μh:=4logd(τcE[opth1])2
Note from (1), we have μh4(c)2logdα2h1.
Since αh1α1Ω(logd) it follows that μhlogd for c chosen small enough. (In particular, the normal approximation applies.)

Our choice of μh ensures that E[opth](μhc2μhlogd)τ1E[opth1]

Combining this with (2) yields

αhE[algh]E[opth]αh11c2logdμhαh1(1+c2logdμh),

where the latter inequality holds because c<1 and μhlogd. Using (3), this gives αhαh1+c2logd,
completing the argument.

Note that in (1), we required ταh1E[opth1]. So let us use (4) to compute: E[opth]μhτ1E[opth1]O(1)logdτE[opth1]

Since E[opth]E[opth1]O(τ/logd),
and E[opth]E[opth1]1 for all h, it follows that E[opth]O(τ/logd), meaning that ταh1E[opth1] will be satisfied for τC(αh1/logd)2h2. Since hlognlogd, this yields completes the proof of in the special case of a d-regular HST.

The “superincreasing” metric

We will now consider a highly unbalanced family of HSTs. These were analyzed in a paper of Karloff, Rabani, and Ravid.

Let us define the trees {Tn} as follows: The root rn of Tn has two children. The left child is a copy of Tn1 of rooted at rn1, and the right child is a single leaf. We denote wn(rn)=1, and wn(rn1)=1/τn, where τn>τn1>>0 is a sequence of positive weights we will choose soon, and such that τnO(logn).

The basic structure of our argument will be similar to the d-regular case. We do the following 2μn times: Choose i{L,R} uniformly at random. If i=L, we play a height-(n1) cost sequence on the left subtree (scaled by 1/τn). Otherwise, we play c, where is the leaf constituting the right subtree.

Consider an online algorithm. If the algorithm sits in the left subtree when i=L, then either the algorithm moves out of the subtree (movement cost 1), or it incurs the cost of a height-(n1) inductive sequence scaled by 1/τn. If it sits in the right subtree, then it pays cost 1 (either by moving or staying and incurring the service cost). If we choose τn:=αn1E[optn1], then such an algorithm always pays at least τ1nαn1E[optn1] in any of these cases. Therefore:

E[algn]μnτ1nαn1E[optn1].

We now bound the cost of the optimal offline algorithm. Let ρL be the number of times i=L and let ρR:=2μnρL. The algorithm will always sit by default in the left subtree. If ρR=0, it will move to the right subtree, suffer zero service cost there, and then move back to the left subtree (paying total cost 2). If ρR>0, the algorithm will remain in the left subtree and play an optimal strategy for Tn1.

This yields: E[optn](1P[ρR=0])E[ρLτ1noptn1ρR>0]+2P[ρR=0](14μn)μnτ1nE[optn1]+24μn.

Now choose: μn:=4τnE[optn1]=4αn1.
This yields: E[optn](11244αn)μnτ1nE[optn1].

Combined with (5), we have αnE[algn]E[optn]αn1+1244αn1.

This implies that αnβlogn for some positive β<1. (To see this observe, that if f(x)=βlogx, then f(x)=β/x<1244βlogx for β chosen small enough.)

Note furthermore that E[optn]4,

and therefore τnO(logn).

The general case

We have demonstrated lower bounds in two cases: When the underlying tree T is regular, and when it is unbalanced and binary. The general case can be proved by combining these two strategies along any Ω((logn)2)-HST. The next lemma contains the basic idea. We leave it to the reader as a basic exercise. (Hint: Use the concavity of xx.)

Suppose that n=n1+n2++nm where n1n2nm1. Then either n1+n2n, or there is some 3 such that nn.

Suppose now that T is a rooted tree with subtrees T1,T2,,Tm beneath the root, and such that Ti has ni leaves with n1n2nm1. The lemma says we can either consider only the first two subtrees (the binary, possibly “unbalanced” case), or we can take the first subtrees for some 3, prune them so they all have exactly n leaves, and then prove a lower bound. Analyzing these two cases correspond roughly to the two lower bound arguments above.