Consider a rooted tree \(T=(V,E)\) with leaf set \(\mathcal L\subseteq V\) and positive vertex weights \(w : V\setminus \mathcal L\to \mathbb R_+\) that are non-increasing along root-leaf paths. We recall the ultrametric \(d_w\) defined on \(\mathcal L\) by
$$ d_w(\ell,\ell') \mathrel{\vcenter{:}}= w(\mathrm{lca}(\ell,\ell')). $$
Say that \((\mathcal L,d_w)\) is a \(\tau\)-HST metric for some \(\tau \geq 1\) if \(\mathcal L\) is the leaf set of some weighted tree \(T=(V,E)\), and \(d_w\) is the ultrametric corresponding to a weight \(w\) on \(T\) with the property that \(w(y) \leq w(x)/\tau\) whenever \(y\) is a child of \(x\). (So for a finite metric space, the notions of \(1\)-HST metric and ultrametic are equivalent.) Say that \((\mathcal L,d)\) is a strict \(\tau\)-HST metric if we require the stronger property that \(w(y)=w(x)/\tau\) whenever \(x,y \in V\) are internal nodes and \(y\) is a child of \(x\).
Two metrics \(d\) and \(d'\) on a set \(X\) are \(K\)-equivalent if there is a constant \(c > 0\) such that \[d(x,y) \leq c\, d'(x,y) \leq K d(x,y)\qquad \forall x,y \in X\,.\] We leave the following statements as an exercise: For every \(\tau > 1\), every \(1\)-HST is \(\tau\)-equivalent to some strict \(\tau\)-HST.
There is a constant \(C \geq 1\) such that if \((\mathcal L,d_w)\) is a \(C (\log n)^2\)-HST metric, where \(n=|\mathcal L|\), then the competitive ratio for MTS on \((\mathcal L,d_w)\) is \(\Omega\left(\log n\right)\).
While the preceding theorem applies only to $\tau$-HSTs with $\tau$ sufficiently large, the next lemma shows how we can improve the separation parameter by passing to a subset of leaves.If \((\mathcal L,d)\) is a strict \(2\)-HST, then for every \(k \in \mathbb N\), there is a subset \(\mathcal L' \subseteq \mathcal L\) with
$$ |\mathcal{L}'| \geq |\mathcal{L}|^{1/k}, $$
and such that \((\mathcal L', d)\) is a \(2^k\)-HST metric.Combining this with Theorem 1 yields:
If \((\mathcal L,d)\) is a \(1\)-HST metric with \(n=|\mathcal L|\), then the competitive ratio for MTS on \((\mathcal L,d)\) is at least \(\Omega\left(\frac{\log n}{\log \log n}\right)\).
Recall that the Metric Ramsey theorem from the preceding lecture implies that every \(n\)-point metric space contains a subset of size \(\sqrt{n}\) that is \(O(1)\)-equivalent to a \(1\)-HST. This yields:
For every \(n\)-point metric space, the competitive ratio for MTS is at least \(\Omega(\log n/\log \log n)\).
We will not prove Theorem 1, but we will prove a lower bound for two special cases that together capture the essential elements of the full argument. The general case is discussed at the end.
The complete d-ary tree
The first case we’ll consider is when \(T\) is a \(d\)-ary tree of height \(h\), so that \(|\mathcal L|=d^h\). Define \(w(x)=\tau^{-\mathrm{dist}_T(x,r)}\), where \(r\) is the root of \(T\) and \(\mathrm{dist}_T\) denotes the combinatorial distance on \(T\). Then \((\mathcal L,d_w)\) is a \(\tau\)-HST.
Our goal is to establish a lower bound of \(\Omega(h \log d)\) on the competitive ratio for MTS when \(\tau\) is sufficiently large. Note that when \(\tau = \Theta(1)\) and \(d=2\), it is an open problem to exhibit an \(\Omega(h)\) lower bound, and proving such a bound is likely the most difficult obstacle in obtaining an \(\Omega(\log n)\) lower bound for every \(n\)-point ultrametric.
Our costs functions will be of the form \(\{\epsilon\cdot c_\ell : \ell \in \mathcal L\}\), where \(\epsilon> 0\) is a number, and \[c_\ell(\ell') = \begin{cases} 0 & \ell=\ell' \\ 1 & \ell \neq \ell'\,. \end{cases}\] More succinctly: \(c_\ell \mathrel{\vcenter{:}}= \mathbf{1}-\mathbf{1}_\ell\).
Let \(\alpha_h\) be the competitive ratio for height \(h\). We will prove inductively that \(\alpha_h \geq \Omega(h \log d)\).
Consider first the case \(h=1\). We define the height-\(1\) cost sequence. The root \(r\) has \(d\) leaves beneath it. We do the following \(d \log d\) times: Choose a random leaf \(\ell\) and play the cost function \(c_\ell\). It is straightforward that if \(\mathrm{alg}_1\) denotes the cost of an online algorithm, then: \[\mathbb{E}[\mathrm{alg}_1] \geq \log d\,.\] (Note that if a cost comes at \(\ell\), then the algorithm must either move and pay movement cost \(1\), or stay in place and pay service cost \(1\).)
Consider the following offline algorithm: Move to the leaf that will be chosen the smallest number of times, and stay there. The movement cost if \(1\), and a standard coupon collecting argument shows that the service cost is \(O(1)\), hence: \[\mathbb{E}[\mathrm{opt}_1] \leq O(1)\,.\] We conclude that the competitive ratio satisfies \[\alpha_1 \geq c \log d\] for some \(c > 0\).
Consider now the case of \(h \geq 2\). We define the height-\(h\) cost sequence. The root has beneath it \(d\) subtrees \(T_1, T_2, \ldots, T_d\) of height \(h-1\). For some number \(\mu_h\) to be chosen shortly, we do the following \(\mu_h d\) times: Choose \(i \in \{1,2,\ldots,d\}\) uniformly at random and play in \(T_i\) the height-\((h-1)\) cost sequence scaled by \(1/\tau\).
Consider first what an online algorithm can do. If the algorithm’s state lies in \(T_i\) when a height-\((h-1)\) cost sequence is imposed on \(T_i\), it can either leave \(T_i\) at some point during the sequence, or it can remain in \(T_i\). Hence the algorithm pays at least $\min\left(1,\tau^{-1} \mathbb{E}[\mathrm{alg}_{h-1}]\right)$. It follows that:
\[\mathbb{E}[\mathrm{alg}_h] \geq \mu_h \min\left(1,\tau^{-1} \mathbb{E}[\mathrm{alg}_{h-1}]\right) \geq \mu_h \min\left(1,\tau^{-1} \alpha_{h-1} \mathbb{E}[\mathrm{opt}_{h-1}]\right)\,.\]
Thus if\begin{equation}\label{eq:tauchoice} \tau \geq \alpha_{h-1} \mathbb{E}[\mathrm{opt}_{h-1}]\end{equation}
it holds that \begin{equation}\label{eq:alg} \mathbb{E}[\mathrm{alg}_h] \geq \mu_h \tau^{-1} \alpha_{h-1} \mathbb{E}[\mathrm{opt}_{h-1}]\,.\end{equation} We will choose \(\tau\) so that \eqref{eq:tauchoice} holds.Let us now analyze an offline algorithm. Let \(\rho_i\) denote the number of times that \(T_i\) is chosen. The offline algorithm first moves to some \(T_i\) with \(\rho_i\) minimal, and then plays optimally against level-\((h-1)\) cost sequences arriving in \(T_i\).
If \(\mu_h \geq \Omega(\log d)\), then the normal approximation to a binomial distribution shows that there is a constant \(0 < c' < 1\) such that \[\mathbb{E}\left[\min(\rho_1,\ldots,\rho_d)\right] \leq \mu_h - c'\sqrt{\mu_h \log d}\,.\] Hence incorporating the movement cost yields: \[\mathbb{E}[\mathrm{opt}_h] \leq 1 + \left(\mu_h - c'\sqrt{\mu_h \log d}\right) \tau^{-1} \mathbb{E}[\mathrm{opt}_{h-1}]\,.\]
We now choose \(\mu_h\) so that \[c' \sqrt{\mu_h \log d} \tau^{-1} \mathbb{E}[\mathrm{opt}_{h-1}] = 2\,,\] i.e., \[\mu_h \mathrel{\vcenter{:}}= \frac{4}{\log d} \left(\frac\tau{c' \mathbb{E}[\mathrm{opt}_{h-1}]}\right)^2\] Note from \eqref{eq:tauchoice}, we have \begin{equation}\label{eq:muh} \mu_h \geq \frac{4}{(c')^2\log d} \alpha_{h-1}^2.\end{equation} Since \(\alpha_{h-1} \geq \alpha_1 \geq \Omega(\log d)\) it follows that \(\mu_h \geq \log d\) for \(c'\) chosen small enough. (In particular, the normal approximation applies.)
Our choice of \(\mu_h\) ensures that \begin{align} \mathbb{E}[\mathrm{opt}_h] &\leq \left(\mu_h-\frac{c'}{2} \sqrt{\mu_h \log d}\right) \tau^{-1} \mathbb{E}[\mathrm{opt}_{h-1}] \label{eq:opt}\end{align} Combining this with \eqref{eq:alg} yields
$$ \alpha_h \geq \frac{\mathbb{E}[\mathrm{alg}_h]}{\mathbb{E}[\mathrm{opt}_h]} \geq \frac{\alpha_{h-1}}{1-\frac{c'}{2} \sqrt{\frac{\log d}{\mu_h}}} \geq \alpha_{h-1} \left(1+\frac{c'}{2} \sqrt{\frac{\log d}{\mu_h}}\right), $$
where the latter inequality holds because \(c' < 1\) and \(\mu_h \geq \log d\). Using \eqref{eq:muh}, this gives \[\alpha_h \geq \alpha_{h-1} + \frac{c'}{2} \log d\,,\] completing the argument.Note that in \eqref{eq:tauchoice}, we required \(\tau \geq \alpha_{h-1} \mathbb{E}[\mathrm{opt}_{h-1}]\). So let us use \eqref{eq:opt} to compute: \[\mathbb{E}[\mathrm{opt}_h] \leq \mu_h \tau^{-1} \mathbb{E}[\mathrm{opt}_{h-1}] \leq \frac{O(1)}{\log d} \frac\tau{\mathbb{E}[\mathrm{opt}_{h-1}]}\] Since \[\mathbb{E}[\mathrm{opt}_h] \cdot \mathbb{E}[\mathrm{opt}_{h-1}] \leq O(\tau/\log d)\,,\] and \(\mathbb{E}[\mathrm{opt}_h] \geq \mathbb{E}[\mathrm{opt}_{h-1}] \geq 1\) for all \(h\), it follows that \(\mathbb{E}[\mathrm{opt}_h] \leq O(\sqrt{\tau/\log d})\), meaning that \(\tau \geq \alpha_{h-1} \mathbb{E}[\mathrm{opt}_{h-1}]\) will be satisfied for \(\tau \geq C(\alpha_{h-1}/\log d)^2 \asymp h^2\). Since \(h \asymp \frac{\log n}{\log d}\), this yields completes the proof of in the special case of a \(d\)-regular HST.
The “superincreasing” metric
We will now consider a highly unbalanced family of HSTs. These were analyzed in a paper of Karloff, Rabani, and Ravid.
Let us define the trees \(\{T_n\}\) as follows: The root \(r_{n}\) of \(T_n\) has two children. The left child is a copy of \(T_{n-1}\) of rooted at \(r_{n-1}\), and the right child is a single leaf. We denote \(w_n(r_n)=1\), and \(w_n(r_{n-1}) = 1/\tau_n\), where \(\tau_n > \tau_{n-1} > \cdots > 0\) is a sequence of positive weights we will choose soon, and such that \(\tau_n \leq O(\log n)\).
The basic structure of our argument will be similar to the \(d\)-regular case. We do the following \(2 \mu_n\) times: Choose \(i \in \{L,R\}\) uniformly at random. If \(i=L\), we play a height-\((n-1)\) cost sequence on the left subtree (scaled by \(1/\tau_n\)). Otherwise, we play \(c_\ell\), where \(\ell\) is the leaf constituting the right subtree.
Consider an online algorithm. If the algorithm sits in the left subtree when \(i=L\), then either the algorithm moves out of the subtree (movement cost \(1\)), or it incurs the cost of a height-\((n-1)\) inductive sequence scaled by \(1/\tau_n\). If it sits in the right subtree, then it pays cost \(1\) (either by moving or staying and incurring the service cost). If we choose \(\tau_n \mathrel{\vcenter{:}}=\alpha_{n-1} \mathbb{E}[\mathrm{opt}_{n-1}]\), then such an algorithm always pays at least \(\tau_n^{-1} \alpha_{n-1} \mathbb{E}[\mathrm{opt}_{n-1}]\) in any of these cases. Therefore:
\begin{equation}\label{eq:algn} \mathbb{E}[\mathrm{alg}_n] \geq \mu_n \tau_n^{-1} \alpha_{n-1} \mathbb{E}[\mathrm{opt}_{n-1}]. \end{equation}
We now bound the cost of the optimal offline algorithm. Let \(\rho_L\) be the number of times \(i=L\) and let \(\rho_R \mathrel{\vcenter{:}}= 2\mu_n - \rho_L\). The algorithm will always sit by default in the left subtree. If \(\rho_R=0\), it will move to the right subtree, suffer zero service cost there, and then move back to the left subtree (paying total cost \(2\)). If \(\rho_R > 0\), the algorithm will remain in the left subtree and play an optimal strategy for \(T_{n-1}\).
This yields: \[\begin{aligned} \mathbb{E}[\mathrm{opt}_n] &\leq \left(1-{\mathbb{P}}[\rho_R=0]\right) \mathbb{E}\left[\rho_L \tau_n^{-1} \mathrm{opt}_{n-1} \mid \rho_R > 0\right] + 2 {\mathbb{P}}[\rho_R=0] \\ &\leq (1-4^{-\mu_n}) \mu_n \tau_n^{-1} \mathbb{E}[\mathrm{opt}_{n-1}] + 2 \cdot 4^{-\mu_n}.\end{aligned}\] Now choose: \[\mu_n \mathrel{\vcenter{:}}=\frac{4 \tau_n}{\mathbb{E}[\mathrm{opt}_{n-1}]} = 4 \alpha_{n-1}\,.\] This yields: \[\mathbb{E}[\mathrm{opt}_n] \leq \left(1-\tfrac12 4^{-4 \alpha_n}\right) \mu_n \tau_n^{-1} \mathbb{E}[\mathrm{opt}_{n-1}].\]
Combined with \eqref{eq:algn}, we have \[\alpha_n \geq \frac{\mathbb{E}[\mathrm{alg}_n]}{\mathbb{E}[\mathrm{opt}_n]} \geq \alpha_{n-1} + \tfrac12 4^{-4\alpha_{n-1}}.\] This implies that \(\alpha_n \geq \beta \log n\) for some positive \(\beta < 1\). (To see this observe, that if \(f(x)=\beta \log x\), then \(f'(x) = \beta/x < \frac12 4^{-4 \beta \log x}\) for \(\beta\) chosen small enough.)
Note furthermore that \[\mathbb{E}[\mathrm{opt}_n] \leq 4\,,\] and therefore \(\tau_n \leq O(\log n)\).
The general case
We have demonstrated lower bounds in two cases: When the underlying tree \(T\) is regular, and when it is unbalanced and binary. The general case can be proved by combining these two strategies along any \(\Omega((\log n)^2)\)-HST. The next lemma contains the basic idea. We leave it to the reader as a basic exercise. (Hint: Use the concavity of \(x \mapsto \sqrt{x}\).)
Suppose that \(n=n_1+n_2+\cdots+n_m\) where \(n_1 \geq n_2 \geq \cdots \geq n_m \geq 1\). Then either \(\sqrt{n_1}+\sqrt{n_2} \geq \sqrt{n}\), or there is some \(\ell \geq 3\) such that \(\ell \cdot \sqrt{n_\ell} \geq \sqrt{n}\).
Suppose now that \(T\) is a rooted tree with subtrees \(T_1, T_2, \ldots, T_m\) beneath the root, and such that \(T_i\) has \(n_i\) leaves with \(n_1 \geq n_2 \geq \cdots \geq n_m \geq 1\). The lemma says we can either consider only the first two subtrees (the binary, possibly “unbalanced” case), or we can take the first \(\ell\) subtrees for some \(\ell \geq 3\), prune them so they all have exactly \(n_\ell\) leaves, and then prove a lower bound. Analyzing these two cases correspond roughly to the two lower bound arguments above.