Reconstruction on Trees: Exponential Moment Bounds for Linear Estimators

Yuval Peres (Microsoft)
Sebastien Roch (UCLA)

Abstract


Consider a Markov chain $(\xi_v)_{v \in V} \in [k]^V$ on the infinite $b$-ary tree $T = (V,E)$ with irreducible edge transition matrix $M$, where $b \geq 2$, $k \geq 2$ and $[k] = \{1,\ldots,k\}$. We denote by $L_n$ the level-$n$ vertices of $T$. Assume $M$ has a real second-largest (in absolute value) eigenvalue $\lambda$ with corresponding real eigenvector $\nu \neq 0$. Letting $\sigma_v = \nu_{\xi_v}$, we consider the following root-state estimator, which was introduced by Mossel and Peres (2003) in the context of the ``recontruction problem'' on trees: \begin{equation*} S_n = (b\lambda)^{-n} \sum_{x\in L_n} \sigma_x. \end{equation*} As noted by Mossel and Peres, when $b\lambda^2 > 1$ (the so-called Kesten-Stigum reconstruction phase) the quantity $S_n$ has uniformly bounded variance. Here, we give bounds on the moment-generating functions of $S_n$ and $S_n^2$ when $b\lambda^2 > 1$. Our results have implications for the inference of evolutionary trees.

Full Text: Download PDF | View PDF online (requires PDF plugin)

Pages: 251-261

Publication Date: May 19, 2011

DOI: 10.1214/ECP.v16-1630

References

  1. Atteson, K. The performance of neighbor-joining methods of phylogenetic reconstruction. Algorithmica 25 (1999), no. 2-3, 251--278. MR1703580 (2000k:92013)
  2. Athreya, K. B.; Vidyashankar, A. N. Large deviation rates for branching processes. II. The multitype case. Ann. Appl. Probab. 5 (1995), no. 2, 566--576. MR1336883 (96k:60214)
  3. Chor, Benny; Tuller, Tamir. Finding a maximum likelihood tree is hard. J. ACM 53 (2006), no. 5, 722--744 (electronic). MR2263067 (2007h:68067)
  4. Day, William H. E. Computational complexity of inferring phylogenies from dissimilarity matrices. Bull. Math. Biol. 49 (1987), no. 4, 461--467. MR0908160 (88j:92041)
  5. Daskalakis, C.; Mossel, E.; Roch, S. Evolutionaty trees and the Ising model on the Bethe lattice: a proof of Steel's conjecture. Prob. Th. Rel. Fields 149 (2011), no. 1-2, 149--189. Math. Review number not available.
  6. Day, W.H.E.; Sankoff, D. Computational complexity of inferring phylogenies by compatibility. Syst. Zool. 35 (1986), no. 2, 224--229. Math. Review number not available.
  7. Evans, William; Kenyon, Claire; Peres, Yuval; Schulman, Leonard J. Broadcasting on trees and the Ising model. Ann. Appl. Probab. 10 (2000), no. 2, 410--433. MR1768240 (2001g:60243)
  8. Erdős, Péter L.; Steel, Michael A.; Székely, László A.; Warnow, Tandy J. A few logs suffice to build (almost) all trees. I. Random Structures Algorithms 14 (1999), no. 2, 153--184. MR1667319 (2000b:92003)
  9. Felsenstein, J. Inferring Phylogenies. Sinauer, New York, New York, 2004. Math. Review number not available.
  10. Graham, R. L.; Foulds, L. R. Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time. Math. Biosci. 60 (1982), no. 2, 133--142. MR0678718 (84f:92008)
  11. Horn, Roger A.; Johnson, Charles R. Matrix analysis. Cambridge University Press, Cambridge, 1985. xiii+561 pp. ISBN: 0-521-30586-1 MR0832183 (87e:15001)
  12. Kesten, H.; Stigum, B. P. Additional limit theorems for indecomposable multidimensional Galton-Watson processes. Ann. Math. Statist. 37 1966 1463--1481. MR0200979 (34 #864)
  13. Lacey, Michelle R.; Chang, Joseph T. A signal-to-noise analysis of phylogeny estimation by neighbor-joining: insufficiency of polynomial length sequences. Math. Biosci. 199 (2006), no. 2, 188--215. MR2211625 (2007a:92048)
  14. Mossel, Elchanan. Phase transitions in phylogeny. Trans. Amer. Math. Soc. 356 (2004), no. 6, 2379--2404 (electronic). MR2048522 (2005b:82019)
  15. Mossel, Elchanan; Peres, Yuval. Information flow on trees. Ann. Appl. Probab. 13 (2003), no. 3, 817--844. MR1994038 (2004e:60143)
  16. Roch, S. A short proof that phylogenetic tree reconstruction by maximum likelihood is hard. IEEE/ACM Trans. Comput. Biology Bioinform. 3 (2006), no. 1, 92--94. Math. Review number not available.
  17. Roch, S. Sequence length requirement of distance-based phylogeny reconstruction: Breaking the polynomial barrier. Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science 2008. Math. Review number not available.
  18. Roch, Sebastien. Toward extracting all phylogenetic information from matrices of evolutionary distances. Science 327 (2010), no. 5971, 1376--1379. MR2650508
  19. Steel, Michael A.; Székely, László A. Inverting random functions. Combinatorics and biology (Los Alamos, NM, 1998). Ann. Comb. 3 (1999), no. 1, 103--113. MR1769697 (2001j:92030)
  20. Steel, Michael A.; Székely, László A. Inverting random functions. II. Explicit bounds for discrete maximum likelihood estimation, with applications. SIAM J. Discrete Math. 15 (2002), no. 4, 562--575 (electronic). MR1935839 (2003h:62036)
  21. Semple, Charles; Steel, Mike. Phylogenetics. Oxford Lecture Series in Mathematics and its Applications, 24. Oxford University Press, Oxford, 2003. xiv+239 pp. ISBN: 0-19-850942-1 MR2060009 (2005g:92024)


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.