Novel characteristics of split trees by use of renewal theory

Cecilia Ingrid Holmgren (Cambridge University)

Abstract


We investigate characteristics of random split trees introduced by Devroye [SIAM J Comput 28, 409-432, 1998]; split trees include e.g., binary search trees, $m$-ary search trees, quadtrees, median of $(2k+1)$-trees, simplex trees, tries and digital search trees. More precisely: We use renewal theory in the studies of split trees, and use this theory to prove several results about split trees. A split tree of cardinality n is constructed by distributing n balls (which often represent data) to a subset of nodes of an infinite tree. One of our main results is a relation between the deterministic number of balls n and the random number of nodes N. In Devroye [SIAM J Comput 28, 409-432, 1998] there is a central limit law for the depth of the last inserted ball so that most nodes are close to depth $\ln n/\mu+O(\ln n)^{1/2})$, where $\mu$ is some constant depending on the type of split tree; we sharpen this result by finding an upper bound for the expected number of nodes with depths $\geq \mu^{-1}\ln n-(\ln n)^{1/2+\epsilon}$ or depths $\leq\mu^{-1}\ln n+(\ln n)^{1/2+\epsilon}$ for any choice of $\epsilon>0$. We also find the first asymptotic of the variances of the depths of the balls in the tree.

Full Text: Download PDF | View PDF online (requires PDF plugin)

Pages: 1-27

Publication Date: January 16, 2012

DOI: 10.1214/EJP.v17-1723

References

  • Asmussen, Søren. Applied probability and queues. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons, Ltd., Chichester, 1987. x+318 pp. ISBN: 0-471-91173-9 MR0889893
  • C.J. Bell, phAn Investigation into the Principles of the Classification and Analysis of Data on an Automatic Digital Computer. phPhD Thesis 1965.
  • Clément, J.; Flajolet, P.; Vallée, B. Dynamical sources in information theory: a general analysis of trie structures. Average-case analysis of algorithms (Princeton, NJ, 1998). Algorithmica 29 (2001), no. 1-2, 307--369. MR1887308
  • E. G. Coffman, and J. Eve, File structures using hashing functions. phCommunications of the ACM 13 (1970), 427--436.
  • Devroye, Luc. Universal limit laws for depths in random trees. SIAM J. Comput. 28 (1999), no. 2, 409--432. MR1634354
  • Devroye, Luc. Applications of Stein's method in the analysis of random binary search trees. Stein's method and applications, 247--297, Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap., 5, Singapore Univ. Press, Singapore, 2005. MR2205340
  • Feller, William. Fluctuation theory of recurrent events. Trans. Amer. Math. Soc. 67, (1949). 98--119. MR0032114
  • Feller, William. An introduction to probability theory and its applications. Vol. I. Third edition John Wiley & Sons, Inc., New York-London-Sydney 1968 xviii+509 pp. MR0228020
  • Feller, William. An introduction to probability theory and its applications. Vol. II. Second edition John Wiley & Sons, Inc., New York-London-Sydney 1971 xxiv+669 pp. MR0270403
  • R.A. Finkel and J.L. Bentley, Quad trees, a data structure for retrieval on composite keys. phActa Inform 4 (1974), 1--9.
  • E. Fredkin, Trie memory. phCommunications of the ACM 3 (1960), no. 9, 490--499.
  • Gut, Allan. Stopped random walks. Limit theorems and applications. Applied Probability. A Series of the Applied Probability Trust, 5. Springer-Verlag, New York, 1988. x+199 pp. ISBN: 0-387-96590-4 MR0916870
  • Gut, Allan. Probability: a graduate course. Springer Texts in Statistics. Springer, New York, 2005. xxiv+603 pp. ISBN: 0-387-22833-0 MR2125120
  • Hoare, C. A. R. Quicksort. Comput. J. 5 1962 10--15. MR0142216
  • Holmgren, Cecilia. A weakly 1-stable distribution for the number of random records and cuttings in split trees. Adv. in Appl. Probab. 43 (2011), no. 1, 151--177. MR2761152
  • Mahmoud, Hosam M.; Pittel, Boris. Analysis of the space of search trees under the random insertion algorithm. J. Algorithms 10 (1989), no. 1, 52--75. MR0987097
  • Pyke, R. Spacings. (With discussion.) J. Roy. Statist. Soc. Ser. B 27 1965 395--449. MR0216622
  • Mohamed, Hanène; Robert, Philippe. A probabilistic analysis of some tree algorithms. Ann. Appl. Probab. 15 (2005), no. 4, 2445--2471. MR2187300


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.