Tail inequalities for sums of random matrices that depend on the intrinsic dimension

Daniel Hsu (Microsoft Research New England)
Sham M. Kakade (Microsoft Research New England and University of Pennsylvania)
Tong Zhang (Rutgers University)

Abstract


This work provides exponential tail inequalities for sums of random matrices that depend only on intrinsic dimensions rather than explicit matrix dimensions.  These tail inequalities are similar to the matrix versions of the Chernoff bound and Bernstein inequality except with the explicit matrix dimensions replaced by a trace quantity that can be small even when the explicit dimensions are large or infinite.  Some applications to covariance estimation and approximate matrix multiplication are given to illustrate the utility of the new bounds.

Full Text: Download PDF | View PDF online (requires PDF plugin)

Pages: 1-13

Publication Date: March 12, 2012

DOI: 10.1214/ECP.v17-1869

References

  • Ahlswede, Rudolf; Winter, Andreas. Strong converse for identification via quantum channels. IEEE Trans. Inform. Theory 48 (2002), no. 3, 569--579. MR1889969
  • Bach, Francis R. Consistency of the group lasso and multiple kernel learning. J. Mach. Learn. Res. 9 (2008), 1179--1225. MR2417268
  • Christofides, Demetres; Markström, Klas. Expansion properties of random Cayley graphs and vertex transitive graphs via matrix martingales. Random Structures Algorithms 32 (2008), no. 1, 88--100. MR2371053
  • Drineas, Petros; Kannan, Ravi; Mahoney, Michael W. Fast Monte Carlo algorithms for matrices. I. Approximating matrix multiplication. SIAM J. Comput. 36 (2006), no. 1, 132--157. MR2231643
  • Freedman, David A. On tail probabilities for martingales. Ann. Probability 3 (1975), 100--118. MR0380971
  • Fukumizu, Kenji; Bach, Francis R.; Gretton, Arthur. Statistical consistency of kernel canonical correlation analysis. J. Mach. Learn. Res. 8 (2007), 361--383 (electronic). MR2320675
  • Golden, Sidney. Lower bounds for the Helmholtz function. Phys. Rev. (2) 137 1965 B1127--B1128. MR0189691
  • Gross, David. Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inform. Theory 57 (2011), no. 3, 1548--1566. MR2815834
  • David Gross, Yi-Kai Liu, Steven T. Flammia, Stephen Becker, and Jens Eisert. Quantum state tomography via compressed sensing. Phys. Rev. Lett. 105 (2010), 150401.
  • Guionnet, Alice. Large deviations and stochastic calculus for large random matrices. Probab. Surv. 1 (2004), 72--172. MR2095566
  • Daniel Hsu, Sham M. Kakade, and Tong Zhang. An analysis of random design linear regression. 2011, arXiv:1106.2363v1.
  • Lieb, Elliott H. Convex trace functions and the Wigner-Yanase-Dyson conjecture. Advances in Math. 11 (1973), 267--288. MR0332080
  • Alexander E. Litvak, Alain Pajor, Mark Rudelson, and Nicole Tomczak-Jaegermann. Smallest singular value of random matrices and geometry of random polytopes. Adv. Math. 195 (2005), no. 2, 491--523. MR2146352
  • Avner Magen and Anastasios Zouzias. Low rank matrix-valued Chernoff bounds and approximate matrix multiplication. Proceedings of the 22nd ACM-SIAM Symposium on Discrete Algorithms 2011.
  • Roberto Imbuzeiro Oliveira. Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. 2010, arXiv:0911.0600.
  • Oliveira, Roberto Imbuzeiro. Sums of random Hermitian matrices and an inequality by Rudelson. Electron. Commun. Probab. 15 (2010), 203--212. MR2653725
  • Pisier, Gilles. The volume of convex bodies and Banach space geometry. Cambridge Tracts in Mathematics, 94. Cambridge University Press, Cambridge, 1989. xvi+250 pp. ISBN: 0-521-36465-5; 0-521-66635-X MR1036275
  • Rasmussen, Carl Edward; Williams, Christopher K. I. Gaussian processes for machine learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA, 2006. xviii+248 pp. ISBN: 978-0-262-18253-9 MR2514435
  • Benjamin Recht. A simpler approach to matrix completion. J. Mach. Learn. Res. 12 (2011), 3413--3430.
  • Rudelson, M. Random vectors in the isotropic position. J. Funct. Anal. 164 (1999), no. 1, 60--72. MR1694526
  • Rudelson, Mark; Vershynin, Roman. Sampling from large matrices: an approach through geometric functional analysis. J. ACM 54 (2007), no. 4, Art. 21, 19 pp. (electronic). MR2351844
  • Bernhard Schölkopf, Alex J. Smola, and Klaus-Robert Müller. Kernel principal component analysis. Advances in Kernel Methods---Support Vector Learning (Bernhard Schölkopf, Christopher J. C. Burges, and Alex J. Smola, eds.), MIT Press, 1999, pp. 327--352.
  • Thompson, Colin J. Inequality with applications in statistical mechanics. J. Mathematical Phys. 6 1965 1812--1813. MR0189688
  • Tropp, Joel A. Freedman's inequality for matrix martingales. Electron. Commun. Probab. 16 (2011), 262--270. MR2802042
  • Tropp, Joel A. User-friendly tail bounds for sums of random matrices. Foundations of Computational Mathematics (2011), 1--46.
  • Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices. Compressed Sensing, Theory and Applications (Y. Eldar and G. Kutyniok, eds.), Cambridge University Press, 2012, pp. 210--268.
  • Voiculescu, Dan. Limit laws for random matrices and free products. Invent. Math. 104 (1991), no. 1, 201--220. MR1094052
  • Zhang, Tong. Data dependent concentration bounds for sequential prediction algorithms. Learning theory, 173--187, Lecture Notes in Comput. Sci., 3559, Springer, Berlin, 2005. MR2203261
  • Zwald, Laurent; Bousquet, Olivier; Blanchard, Gilles. Statistical properties of kernel principal component analysis. Learning theory, 594--608, Lecture Notes in Comput. Sci., 3120, Springer, Berlin, 2004. MR2177937


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.