Potential-Function Proofs for Gradient Methods

by Nikhil Bansal and Anupam Gupta

Theory of Computing, Volume 15(4), pp. 1-32, 2019

Bibliography with links to cited articles

[1]   Zeyuan Allen-Zhu and Lorenzo Orecchia: Linear coupling: an ultimate unification of gradient and mirror descent. In Proc. 8th Innovations in Theoret. Comp. Sci. Conf. (ITCS’17), pp. 3:1–3:22. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2017. [doi:10.4230/LIPIcs.ITCS.2017.3, arXiv:1407.1537]

[2]   Amir Beck and Marc Teboulle: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett., 31(3):167–175, 2003. [doi:10.1016/S0167-6377(02)00231-6]

[3]   Aharon Ben-Tal and Arkadi Nemirovski: Lectures on Modern Convex Optimization. Society for Industrial and Applied Mathematics, 2001. [doi:10.1137/1.9780898718829]

[4]   Dmitri P. Bertsekas, Angelia Nedić, and Asuman E. Ozdaglar: Convex Analysis and Optimization. Athena Scientific, 2003.

[5]   Stephen Boyd and Lieven Vandenberghe: Convex Optimization. Cambridge University Press, 2004. [doi:10.1017/CBO9780511804441]

[6]   Sébastien Bubeck: Convex optimization: algorithms and complexity. Foundations and Trends in Machine Learning, 8(3-4):231–357, 2015. [doi:10.1561/2200000050, arXiv:1405.4980]

[7]   Nicolň Cesa-Bianchi and Gábor Lugosi: Prediction, Learning, and Games. Cambridge University Press, Cambridge, 2006.

[8]   Jelena Diakonikolas and Lorenzo Orecchia: The approximate gap technique: a unified approach to optimal first-order methods. SIAM J. Optim., 29(1):660–689, 2019. SIAM. [arXiv:1712.02485]

[9]   Yoel Drori and Marc Teboulle: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Program., 145(1-2):451–482, 2014. [doi:10.1007/s10107-013-0653-0, arXiv:1206.3209]

[10]   John C. Duchi: Introductory Lectures on Stochastic Optimization, 2016. Available at author’s webpage.

[11]   Marguerite Frank and Philip Wolfe: An algorithm for quadratic programming. Naval Research Logistics Quarterly, 3(1-2):95–110, 1956. [doi:10.1002/nav.3800030109]

[12]    Elad Hazan: Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4):157–325, 2016. [doi:10.1561/2400000013]

[13]   Jean-Baptiste Hiriart-Urruty and Claude Lemaréchal: Fundamentals of Convex Analysis. Springer, 2001. [doi:10.1007/978-3-642-56468-0]

[14]   Martin Jaggi: Revisiting Frank–Wolfe: projection-free sparse convex optimization. In Proc. 13th Internat. Conf. on Machine Learning (ICML’13), pp. 427–435, 2013. [PMLR].

[15]   Sahar Karimi and Stephen A. Vavasis: A single potential governing convergence of conjugate gradient, accelerated gradient and geometric descent, 2017. [arXiv:1712.09498]

[16]   Donghwan Kim and Jeffrey A. Fessler: Optimized first-order methods for smooth convex minimization. Math. Program., 159(1-2):81–107, 2016. [doi:10.1007/s10107-015-0949-3, arXiv:1406.5468]

[17]   Walid Krichene, Alexandre M. Bayen, and Peter L. Bartlett: Accelerated mirror descent in continuous and discrete time. In Adv. in Neural Inform. Processing Systems 28 (NIPS’15), pp. 2845–2853, 2015. NIPS.

[18]   Arkadi Semënovich Nemirovski and David Berkovich Yudin: Problem Complexity and Method Efficiency in Optimization. John Wiley & Sons, Inc., 1983.

[19]   Yurii Nesterov: A method of solving a convex programming problem with convergence rate O(1∕k2). Soviet Mathematics Doklady, 27:372–376, 1983. LINK at Math-Net.ru.

[20]   Yurii Nesterov: Introductory Lectures on Convex Optimization – A Basic Course. Kluwer Academic Publishers, 2004. [doi:10.1007/978-1-4419-8853-9]

[21]   Javier Peńa: Convergence of first-order methods via the convex conjugate. Oper. Res. Lett., 45(6):561–564, 2017. [doi:10.1016/j.orl.2017.08.013, arXiv:1707.09084]

[22]   Shai Shalev-Shwartz: Online learning and online convex optimization. Foundations and Trends in Machine Learning, 4(2):107–194, 2012. [doi:10.1561/2200000018]

[23]   Naum Zuselevich Shor: Minimization Methods for Non-differentiable Functions. Springer, 1985. [doi:10.1007/978-3-642-82118-9]

[24]   Weijie Su, Stephen Boyd, and Emmanuel J. Candčs: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. (JMLR), 17(153):1–43, 2016.

[25]   Eiji Takimoto and Manfred K. Warmuth: The minimax strategy for Gaussian density estimation. In Proc. 13th Ann. Conf. on Computational Learning Theory (COLT’00), pp. 100–106. Morgan Kaufmann Publ., 2000. ACM DL.

[26]   Adrien Taylor and Francis Bach: Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions. In Proc. 32th Ann. Conf. on Computational Learning Theory (COLT’19), pp. 2934–2992. MLR Press, 2019. [PMLR]. [arXiv:1902.00947]

[27]   Paul Tseng: On accelerated proximal gradient methods for convex-concave optimization, 2008. Available at author’s webpage.

[28]   Nisheeth K. Vishnoi: A mini-course on convex optimization, 2018. Available at author’s webpage.

[29]   Andre Wibisono, Ashia C. Wilson, and Michael I. Jordan: A variational perspective on accelerated methods in optimization. Proc. Natl. Acad. Sci. USA, 113(47):E7351–E7358, 2016. [doi:10.1073/pnas.1614734113, arXiv:1603.04245]

[30]   Ashia C. Wilson, Ben Recht, and Michael I. Jordan: A Lyapunov analysis of momentum methods in optimization, 2016. [arXiv:1611.02635]