Geometry of Feedback Control and Learning.
In this thesis, we shall study optimal control problems, e.g. linear-quadratic-regulator (LQR), least squares stationary optimal control, linear quadratic (LQ) dynamic games, through the lens of first-order algorithms. The developed theories on these topics are largely derived from model-based dynamic programming. Recently there is a surge of interest in constructing optimal control strategies directly, viewing control synthesis by policy gradient based algorithms. Adopting such a point of view has been partially inspired by the success of learning algorithms, such as Reinforcement Learning (RL), where using principles of Dynamic Programming (DP), one can devise real-time model-free methods for both continuous-time and discrete-time LQR. The direct policy update approach offers advantages in terms of scalability, model-free implementations and richer parameterizations (e.g., structured controller design).We first study the topological and metrical properties of the set of stabilizing feedback controls. The problem is of interest as this set is the natural domain of the cost functions for optimal problems. We present a complete account of the set-theoretic properties for both single-input-single-out (SISO) and multiple-input-multiple-output (MIMO) systems. We particularly prove an upper bound of number of path-connected components in SISO systems. An algorithm on how to identify the connected components is proposed as well.
We next move on LQR optimal control. We characterize several analytical properties (smoothness, coerciveness, quadratic growth) that are crucial in the analysis of gradient- based algorithms. We then examine three types of well-posed flows for LQR: gradient flow, natural gradient flow and the quasi-Newton flow. The coercive property suggests that these flows admit unique solutions while gradient dominated property indicates that the corresponding Lyapunov functionals decay at an exponential rate; quadratic growth on the other hand guarantees that the trajectories of these flows are exponentially stable in the sense of Lyapunov. We then discuss the forward Euler discretization of these flows, realized as gradient descent, natural gradient descent and quasi-Newton iteration. We present stepsize criteria for gradient descent and natural gradient descent, guaranteeing that both algorithms converge linearly to the global optima. An optimal stepsize for the quasi-Newton iteration is also proposed, guaranteeing a Q-quadratic convergence rate–and in the meantime–recovering the Hewer algorithm.
We then consider the least squares stationary optimal control, i.e., LQR with indefinite state and input cost matrices. Such a setup has important applications in control design with conflicting objectives, such as linear quadratic dynamic games. We show the global convergence of gradient, natural gradient and quasi-Newton policies for this class of indefinite least squares problems.
Lastly, we study LQ dynamic games, which is closely related to H∞ optimal control. We propose projection-free sequential algorithms for…
Advisors/Committee Members: Mesbahi, Mehran (advisor), Fazel, Maryam (advisor).
to Zotero / EndNote / Reference
APA (6th Edition):
Bu, J. (2021). Geometry of Feedback Control and Learning. (Doctoral Dissertation). University of Washington. Retrieved from http://hdl.handle.net/1773/46786
Chicago Manual of Style (16th Edition):
Bu, Jingjing. “Geometry of Feedback Control and Learning.” 2021. Doctoral Dissertation, University of Washington. Accessed April 22, 2021.
MLA Handbook (7th Edition):
Bu, Jingjing. “Geometry of Feedback Control and Learning.” 2021. Web. 22 Apr 2021.
Bu J. Geometry of Feedback Control and Learning. [Internet] [Doctoral dissertation]. University of Washington; 2021. [cited 2021 Apr 22].
Available from: http://hdl.handle.net/1773/46786.
Council of Science Editors:
Bu J. Geometry of Feedback Control and Learning. [Doctoral Dissertation]. University of Washington; 2021. Available from: http://hdl.handle.net/1773/46786