In this chapter we discuss optimization when time is not a parameter. The discussion is preparatory to dealing with time-varying systems in subsequent chapters. A reference that provides an excellent treatment of this material is Bryson and Ho (1975), and we shall sometimes follow their point of view. Appendix A should be reviewed, particularly the section that discusses matrix calculus. 1.1 OPTIMIZATION WITHOUT CONSTRAINTS A scalar performance index L(u) is given that is a function of a control or decision vector u ? Rm. It is desired to determine the value of u that results in a minimum value of L(u)
ffirs.dvi Lewis ffirs.tex V1 - 10/19/2011 5:03pm Page i OPTIMAL CONTROL Lewis ffirs.tex V1 - 10/19/2011 5:03pm Page ii Lewis ffirs.tex V1 - 10/19/2011 5:03pm Page iii OPTIMAL CONTROL Third Edition FRANK L. LEWIS Department of Electrical Engineering, Automation & Robotics Research Institute, University of Texas at Arlington, Arlington, Texas DRAGUNA L. VRABIE United Technologies Research Renter, East Hartford, Connecticut VASSILIS L. SYRMOS Department of Electrical Engineering, University of Hawaii at Manoa, Honolulu, Hawaii JOHN WILEY & SONS, INC. Lewis ffirs.tex V1 - 10/19/2011 5:03pm Page iv This book is printed on acid-free paper. Copyright © 2012 by John Wiley & Sons, Inc. All rights reserved Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at www.wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty: While the publisher and the author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor the author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information about our other products and services, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: Lewis, Frank L. Optimal control / Frank L. Lewis, Draguna L. Vrabie, Vassilis L. Syrmos.—3rd ed. p. cm. Includes bibliographical references and index. ISBN 978-0-470-63349-6 (cloth); ISBN 978-1-118-12264-8 (ebk); ISBN 978-1-118-12266-2 (ebk); ISBN 978-1-118-12270-9 (ebk); ISBN 978-1-118-12271-6 (ebk); ISBN 978-1-118-12272-3 (ebk) 1. Control theory. 2. Mathematical optimization. I. Vrabie, Draguna L. II. Syrmos, Vassilis L. III. Title. QA402.3.L487 2012 629.8’312–dc23 2011028234 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1 Lewis ffirs.tex V1 - 10/19/2011 5:03pm Page v To Galina, Roma, and Chris, who make every day exciting —Frank Lewis To my mother and my grandmother, for teaching me my potential and supporting my every choice —Draguna Vrabie To my father, my first teacher —Vassilis Syrmos Lewis ffirs.tex V1 - 10/19/2011 5:03pm Page vi Lewis ftoc.tex V1 - 10/19/2011 4:11pm Page vii CONTENTS PREFACE xi 1 STATIC OPTIMIZATION 1 1.1 Optimization without Constraints / 1 1.2 Optimization with Equality Constraints / 4 1.3 Numerical Solution Methods / 15 Problems / 15 2 OPTIMAL CONTROL OF DISCRETE-TIME SYSTEMS 19 2.1 Solution of the General Discrete-Time Optimization Problem / 19 2.2 Discrete-Time Linear Quadratic Regulator / 32 2.3 Digital Control of Continuous-Time Systems / 53 2.4 Steady-State Closed-Loop Control and Suboptimal Feedback / 65 2.5 Frequency-Domain Results / 96 Problems / 102 3 OPTIMAL CONTROL OF CONTINUOUS-TIME SYSTEMS 110 3.1 The Calculus of Variations / 110 3.2 Solution of the General Continuous-Time Optimization Problem / 112 3.3 Continuous-Time Linear Quadratic Regulator / 135 vii Lewis ftoc.tex V1 - 10/19/2011 4:11pm Page viii viii CONTENTS 3.4 Steady-State Closed-Loop Control and Suboptimal Feedback / 154 3.5 Frequency-Domain Results / 164 Problems / 167 4 THE TRACKING PROBLEM AND OTHER LQR EXTENSIONS 177 4.1 The Tracking Problem / 177 4.2 Regulator with Function of Final State Fixed / 183 4.3 Second-Order Variations in the Performance Index / 185 4.4 The Discrete-Time Tracking Problem / 190 4.5 Discrete Regulator with Function of Final State Fixed / 199 4.6 Discrete Second-Order Variations in the Performance Index / 206 Problems / 211 5 FINAL-TIME-FREE AND CONSTRAINED INPUT CONTROL 213 5.1 Final-Time-Free Problems / 213 5.2 Constrained Input Problems / 232 Problems / 257 6 DYNAMIC PROGRAMMING 260 6.1 Bellman’s Principle of Optimality / 260 6.2 Discrete-Time Systems / 263 6.3 Continuous-Time Systems / 271 Problems / 283 7 OPTIMAL CONTROL FOR POLYNOMIAL SYSTEMS 287 7.1 Discrete Linear Quadratic Regulator / 287 7.2 Digital Control of Continuous-Time Systems / 292 Problems / 295 8 OUTPUT FEEDBACK AND STRUCTURED CONTROL 297 8.1 Linear Quadratic Regulator with Output Feedback / 297 8.2 Tracking a Reference Input / 313 8.3 Tracking by Regulator Redesign / 327 8.4 Command-Generator Tracker / 331 8.5 Explicit Model-Following Design / 338 8.6 Output Feedback in Game Theory and Decentralized Control / 343 Problems / 351 Lewis ftoc.tex V1 - 10/19/2011 4:11pm Page ix CONTENTS ix 9 ROBUSTNESS AND MULTIVARIABLE FREQUENCY-DOMAIN TECHNIQUES 355 9.1 Introduction / 355 9.2 Multivariable Frequency-Domain Analysis / 357 9.3 Robust Output-Feedback Design / 380 9.4 Observers and the Kalman Filter / 383 9.5 LQG/Loop-Transfer Recovery / 408 9.6 H∞ DESIGN / 430 Problems / 435 10 DIFFERENTIAL GAMES 438 10.1 Optimal Control Derived Using Pontryagin’s Minimum Principle and the Bellman Equation / 439 10.2 Two-player Zero-sum Games / 444 10.3 Application of Zero-sum Games to H∞ Control / 450 10.4 Multiplayer Non-zero-sum Games / 453 11 REINFORCEMENT LEARNING AND OPTIMAL ADAPTIVE CONTROL 461 11.1 Reinforcement Learning / 462 11.2 Markov Decision Processes / 464 11.3 Policy Evaluation and Policy Improvement / 474 11.4 Temporal Difference Learning and Optimal Adaptive Control / 489 11.5 Optimal Adaptive Control for Discrete-time Systems / 490 11.6 Integral Reinforcement Learning for Optimal Adaptive Control of Continuous-time Systems / 503 11.7 Synchronous Optimal Adaptive Control for Continuous-time Systems / 513 APPENDIX A REVIEW OF MATRIX ALGEBRA 518 A.1 Basic Definitions and Facts / 518Q1 A.2 Partitioned Matrices / 519 A.3 Quadratic Forms and Definiteness / 521 A.4 Matrix Calculus / 523 A.5 The Generalized Eigenvalue Problem / 525 REFERENCES 527 INDEX 535 Lewis ftoc.tex V1 - 10/19/2011 4:11pm Page x Lewis fpref.tex V1 - 10/19/2011 4:55pm Page xi PREFACE This book is intended for use in a second graduate course in modern control theory. A background in the state-variable representation of systems is assumed. Matrix manipulations are the basic mathematical vehicle and, for those whose memory needs refreshing, Appendix A provides a short review. The book is also intended as a reference. Numerous tables make it easy to find the equations needed to implement optimal controllers for practical applications. Our interactions with nature can be divided into two categories: observation and action. While observing, we process data from an essentially uncooperative universe to obtain knowledge. Based on this knowledge, we act to achieve our goals. This book emphasizes the control of systems assuming perfect and com- plete knowledge. The dual problem of estimating the state of our surroundings is briefly studied in Chapter 9. A rigorous course in optimal estimation is required to conscientiously complete the picture begun in this text. Our intention is to present optimal control theory in a clear and direct fashion. This goal naturally obscures the more subtle points and unanswered questions scattered throughout the field of modern system theory. What appears here as a completed picture is in actuality a growing body of knowledge that can be interpreted from several points of view and that takes on different personalities as new research is completed. We have tried to show with many examples that computer simulations of optimal controllers are easy to implement and are an essential part of gaining an intuitive feel for the equations. Students should be able to write simple pro- grams as they progress through the book, to convince themselves that they have confidence in the theory and understand its practical implications. Relationships to classical control theory have been pointed out, and a root- locus approach to steady-state controller design is included. Chapter 9 presents xi Lewis fpref.tex V1 - 10/19/2011 4:55pm Page xii xii PREFACE some multivariable classical design techniques. A chapter on optimal control of polynomial systems is included to provide a background for further study in the field of adaptive control. A chapter on robust control is also included to expose the reader to this important area. A chapter on differential games shows how to extend the optimality concepts in the book to multiplayer optimization in interacting teams. Optimal control relies on solving the matrix design equations developed in the book. These equations can be complicated, and exact solution of the Hamilton- Jacobi equations for nonlinear systems may not be possible. The last chapter, on optimal adaptive control, gives practical methods for solving these matrix design equations. Algorithms are given for finding approximate solutions online in real-time using adaptive learning techniques based on data measured along the system trajectories. The first author wants to thank his teachers: J. B. Pearson, who gave him the initial excitement and passion for the field; E. W. Kamen, who tried to teach him persistence and attention to detail; B. L. Stevens, who forced him to consider applications to real situations; R. W. Newcomb, who gave him self-confidence; and A. H. Haddad, who showed him the big picture and the humor behind it all. We owe our main thanks to our students, who force us daily to take the work seriously and become a part of it. Acknowledgments This work was supported by NSF grant ECCS-0801330, ARO grant W91NF-05- 1-0314, and AFOSR grant FA9550-09-1-0278. Lewis c01.tex V1 - 10/18/2011 3:39pm Page 1 1 STATIC OPTIMIZATION In this chapter we discuss optimization when time is not a parameter. The discus- sion is preparatory to dealing with time-varying systems in subsequent chapters. A reference that provides an excellent treatment of this material is Bryson and Ho (1975), and we shall sometimes follow their point of view. Appendix A should be reviewed, particularly the section that discusses matrix calculus. 1.1 OPTIMIZATION WITHOUT CONSTRAINTS A scalar performance index L(u) is given that is a function