250 likes | 439 Vues
This talk provides a comprehensive introduction to the methodologies of Successive Quadratic Programming (SQP) and Reduced Successive Quadratic Programming (rSQP) in the context of nonlinear optimization. Starting with the formulation of Quadratic Programming (QP) problems, we explore the properties and solution methods, including the derivation of SQP and its modifications. The discussion extends to rSQP, highlighting its enhancements over the SQP approach. Case studies, visualizations, and algorithmic strategies will be presented to illustrate these optimization techniques and their application in real-world scenarios.
E N D
An overview of the SQP and rSQP methodologies Kedar Kulkarni Advisor: Prof. Andreas A. Linninger Laboratory for Product and Process Design, Department of Bioengineering, University of Illinois, Chicago, IL 60607, U.S.A.
Outline of the talk • Introduction to the Quadratic Programming (QP) problem • Properties of the QP problem • Solution methods of the QP problem • Successive Quadratic Programming (SQP) as a method to solve a general Nonlinear Programming (NLP) problem - Introduction to SQP (Derivation) - Possible modification - Case study/Visualization • The rSQP as an improvement over the SQP - Introduction to SQP (Derivation) - Case study • Recap
Introduction to the QP problem: General Constrained Optimization problem: Sufficient condition for local optimum to be global: f, g convex and h linear • Quadratic Programming (QP) problem: • Quadratic Objective function (f) • Linear g and h Standard form: x is a vector of n variables containing the slack variables
Properties of the QP problem: x2 x2 x1 x1 x2 x1 Q is positive/negative semidefinite (i = 0 for some i): Ridge of stationary points (minima/maxima) Q is positive/negative definite (i > 0 or i < 0 i): Stationary points are minima/maxima Q is indefinite (i > 0 for some i and i < 0 for others): Stationary point is a saddle point
Solution methods for the QP problem: Construct the Lagrangian for the QP problem: Write the Kuhn-Tucker (KKT) conditions for the QP problem: • This is a linear set of equations except for the last one • It could be solved using “LP machinery” by a modified Simplex method • (Wolfe, 1959 ), which works only if Q is positive definite • Other methods: (1) Complementary pivoting (Lemke) – faster and works for • positive semidefinite Q. • (2) Range and null space methods (Gill and Murray)
Introduction to SQP: • Solves a sequence of QP approximations to a NLP problem • The objective is a quadratic approximation to the Lagrangian function • The algorithm is simply Newton’s method applied to solve the set of • equations obtained on applying KKT conditions! Consider the general constrained optimization problem again: KKT
Introduction to SQP: • Considering this as a system of equations in x*,* and *, we write the • following Newton step • These equations are the KKT conditions of the following optimization • problem! • This is a QP problem. Its solution is determined by the properties of the Hessian-of-the-Lagrangian
Introduction to SQP: • Equivalent and more general form: • The Hessian is not always positive definite => Non convex QP, difficult • to solve • Remedy: At each iteration approximate the Hessian with a matrix that • is symmetric and positive definite • This is a Quasi-Newton secant approximation • Bi+1 as a function of Bi is given by the BFGS update
BFGS update: • s and y are known; to determine Bi+1. • Too many solutions possible. Obtain Bi+1as a result of an optimization • problem -- Positive definite and symmetric B Thus, Broyden Finally, -- BFGS update • If Biis positive definite and sTy > 0, then Bi+1 is also positive definite
Possible modification: • Choose to ensure progress towards optimum: • is chosen by making sure that a merit function is decreased at each • iteration: exact penalty function, augmented Lagrangian Exact penalty function: • Newton-like convergence properties of SQP: • - Fast local convergence • - Trust region adaptations provide a stronger guarantee of global • convergence
Case study: Choose x0 = [0,0]T as the initial guess k = 0: Line search
Visualization: Feasible region 1st Iteration k = 1:
Visualization: k = 2: Line Solution search so, no search direction remaining!
SQP: A few comments • State-of-the-art in NLP solvers, requires fewest function iterations • Does not require feasible points at intermediate iterations • It is sensitive to scaling of functions and variables. Performs poorly with ill-conditioned QP problems • Not efficient for problems with a large number of variables (n > 100) • Computational time for each iteration goes up due to presence of dense matrices • Reduced space methods (rSQP, MINOS), large scale adaptations of SQP
Introduction to rSQP: Consider the general constrained optimization problem again: SQP iteration i KKT conditions z = [xT sT] The second row is: n > m; (n-m) free, m dependent • To solve this system of equations we could exploit the properties of the null space of the matrix A. Partition A as follows: Where N is n X (n-m) and C is m X m • Z is a member of the null space: • Z can be written in terms of N and C as: Check AZ = 0 !!
Introduction to rSQP: • Now choose Y such that [Y | Z] is a non-singular and well-conditioned matrix -- co-ordinate basis • It remains to find dY and dZ. Let d = YdY + ZdZ in the optimality condition of the QP Where • The last row could be used to solve for dY: • This value could be used to solve for dZ using the second row: • This is okay if there were no bounds on z. If there are bounds too, then:
Case study: • At iteration i consider the following QP sub-problem: n=3, m=2 • Comparing with the standard form: Choose
Case study: • Thus Z can be evaluated as: Check AZ = 0 !! • Now choose Y as: • Rewrite the last row : where • Solve for dY : • Now, we have Y, Z, dY. To calculate dZ
Case study: • Solve: • The components: • Finally:
rSQP: A few comments • Basically, solve for dY and dZ separately instead of directly solving for d • More iterations but less time per iteration • The full Hessian does not need to be evaluated. We deal only with the reduced (projected) Hessian ZTBZ ((n-m) X (n-m)) • Local convergence properties are similar for both SQP and rSQP Recap: Newton’s Method f**(x)=0 KKT Optimality condition Quadratic approx. to Lagrangian P1 P2 P3 P4 NLP f(x)=0 f*(x)=0 KKT P5 P6 Range and null space rSQP sub problem QP sub problem