ECE 530 – Analysis Techniques for Large-Scale Electrical Systems

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Lecture 24: Equivalents, Krylov Subspace Methods Prof. Tom Overbye Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign overbye@illinois.edu

Announcements • HW 8 is due Thursday Dec 5 • Final exam Wednesday Dec 18 from 1:30 to 4:30pm in this room (EL 260) • Closed book, closed notes; you may bring in one new note sheet and your exam 1 note sheet, along with simple calculators

Power System Equivalents • For many power system applications it is not necessary to study the entire interconnected network • Usually we are only concerned with a portion of the network • For real-time operations, real-time information is only available for a portion of the network • System is partitioned into study system, for which a detailed model is desired, and an external system, for which an equivalent model is used • Boundary buses (within the study system) connect the two

Power System Equivalents • For decades power system network models have been equivalenced using the approach originally presented by J.B. Ward in 1949 AIEE paper “Equivalent Circuits for Power-Flow Studies” • Paper’s single reference is to 1939 book by Gabriel Kron, so this also known as Kron’sreduction • Additional classical techniques are discussed in S. Deckmann, A. Pizzolante, A. Monticelli, B. Stott, and O. Alsac,“Studies on power system load flow equivalencing,” IEEE Trans. Power App. Syst., vol. PAS-99, no. 6, pp. 2301–2310, Nov./Dec. 1980.

Ward Equivalents (Kron Reduction) • Equivalent is performed by doing a reduction of the bus admittance matrix • Done by doing a partial factorization of the Ybus • Computationally efficient • Yee matrix is never explicitly inverted! • Similar to what is done when fills are added, with new equivalent lines eventually joining the boundary buses

Ward Equivalents (Kron Reduction) • Prior to equivalencing constant power injections are converted to equivalent current injections, the system equivalenced, then they are converted back to constant power injections • Tends to place large shunts at the boundary buses • This equivalencing process has no impact on the non-boundary study buses • Various versions of the approach are used, primarily differing on the handling of reactive power injections • The equivalent embeds information about the operating state when the equivalent was created

PowerWorld Example • Ward type equivalents can be created in PowerWorld by going into the Edit Mode and selecting Tools, Equivalencing • Use Select the Buses to determine buses in the equivalent • Use Create the Equivalent to actually create the equivalent • When making equivalents for large networks the boundary buses tend to be joined by many high impedance lines; these lines can be eliminated by setting the Max Per Unit for Equivalent Line field to a relatively low value (say 2.0 per unit) • Loads and gens are converted to shunts, equivalenced, then converted back

PowerWorld B7Flat_Eqv Example • In this example the B7Flat_Eqv case is reduced, eliminating buses 1, 3 and 4. The study system is then 2, 5, 6, 7, with buses 2 and 5 the boundary buses For ease ofcomparisonsystem is modeled unloaded

PowerWorld B7Flat_Eqv Example • Original Ybus

PowerWorld B7Flat_Eqv Example Note Yes=Yse'if no phaseshifters

PowerWorld B7Flat_Eqv Example • Comparing original and equivalent Only modification was a change in the impedancebetween buses 2 and 5, modeled by adding anequivalent line

Contingency Analysis Application of Equivalencing • One common application of equivalencing is contingency analysis • Most contingencies have a rather limited affect • Much smaller equivalents can be created for each contingent case, giving rapid contingency screening • Contingencies that appear to have violations in contingency screening can be processed by more time consuming but also more accurate methods • W.F. Tinney, J.M. Bright, "Adaptive Reductions for Power System Equivalents," IEEE. Trans Power, May 1987, pp. 351-359

New Applications in Equivalencing • Models in which the entire extent of the network is retained, but the model size is greatly reduced • Often used for economic studies • Mixed ac/dc solutions, possibly with an equivalent as well • Internal portion is modeled with full ac power flow, more distant parts of the network are retained but modeled with a dc power flow, rest might be equivalenced • Attribute preserving equivalents • Retain characteristics other than just impedances, such as PTDFs; also new research looking at preserving line limits

Iterative Methods for Solving Ax=b • In the 1960s and 1970s iterative methods to solve large sparse linear systems started to gain popularity • The interest arose due to the development of new, efficient Krylov subspace iteration schemes that were in many cases more appropriate than the general purpose direct solution software codes • Such schemes are gaining ground because they are easier to implement efficiently on high-performance computers than the direct methods • GPUs can also be used for parallel computation

References • The good and still free book mentioned earlier on sparse matrices is Iterative Methods for Spare Linear Systems, by Yousef Saad, 2002, at www-users.cs.umn.edu/~saad/IterMethBook_2ndEd.pdf • Y. Saad, "Numerical Methods for Large Eigenvalue Problems," 2011, available for free at • http://www-users.cs.umn.edu/~saad/eig_book_2ndEd.pdf • R.S. Varga, “Matrix Iterative Analysis,” Prentice Hall, Englewood Cliffs, NJ, 1962. • D.M. Young, “Iterative Solution of Large Linear Systems,” Academic Press, New York, NY 1971.

Krylov Subspace Outline • Review of fields and vector spaces • Eigensystem basics • Definition of Krylov subspaces and annihilating polynomial • Generic Krylov subspace solver • Steepest descent • Conjugate gradient • Arnoldi process

Basic Definitions: Fields • A field F is a set of elements for which the operations of addition, subtraction, multiplication, and division are defined • The following field axioms hold for any field F and arbitrary a,b,gF • Closure : a + bFand a  b F • Commutativity: a + b = b + a, a  b = b a • Associativity: (a + b) + g = a + (b + g), (a  b) g = a  (b g) • Distributivityof multiplication:a  (b + g) = (a  b) + (a  g)

Basic Definitions: Fields • existence and uniqueness of the null element 0 :a + 0= a and a  0= 0 • existence of the additive inverse: for every aF there exists a unique bF such that a + b = 0 • existence of the multiplicative inverse: for all aF and a 0, there exists an element gF such that a  g =1

Vector Spaces • A vector space V over the field F is denoted by (V,F) • The space V is a set of vectors which satisfies the following axioms of addition and scalar multiplication: • Closure: For all x1, x2V then x1+ x2V • Commutativity of addition: x1 + x2= x2 + x1 • Associativity of addition: (x1+ x2) + x3 = x1 + (x2+ x3) • Identity element of addition: There exists an element 0V such that for every xV, x + 0 = x • Inverse element of addition: For every xV there exists an element –xV such that x + (-x) = 0

Vector Spaces • Scalar multiplication: For all xV and aF, a  xV • Identity element of scalar multiplication: There exists a field 1 such that 1 x = x • Associativity of scalar multiplication: a  (b  x) = (a  b)  x • Distributivity of scalar multiplication with respect to field addition: (a+ b)  x= a  x + b  x • Distributivity of scalar multiplication with respect to vector addition: a (x1 + x2)= a  x1+ a  x2

Linear Combination and Span • Consider the subset {xiV, i = 1,2,…n} with the elements xi, which are arbitrary vectors in the vector space V • Corresponding to the arbitrary scalars a1, a2, ... an we can form the linear combination of vectors • The set of all linear combinations of x1,x2, …xnis called the span of x1,x2, …xn and is denoted byspan{x1,x2, …xn}

Linear Independence • A set of vectors x1,x2, …xnin vector space V is linearly independent (l.i.) if and only if • A criterion of linear independence of the set of vectors is related to the matrix • A necessary and sufficient condition of l.i. of this set is that X be nonsingular.

Linear Independence • The maximum number n, such that there exists n vectors in V that are l.i., is called the dimension of the vector space V • The vectors x1,x2, …xnform a basis for the vector space V if and only if • x1,x2, …xnare l.i. • x1,x2, …xnspan V (i.e., every vector in V can be expressed as a linear combination of x1,x2, …xn) • A vector space V can have many bases • For example for the 2 vector space one basis is (1,0) and (0,1), while another is (1,0) and (1,1)

Eigensystem Definitions • The scalar l is an eigenvalue of the n by n matrix A if and only if Ax = lxfor some x  0 where x is called the eigenvector corresponding to l • The existence of the eigenvalue l implies (Ax - lx) = 0 so the matrix (Ax - lx) is singular

Eigensystem Definitions • The characteristic equation for determining l is • The function D() is called the characteristic polynomial of A • Suppose that l1, l2, ...,lnare the n distinct eigenvalues of the n by n matrix A, and let x1,x2, …xn be the corresponding eigenvectors • The set formed by these eigenvectors is l.i. • When the eigenvalues of A are distinct, the modal matrix, defined by X = [x1,x2, …xn] is nonsingular

Eigensystem Definitions • X satisfies the equationwhere

Diagonalizable Matrices • An n by n matrix A is said to be diagonalizable if there exists a nonsingular modal matrix X, and a diagonal matrix L such that • It follows from the definition that

Diagonalizable Matrices • Hence in general for an arbitrary k • The matrix X is sometimes referred to as a similarity transformation matrix • It follows that a diagonalizable matrix A implies that any polynomial function of A is diagonalizable

Example • Given • Its eigenvalues are -1, 2 and 5, with • Its characteristic polynomial is

Example • We can verify that A = X L X-1 • Also

Cayley-Hamilton Theorem and Minimum Polynomial • The Cayley-Hamilton theorem states that every square matrix satisfies its own characteristic equation • The minimal polynomial is the polynomial such thatfor minimum degree m • The minimal polynomial and characteristic polynomial are the same if A has n distinct eigenvalues

Cayley-Hamilton Theorem and Minimum Polynomial • This allows us to express A-1 in terms of powers of A • For the previous example with and • Verify

Solution of Linear Equations • As covered previously, the solution of a dense (i.e., non-sparse) system of n equations is O(n3) • Even for a sparse A the direct solution of linear equations can be computationally expensive, and using the previous techniques not easy to parallize • We next present an alternative, iterative approach to obtain the solution using the application of Krylov subspace based methods • Builds on the idea that we can express x = A-1b

Definition of a Krylov Subspace • Given a matrix A and a vector v, the ith order Krylov subspace is defined as • Clearly, i cannot be made arbitrarily large; if fact, for a matrix A of rank n, then i  n • For a specified matrix A and a vector v, the largest value of i is given by the order of the annihilating polynomial

Generic Krylov Subspace Solver • The following is a generic Krylov subspace solver method for solving Ax = b using only matrix vector multiplies • Step 1: Start with an initial guess x(0) and some predefined error tolerance e > 0; compute the residual,r(0) = b – A x(0); set i = 0 • Step 2: While r(i)   e Do (a) i := i + 1 (b) get Ki(r(0),A) (c) get x(i) {x(0) + Ki(r(0),A)} to minimize r(i)  • Stop

Krylov Subspace Solver • Note that no calculations are performed in Step 2 once i becomes greater than the order of the annihilating polynomial • The Krylov subspace methods differ from each other in • the construction scheme of the Krylov subspace in Step 2(b) of the scheme • the residual minimization criterion used in Step 2(c) • A common initial guess is x(0)= 0, giving r(0) = b – A x(0)= b

Krylov Subspace Solver • Every solver involves the A matrix only in matrix-vector products: Air(0), i=1,2,… • The methods can strive to effectively exploit the spectral matrix structure of A with the aim to make the overall procedure computationally efficient • To make this approach computationally efficient it is carried out by using the spectral information of A; for this purpose we order the eigenvalues of A according to their absolute values with

Construction of Krylov Subspaces • The largest eigenvalues of A are called its dominant eigenvalues • Let A be diagonalizable (i.e., it has n distinct eigenvalues) so that there exists an X such thatwhere L is a diagonal matrix of the n distinct eigenvalues • As i increases, the result is increasingly dependent on the dominant eigenvalues

Construction of Krylov Subspaces • For finite precisions arithmetic, as i increases then Aivonly contains, in effect, information only about the dominant eigenvalues of A • In other words, as the order i of Ki(v,A) increases, the basis vector Ai-1v added to Ki-1(v,A) to construct the ith order Krylov subspace, whose dimension is 1 higher than the previous one, has the additional vector in the basis, but it is essentially indistinguishable from the previous Kyrlov subspace • So computationally we have not really done anything • Different solution schemes address this numerical issue

Steepest Descent Approach • Presented first since it is easiest to explain • Assume in Ax = b that A is symmetric, positive definite (all eigenvalues real, nonnegative) so • Let • The solution x* that minimizes f(x), is given by • Which is obviously the solution to Ax = b

Steepest Descent Approach • Steepest descent is classic solution for minimizing f(x) i • At any given point x(i), [f(x(i))T] is the direction of maximum increase in f(x) and therefore moving in the opposite direction leads to the maximum decrease • The step-size is selected to determine the minimum along this direction • If x* is the solution, define the ith error and residual as

Steepest Descent Approach • The trick is quickly determining the step-size a so • Hence we want a(i) such that Which occurs when r(i) and f(x(i+1)) are orthogonal

Steepest Descent Approach • Recall that • Substitute • Therefore

Steepest Descent Algorithm • Set i=0, e > 0, x(0) = 0, so r(i) = b - Ax(0) = b • While  r(i)   eDo (a) calculate (b)x(i+1) = x(i) + a(i) r(i) (c) r(i+1) = r(i)- a(i) Ar(i) (d) i := i + 1End While Note there is onlyone matrix, vectormultiply per iteration

Steepest Descent Example • Keep in mind these algorithms are designed for large systems, whereas the example is necessarily small! • Let • At solution f(x*) = -29.03 • Select x(0) = 0, e = 0.1, then r(0)= b

Steepest Descent Example • Then

Steepest Descent Example • Repeating for the next iteration, a(1) = 1.031 • And again, a(2) = 0.060,

Steepest Descent Example • And again, a(3) = 0.2452, • We are converging, but certainly not quickly! • The eigenvalues of A are 0.65, 13.91 and 17.44 • The larger eigenvalues tend to dominate

Steepest Descent Convergence • We define the A-norm of x • We can show that where is the condition number of A, i.e., For our example 26.8,and (-1)/(+1) = 0.928

Steepest Descent Convergence • Because (-1)/(+1) < 1 the error will decrease with each steepest descent iteration, albeit potentially quite solutions (as in our example) • The function values decreases quicker, as perbut this can still be quite slow if is large • If you are thinking, "there must be a better way," you are correct. The problem is we continually searching in the same set of directions

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems