Low Cost Design of Advanced Encryption Standard (AES) Processor

1 / 87

# Low Cost Design of Advanced Encryption Standard (AES) Processor

## Low Cost Design of Advanced Encryption Standard (AES) Processor

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Low Cost Design of Advanced Encryption Standard (AES) Processor Ming-Chih Chen Department of Electronic Engineering National Kaohsiung First University of Science and Technology

2. Outline • Introduction • Previous AES Design Methods • Two Proposed Substructure Sharing Methods for XOR-based Operations • Two Proposed CSE Algorithms for Sum-of-Product Operations • Comparisons and Implementations • Conclusions

3. Introduction

4. Introduction • In Oct. 2000, the Rijndael Advanced Encryption Standard was selected by the NIST (National Institute of Standards and Technology) as a new encryption standard. • The Rijndael AES algorithm is a symmetric block cipher that processes data blocks of 128 bits using cipher keys with lengths of 128, 192, and 256 bits. • Applications for AES include the security of wireless network (IEEE 802.11), smart card, …etc.

5. Advanced Encryption Standard Finite Field Operations AES Transformations & Algorithm

6. Finite Field Operations

7. Finite Field Addition • Bitwise XOR operation (or modulo-2 addition) (Polynomial notation) (Binary notation) (Hexadecimal notation)

8. Multiplication in GF(28) • Multiplication of two polynomials modulo an irreducible polynomial m(x)=x8+x4+x3+x+1 • Ex: {57}·{83}={c1} • Multiplicative identity: {01} • Multiplicative inverse of b(x) is denoted by b-1(x) • Extended Euclidean algorithm • b(x)a(x) + m(x)c(x)=1 => b-1(x)=a(x) mod m(x)

9. Multiplication by X • b7=0 • Left shift • b7=1 • Left shift followed by bitwise XOR with {1b} • This operation is denoted by xtime( )

10. Polynomial with Coeffs. in GF(28) • Each coeff. of a polynomial is a byte (8-bit) • Polynomial addition: a(x) + b(x) • Byte-wise XOR for corresponding coeffs. • Polynomial multiplication modulo x4+1 • d(x) = a(x) b(x) (similar to cyclic convolution)

11. AES Transformations&Algorithm

12. Inputs and Outputs • Input and output • Sequences of blocks with block length of 128 bits (Nb = 4 words for each block • Cipher key • Sequence of cipher keys with key length of 128, 192 or 256 bits (Nk = 4, 6, or 8 words for each key)

13. Byte Representation • Block length = 128 bits = 16 bytes • Key length = 128, 192 or 256 bits = 16, 24 or 32 bytes • Finite field element representation • Polynomial, {01100011}=x6+x5+x+1 • Hexadecimal representation • {01100011}={63} • One extra bit to the left of a byte • {01}{1b}

14. State: 2-D 4 x 4 array of bytes A state has four rows and Nb columns 1D array of 32-bit words w0, w1, w2, w3 with each word wi composed of a column in the 2-D state State

15. Key-Block-Round

16. Rijndael AES Algorithm (a) Encryption (b) Direct Decryption (c) Modified Decryption

17. Four Transformations in Cipher • SubBytes( ):SB • Nonlinear byte substitution • ShiftRows( ):SR • Cyclically left-shift the last three rows of the state • MixColumns( ):MC • Transformation on each column of the state • AddRoundKey( ):ARK • Each column is XORed with a 32-bit key schedule word generated from the key expansion

18. SubBytes( ) • Take multiplicative inverse (MI) in GF(28): S S-1 • Apply affine transformation (AF) over GF(2) as follows: S’=M·S-1+C (C={63}16) • where S and S’ are input/output bytes in 8-D vector formats

19. Overall Effect of SubBytes( ) • Substitution table (S-box)

20. ShiftRows( )

21. MixColumns( ) • Polynomial multiplication of a fixed term a(x)={03}x3+{01}x2+{01}x+{02} modulo x4+1

23. Key Expansion • For Nk=4 or 6, and i ≠ multiple of Nk − w[i] = w[i-1] ⊕ w[i-Nk] • for i = multiple of Nk − w[i] = transformation1(w[i-1]) ⊕ w[i-Nk] − Transformation 1 contains RotWord(), followed by SubWord(), followed by XOR with Rcon[i] • If Nk=8 and i-4 = multiple of Nk − w[i] = transformation2(w[i-1]) ⊕ w[i-Nk] − Transformation 2 contains SubWord() only

24. Key Expansion Structure: On-the-Fly w(i+2) / w(i+6) w(i+3) / w(i+7) w(i) / w(i+4) w(i+1) / w(i+5) w(i+3) / w(i+3) w(i+4) / w(i) w(i+5) / w(i+1) w(i+6) / w(i+2) w(i+7) / w(i+3)

25. Four Transformations in Inverse Cipher • InvSubBytes( ):ISB • Nonlinear byte substitution • InvShiftRows( ):ISR • Cyclically left-shift the last three rows of the state • InvMixColumns( ):IMC • Transformation on each column of the state • AddRoundKey( ):ARK • Each column is XORed with a 32-bit key schedule word generated from the key expansion

26. InvSubBytes( ) • Apply inverse affine (IAF) transformation over GF(2) as follows: S-1=M-1(S’+c) • Take multiplicative inverse (MI) in GF(28): S-1S • Overall effect: S-1-box

27. InvShiftRows • Cyclically right-shift the last three rows of the state.

28. InvMixColumns( ) • Polynomial multiplication of a fixed term a-1(x)={0b}x3+{0d}x2+{09}x+{0e} modulo x4+1

29. Previous AES Design Methods

30. Optimization Approaches for AES Transformations

31. Three Categories of Transformation Optimization • The optimization of separate transformations. • The optimization of combined round transformations. • The optimization of integrated encryption/decryption transformations.

32. The Optimization of Separate Transformations (1) • Two major transformations: • SB (ISB), MC (IMC) • SB (ISB): • Perform MI (Multiplicative Inverse) in GF(28) followed by AF. • 1. Uses 256x8-bit table look-up ROM (S-box) to store all pre-calculated results. • 2. Changes the calculation of MI in GF(28) to that in the composite field GF((24)2). • 3. Changes the calculation of MI in GF(28) to that in the composite field GF(((22)2)2). • 4. Uses the calculation of MI in GF(28) based on matrix decomposition of A-1.

33. Calculation of Multiplicative Inverse (MI) in GF((24)2) (1.2a) • There are three stages for the calculation of MI in GF((24)2).

34. Calculation of Multiplicative Inverse (MI) in GF((24)2) (1.2b) • Stage 1: • Translate from GF(28) to the composite field in GF((24)2). Expand The implementation of T transformation has area=17AXOR , and delay=3 TXOR.

35. Calculation of Multiplicative Inverse (MI) in GF((24)2) (1.2c) • Stage 2: • Find the MI for the two number in GF(24). where A=(0001)2, and B=(1001)2

36. Calculation of Multiplicative Inverse (MI) in GF((24)2) (1.2d) • Stage 3: • Convert the number in GF((24)2) to the number in GF(28) using T-1.

37. Calculation of Multiplicative Inverse (MI) Using A-1 (1.4) • A-1: • The A-1 (MI) can be calculated by • It requires four GF(28) multipliers, plus one A2 and three A4 components.

38. The Optimization of Separate Transformations (2) • MC (IMC): • 1. Byte-level optimization: Multiplication block (XTime): multiplies a byte with a constant value {02}16 and then reduces the numbers of XTime blocks by different byte-level sharing methods. • Ex1: MC: D”={01}A+{01}B+{02}D+{03}E =A+B+XTime(D)+XTime(E)+E • Ex2: MC: D”={02}(D+E)+(A+B+D+E)+D using {02}D={02}D+D+D, D+D=0

39. The Optimization of Separate Transformations (3) • 2. Bit-level optimization: Common sub-expression elimination algorithm (CSE): extracts the common factors as possible in order to further reduce the hardware cost. • Ex: {02]A={a6, a5, a4, a3+a7, a2+a7, a1,a0+a7, a7} {03}A={a6+a7, a5+a6, a4+a5, a3+a4+a7, a2+a3+a7, a1+a2, a0+a1+a7, a0+a7} The factor a0+a7 appears at 1-th bit of {02}A, and 0, 1-th bits of {03}A can be extracted and replaced with a8=(a0+a7). The factor a3+a7 appears at 4-th bit of {02}A, and 3, 4-th bits of {03}A can also be extracted and replaced with a9=(a3+a7).

40. The Optimization of Combined Round Transformations (1) • Combine SB, SR, and MC in encryption or ISB, ISR, and IMC in decryption. • 1. Table-lookup ROM (T-box or T-1-box):

41. The Optimization of Combined Round Transformations (2) – 2. Combined IMC/ISR/IAF and AF/SR/MC with Shared MI in GF((24)2): (a) Combined AF/SR/MC (b) Combined IMC/ISR/IAF Integration of AES Enc. and Dec. with shared MI in GF((24)2)

42. The Optimization of Integrated Encryption/Decryption Transformations (1) • Two major integrations: • Integration of SB and ISB, integration of MC and IMC. • SB/ISB: • Share the same MI logic in GF(28) but multiplexes the AF and IAF.

43. The Optimization of Integrated Encryption/Decryption Transformations (2) • MC/IMC: • 1. Share the common factor, XTime block, for constructing one output byte of MC and IMC as shown in followed figure. • 2. Decompose the constant matrix of IMC =MC x C. C is a constant matrix as shown in the following equation.

44. The Optimization of Integrated Encryption/Decryption Transformations (3) – 3. Decompose the IMC=MC+F+G. F and G are two constant matrix multiplications. IMC: + + MC F G

45. Our Proposed Substructure Sharing Methods for XOR-based Operations Bit-level Expressions of AES Transformations Proposed Method: Bit-level Substructure Sharing

46. Bit-level Expressions of AES Transformations

47. Bit-level Expressions of AES Transformations • Two kinds of major transformations, SB (ISB), MC (IMC) occupy about 65% of total area cost for implementing AES. • They can be expressed as bit-level XOR-based sum-of-product (SoP) operations. • SB: OutSB=MI+AF • ISB: OutISB=IAF+MI • MI: GF((24)2), GF(((22)2)2) • MC: OutMC={01}A+{01}B+{02}D+{03}E (1-byte output) • IMC: OutIMC={0d}A+{09}B+{0e}D+{0b}E (1-byte output)

48. Two Proposed CSE Algorithms for Sum-of-Product Operations Bit-level SoP Expressions Proposed Method III: Vertical CSE Algorithm Proposed Method IV: Horizontal CSE Algorithm

49. Bit-level Expressions (1) • A group of P bit-level equations (z0, z1, ..., zP-1) with M0 primary input variables (a0, a1, …, aM0-1) and N0 product-terms (w0, w1, …, wN0-1) can be expressed as the following matrix product form:

50. Bit-level Expressions (2) • The N0 intermediate bit variables wi can be expressed as • with where is defined as and ．denotes the bit-wised AND operation.