200 likes | 301 Vues
Learn about the fundamentals of cryptography and the implementation of RSA encryption. Explore key concepts in number theory, key choice, and mathematical foundations of RSA. Discover the cryptographic processors and recommended readings in the field.
E N D
Lecture5 – Introduction to Cryptography 3/ Implementation Rice ELEC 528/ COMP 538 Farinaz Koushanfar Spring 2009
Rivest, Shamir, Adelman (RSA) • Number theory + difficulty of determining prime factors of a large number • Two keys d and e are used for encryption and decryption • Plaintext message P is encrypted to ciphertext C C=Pe mod n • The plaintext is recovered by P=Cd mod n • Encrypt/decrypt are mutual inverses and commutative P=Cd mod n = (Pe)d mod n = (Pd)e mod n
RSA – Key Choice • Starting point: select a value for n • Product of two large primes p and q – they are ~100 digits n is ~200 bits • A relatively large e is selected that is relatively prime to (p-1)*(q-1), one easy way is to select e to be larger prime than both (p-1) and (q-1) • Finally, d is selected such that e*d= 1 mod (p-1)*(q-1)
Mathematical Foundation • The Euler totient function(n)is the number of positive integers less than n relatively prime to n, if p is prime, then (p)=p-1 • If n=p*q, where p and q are both prime (n)=(p)*(q)= (p-1)*(q-1) • Euler and Fermat proved that x(n) =1 mod n For any integer x, if n and x are relatively prime
Mathematical Foundation -- RSA • Encrypt by RSA: E(P)=Pe • Value of e is selected s.t. the inverse d can be easily formed (inverses mod (n)) e*d=1 mod(n) • Or, e*d=k*(n)+1 for some int k • Because of Euler/Fermat results, assuming P and p are relatively prime Pp-1=1 mod p
RSA Math (Cont’d) • Since (p-1) is a factor of (n) Pk*(n)=1 mod p • Multiplying by P produces Pk*(n)+1=P mod p • The same is true for q: Pk*(n)+1=P mod q (Pe)d = Ped =Pk*(n)+1=P mod q=P mod p • Thus, (Pe)d = P mod n • And e and d are inverse operations
Crypto Processors • There are many many HW implementations of the standard security protocols, e.g., AES, DES, PKP • Please check: http://www.hardware-ciphers.com/en/index.html • Our goal is not to design a new one, or to teach you to design a new one, but to show to you how implementations look • What are the basic building blocks, what are the potential weaknesses/vulnerability of each block
Recommended reading • A. Hodjat, I. Verbauwhede. Minimum area cost for a 30 to 70 Gbits/s AES processor. IEEE Computer society Annual Symposium on VLSI, pp. 83- 88, 2004. • T. Good and M. Benaissa. AES on FPGA from the fastest to the smallest, 2005. • L. Batina, S. Berna Ors, B. Preneel and J. Vandewalle. Hardware architectures for public key cryptography, 2003.
Minimum Area Cost for a 30 to 70 Gbits/s AES Processor Alireza Hodjat Ingrid Verbauwhede Department of Electrical Engineering University of California, Los Angeles {ahodjat, ingrid} @ ee.ucla.edu IEEE Computer Society Symposium on VLSI (ISVLSI 04) February 2004 This material is based upon work supported by the Space and Naval Warfare Systems Center - San Diego under contract No. N66001-02-1-8938.
Outline • Motivation • Ultra high throughput AES implementation • Area efficient byte substitution • High speed AES with online key scheduling • High speed AES with offline key scheduling • Conclusion
Motivation • Cryptographically secure random number generation for optical link switches • Advanced Encryption Standard algorithm in the Counter mode of operation • Non-feedback mode of operation (pipelining is allowed)
Ultra High Throughput AES • The key length • Critical path is in the Key scheduling path • Fixed key size : only 128-bit • Loop-unrolling • Pipelining • Inner round pipelining • Outer round pipelining • Choice of byte-substitution phase • LUT implementation • Implementation using GF operations (further pipelining)
Byte substitution optimization • Byte substitution on GF(28) • First: multiplicative inverse in GF(28) • Second: Affine transformation (over Gf(2)) • Multiplicative inverse in GF(28) is expensive • Area efficient implementation using GF(24) operations
a : Byte substitution using LUT implementation b : Non-pipelined Sbox using GF operations c : Two-stage pipelined Sbox using GF operations d : Three-stage pipelined Sbox using GF operations Area Efficient Byte Substitution
Area-Delay Trade-off for Sbox • The area cost of the Sbox using two-stage and three-stage composite field implementation is 23% and 32% less than the LUT design with the same speed
2 pipeline stages per round 3 pipeline stages per round 4 pipeline stages per round High Speed AES with Online Key Scheduling
Throughput-Area Trade-off for AES • Area cost for the design with three pipeline stages is 35% less than the design with LUT Sbox implementation • Area cost for the design with four pipeline stages is 30% less than the design with LUT Sbox implementation
High Speed Design with Offline Key Scheduling • Key does not vary as frequent as data • Pre-calculate the key schedule and store them in the round key registers • Key schedule is done in 20 cycles
Throughput-Area Trade-Off • Offline key scheduling unit can reduce the area up to 28 %. • Area cost for the design with three pipeline stages is 37% less than the design with LUT Sbox implementation • Area cost for the design with four pipeline stages is 33% less than the design with LUT Sbox implementation
Conclusion • Area efficient architectures for 30 to 70 Gbits/s AES processor • Loop unrolling and inner and outer round pipelining were used • Pipelined design of composite field implementation of the byte substitute phase reduces the area cost up to 35% • Offline key scheduling unit reduces the area cost up to 28% • Total area cost of the final architecture was reduced up to 48%