260 likes | 293 Vues
IVEC: Off-Chip Memory Integrity Protection for Both Security and Reliability. Ruirui Huang, G. Edward Suh Cornell University. Motivation. ECC Parity. Random Transient Errors. Processor. Off-chip Memory. ECC. IV. Malicious Attacks. IV Hash.
E N D
IVEC: Off-Chip Memory Integrity Protection for Both Security and Reliability Ruirui Huang, G. Edward Suh Cornell University
Motivation ECC Parity Random Transient Errors Processor Off-chip Memory ECC IV Malicious Attacks IV Hash Twice the overhead for random error detection!! It’s easy to compute the ECC parity bits for the injected attack data. Execution is aborted when IV fails.
IVEC – Integrity Verification with Error Correction • Goal: • Extend IV to correct errors while ensuring a proper level of security • Cover both single-bit and multi-bit errors • Challenge • Error correction is essentially finding the erroneous bits • Cryptographic hash in IV does not reveal error locations Can we extend the capability of IV to handle both security and reliability errors with minimal overheads? 3
Outline • Background • ECC • Integrity Verification (IV) • IVEC error correction • Single-bit errors • Multi-bit errors • HW Implementation • Evaluation
ECC (SEC-DED) DRAM 6 DRAM 1 DRAM 2 DRAM 3 DRAM 5 DRAM 8 DRAM 7 DRAM 4 DRAM 10 DRAM 11 DRAM 12 DRAM 13 DRAM 14 DRAM 15 DRAM 16 DRAM 9 • In general, a modern system uses (72, 64) SEC-DED ECC • For every 64-bit data, 8 additional parity bits are needed • Memory space and bandwidth overheads of 12.5% • Correct 1-bit errors Two extra DRAM chips for 8-bit parity of ECC ECC DIMM (18 x4 DRAM chips) DRAM 17 DRAM 18 72-bit SEC-DED ECC Word • ECC can be extended to correct common multi-bit errors • Chip-kill correct: correct up to one DRAM chip failure 5
Cryptographic Hash • IV relies on cryptographic hash to detect any changes on data saved in an un-trusted memory • Fixed length “finger print” of the data • Collision resistance is a key property • Message Authentication Code (MAC) is a keyed cryptographic hash that can also be used for IV Hash (h) On data access, check if h == H(d) Data (d)
h1 h1 h1 h2 h2 h2 h3 h3 h3 h4 h4 h4 hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash hash IV - Hash/MAC Trees • Integrity verification techniques often rely on hash/MAC trees • Any changes in data memory would be detected H(h1 || h2 || h3 || h4) In processor root hash Previous works suggest that IV’s performance overhead is only 2-5% when using Cached MAC Trees In off-chip memory Size of a cache block Size of a cache block Protected data in memory Protected data in memory 7
Outline • Background • ECC • Integrity Verification (IV) • IVEC error correction • Single-bit errors • Multi-bit errors • HW Implementation • Evaluation
Single-bit Error Model DRAM 16 DRAM 1 DRAM 1 DRAM 16 • A single-bit error in a cache block (64B) • Error is detected by checking the computed hash value to the stored hash value on-chip DIMM4 DIMM1 1st Read-block (256 bits) 2nd Read-block (256 bits) • 64B cache block, 256-bits per read-block (2 read-blocks required to fill 1 cache block) 9
Single-bit Error Correction DRAM 16 DRAM 1 DRAM 1 DRAM 16 • Correction as searching problem • Flip one bit at a time for all possible combinations, and check if the new value passes the integrity verification 1 0 0 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 0 1 1 1 0 0 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 DIMM4 DIMM1 1st Read-block (256 bits) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Corrected! 1 1 1 1 1 1 1 1 2nd Read-block (256 bits) 1 1 1 1 1 1 1 1 • 64B cache block, 256bits per read-block (2 reads required to fill 1 cache block) 10
Multi-bit Error Model DRAM 16 DRAM 1 DRAM 1 DRAM 16 • Any bits in one DRAM chip can fail in each read-block • Similar to chip-kill correct DIMM4 DIMM1 1st Read-block (256 bits) 2nd Read-block (256 bits) • 64B cache block, 256bits per read-block (2 reads required to fill 1 cache block) 11
IVEC Error Correction with Parity DRAM 16 DRAM 1 DRAM 1 DRAM 16 P5 P5 P5 P5 P6 P6 P6 P6 P7 P7 P7 P7 P8 P8 P8 P8 • Each parity bit covers one bit from every DRAM chip in a read-block • x4 DRAM: 4 parity bits per read-block DIMM4 DIMM1 1st Read-block (256 bits) P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 2nd Read-block (256 bits) P1 P3 P4 P2 • 64B cache block, 256bits per read-block (2 reads required to fill 1 cache block), 8 parity bits 12
IVEC Correction with Parity DRAM 16 DRAM 1 DRAM 1 DRAM 16 P5 P5 P5 P1 P1 P1 P1 P5 P2 P2 P2 P6 P2 P6 P6 P6 P3 P3 P7 P7 P3 P3 P7 P7 P8 P8 P4 P4 P8 P8 P4 P4 • Use parity bits to guide our correction search • Correction scheme can be extended with more or fewer number of parity bits 1 0 0 0 1 0 0 1 1 0 0 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 1 1 1 1 1 0 1 0 0 0 DIMM4 DIMM1 1st Read-block (256 bits) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2nd Read-block (256 bits) 1 1 1 1 1 1 1 1 Corrected! • For hard faults, start searching from recent error locations • 64B cache block, 256bits per read-block (2 reads required to fill 1 cache block), 8 parity bits 13
Parity Handling • Parity bits are stored in regular memory space • Parity bits are not needed for reads unless there is an error • They are only updated on write-back operations • Decoupled error detection and correction • A parity cache can be used to load and store parity bits when necessary
Outline • Background • ECC • Integrity Verification (IV) • IVEC error correction • Single-bit errors • Multi-bit errors • HW Implementation • Evaluation
IVEC Hardware Implementation • Blue – new blocks for IVEC • Yellow – already exist in a system with IV Parent MAC from cache Counter Cache AES LDQ L2 Cache MACQ Parity Cache To memory Check GF Multiply From memory IV Queue To L2 Result to control Data Queue IVEC Control Correction Buffer 16
Outline • Background • ECC • Integrity Verification (IV) • IVEC error correction • Single-bit errors • Multi-bit errors • HW Implementation • Evaluation
Error Detection • IV detects any error pattern unless there is a hash/MAC collision • Error detection probability depends on the length of the hash/MAC • ↑ hash/MAC length, ↓ collision rate • For example, 64-bit MAC has 1/264 collision rate
Error Correction • Mis-correction happens if there is a hash/MAC collision on a correction attempt • Every time a hash is recomputed for a possible correction (correction attempt), there is a chance of a collision • ↑ number of correction attempts, ↑ mis-correction rate • Security is weakened by correction attempts • An integrity violation is not detected on a mis-correction • ↑ number of correction attempts, ↓ security • Correction latency • GMAC: 4-8 cycles per correction attempt
Worst-Case Numbers • Maximum number of correction attempts Security is reduced by ~8-bit (64bits->56bits) Max correction latency: 4096cycles Security is reduced by ~12-bit (64bits->52bits) Max correction latency: 32768 cycles 512-bit cache block, 256-bit read-block 20
Memory Space Overhead • ECC: 64 parity bits per cache block (512 bits) • IV: 64-bit MAC per cache block (512 bits) in a MAC tree structure plus meta-data 21
Performance Evaluation • Run-time overheads • Error correction latency: negligible with a typical SER rate • Performance overhead due to off-chip bandwidth usage from updating parity bits • Tools • Pin instrumentation tool and TAXI performance simulator • Parameters • Core2-like single processor: 4-issue OoO core • Baseline is chosen to have IV implemented • 64-bit GMAC-tree with split counter mode (< 5% overhead)
Memory Bandwidth Overhead • Traditional ECC bandwidth overhead is 12.5% • IVEC Memory bandwidth overhead is <= 9% in the worst case • Performance overhead is negligible (0.5% in the worst case) 9% 3.2% 23
Related Work • Memory integrity verification • Off-chip DRAM ECC • SEC-DED ECC • Chip-kill Correct • Tiered ECC • Reliability and Security Engine (RSE) 24
Conclusion • IVEC enables efficient protection of off-chip memory from both security attacks and random errors • Can handles both single-bit errors and multi-bit errors • Minimal impact on security • IVEC is able to eliminate the use of traditional ECC for off-chip memory when a system requires IV for security 25