400 likes | 522 Vues
I15. Using Column-Level Encryption in IDS 10.00. Jonathan Leffler IBM Information Management. Thursday 5 th October 2006 • 09:00 – 10:00. Agenda. What is Column-Level Encryption? Manual Pages Modes of Use Key Management Behind the Scenes Questions and Answers.
E N D
I15 Using Column-Level Encryption in IDS 10.00 Jonathan Leffler IBM Information Management Thursday 5th October 2006 • 09:00 – 10:00
Agenda • What is Column-Level Encryption? • Manual Pages • Modes of Use • Key Management • Behind the Scenes • Questions and Answers
Encryption in Earlier Versions? • IDS 7.3x, IDS 9.2x, 9.30 provide • Password encryption • Password is transmitted encrypted • Rest of session is unencrypted • IDS 9.40 provides • Fully Encrypted Communications • Using encryption technology from OpenSSL • Encrypts all client to server communications • ER server to server communications • Not HDR server to server communications
What is Column-Level Encryption? • Data can be stored in encrypted format. • Using SQL functions to encrypt and decrypt the data. • Data encrypted using either Triple-DES or AES. • Data encrypted under application control. • DBMS is not aware that data is encrypted. • Can set password at session level with new SQL statement. • Design based on DB2 functions. • DB2 supports DES: • But, since July 2004, the US Government does not. • DB2 does not support AES: • AES (aka Rijndael) is the replacement for DES. • Analogous to Oracle’s DBMS Obfuscation Tool Kit.
What is Column-Level Encryption? • Assists in legislative compliance. • HIPAA (Health Insurance Portability and Accountability Act), 1996 • Sarbanes-Oxley (aka Sarbox or Sox), 2002 • Basel II, 2001 • Gramm-Leach-Bliley Act (GLBA), 1999 • California SB 1386 ‘Personal Information: Privacy’, 2002 • Latest cryptographic standards (OpenSSL 0.9.7c) • 128-bit AES and 112-bit Triple-DES
Manual Pages – Overview • Passwords and hints • ENCRYPT_TDES function • ENCRYPT_AES function • DECRYPT_CHAR function • DECRYPT_BINARY function • GETHINT function • SET ENCRYPTION PASSWORD statement • Data space requirements • Type support
Passwords and Hints • Passwords are necessary. • Pass-phrase can be from 6 bytes up to 128 bytes. • Cryptographic hash is used to make random-seeming key. • Random initialization vector also used (and recorded). • Hints are optional. • Hints can be up to 32 bytes of text. • Passwords (and hints) can be set for a session. • SET ENCRYPTION PASSWORD statement. • Hence, the password and hint parameters are optional. • Explicit values override the session default values.
ENCRYPT_TDES • ENCRYPT_TDES(data [ , password [ , hint ] ]) • Input data is encrypted. • For character input data: • Output is base-64 encoded. • For binary input data (BLOB): • Output is unencoded. • Triple-DES encryption • Triple DES uses two 56-bit keys for 112-bits overall. • UPDATE SomeTableSET EncryptedColumn = ENCRYPT_TDES(?, ?) WHERE PK_Column = ?; • Extremely variant function.
ENCRYPT_AES • ENCRYPT_AES(data [ , password [ , hint ] ]) • Input data is encrypted. • For character input data: • Output is base-64 encoded. • For binary input data (BLOB): • Output is unencoded. • AES encryption • Advanced Encryption System (aka Rijndael). • 128-bit key size. • INSERT INTO SomeTableVALUES (?, ENCRYPT_AES(?, ?)) • Extremely variant function.
DECRYPT_CHAR • DECRYPT_CHAR(encrypted_data [ , password ]) • Encrypted data in base-64 encoding contains: • Information about encryption method. • All other data needed to decrypt it. • Except the password! • Error if the data is not encrypted. • SELECTDECRYPT_CHAR(EncryptedColumn, ‘password’) FROM SomeTableWHERE PK_Column = ? • Invariant function.
DECRYPT_BINARY • DECRYPT_BINARY(encrypted_data [, password ]) • Encrypted data contains: • Information about encryption method. • All other data needed to decrypt it. • Except the password! • Error if the data is not encrypted. • SELECTDECRYPT_BINARY(EncryptedByteColumn, ?) FROM SomeTableWHERE PK_Column = ? • Invariant function.
GETHINT • GETHINT(encrypted_data) • Returns the hint (if any) from the encrypted data. • Or an empty string (NULL). • SELECT GETHINT(enc_cc_number) FROM cc_infoWHERE user_id = ?; • Invariant function. • Anybody can get any hint at any time.
SET ENCRYPTION PASSWORD • New SQL statement • SET ENCRYPTION PASSWORD ‘password’ [ WITH HINT ‘hint string’ ]; • Specifies the password that will be used by default. • When no password explicitly provided: • To encryption functions. • To decryption functions. • Optionally specifies the hint that will be used by default: • When no hint provided to encryption functions. • Session wide password management: • For easy programming. • Support for views, triggers, SPL.
Storage Space Requirements • Encrypted data will be stored in character columns. • Needs more space than the unencrypted data. • If input data string is N bytes long: • AES = B64(NGM(N, 16) + H + 8) + 11 • Triple-DES = B64(NGM(N, 8) + H + 8) + 11 • H = 0 with no hint; H = 40 with hint. • NGM(x,y) – Next multiple of y that is greater than x. • NGM(x, y) = ((x + y) ÷ y) × y • B64(x) – Base-64 encoding size. • B64(x) = ((x + 2) ÷ 3) × 4 • AES can be bigger than Triple-DES, but not by much.
Importance of Sizing • You must allocate enough space for encrypted values. • ENCRYPT_AES and ENCRYPT_TDES return strings. • IDS does not know that a column stores encrypted values • Storing an encrypted INTEGER in a CHAR(4) column • IDS will truncate the 43-byte string, and • IDS will store just the first four bytes of the string. • This is futile: • IDS doesn't even have all its control information, • Much less the complete string. • The insert operation and update operations will ‘succeed’, BUT the decrypt operations will fail horribly. • The size calculation is very important. • So is testing.
Data Types Supported • You can encrypt almost all the built-in data types: • CHAR, VARCHAR, NCHAR, NVARCHAR, LVARCHAR • SMALLINT, INTEGER, INT8, DECIMAL, SMALLFLOAT, FLOAT • DATE, DATETIME, INTERVAL, BOOLEAN • BLOB, CLOB • And the various synonyms for these type names • All except BYTE, TEXT • You must store the data in a character column • Usually CHAR(n), possibly a VARCHAR(n) or LVARCHAR(n). • Store encrypted BLOB, CLOB values in BLOB columns.
User-Defined Data Types • Collection types require an explicit cast: • … ENCRYPT_AES(LIST{1,3,2}::LVARCHAR, ‘PassWord’) … • … DECRYPT_CHAR(enc_list, ‘PassWord’)::LIST(INT NOT NULL) … • And occupy more space than you might expect! • Row types need similar treatment. • … ENCRYPT_AES(ROW(1, “abc”, 2.3)::ROW(i INT, c CHAR(10), d DECIMAL(4,2))::LVARCHAR, ‘PassWord’) … • … DECRYPT_CHAR(enc_row, ‘PassWord’)::ROW(i INT, c CHAR(10), d DECIMAL(4,2)) … • Treat DISTINCT types the same as the base type. • User-defined (opaque) types: • Should provide explicit casts to/from LVARCHAR • And you need to use them.
Data Types Supported • IDS is very good at converting data into strings. • CREATE FUNCTION make_string(x VARCHAR(255)) RETURNING VARCHAR(255); • RETURN x; • END FUNCTION; • EXECUTE FUNCTION make_string(CURRENT YEAR TO FRACTION(5)); • EXECUTE FUNCTION make_string(2.718281828); • These calls both return strings • The ENCRYPT_AES and ENCRYPT_TDES functions are similar.
Data Types Supported • The size of the converted string dictates the size of encrypted data. • INTEGER -2,000,000,000 requires 12 characters Encrypted size is 43 bytes without hint. • Approximately 11:1 expansion in storage requirement. • DATETIME YEAR TO FRACTION(5) requires 25 characters • Encrypted size is 55 bytes for Triple-DES, 67 for AES. • BOOLEAN, SMALLINT, DECIMAL(5,2) require less than 8 characters • Encrypted size is 37 bytes for Triple-DES, 43 for AES. • DECIMAL(6,2) requires 8 characters (-1234.56) • Encrypted size is 43 bytes.
Performance Impact of Encryption • Comparing ‘apples to apples’ is hard. • An accurate comparison consists of: • IDS encrypting and decrypting data sent unencrypted by client • versus • Client encrypting and decrypting data sent encrypted to IDS. • Unfortunately, that requires benchmarking an application with cryptography built in. • Can be done, but is fiddly. • So, everybody compares ‘apples to oranges’: • IDS encrypting and decrypting data • versus • IDS not encrypting and decrypting data.
Performance Impact of Encryption • The performance impact of encryption is significant. • It depends on direction: • Encrypting is slower than decrypting. • It does not depend measurably on algorithm: • AES performs at the same speed as Triple-DES. • It does depend on data size: • The relative overhead is less when there is more data to encrypt. • Do not use encryption just because it is sexy. • Use it where there is a demonstrable business or legal need.
Example – Credit Card Numbers • Credit card number plus expiry date • 20 digits + 5 punctuators • “4567-1234-2345-3456□01/99” • Needs 67 characters without hint; 119 with hint. • Repeat these comparisons on your machine! • Mine is an old, small, slow Sun UltraSparc 10: • Solaris 8 • Single CPU at 333 MHz • 256 MB • Single user • No /dev/random or /dev/urandom • Running several IDS servers, Apache, etc.
Example – Credit Card Numbers • Without hints – batch mode processing – 5000 rows. • INSERT INTO NewTableSELECT …, ENCRYPT_TDES(OtherColumn), …FROM OtherTable; • Encryption performance: • 424 µs per row (without encryption) • 3601 µs per row (with Triple-DES encryption) • Ratio: 8.5:1 • Cost: 3200 µs per call. • Even a trivial SPL procedure called in place of encryption levels the playing field a lot.
Example – Credit Card Numbers • Without hints – batch mode processing – 5000 rows. • SELECT …, DECRYPT_CHAR(EncryptedData), …FROM NewTable; • Decryption performance: • 155 µs per row (without decryption) • 1285 µs per row (with Triple-DES decryption) • Ratio: 8.3:1 • Cost: 1100 µs per call. • Decryption costs about 1/3 what encryption costs. • Major component of encryption cost: • Generating cryptographically random number. • Just as well you’ll normally do more decryption than encryption.
Modes of Use • Column-Level Encryption is an enabling technology • You decide how you are going to use it. • Kevin Kenan, ‘Cryptography in the Database’, Symantec Press (Addison-Wesley), 2005. • ISBN 0-321-32073-5 (http://www.kevinkenan.com/). • Peter Wayner, ‘Translucent Databases’, Flyzone Press, 2002. • ISBN 0-12-738671-8 (http://www.wayner.org/books/td/). • Two main modes of operation: • Web mode: • Different keys for each row of data. • MIS mode: • Same key for each row of data.
Web Application • Think of credit card numbers on a web site. • Different keys for each row of data. • Hints are important. • Key management is not a major problem. • You do not store the key (password) for the user. • If the user forgets their password, the data is re-enterable. • Or you get into more fancy schemes: • Encrypt user’s passwords with known key. • But these systems are generally less secure. • SET ENCRYPTION PASSWORD is irrelevant.
MIS Applications • Same key for each row of data • Hints are irrelevant • Key management is critical. • SET ENCRYPTION PASSWORD is critical. • You might be better off coding with a temporary table: • Contains one row of data – the password. • Join with that table when you need encrypted data. • Avoids revealing the password in SQL.
Key Management • IDS does not do any key management. • Keys are handled outside the DBMS. • You can store keys in the DBMS if you want to. • Securing them is your problem. • Probably encrypted with a single high-security password.
Without Changing Applications? • You can do it, • But it is not necessarily a good idea. • Applications do not work as fast as without encryption, • But it will run about as fast as if you rework it with encryption. • So, many people will do it. • The application will use more data space. • But you won’t be using hints. • The easiest approach uses more space than necessary. • And more encryption and decryption operations. • It is a bad idea to use encrypted data as keys. • Does the SSN get used as a key? • Encrypted keys are very much more difficult to handle. • Can you use a hash checksum instead?
Techniques – Schema Changes • Changes to database schema: • Rename all tables containing encrypted data. • Change data types for columns that must be encrypted. • CHAR data type. • Expanded size. • Create views with the old table names: • CREATE VIEW oldname(key, col2, col3) AS SELECT key, DECRYPT_CHAR(col2)::type1, DECRYPT_CHAR(col3)::type2 FROM newname; • The hard part is setting the encryption password!
Techniques – Triggers • Create INSTEAD OF triggers on views. • CREATE TRIGGER ti_on INSTEAD OF INSERT ON oldname REFERENCING NEW AS new FOR EACH ROW (INSERT INTO newname VALUES (new.key, ENCRYPT_AES(new.col2), ENCRYPT_AES(new.col3))); • CREATE TRIGGER tu_on INSTEAD OF UPDATE ON oldname REFERENCING OLD AS old NEW AS new FOR EACH ROW (UPDATE newname SET (key, col2, col3) = (new.key, ENCRYPT_AES(new.col2), ENCRYPT_AES(new.col3)) WHERE key = old.key); • CREATE TRIGGER td_on INSTEAD OF DELETE ON oldname REFERENCING OLD AS old FOR EACH ROW (DELETE FROM newname WHERE key = old.key);
Behind the Scenes • The encrypted data contains: • Which encryption algorithm was used. • A random initialization vector (IV). • The encrypted data. • Optionally, the hint. • The IV ensures randomization: • If the same data is encrypted with the same key, • The encrypted data is different. • Assuming the IV is different (and it ‘always’ is). • Text data is converted to Base-64 encoding. • Binary data is not Base-64 encoded. • More compact.
Behind the Scenes • Session password in shared memory is encrypted. • Constant password in shared memory is encrypted. • “onstat –g sql” will display not constant password. • Encryption VPs • Generating (cryptographically) good random numbers can block. • Define multiple ENCRYPT VPs in ONCONFIG • VPCLASS encrypt,num=3 • Add or drop encryption VPs online • onmode –p +1 encrypt • onmode –p –1 encrypt • If you don’t define encryption VPs and use encryption • One encryption VP is added automatically.
Gotchas • Encrypting BLOB data requires the correct configuration • SBSPACE set in ONCONFIG file • (SYSSBSPACE set in ONCONFIG file, too) • If omitted, the errors returned in the initial release are unhelpful. • B171117 addresses the poor error messages. • Data export and import is tough • Do you export encrypted data? • Import will work fine – same passwords as before. • Platform neutral. • Do you export decrypted data? • Do you have the passwords (MIS mode, yes; Web mode, no?) • You risk exposing data that should be encrypted.
Gotchas • You can mix encrypted and unencrypted data • In a single column: • INSERT INTO SomeTable VALUES(…, ‘abc’, …); • INSERT INTO SomeTable VALUES(…, ENCRYPT_AES(‘abc’), …); • It is usually a very bad idea to do so. • Perhaps one program is unaware of the encrypted data. • Or someone is manually inserting data. • Multi-row select statements with decryption will fail • SELECT …, DECRYPT_CHAR(xyz), … FROM SomeTable; • Rows with unencrypted data generate an error • Web mode applications cannot do this either • Unless you can provide the different password for each row
Things to Avoid • Do not index encrypted columns. • You should not be searching for encrypted values. • Indexed encrypted columns only usefully test for equality. • Use hash function for columns that must be indexed. • There is no built-in function to do it (GUID blade?) • Do not create a functional index on the decrypted data. • In a web-style application, you won’t have the key. • In a MIS-style application, you will be storing the data unencrypted after all. • Avoid using encrypted columns as keys for tables. • Do not ask Tech Support to retrieve your passwords. • It can’t be done!
Session I15 Using Column-Level Encryption in IDS 10.00 Jonathan Leffler IBM Information Management jleffler@us.ibm.com