Loud and Clear: Human Verifiable Authentication Based on Audio

Michael Goodrich Michael Sirivianos John Solis Gene Tsudik Ersin Uzun Computer Science Department University of California, Irvine July 5, 2006 @ ICDCS, Lisbon Loud and Clear: Human Verifiable Authentication Based on Audio

Problem Wallet phone Goal:Establish a secure channel between devices that lack prior association

Problem Eve can launch man-in-the-middle attack Goal:Establish a secure channel between devices that lack prior association

Challenges • Human-assisted authentication • Involve human user in the process • No pre-established shared secrets • No on-line or off-line authority • no common PKI, TTP, etc • Support for multiple communication media • e.g., Bluetooth, Infrared, 802.11, etc • Limited computational resources on portable devices

Outline • Related work and our motivation • Our Solution • System overview • Sample use scenarios • Use types • Vocalizable representations • Unidirectional authentication • Implementation and performance • Conclusions

Related work-Secondary Channels • Stajano et. al. [Security Protocols ‘99] • Use a physical link a secondary authentication channel • Not all devices have suitable interfaces • Balfanz et. al. [NDSS ‘02] • Uses an infrared link as secondary channel • Still susceptible to man-in-the-middle attack

Related work–Human verifiable channels • Maher [US Patent, ‘95] • Users compare 4 hex digit truncated hash of the shared key. Not enough bits for security. • Cagalj et. al. and Laur et. al. • Commitment-based short authenticated string schemes • 20 bit verification code is sufficient for security • Do not address verification code representation

Related work–Textual representations • Haller [S/KEY, RFC 1760] • Textual representation of cryptographic strings • Pass-phrases not auditorially robust nor syntactically-correct • Juola and Zimmermann [PGPfone, ICSLP ‘96] • Uses auditorially robust word list. Not syntactically- correct, hard for human users to parse it

Related work-SiB • Human readable visual hashes • Cumbersome task • High error rate • McCune et al [Oakland ‘05] • Seeing is Believing • Uses camera phones and bar codes to create a visual secondary channel • The visual channel is not always plausible

Motivation • Many personal devices not equipped with cameras • Cameras unsuitable for visually-impaired users • Bar-code scanning requires ample light and sufficient proximity between devices • Camera-equipped devices typically prohibited in high-security areas

Loud and Clear • Audio channel for human-assisted authentication of un-associated devices • Derive a robust-sounding, syntactically-correct sentence from a hash of a public key • Vocalize the sentence • L&C couples vocalization and/or display of the public authentication object on two devices • Suitable for secure device pairing

Personal Device Target Device Printer or FAX: speaker & small display Cell phone: speaker & small display Handheld/PDA: speaker & display Base Station: no speaker & no display Smart Watch: tiny speaker & tiny display Mutual authentication possibly required MP3 player: audio out & no display Sample use scenarios

L&C use types TYPE 2: Hear audible sequence from target device, compare to text displayed by personal device TYPE 1: Hear and compare two two audible sequences, one from each device TYPE 3: Hear audible sequence from personal device, compare it to text displayed by target device TYPE 4: Compare text displayed on each device

Device requirements Device requirements for various use types

Vocalizable representations Represent the authentication object as a syntactically- correct, auditorially robust sentence • Generate a non-sensical, English-like sentence (MadLib) from the output of a one-way hash • S/KEY-based word generation. • Divide truncated hash into 10-bit sections • Use each 10-bit section as index into a catalogue • One catalogue for each part of speech, e.g., verb, noun etc • Number of 10-bit sections = number of words contributing entropy in the MadLib sentence

Vocalizable representations Within a catalogue, no two words sound the same • Create auditorially robust word lists for each catalogue, based on PGPfone’s phonetic distance Second pre-image resistance • For ephemeral Diffie-Hellman key agreement • 5 S/KEY-generated words needed • For one-year-term Diffie-Hellman public keys • 8 S/KEY-generated words needed

Vocalizable representations Within a catalogue, no two words sound the same • Create auditorially robust word lists for each catalogue, based on PGPfone’s phonetic distance Second pre-image resistance • For ephemeral Diffie-Hellman key agreement • 5 S/KEY-generated words needed • For one-year-term Diffie-Hellman public keys • 8 S/KEY-generated words needed CALLIE FLEXIBLY ownsFLUFFYBINTURONGs that ABUSE.

Auditorially robust word lists Using PGPfone’s phonetic distance, create auditorially- robust word lists, unique for each catalogue Construct a large set C of candidate words. Select a random subset W of 2k words from C, where k is the number of hash bits we wish to have this type of word represent. Repeatedly find the phonetically closest pair (p, q) of words in W and replace q with a word from C - W whose distance to any word in W is more than distance(p, q), if such word exists.

Auditorially robust word lists (2) Order W so that each pair of consecutive words in W are as distant as possible. Assign integer values to words in W, so that consecutive integers differ in exactly one bit but their respective code words are distant.

Unidirectional authentication • Step 1: • PDA and fax send to each other their Diffie-Hellman public keys

Unidirectional authentication • Step 2: • PDA and fax compute the MadLib for fax’s public key

Unidirectional authentication • Step 3: • Alice instructs PDA and fax to speak the MadLibs out

Unidirectional authentication • Step 3: • Alice instructs PDA and fax to speak the MadLib out

Unidirectional authentication • Step 4: • Alice compares the MadLibs

Unidirectional authentication • Step 5: • Alice instructs the devices to compute the secret key

Implementation Programming System • Built on the highly portable Ewe Java VM Text-to-Speech Engine • Can utilize a variety of portable TTS engines • Prototype uses Digit for PC and Pocket PC, which uses the Elan Speech Engine • Porting Sun’s clean Java FreeTTS and JSAPI to Ewe

Implementation Crypto API • Ported Bouncy-Castle lightweight crypto package to implement DH- and RSA-based key agreement Memory utilization • Digit and Ewe program reside on ~10800 KB

Performance Processing times (in milliseconds) of L&C operations • PC: 1.7 GHZ/2MB, 512MB RAM • iPAQ: 206 MHZ, 32 MB RAM • 10 word MadLib, 7 of whichS/Key generated

Performance Excluding initialization and shared secret computation: • ~12 secs for TYPE 1 unidirectional session • ~7 secs for TYPE 2 unidirectional session With a commitment-based SAS protocol: • Number of S/Key generated words can be reduced to only 2! • ~6 secs for TYPE 1 unidirectional session

Conclusions • Loud-and-Clear (L&C)for human-assisted device authentication • Light burden for human user • Based on the audio channel • Uses a TTS engine to vocalize a robust-sounding, syntactically-correct word sequence derived from some authentication object • Discussed some anticipated use cases • Provided experimental results for a prototype • Formal and comprehensive usability studies in progress

In case you wonder … Binturong • a.k.a Bearcat • Leaves in the forest canopy, of southeast Asia, Borneo Vietnam, Malaysia, Indonesia • Belongs in the Viverridae family • Endangered 

Thank you Latest paper version available at: www.ics.uci.edu/~msirivia/publications/icdcs.pdf Questions?

Cagalj et al, DH-SC

Performance Timings (in ms) for L&C sessions

Loud and Clear Loud and Clear (L&C) system. • Light burden for the human user • Uses spoken natural language for human-assisted authentication • Suitable for secure device pairing • e.g., key exchange or similar tasks

Loud and Clear: Human Verifiable Authentication Based on Audio

Loud and Clear: Human Verifiable Authentication Based on Audio

Presentation Transcript

Message Authentication and Hash Functions

TAXCO, MEXICO

Authentication

Maximizing Human Potential Through Strengths-Based Leadership in Higher Education

Identity Authentication

Computer and Information Security

Message Authentication

Chapter 6 Attacking Authentication

While you are waiting. . .

Correctness Proofs and Counter-model Generation with Authentication-Protocol Logic

THE AUDIO-LINGUAL METHOD

Human Rights and Human Rights Based Approach

Embedded Audio Coder

Do’s and Don’ts of Client Authentication on the Web

Signing Out Loud: Working with the Deaf Community

Nicolas T. Courtois - U niversity C ollege L ondon

Introduction to audio signal processing

Applied Cryptography

Brief Guide to msharpmath ( cemmsharp )