1 / 27

Merkle Tree Traversal in Log Space & Time

Merkle Tree Traversal in Log Space & Time. Michael Szydlo, RSA Eurocrypt 2004 May 6, 2004. Presentation overview. Review of Merkle Authentication Trees Define the Traversal Problem Describe classic traversal technique

kiora
Télécharger la présentation

Merkle Tree Traversal in Log Space & Time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Merkle Tree Traversal in Log Space & Time Michael Szydlo, RSA Eurocrypt 2004 May 6, 2004

  2. Presentation overview • Review of Merkle Authentication Trees • Define the Traversal Problem • Describe classic traversal technique • Present new, space-efficient algorithm • Concluding comments

  3. Merkle trees • Introduced by Ralph Merkle, 1979 • “Classic” cryptographic construction • Involves combining hash functions on binary tree structure • A public-key authentication scheme • Using only one-way hash function as building blocks • No number theory or trapdoor permutations • Also public-key signatures (Lamport’s one-time signatures) • Theoretical and practical contexts • Receive less practical attention today due to (e.g, RSA, DSA) • Not terribly inefficient. No number theory – advantage? • Our contribution • Re-examine efficiency aspects of construction • New algorithm - answer an “old question” about Merkle trees

  4. Merkle tree data structure • Binary tree, nodes are assigned (e.g. 160 bit) values • Extra, secret values associated to each leaf. xxxxxx Interior nodes • v=Hash( vleft || vright ) xxxxxx xxxxxx leaves xxxxx xxxxx xxxxxx xxxxx • vi =Hash( si ) xxxxxx xxxxxxx xxxxxx xxxxxxx si secret

  5. A Public / Private key pair • How to generate a public key pair • Select a random (e.g 160 bit) secret S • Derive leaf secrets si = PRF(S || i ) • Use hash function to get leaf / interior node values • Publish root value as P • Key generation has a cost • Tree of height H has N= 2H leaves • Nodes at height h will depend on 2h leaf values • Obtaining P requires calculating all N leaf values plus 2H-1 more hash function evaluations

  6. Authenticating a secret • Prover wishes to reveals si to identify herself • Prover sends i,si (each secret used just once) • Additional data required:”sibling node” values • Verifier checks si against the public key P • Hash first si • Hash result together with its sibling in tree • Repeat, moving up tree • Check result with root

  7. Sibling node values required xxxxxx Root value is public H Sibling nodes required to authenticate secret xxxxxx xxxxxx H xxxxx xxxxx H xxxxxx • Verify secret value by hashing, then hashing together with sibling, etc. • Accept if you match with the root value s0

  8. Digital signatures, too • Use up 1 leaf per authentication • Digital Signature– use multiple leaves • Extends Lamport’s one-time signature scheme • Want to sign m = (m0, m1,… m159) • Requires 160 pairs of secrets {si ti} • si included in signature if mi =0. Otherwise ti is. • Verification requires sibling nodes, as above • Merkle construction provides signatures • Security intuitive, how about efficiency?

  9. Efficiency questions • Tacit assumption - all node values saved. • A useful Merkle tree has many leaves! • E.g., N= 230 allows many authentications / signatures. • Not practical for a weak prover! • Store all node values? – too much space! • N= 2H leaves, N-1 interior nodes • Recalculate from scratch? - too much time! • Interior node near the top requires 2H-1 Hash operations

  10. The traversal problem • Formulate efficient Prover algorithm. • Must output authentication data for each leaf, in sequence: (on round i, si with associated sibling nodes) • Prover has limited memory • Prover should compute few Hash values per round • Metrics • Space: 1 Unit = 1 stored node value • Time: 1 Unit = 1 leaf calc. or 1 interior node calc. • Note - this analysis fixes the security parameter.

  11. Traversal challenge Higher node – used for 220 rounds, costs ~221 …………………………………………… …………………………………………… Lower node – used for 25 rounds, costs ~26 ( Note ‘per round’ cost is <2 )

  12. Merkle’s amortization technique • Used space-efficient node computation • Costly nodes computed over many rounds • Form of the algorithm – on each round • Output si with sibling values • Discard “expired” sibling values • For each height, working on preparing “upcoming” sibling • Upcoming values should be ready on time • Merkle’s result for tree with N=2H leaves • O(log(N)) = O(H) time per round. • Space bounded O(log(N)2) = O(H2)

  13. TREEHASH • Calculate a height h node using space= h+1 • Simply erase values no longer required • Adding leaf or internal node is 1 “unit” of work • Evolving set of stored node – call tail nodes • Example with h=3

  14. Merkle’s amortization (2) • Prover’s initial internal state • Contains Current and Next sibling value for each height h<H • Prover’s internal state (later points) • Contains Current sibling value for each height h<H • For each height, contains Next sibling, OR a partial TREEHASH computation for Next. • Per-round update procedure • Output leaf secret and Current sibling nodes • Discard “expired” sibling nodes, promote Next to Current • Spend maximum 2 units of work towards the TREEHASH procedure for each height

  15. Merkle’s amortization (3) • Nodes are ready on time • 2 units per round is enough • The cost of 2h+1 spread over 2h rounds • Time per round linear in tree-height • O(log(N)) = O(H) time per round. • Total Space quadratic in tree-height • Each height TREEHASH may be in progress. • Space for TREEHASH < 1+2+3+……H • Space bound - O(log(N)2) = O(H2)

  16. Recap of classic traversal • Merkle’s Solution indeed satisfactory • Medium / Large Merkle trees practical • Less efficient than number theory approaches • Security properties transparent • No random oracles, etc • Conjecture classic traversal is “optimal”?

  17. Related work • Time-space trade-off. RSA’03 • Jakobsson, Micali, Leighton, Szydlo • Idea use “sub trees” of height T • Speed up Prover by a factor of T ! • Increases space by a factor of 2T

  18. This work • New traversal algorithm • Still O(log(N)) time • Space required reduced to O(log(N)) • This is optimal in sense • Space at least O(log(N)) - easy to see • No traversal algorithm has both • If time < O(log(N)) • space= O(log(N)) • Proof in paper

  19. Motivation for improvement • Tails of Concurrent TREEHASH computations • Graphic reminder of why space is O(log(N)2) Tail at height h - up to h+1 values up to h tail pebbles up to h-1 tail pebbles Many tails contain pebbles at the same height. Can this be avoided ?

  20. Wasteful concurrent computation • Example - two TREEHASH instances. • Each must compute a node value at height 3 as a sub-goal • Assume start at same time • Classic traversal – 2 units of work to each • Maximum space 4+4 =8 • Re-allocate 4-units/per round • Complete first, then do second • Maximum space 1+4 =5 • Rescheduling save space, complete nodes on time. • Look for scheduling algorithm to avoid such concurrent node computations.

  21. New algorithm:“Zipping” up the tails • Apply budget to meet two kinds of requirements • Avoid working on height h nodes from different tails • Ensure completion of nodes with short deadline. • Solution: this compromise algorithm satisfies both • Focus computational attention on nodes with shortest deadline • Delay beginning new height h node until other TREEHASH are partially completed, with no tail nodes below height h • So we zip up the tails before diverting attention • Essentially rigging it to have fewer tail nodes • What is the effect of this rescheduling ? • Question 1: Are the nodes completed on time ? • Question 2: How much space do you need now ?

  22. Nodes completed on time • Informal justification • For a node at height h node, the delay < 2h+1 • This is only 2 per round over period of 2h rounds • Long time to recover from delay • Formal proof involves computation • Fix any period of 2h rounds • Identify all “deadlines”, maximum delay • Tabulate total required computation units • This is less than total budget over period • Experimental verification (via implementation) • Algorithm works time 2 log(N) per round

  23. Less space is used • Easy to see why space is O(log(N)) • At each height at most 4 values are stored. • Exactly one current sibling value • At most 1 completed next sibling value • At most 2 tail values • Total space required 3 log(N) • Tail pebbles happen when a sibling incomplete

  24. Result of new algorithm • Traversal of a Merkle tree with N leaves • Space bounded by 3 log(N) • [ node storage units ] • Time is 2 log(N) • [ leaf calc units, hash evaluation units ] • Answers classic Merkle traversal problem. • Asymptotically optimal

  25. Improved constants? • The constants are not optimal • Example - retain left nodes to half time • Manuscript on webpage rsasecurity.com / szydlo.com • Can technique be combined with JMLS’03? • The main focus was to increase speed, at space cost • Zipping technique still always saves some space

  26. Practical ramifications • Merkle authentication & signatures more feasible on space constrained devices • Easy relationship between tree size and speed • Speed up if smaller tree size acceptable • Possible bonus for longer term assurance • hedge against number theory breakthrough

  27. Conclusions • Merkle Trees - interesting after 25 years. • Viable for practical applications? • Need not be only a theoretical construction • More efficient than widely believed. • Further directions • Use as a tool in larger crypto protocols • Improve constants • good implementations, compare speed to RSA • What else can we do without number theory based cryptography?

More Related