One-Shot Snapshots in Space

One-Shot Snapshots in Space Helmi, Higham, Pacheco and Woelfel(Best paper PODC 2011)

Some Space Bounds (# Registers)

Getting Below the Obvious R[1,..,n/2] 1 getTSp sum = 0 for i = 1 to if i == R[i] = R[i]+1 sum = sum + R[i] return sum 2 The value of each R[i]changes 0  1  2 (it could change 0  1 only , but not 2  1) I.e., value of each R[i]is monotone • sum is also monotone • Timestamps increase with time

Getting to Space Pack processes in each entry Entry k used only after k “rounds” A timestamp is a pair • round in which it was taken • turn orders within a round Timestamps compared by lexical order R[1,..,]

Data Structures A shared array R[1,..,]with entries or Each process also has: Local copy r[1,..,] And two indexes myrnd, j R r sequence of process ids myrnd j 1 1 

The Algorithm (for Process p) while R[j]  r[j] = R[j] ; j++ myrnd = j-1 for j = 1 to myrnd-1 if R[myrnd+1]  return if r[myrnd].seq[j] == last(R[j].seq) R[j] = ; if R[j].rnd < myrnd R[j] = r[1..m] = DoubleCollect(R[1..m]) if R[myrnd+1] ==  R[myrnd+1] = return R r • (z) • (_,p) return • (r,p,q) repeatedly read R[], until two equal collects 

Step Complexity while R[j]  r[j] = R[j] ; j++ myrnd = j-1 for j = 1 to myrnd-1 if R[myrnd+1]  return if r[myrnd].seq[j] == last(R[j].seq) R[j] = ; if R[j].rnd < myrnd R[j] = r[1..m] = DoubleCollect(R[1..m]) if R[myrnd+1] ==  R[myrnd+1] = return • O(m) iterations • O(1) steps / iteration • O(m) steps • (1,0) •  • ≤ m writes / getTS • O(mn) writes total return •  •  • O(m) steps / iteration • O(m n) iterations • O(m2n) steps •  Altogether O(m2n) = O(n2) steps • 

Basic Properties • Once R[j]  it stays non- • If R[j]  then i< j, R[i]  • The values written to last(R[j].seq) are distinct • If getTS returns (j,_) then R[j]  while R[j]  r[j] = R[j] ; j++ myrnd = j-1 for j = 1 to myrnd-1 if R[myrnd+1]  return if r[myrnd].seq[j] == last(R[j].seq) R[j] = ; if R[j].rnd < myrnd R[j] = r[1..m] = DoubleCollect(R[1..m]) if R[myrnd+1] ==  R[myrnd+1] = return R • (1,0) •  return •  •  •  • 

Timestamp Property • Once R[j]  it stays non- • If R[j]  then i < j, R[i]  • The values written to last(R[j].seq) are distinct • If getTS returns (j,_) then R[j]  Assume processes don’t fall off the array getTSp returns getTSqreturns By basic facts (and many subcases), So assume • getTSp precedes getTSq

Timestamp Property while R[j]  r[j] = R[j] ; j++ myrnd = j-1 for j = 1 to myrnd-1 if R[myrnd+1]  return if r[myrnd].seq[j] == last(R[j].seq) R[j] = ; if R[j].rnd < myrnd R[j] = r[1..m] = DoubleCollect(R[1..m]) if R[myrnd+1] ==  R[myrnd+1] = return R • (1,0) A •  B return •  •  •  •  C

Space Complexity Proof: Phases Partition the execution into phases • Phase 0 starts at the beginning of the execution • Phase starts when the double collect of the first process with myrnd = is linearized • Phase completes when phase starts phase 0 phase 1 phase phase

Additional Basic Properties • R[j] is changed to non- only in Z • getTSp executes Z in phase myrndp+1 • getTSp executes W in phase myrndp while R[j]  r[j] = R[j] ; j++ myrnd = j-1 for j = 1 to myrnd-1 if R[myrnd+1]  return if r[myrnd].seq[j] == last(R[j].seq) R[j] = ; if R[j].rnd < myrnd R[j] = r[1..m] = DoubleCollect(R[1..m]) if R[myrnd+1] ==  R[myrnd+1] = return R W • (1,0) •  X return •  Y •  •  Z • 

Space Complexity Proof: Overview • R[j] is changed to non- only in Z • getTSp executes Z in phase myrndp+1 • getTSp executes W in phase myrndp The first write to R[j] in a phase is an invalidation • Exactly invalidations if phase completes We prove that #invalidations •  • Exactly registers R[1],…,R[] are written in phase phase 0 phase 1 phase phase

Space Complexity Proof: Charging Map invalidations to writes, such that • Mapping is one-to-one, and • Onto at most two writes of the same getTS  #invalidations phase 0 phase 1 phase phase

Mapping Invalidations • An invalidation is mappedto: • Itself if it is the first invalidation or last write of getTS (self map) or • The write that wrote the value it read before the invalidation while R[j]  r[j] = R[j] ; j++ myrnd = j-1 for j = 1 to myrnd-1 if R[myrnd+1]  return if r[myrnd].seq[j] == last(R[j].seq) R[j] = ; if R[j].rnd < myrnd R[j] = r[1..m] = DoubleCollect(R[1..m]) if R[myrnd+1] ==  R[myrnd+1] = return R W • (1,0) •  X return •  Y •  •  Z • 

Space Complexity Proof: Charging Map invalidations to writes, s.t. • Mapping is one-to-one, and Non-self maps are not onto invalidations Two invalidations are not mapped to same write (They cannot read the same value and both be invalidations.)

Space Complexity Proof: Charging Map invalidations to writes, s.t. • Mapping is one-to-one, and • Onto at most two writes of the same process If non-self map onto a write, then this write is the final one of the process & it is not an invalidation

Opportunities • Space requirements grow as , but lower bound needs exponential #invocations • More graceful degradation? • Adaptive space requirements for other problems? (E.g., max registers.) • Step complexity? Polylog(#invocations)? • Avoid scan (double collect)? • Use a better snapshot?

One-Shot Snapshots in Space