170 likes | 296 Vues
This paper explores the challenges and solutions in writing application workflows that ensure fault tolerance and idempotence, particularly in distributed systems with partitioned data. It discusses how workflows can maintain strong consistency through ACID transactions while being resilient to process failures. Key methodologies include automated idempotent fault tolerance and a monadic approach for implementing workflows in programming languages like C# and F#. The work emphasizes the importance of unique identifiers for workflow steps and the ability to handle duplicate requests seamlessly.
E N D
Idempotent Transactional Workflow(POPL 2013) G. Ramalingam Kapil Vaswani Microsoft Research India
The Problem Can we simplify writing such applications? Application Partitioned Data scale-out
Transfer (amt, acct1, acct2) { • Debit amt from acct1; • Credit amt to acct2; • }
ACID Transaction • Strong consistency • Distributed transaction • Transfer (amt, acct1, acct2) • atomic { • Debit amt from acct1; • Credit amt to acct2; • }
Workflow • Weaker consistency • No isolation • No distributed transaction • Transfer (amt, acct1, acct2) • atomic {Debit …}; • atomic {Credit …}; Claim: Workflows are common in applications over partitioned data What about process failure?
The Problem Modern Cloud Platforms • Goal • Fault-tolerance in application • A transactional workflow engine • decentralized! Application Logic Stopping (non-byzantine) failure Storage Layer (failures handled by storage layer)
Making Workflows Fault-Tolerant request response
Taking a step back … Request or response may be lost! Resending messages is a critical element of fault-tolerance • Transfer (amt, acct1, acct2) { • Debit amt from acct1; • Credit amt to acct2; • } Must be Idempotent! (tolerate duplicate messages)
Goal:IdempotentFault-Tolerance • (Idempotent Workflow) • A program is said to be idempotent & fault-tolerantiff • its behavior is unaffected by process failures • its behavior is unaffected by duplicate input requests • Behavioral equivalence: • duplicate output responses allowed • progress (liveness) conditions • slightly weakened
Making WorkflowsIdempotent & Fault-Tolerant request response
Making Computations Idempotent request response Make every effectful step idempotent: Associate unique id with every step Modify step to log execution of step Modify step to check if it has already executed All must be done atomically !
AutomatedIdempotent Fault-Tolerance • As a library • In C# & F# • Technically, a monad • As a compiler • As a programming-language construct
Formal Results Theorem. A well-typed monadic program is idempotent and fault-tolerant. Any (well-typed) program e can be automatically translated (compiled) into a program compile[e] Theorem. compile[e] is an idempotent and fault-tolerant realization of e.
Idempotence: A Language Construct • “idworkflowuid e’’ • transfer (uid, amt, acct1, acct2) { • idworkflowuid{ • atomic T1 Debit amt from acct1 • atomic T2 Credit amt to acct2} • } • }
Extensions • Compensating actions • Undo earlier actions when later actions encounter logical failure • Automatic retry • Detect process failures & restart • Checkpointing • Restart at most recent checkpoint
Questions? Fault-Tolerance & Idempotence: Simpler Together
Problem Setting client service Application Logic Storage Layer partitioned data