Outline

Outline Part A: WF Specification and Verification Part B: WF System Architecture and Configuration WF Execution Infrastructure What Is It All About? • Failure Handling WF Specification Techniques • Stochastic Modeling Statecharts • WF System Configuration  CTL and Model Checking • Summary and Open Research Issues Summary and Open Research Issues

WFMS Architecture for E-Services Clients WF server type 2 WF server type 1 Comm server ... ... App server type 1 App server type n

Interoperability between WF Systems <?xml version="1.0"?> <Activity WFtype="5” Name=„RiskAssessmentAct"> <BusinessData> <CreditRequest> <CustId>101</CustId> </CreditRequest> </BusinessData> <WorkflowCtrl> <Variable Name="Currency" Value="USD"/> </WorkflowCtrl> </Activity> WFMS 1 WFMS 2 WF Mediator • wrap WFMS using XML-based interface • (e.g., WSDL/WSFL or ???) • route activity and sub-WF invocations • through WF mediator • same protocol for activities and sub-WFs • add‘l functions for sub-WF monitoring

Needed: semantics & interoperability with automatic reasoning about process/activity interface, behavior, and outcome standardized ontologies are a step forward, but still far from final goal; we need: Semantic Web + Intelligent Search Some “AI-complete” Problems Grand challenge: service discovery and matchmaking State of the art: standardized syntax& protocols (à la UDDI) with queries on yellow pages (“business registry“)

Outline Part A: WF Specification and Verification Part B: WF System Architecture and Configuration  WF Execution Infrastructure What Is It All About? Failure Handling WF Specification Techniques • Stochastic Modeling Statecharts • WF System Configuration  CTL and Model Checking • Summary and Open Research Issues Summary and Open Research Issues

Important System Issues • Scalability, Reliability, Availability, • Manageability,... • Differentiated quality of service and • performance guarantees • (e.g., class-specific response time • and workflow turnaround time) see, e.g., Mentor-lite, http://www-dbs.cs.uni-sb.de/~mlite/ • World-wide failure masking for • exactly-once behavior with easy app development see, e.g., Phoenix project http://research.microsoft.com/db/phoenix/

Your server command (process id #20) has been terminated. Re-run your command (severity 13) in /export/home/WWW/your-reliable-eshop.biz/mb_1300_db.mb1 The Need for Failure Masking Please review and place your order Place your order

Atomic (transactional) write of persistent state & context guarantees forward recovery Long-Lived & Distributed Execution / Budget:=1000; Trials:=1 Go Check ConfFee Select Conf [Found] / Cost:=0 Check Cost Check TrExpenses [Fok & Eok] / Cost := ConfFee + TrExpenses No [!Found]

send „prepare“ send „prepare“ send „yes“ send „yes“ send „commit“ send „commit“ send „ack“ send „ack“ Digression: Two-Phase Commit Protocol (2PC)for Distributed Atomic Transactions Coordinator Agent 1 Agent 2 write „begin“ force log entries & write „prepared“ force log entries & write „prepared“ write „commit“ write „commit“ write „commit“ write „end“

T1|F1 T|F T1|F1 / abort1; abort2 T2|F2 T|F T|F C-pending A-pending T2|F2 / commit1; commit2 Statechart for 2PC Protocol initial1 prepare1 / sorry1 prepare1 / yes1 initial / prepare1; prepare2 prepared1 abort1 / ack1 collecting commit1 / ack1 sorry1 | sorry2 / abort1; abort2 yes1 & yes2 / commit1; commit2 committed1 aborted1 commit1 / ack1 abort1 / ack1 committed aborted initial2 prepare2 / sorry2 ack1 & ack2 prepare2 / yes2 ack1 & ack2 prepared2 abort2 / ack2 commit2 / ack2 forgotten committed2 aborted2 commit2 / ack2 abort2 / ack2

Queued transactions & 2PC guarantee consistency of distributed WFMS & exactly-once execution Long-lived & Distributed Execution / Budget:=1000; Trials:=1 Go Check ConfFee Select Conf [Found] / Cost:=0 Check Cost Check TrExpenses [Fok & Eok] / Cost := ConfFee + TrExpenses No [!Found]

Problem: Client that does not receive a returncode from transactional server cannot easily find out the transaction outcome and may be tempted to re-initiate the (non-idempotent) transaction, thus producing unacceptable effects. Approach: In addition to atomicity, the transactional server needs to guarantee the exactly-once execution of the transaction, where execution includes the server‘s reply message.  (almost) perfect failure masking From ACID To Recovery Guarantees

stateless application • (running on client, or app server or data server): • user sends input message • app program sends request message to data server • data server executes transaction and sends reply message to app • app program sends output message to user • there are no conversations with the user within a transaction, • and subsequent transactions are independent • Solution Queued Transactions: • message recovery by queue manager • with persistent, recoverable message queues • exactly-once execution by enclosing • message dequeue and enqueue into transaction Stateless Applications Based on Queues

User input output Application Process (Client) ... enqueue request dequeue reply dequeue request enqueue reply Database Server ... server transaction Illustration of 2-Tier Queued Transaction

User input output ... Client enqueue request dequeue reply dequeue request enqueue reply Application Server ... Database Server ... distributed server transaction Illustration of 3-Tier Queued Transaction

Theorem: With the queued transaction protocol for stateless applications, the following guarantees hold: 1. Once the user-input transaction is committed, a request is executed by the server exactly once. 2. Once the user-input transaction is committed, the user output is delivered at least once. 3. If user output is testable, the user output is delivered exactly once, provided the user-input transaction has been committed. Correctness of Queued Transaction Protocol • Inherent (small window of) uncertainty: • (last) user input may get lost • (last) user output may be sent more than once •  can be eliminated with testable output (using special hardware)

Client During Normal Operation user-input processing by client: begin transaction; enqueue (request); commit transaction; user-output processing by client: wait until reply queue is not empty; begin transaction; dequeue (reply); while user has not acknowledged the reply or sent the next request do present reply to user; end /*while*/; commit transaction;

Server During Normal Operation request-reply processing by data server: begin transaction; dequeue (request); perform data operations and generate reply; enqueue (reply); commit transaction;

Client and Server Restart Client restart: check reply queue; if not empty then process reply like during normal operation; end /*if*/; Server restart: check request queue; if not empty then initiate processing of requests like during normal operation end /*if*/;

Pseudo-Conversational Transactionsfor Stateful Applications • Queue-based message recovery for entire conversations • Conversational “logical unit of work” • broken down into chain of stateless transactions • with (small) application state maintained in the queue • (analogously to Cookies, but more general and much more reliable) • Dequeue of reply and enqueue of next request • combined into one transaction for exactly-once execution guarantee • good for apps such as travel reservation, electronic shopping, etc.

User ... Application Process (Client) ... ... Database Server ... Illustration ofPseudo-Conversational Transactions

Theorem: • With the queue-based message recovery for conversational • multi-step transaction chains, the following guarantees hold: • Once the initial user-input transaction that starts the entire • conversation is committed, the entire transaction chain is • executed by the server exactly once. • Once the initial user-input transaction is committed, each • user-output message throughout the conversation is • delivered at least once. • If user output is testable, each user-output message is • delivered exactly once, • provided the initial user-input transaction has been committed. Correctness ofPseudo-Conversational Transaction Protocol

At end of activity execute transaction that combines: • writing the activity‘s modifications of workflow state and context • to persistent store • writing the state modifications that result from the firing of • outgoing transitions to persistent store • writing the context modifications that result from the actions of • firing transitions to persistent store • notifying the follow-up activities by enqueueing messages • Newly invoked activity executes transaction that combines: • dequeueing of notification message • writing the workflow state and context to persistent store Queue-based Message Recovery forExactly-Once Workflow Execution

CheckConfFee Go Check Flight / Budget:=1000; Trials:=1; Select Tutorials Compute Fee [Cost  Budget] Check Cost [ConfFound] / Cost:=0 Select Conference / Cost = ConfFee + TravelCost Check Áirfare [Cost > Budget & Trials  3] Check Hotel Check Hotel No CheckTravelCost [!ConfFound] Queued transactions & 2PC guarantee consistency of distributed WFMS & exactly-once execution [Cost > Budget & Trials < 3] / Trials++ Use of Queued Transactions inTravel Planning Workflow

Provide compensating steps Cancel Travel Cancel Conf Compensation of Invoked Applications / Budget:=1000; Trials:=1 Go Check ConfFee Select Conf [Found] / Cost:=0 Check Cost Check TrExpenses [Fok & Eok / Cost := ConfFee + TrExpenses No [!Found] & invoke steps (mostly) automatically

? Arbitrary compensation spheres may leave workflow in non-resumable configuration ! Meaningful Compensation Spheres (1) / Budget:=1000; Trials:=1 Go Check ConfFee Select Conf [Found] / Cost:=0 Check Cost Check TrExpenses [Fok & Eok / Cost := ConfFee + TrExpenses No [!Found]

Restrict atomicity spheres to a single state and its enclosed activities & apps Meaningful Compensation Spheres (2) / Budget:=1000; Trials:=1 Go Check ConfFee Select Conf [Found] / Cost:=0 Check Cost ! Check TrExpenses [Fok & Eok / Cost := ConfFee + TrExpenses No [!Found]

The Need for Multi-Tier Application Recovery Realistic example: Expedia or Travelocity style multi-tier service Client Expedia App Web Server Expedia App Server Sabre App Server Amadeus App Server Data Server Data Server Data Server Data Server

Need for Integrated & Application-transparent Data, Message, and Process Recovery Users Web app server Business portal server ? Data server Other clients for largely autonomous components

Efficient Solution: Recovery Contracts • For each process: • log all non-deterministic events (non-forced) • Upon interaction between sender and receiver: • sender promises recoverable state and message (e.g., via replay) • and resends message if necessary • receiver promises duplicate elimination • and recoverable state when releasing sender promise • low run-time overhead: • one forced log write per multi-tier request/reply • fast restart ( high availability) • rebuild process state & message table and replay • independent recovery of autonomous components prototype implementation for IE6 / Apache / PHP / MySQL plus COM+-based implementation work in Phoenix project at MSR

Committed Interaction Contract (CIC) • Sender Obligation S1: persistent state as of message time or later • Sender Obligation S2: persistent message • S2a: resend message periodically until released by receiver • S2b: resend message upon explicit request until released • Sender Obligation S3: unique messages • Receiver Obligation R1: duplicate message elimination • Receiver Obligation R2: persistent state • R2a: persistent state as of message time or later before • releasing sender from S2a (stable interaction) • R2b: persistent state & message before releasing sender • from S2b (installed interaction) Immediately Committed Interaction (ICIC): Receiver releases sender from S2a, S2b immediately (similar to optimized 2PC) – crucial for autonomous recovery

Statechart for CIC sender [true] interaction (known to be) stable: (S2a released) stability notification interaction recoverable: S1, S2 promised / message transfer message sent running / make state and message persistent [true] interaction (known to be) installed: (S2b released) commit notification [true] / stability notification interaction stable: R2a promised / log message arrival message transfer message received running [true] interaction installed: R2b promised [interaction stable] / make state persistent receiver / install notification

Statechart for ICIC sender interaction (known to be) installed: S2 released interaction recoverable: S1, S2 promised stability and install notification / make state and message persistent / message transfer message sent running [true] / make state persistent interaction installed: R2 promised message transfer message received running / stability and install notification receiver

External Interaction Contract (XIC)and Transactional Interaction Contract (TIC) XIC: • input from user: receiver promises ICIC, sender doesn’t • output to user: sender promises ICIC, receiver doesn’t consequence: crash may lead to lost input or duplicated output (for small but inherently unavoidable window of vulnerability) TIC: • receiver of transactional request promises: • atomic state transition • faithful reply message • persistent reply message • sender of transactional request promises: • persistent state and commit request message • unique messages

during normal operation crash User input output Application Process (Client) ... request reply Database Server ... during client restart 2nd App Process ... replay User input Application Process (Client) ... request reply ? Database Server ... 2nd App Process ... Special Case: Client-Server Application Recovery

General Considerations forClient-Server Stateful Application Recovery • Message logging for message recovery and • deterministic program replay • (of piecewise deterministic program) • Installation points for process recovery and • reduced program replay • Server processes concurrent threads on behalf of many clients • Server “commits its state” upon sending a reply to a client • Forced logging should be minimized • Server should be able to perform independent recovery

Server Reply Logging Method • Client and server each • maintain a message lookup table (MLT) and • write message log entries to a stable log • Client performs lazy, non-forced, logging, • and periodically creates intallation point, • and force-logs user-input messages • Server forces its log buffer before sending a reply message • Server recovery rebuilds message lookup table • and replays incomplete requests to produce reply • may need logging of read/write interleaving among threads • Client recovery rebuilds MLT, • reloads app from last installation point and • replays application, intercepting message events and • obtaining the contents of messages from local MLT or the server • Client sends stability notifications • to facilitate server log truncation

MSN Type 15 input lazy logging 20 request 40 reply installation point 45 output ... 65 input 70 request client server ... stable log file MSN Type message lookup table 10 request ... ... 20 request 30 reply 40 reply ... force log upon reply 70 request 80 reply Data Structures for Server Reply Logging

MSN Type 15 20 15 input 20 request ... client server ... 10 20 30 MSN Type 10 request ... 20 request 30 reply ... R(x)W(x)R(x)R(y)W(y)R(y) ... Replaying Incomplete Requests with Server Reply Logging

client message lookup table MSN Type 15 input 20 request 40 reply 15 20 40 45 output client log ... 70 request request 70 + stability notification 15 45 client c 20 40 70 80 server 20 20 20 20 20 40 40 40 40 40 70 80 server log ... ... RedoMSN for client c other clients Log Truncation with Server Reply Logging

Client Expedia App Web Server Expedia App Server Sabre App Server Amadeus App Server Data Server Data Server Data Server Data Server Efficient Multi-tier Application Recoveryand Failure Masking altogether 16 messages (8 requests + 8 replies) per user request • 10 forced log writes: • 1 user request at client • 4 replies at data servers (transactional ICs) • 3 replies at external app servers (ICICs) • 2 app server replies at Web server (ICICs) • no forced logging between Web server and • app server in same „recovery ensemble“ (CIC) as opposed to  32 forced log writes with 2PC for every sender-receiver pair

High availability through server and data replication Scalable performance Guaranteed performance e.g.: response time < 5 seconds with probability 0.95 for 1000 concurrently active workflows Additional System Guarantees for Workflows Exactly-once execution guarantees to preserve the guaranteed semantic properties in a failure-prone, distributed system environment auto-tuning and zero-admin

Outline Part A: WF Specification and Verification Part B: WF System Architecture and Configuration  WF Execution Infrastructure What Is It All About?  Failure Handling WF Specification Techniques Stochastic Modeling Statecharts • WF System Configuration  CTL and Model Checking • Summary and Open Research Issues Summary and Open Research Issues

Internal Server Error. Our system administrator has been notified. Please try later again. The Need for Performance and QoS Guarantees Check Availability (Look-Up Will Take 8-25 Seconds)

From Best Effort To Performance & QoS Guarantees ”Our ability to analyze and predict the performance of the enormously complex software systems ... are painfully inadequate” (Report of the US President’s Technology Advisory Committee) • Very slow servers are like unavailable servers • Tuning for peak load requires predictability • of workload  config  performance function • Self-tuning requires mathematical models • Stochastic guarantees for huge #clients • P [response time  5 s] > 0.95

WFMS Architecture for E-Services Clients WF server type 2 WF server type 1 Comm server ... ... App server type 1 App server type n

Digression: Markov Chains A discrete-time finite-state Markov chain is a pair (, p) with a state set ={s1, ..., sn} and a transition probability function p:   [0,1] with the property for all i where pij := p(si, sj). A Markov chain is called ergodic (stationary), if for each state sj the limit exists and is independent of si, with for t>1 and pij(t) := pij for t=1. For an ergodic finite-state Markov chain, the stationary state probabilities pj can be computed by solving the linear equation system:

Markov Chain Example 0.2 0.5 0.3 0: sunny 1: cloudy 2: rainy 0.8 0.5 0.3 0.4 p0 = 0.8 p0 + 0.5 p1 + 0.4 p2 p1 = 0.2 p0 + 0.3 p2 p2 = 0.5 p1 + 0.3 p2 p0 + p1 + p2 = 1  p0  0.657, p1 = 0.2, p2  0.143

Digression: Continuous Time Markov Chains A finite-state continuous-timeMarkov chain(CTMC) is a pair (, q) with a state set ={s1, ..., sn} and transition rates q:   with A CTMC can be „factorized“ into a discrete-time Markov chain with transition probabilities and exponentially distributed state residence times with For an ergodic CTMC the stationary state probabilities pj can be computed by solving the system of linear flow balance equations: and

CTMC Example 1: Stationary Availability only transient, repairable failures availability = P[system is operational at random time point] Single server: Mirrored server pair: 1 / MTTF 2 / MTTF 1 / MTTF both up 1 up 1 down both down 1: up 0: down 2: 1: 0: 1 / MTTR 1 / MTTR 1 / MTTR p0 / MTTR = p1 / MTTF p1 /MTTF = p0 / MTTR p0 + p1 = 1 p1 / MTTR = 2 p2 / MTTF 2 p2 / MTTF + p0 / MTTR = p1 / MTTR + p1 / MTTF p1 / MTTF = p0 / MTTR p0 + p1 + p2 = 1   availability of server availability of server pair

Outline

Outline

Presentation Transcript

Outline

Outline

Outline

Outline

Outline

Outline

Outline

outline

outline

OUTLINE

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline:

Outline

Outline

OUTLINE: