Ch 5. Transaction Processing Monitors An Overview

Ch 5. Transaction Processing MonitorsAn Overview Dr. Kien A. Hua

An Overview • Many transaction processing (TP) monitors differ widely in functionality and scope. • There is no commonly accepted definition of precisely what a TP monitor is, how it interfaces to other system components. • The intent of Chapter 5 and 6 is to present a reference architecture of transaction-oriented system, and define the role of a TP monitor within that framework. • The current chapter, in particular, explains the services provided by a TP monitor, and introduces the structure of this system component.

The Role of TP Monitors • Operating systems, communication systems, etc. are usually not designed for the needs of a transaction-oriented environment: • A TP monitor provides either essential services absent from the host system, or services the host performed so poorly that a new implementation was required. • The main function of a TP monitor is to integrate other system components to make them work together to support transaction-oriented processing.

COMPUTING STYLES • Computing systems are used in a variety of ways, which are largely determined by the type of applications these systems were developed for • Batch Processing • Time-Sharing • Real-Time Processing • Client-Server Processing • Transaction-Oriented Processing • It is helpful to analyze these styles to understand what system facilities are fundamental to support each style.

Batching Processing • Large unit of work: Work comes in large portions at prescheduled times and with well-defined resource requirements. • Coarse-grained resource allocation: The programs typically operate on their own private data. • Sequential access patterns: Batch jobs typically go sequentially through a large number of processing steps, access files in a sequential scan, and so on. • Application does recovery: Batch applications have to make their own provisions against system crashes. • Few (tens of) concurrent jobs: There are not many batch jobs running concurrently on any given system. Throughput is the key performance criterion. • Isolated execution: each batch job executes in its own process; this process has exclusive control of the files, data streams, and other resources it uses.

Time Sharing (1) Time-sharing is the terminal-oriented version of batch. It is a way of giving interactive access to computing resources via low-bandwidth (dumb) terminals. • Process per terminal: While batch processing is driven through a predefined job control program, a time-sharing session is controlled by the terminal user. A terminal session gives the user a complete abstract machine with memory, devices, and the like. • Coarse-grained resource allocation: Terminal sessions are typically long, and resources are assigned in large granules; as in batch processing, the application works on private data.

Time Sharing (2) • Unpredictable demands: The actual resource demands are not as predictable as they are with batch processing. • Sequential access pattern. • Application does recovery: It is up to the user to reestablish the session and to figure out how far he/she had gotten before the crash. • Hundreds of concurrent users: Response time is the key performance criterion.

Real-Time Processing • Event driven operation: Activities in the system are driven largely by interrupts coming from the sensor devices. The workload pattern is not preplanned. • Repetitive workload: The set of programs that can be activated by outside events is statically defined. • High availability: The system must be highly available because it controls a real-world process. • High performance: Sometimes, a system is characterized as real-time if it is supposed to react real fast. The distinctive requirement, however, is that it is able to do deadline scheduling.

Client-Server Computing • Client-server processing is the modern version of time-sharing. • Rather than running everything a user requests in one process, services are invoked by passing requests to dedicated servers, which can reside in other processes on the same machine or in different machines of a distributed system. • All persistent data are now encapsulated in database servers, so data are shared among many users through that server. • Such servers have to be highly available.

TRANSACTION-ORIENTED PROCESSING (1) • Data sharing: Computations read and update databases shared among all users. • Variable requests: user requests are random. • Repetitive workload: users do not run arbitrary programs, but rather request the system to execute certain functions out of a predefined set. • Mostly simple functions: consume 105 – 107 instructions and do some 10 disk I/Os.

TRANSACTION-ORIENTED PROCESSING (2) • Some batch transactions: have the size and duration of typical batch job. • Many terminals: 103 – 105 terminals. • High availability: Because of the large number of users, the system must be highly reliable and available. • System does recovery. • Automatic load balancing: The system should deliver high throughput with guaranteed low response times (soft real-time system).

A TAXONOMY OF TRANSACTION EXECUTION (1) Transaction Direct Queued Single Message Conversational Shot Long Local Distributed Local Distributed Local Distribute Local Distribute Direct OLTP Complex Online Queued OLTP Lone Batch Transaction Transaction Transaction Transaction

A TAXONOMY OF TRANSACTION EXECUTION (2) Direct: The terminal and the process running the server program (handling the request) are associated with each other. Queued: Transactions are put in a queue and scheduled for processing according to the queuing discipline.

A TAXONOMY OF TRANSACTION EXECUTION (3) Simple • Single message: There is a single input message from the terminal, and upon commit a single output message is delivered. • Short: The number of object it touches is in the tens. Complex • Conversational: It allows for repeated exchange of messages between the user and the application. • Long: The number of objects it touches is in the tens of thousands (batch-like transaction).

Transaction Processing Services • Transaction services must provide the application programmer with a programming environment that integrates transaction control in a seamless manner. • The program needs not worry about concurrency, failures, clean-up, and so forth. • As far as data sharing is concerned, applications can use the services provided by a database manager.

The Transaction Processing Services (1) Apart from the technical issue of access to shared data, more system service are required to support transaction-oriented processing. Manage heterogeneity: The local transaction mechanisms in each subsystem will not be sufficient to ensure the ACID properties for the whole function. Control communication: The status of communication sessions must also be subject to transaction control by the transaction services.

The Transaction Processing Services (2) Terminal management: Since the ACID properties must be perceived by the user and not just by the program, sending and receiving the message must be part of the transaction. Presentation services: If the terminal uses sophisticated presentation services, then reestablishing the window environment after a crash of the workstation is also a part of the transaction guarantee.

The Transaction Processing Services (3) Context management Start/restart: TP monitor must also handle the restart after any failure. By doing so, all the subsystems are brought up in a state that is consistent with respect to the ACID rules.

Integrated Control Note: Many textbooks create the impression that database transaction control is all there is to transaction processing. The need to support other resources with ACID properties forces a more generalized transaction management.

Key Terms • Typically, a number of services are bunched together in one application. • At run time, a server class is maintained for each application program. • A server class is a group of processes that are able to run the code of the corresponding application program. • An actual execution of a service request requires the request to be sent to a process (server) of the right server class. • The activation of a server on behalf of a service request is called service invocation.

One Process Per Terminal (1) • At logon, each terminal is given its own process, which it holds on to for the rest of the session. • Example: Time-Sharing systems • Problem: • Too many capabilities per process: Each process can run all applications. It comes with more capabilities than a terminal needs. • Too many process switches: Process switches are very expensive operations in most operating systems (2000 – 5000 instructions)

One Process Per Terminal (2) Conclusions: • This approach does not work well for transaction-oriented systems. • It is acceptable only for small systems of less than 100 clients.

Only One Terminal Process (1) • All terminals talk to one process, which can be the TP monitor process itself. • The TP monitor process receives the function requests and routes them to the programs that can service them. • Example: CICS, ComPlete.

Only One Terminal Process (2) • Advantages: Simplicity! The TP monitor can check the function requests, schedule them according to its own polices, and so on. • Disadvantages: • Each page fault or other exception in the TP monitor’s process will stop the whole TP environment. • Since a single process can employ only one CPU at a time, the TP system can use only one CPU. • The process is confined within one address space, which can be a serious limitation for large applications.

Many Servers, One Scheduler (1) • There is only one (data communications) process that handles all the request and response messages. • There is a group of processes (i.e., a server class) for each application program. • Different applications are fenced off against each other. • The data communication process routes the service request to the appropriate server.

Many Servers, One Scheduler (2) • Example: IMS/DC • Advantages: Simplicity! There is one place for scheduling and load control. • Disadvantages: The data communication resource can become a bottleneck.

Many Servers, Many Schedulers (1) • A number of (functionally identical) data communication processes do the terminal handling. • There is a server class for data communication services. • The data communication process must multiplex itself among the terminals it is attached to, and therefore must be multi-threaded.

Many Servers, Many Schedules (2) • The application server classes are set up as in the last scenario. Example: Tandem’s Pathway, DEC’s ACMS. Advantage: The data communication process is no longer a bottleneck. Disadvantage: Load balance becomes more difficult.

The Tasks of TP Monitors (1) • Scheduling: Service requests must be mapped to the proper servers. • Server class management: The TP monitor is responsible for setting up the server class. • Recovery: After a crash, the TP monitor is responsible for bringing up the TP environment. • It starts all the system processes, brings up the server classes, and then passes control to the transaction manager.

The Tasks of TP Monitors (2) • Resource administration: Information about the terminals, databases, application programs, users, etc. is kept in a system repository managed by the TP monitor. • Authentication and authorization: Service requests must be cleared by the TP monitor before they are executed. • System operation: The TP monitor must provide the operators with sufficient information to tune the system, and inform them about any problems that occur during normal operations.

Resource Managers A resource manager is a software subsystem that ties into the TP monitor to provide protected actions on its state.  It must be able to participate in transaction-oriented recovery BEGIN WORK receive (input message) send (statistics menu) to (window w1); COMMIT WORK;

Context-Sensitive Scheduling • The completion of a request typically frees the server so that it can be reassigned to another request. • However, there are cases in which a server is reserved for a special user. Example: For chained transactions, the server must be reserved for the “next” transaction, because it may refer to local context variables available only in that server process.

Transaction Manager (TM) • Once the transaction program has started, TP monitor has little to do with transaction management. The coordination of the resource managers is done by the transaction manager.

Transaction Manager (TM) (2) • We want to separate the components exercising transaction control (the transaction manager) from those that do transaction-oriented resource scheduling (TP monitor). • Reason: There are transactions that do not come in through the TP monitor. • Examples: • Ad hoc query interface of SQL systems. • CAD applications run their own terminal environment.

Responsibilities of TP Monitors (1) • The TP monitor brings up the resource managers upon startup. • For restart, the TP monitor only has to bring up the resource managers. The actual recovery protocol is completely handled among the resource managers and the transactions manager.

Responsibilities of TP Monitors (2) • To dispatch a server for a request, the TP monitor creates a process (or reuse an existing one) and load the code into it. • All the calls among resource managers are so-called transactional remote procedure calls (TRPCs). The mechanisms to handle them are provided by the TP monitor. • Example: BEGIN_WORK is a TRPC to the transaction manager.

Transaction Processing Components • TP monitor’s main tasks: • To handle the incoming requests • To provide the resources for their processing • To hand back the results • Orchestrating the cooperation among the various resource managers is the task of the transaction manager.

Transaction Processing Components

Transactional Remote Procedure Call (TRPC)

Remote Procedure Call (RPC) • An RPC system enables a client program to communicate with server programs on different computers by calling procedures in a similar way to the conventional use of procedure calls in high-level language. • At the RPC level a service may be viewed as a module with an interface that exports a set of procedures appropriate for operating on some data abstraction or resource. • From the perspective of client programs a service provides the same facilities as a software module – enabling clients to import its procedures.

Marshalling • Marshalling is the process of taking a collection of data items and assembling them into a form suitable for transmission in a message. • Flatten structured data items into a sequence of basic data items. • Translate those data items into an external data representation. • Unmarshalling is the process of disassembling them on arrival to produce an equivalent collection of data items at the destination. • Translate the external data representation to the local one. • Unflatten the data item.

Receive (p, message) Send (p, message) port p port q Message Message Destinations • Potential clients need to know an identifier for communicating with a server. • In the Internet protocols, the destination addresses for messages are specified as a port number used by a process and the Internet address of the computer on which it runs.

RPC: Main Tasks The software that supports remote procedure calling has three main tasks: • Interface processing: Integrating the RPC mechanism with client and server programs in conventional programming languages. • dispatching of request messages to the appropriate procedure in the server. • marshalling and unmarshalling of arguments in the client and the server. • Communication handling: Transmitting and receiving request and reply messages. • Binding: Locating an appropriate server for a particular service.

Building the Client Programs

Building the Client Programs (2) • An RPC system provides a stub procedure to stand in for each remote procedure that is called by the client program. • The purpose of a client stub procedure is to convert a local procedure call to a remote procedure call to the server. • The task of a client stub procedure is to • marshal the arguments and to pack them up with the procedure identifier into message, • send the message to the server and then await the reply message, • unmarshal it and return the results.

Building the Server Programs

Building the Server Programs (2) • An RPC system provides a despatcher and a set of server stub procedures. • The despatcher uses the procedure identifier in the request message to select one of the server stub procedures and pass on the arguments. • The task of a server stub procedure is to • unmarshal the arguments, • call the appropriate service procedure, and • when it returns, marshal the output arguments into a reply message.

Interface • The types of the arguments and results in the client stub must conform to those expected by the server stub. This is achieved by the use of a common interface definition. • An RPC interface definition specifies those characteristics of the procedures provided by a server that are visible to the server’s clients. • The characteristics that must be defined include the names of the procedures and the types of their parameters.

Interface Compilers Interface compilers can be designed to process interfaces for use with different languages enabling clients and servers written in different languages to communicate by using RPCs.

Binding • An interface definition specifies a textual service name for a server. However, client request messages must be addressed to a server port. • In a distributed system, a Name Service, called a Binder, is used to maintain a table containing mappings from service names to server ports. • When a server process starts executing, it sends a message to the binder requesting it to Register its service name and server port. • When a client process starts, it sends a message to the binder requesting it to LookUp the identifier of the server port of a named service.

Ch 5. Transaction Processing Monitors An Overview

Ch 5. Transaction Processing Monitors An Overview

Presentation Transcript

Transaction Processing Monitors An Overview

Transaction Processing

Overview of Databases and Transaction Processing

CH 5: An Overview of Organic Reactions

Transaction Processing:

Transaction Processing

Transaction Processing

Transaction Processing

An Overview of Transaction Processing Systems

Transaction Processing

Transaction Processing

Ch 6. Transaction Processing Monitors

An Overview of Transaction Processing Systems

Ch 6. Transaction Processing Monitors

Transaction Processing

Module 3 Transaction Processing Monitors

Overview Ch. 5

An Overview of Transaction Processing Systems