University of Palestine Faculty of Engineering and Urban planning Software Engineering Department Distributed SystemsESGD4221 Chapter 1 Eng. Mohammed Timraz Electronics & Communication Engineer Sunday, 13th February 2011
Text Books • G. Coulouris, J. Dollimore, T. Kindberg: Distributed Systems: Concepts and Design. • Tanenbaum & Van Steen: Distributed Systems – • Mei-Ling Liu: Distributed Computing Principles and Applications. • Ken Birman: Reliable Distributed Systems: Technologies, Web Services, and Applications . • S. Mullender: Distributed Systems, Second Edition • http://java.sun.com/docs/books/tutorial/ • http://msdn.microsoft.com/library/default.asp? • url=/library/en-us/cpref/html/cpref_start.asp
Syllabus • Information Client-Server Programming • Middleware • Java RMI • CORBA • Application Server • Enterprise Java Beans • Web Services • .NET • Scalability • Emerging technologies • Replication • Group Communication und Group Membership • Cloud Computing • Security
Marks • Mid-exam: 20 marks. • Attendance: 5 marks. • Assignments: 5 marks. • Quizzes: 10 marks. • Project: 10 or 20 marks. • Final-exam: 50 or 40marks.
Assignments • Java Application • RMI • CORBA • EJB • Web Services Lecture 1
What is a Distributed Systems? • A collection of independent computers that appears to its users as a single coherent system. • A collection of autonomous computers linked by a computer network that appear to the users of the system as a single computer. • A number of interconnected autonomous computers that provide services to meet the information processing needs of modern enterprises. • Enabling technology – Communications and networking
Distributed Systems? • System architecture: the machines are autonomous; this means they are computers which, in principle, could work independently; • The user’s perception: the distributed system is perceived as a single system solving a certain problem (even though, in reality, we have several computers placed in different locations).
Distributed Systems? • By running a distributed system software the computers are enabled to: • coordinate their activities • share resources: hardware, software, data. • According to this definition, the Internet as such, is not a distributed system, but an infrastructure on which to implement distributed applications/services (such as the World Wide Web).
Reasons for distribution • Distributed (and mobile) users • Distributed data/information • Distributed organizations • Distributed resources • Distributed information systems. • Why using it? • Sharing of resources, managed by servers and accessed by clients. • Example: • The World Wide Web (WWW). • Cloud Computing Infrastructures • Federated and Distributed Databases • General purpose (university, office automation) • Communication – email, IM, VoIP, social networks
Examples of Distributed Systems Network of workstations • Personal workstations + processors not assigned to specific users. • Single file system, with all files accessible from all machines in the same way and using the same path name. • For a certain command the system can look for the best place (workstation) to execute it.
Examples of Distributed Systems Automatic banking (teller machine) system • Primary requirements: security and reliability. • Consistency of replicated data. • Concurrent transactions (operations which involve accounts in different banks; simultaneous access from several users, etc).
Classifying Distributed Systems • Based on degree of synchrony • Synchronous • Asynchronous • Based on communication medium • Message Passing • Shared Memory • Fault model • Crash failures • Byzantine failures
Next Generation Information Infrastructure Battlefield Battlefield Battle Battle Visualization Visualization Planning Planning Data servers DeviceNets & SensorNets Electronic Commerce Distance Learning QoS Enabled Wide Area Network Collaborative Multimedia (Telemedicine) Collaborative task Clients Requirements - Availability, Reliability, Quality-of-Service, Cost-effectiveness, Security
End-User Application Developer • Personalized Environment • Predictable Response • Location Independence System Administrator • Platform Independence • Flexibility • Increased • Complexity • Code • Reusability • Real-Time Access • to information • Lack of • Management • Tools • Interoperability • Scalability • Portability • Faster Development • And deployment of • Business Solutions • Reduced • Complexity • Changing • Technology Organization Distributed Computing - Strategic Factors
The Internet • The Internet is a vast interconnected collection of computer networks of many different types. • It is the dominant distributed system at the current time, although intranets, the public switched telephone network (PSTN) and other networks are still important.
Architectural Models • In complex systems, it is necessary to organize the complexity by partitioning. • Architectural models are ways to organize the parts and structure the relationships between them. • Two fundamental architectural models are the Client/Server model and the Peer-to-Peer model
The Client/Server Model • The client/server model is organized around clients that request services and servers that provide services. • Services might include information (such as the current weather) or computational services (such as complex calculations). • There may be intermediate layers between the client and server that perform a portion of the task of locating and providing services.
Client/Server Performance Several factors are influencing the performance of a distributed system: • The performance of individual workstations. • The speed of the communication infrastructure. • Extent to which reliability (fault tolerance) is provided (replication and preservation of coherence imply large overheads). • Flexibility in workload allocation. • Scalability.
Client/Server Performance Scalability: Distributed systems can operate efficiently at different scales, ranging from a small intranet to the Internet. – A system is scalable if it remains effective when the number of users and resources is increased, i.e. providing acceptable performance at high load. – Need to control the cost of physical resources: The cost of adding new hardware to scale the distributed system should be reasonable. e.g. the correspondence between DNS and IP addresses. – Need to prevent software resources running out: e.g. lack of scalability in IP addresses (32 bits). Was modified to 128-bits in the new version of IP protocol.
Client/Server Performance • Performance, scalability and mobility of the client/server model can be improved by • Partitioning or replicating data on servers • Caching data at proxy servers or clients • Using mobile code and mobile agents • Adding and removing mobile devices
The Peer-to-Peer Model • In peer-to-peer models, each node of a distributed system is capable of requesting and providing services. • Most of the services available are available from several or even many nodes. • Nodes tend to come and go from the network frequently, but redundancy tends to keep most services available.
Communications • Probably the key concern of a distributed system is the communication between nodes of the system. • It can be argued that computers are improperly named. While they can and do provide computations, their dominant use is to provide information, so communicator might be a more accurate description than computer.
Fundamental models • Fundamental models describe properties common to all architectural models often focusing on communications between nodes • The interaction model manages performance and time limits. • The failure model specifies faults and defines reliable communication. • The security model describes threats to processes and communication channels.
Problems for Distributed systems • Distributed systems can be used in many different ways by persons or systems with different objectives. • Distributed systems can include a wide variety of hardware, operating systems, networks and applications. • Many types of internal system problems are possible. • There are many external threats that may affect a system,
Architectural Modeling • Architecture details a system by the organization of its components. • The components might be nodes on a network, layers of software, collections of services, or other ways of partitioning the system. • For example, one way of partitioning a system is to classify processes as client processes, server processes or peer processes.
Architectural Views System Architecture Application Architecture Business Architecture Presentation Layer Business Application Layer Product Information Layer Service Resource Support Physical Architecture Operations Architecture
Some Client/Server Architectural Patterns • Thin-Client model: All application processing and data management by server only. • Fat-Client model : Server only responsible for data management. The client machine implements application logic and interactions with user. • Three-tier client-server: There is a layer between client and server that may provide data and/or application processing.
Server (e.g. Data) Client (e.g. User Interface) Client/Server Configurations • Client-Server configurations are generally categorized in two types: • Two-tier Configurations: consisting of a client and a server.
Server (e.g. Data) Mid-tier for processing Client (e.g. User Interface) Client/Server Configurations (cont’d) • Three-tier Configurations: includes another server that offloads certain functions from either the client and/or the server. Since there can be many intermediate servers, this may also be called n-tier.
Client Mid-tier Server (e.g. Data) Mid-tier Client Mid-tier Client/Server Configurations (cont’d) • N-Tier Configurations: a type of three-tier set up that includes multiple intermediate servers. • Processing and data storage can occur on any node, as shown on the next page (using a three-tier configuration).
User Interface Processing Data Processing User Interface Processing Data Processing User Interface Data Processing Data Data Processing User Interface Processing Processing Data
Processing User Interface Processing Data Processing Fat vs. Thin Clients • Thin Client: In a client/server model, a client is called a “thin client” when the client contains a small amount of processing. Most of the processing is with the middle tier (in a three-tier setup) or the server. Lecture 1
Processing Data User Interface Processing Processing Data Fat vs. Thin Clients (cont’d) • Fat Client: A “fat client” is a client that contains a great deal of processing (which may include business logic), usually more than the server.
More Architectural Patterns for Distributed Software Systems • Multiprocessor: Common for large real-time systems and critical systems to improve performance and resilience of the system. • Distributed Object: Removes the distinction between client and server. The fundamental system components are objects that provide interface to a set of services. Other objects call these services with no logical distinction between the provider and receiver. • Peer-to-Peer: Similar to distributed objects, except that components are systems, not objects.
Typical Layers in a Distributed System Applications and Services Middleware Operating System Computer and Network Hardware
Typical Corporate Application (multi-tier client-server) Presentation Data management Application processing Client (browser) Client Client HTTP Web Server(s) Intranet CGI Database Server ASP CF WS Internet ODBC HTTP Client ODBC-Open Database Connectivity CGI – Common Gateway Interface ASP –Active server pages (Microsoft) CF - Cold Fusion (Allaire Corp.) Web Sphere – (IBM Corp.) Client Fire Wall (security) Client (browser)
What is Middleware? • Middleware is the software between your application and the operating system and networking on a computer. It is the layer above the operating system but below the application program that provides a common programming abstraction across a distributed system. • It can be called the / in Client/Server. • The classical definition of an operating system is “software that makes hardware useable”. Similarly, Middleware can be considered to be the software that makes a distributed system programmable.
What is Middleware? • Layer between Application and OS/Network • Provides distribution transparency • communication infrastructure • registration and lookup of remote service • Resolves heterogeneity of • Hardware/OS • Networks • Programming languages
ISO: forms of transparency • Access – hide differences in data representation and how a resource is accessed • Location – hide where a resource is located • Migration – hide that a resource may move to another location • Relocation – hide that a resource may move while in use • Replication – hide that a resource is replicated
ISO: transparency continued • Concurrency – hide that a resource may be shared by several competitive users • Failure – hide the failure and recovery of a resource • Persistence – hide whether a software resource is in memory or on disk
Pros and Cons of Middleware • Pros: • Reduce number of interfaces. • Clients see only one system i.e. the middleware. • Centralizes control. • Functionality widely available to all clients. • It allows to implement functionality that otherwise would be very difficult to provide. • Cons • Complex software. • Development platform (API) not complete system. • Functionality is hard to understand.
Standard Interfaces • Middleware provides a comprehensive set of higher-level distributed computing capabilities and a set of standards-based interfaces. • These interfaces allow applications to be distributed more easily and to take advantage of other services provided over the network.
Building Blocks • Application developers and system integrators can use middleware services as building blocks to construct enterprise-wide information systems that use distributed computing resources effectively.
Position of Middleware in OSI Model Application Presentation Session Transport Network Data Link Physical Complex data conversion Operation and Objects Communication
Examples of Middleware • Distributed Computing Environment from OSF • based on RPC and IDL • CORBA from OMG • based on objects and IDL interface • objects locate each other through ORB • defines Internet Inter-ORB protocol (IIOP) • DCOM from Microsoft • builds ORPC on top of DCE RPC • supports integration of binary components from different languages (e.g. VB, Java, C++)
Examples of Middleware(cont.) • Remote Method Invocation (RMI) • based on a single language (Java) • CORBA IDL used for defining interfaces • Jini from Sun • based on RMI • proposed for network-aware appliances
High-Level Services • Many middleware services are high-level services • Example: a single request of the data access service can retrieve many rows of information from one or more remote SQL relational databases (less coding effort)
Evolving Middleware for Application Distributability • The burden of distributing the application’s functions across nodes in a network fell largely on the application programmer. • The transport interface code is divided into transport-independent and transport-dependent parts. • Middleware provides transparency of the transport interface code.
Structure of Traditional Distributed Applications • The application includes the application logic and two types of networking code: • Application protocol support code. • Example: The X protocol for transmitting graphical images. • Transport Interface Code makes the appropriate network calls to send and receive the messages that make up the application protocol over a specific network transport.
Node 1 Node 2 Appl. Program 1 Appl. Program 2 Appl. Protocol Support Code Appl. Protocol Support Code Transport Interface Code Transport Interface Code Appl. Protocol Transport Protocol Traditional Distributed Applications