Subject Revision Distributed Systems: Principles and Paradigms

Subject Revision Distributed Systems: Principles and Paradigms Dr. Christian Vecchiola Postdoctoral Research Fellow csve@unimelb.edu.au Cloud Computing and Distributed Systems (CLOUDS) Lab Dept. of Computer Science and Software Engineering The University of Melbourne

Outline • Part I • Distributed Systems Fundamentals • Socket Programming • Multithreaded Programming • Part II • Operating Systems • Distributed System Models • Programming with Distributed Objects • Part III • Security • Distributed File System • Naming Services • Conclusions

Part I

Part I • Distributed Systems • Definitions • Colouris, Dollimore, Kindberg “A system in which hardware or software components located at networked computers communicate and coordinate their actions only by message passing.” • Tanenbaum and Van Steen “A distributed system is a collection of independent computers that appear to the users of the system as a single computer.” • Leslie Lamport "A distributed system is one on which I cannot get any work done because some machine I have never heard of has crashed.“

Part I • Distributed Systems • Definitions • Main use: • Connecting Users and Resources • Characteristics: • Heterogeneity • Lack of Global Clock • Concurrent and Autonomous • Independent Failures • Desired Properties: • Transparency • Openness • Scalability • Enhanced Availability Hide users/developers from the separation of components so that the system is perceived as a whole: - location - access - migration - concurrency - failure - replication - application The ability to interoperate with other systems and integrate new features with minimal changes, by means of open interfaces and standards. The ability to exhibit acceptable performance as one of more dimensions of the systems increases. Scalability in terms of number of nodes, users, operations. The ability of serving users despite the presence of failures. A failure in the system should never bring the system down. “Fault tolerance: no failure despite faults.” • Distributed Systems are heterogeneous in terms of: • Hardware Architectures • Operating Systems • Programming Languages • Software Interfaces • Communication Models • Information Representation • Security Measures • Being the result of the cooperative action of distinct computing elements… • There is no single time reference (not applicable efficiently) • Each system performs its activities together with the others • For these reasons failures are not related and occur at any time.

Part I • Distributed Systems • Examples: • Clusters • Grids • Clouds • The Internet! • Practical Applications • Communities (Virtual teams, organizations, social networks) • Science (e-Science) • Business (e-Business)

Part I • Socket Programming • Client Server Computing • Request – Response model • Two roles: • Client: makes requests and waits for responses • Server: serves requests and produces responses request result network client server

Part I • Socket Programming • Technologies for communication • Java RMI • RPC • NET Remoting • AJAX-based • CORBA • FTP, HTTP, SMP All of them are based on the Client Server model, and use Socket Programming (at some level) as communication abstraction and implementation technology.

Part I • Socket Programming • Definition “A socket defines an endpoint of a communication between two processes and it is identified by an address (mostly IP) and a port.” • Where does socket fit? Sockets are a programming interface for communication between two end points. They provide an interface for programming networks at transport layer. Application (http,ftp,telnet,…) Transport (TCP, UDP,..) TCP/IP Stack Network (IP ..) Link (Device driver)

Part I • Socket Programming • Properties • A socket is identified by an address and a port • They are implemented on top of TCP/UDP • Network communication is similar to file I/O • A socket handle is treated as a file handle • Socket programming is language independent • The abstraction of socket is identical in all the languages • Allows for interoperation between different languages and platforms • There are two types of socket: • Server socket  Server component of Client-Server model. • Client socket  Client component of Client-Server model.

Part I • Socket Programming Output stream Create user socket (random port: 39332) Input stream Server: 128.250.25.158 Server Socket: 1254 Connect(128.250.25.158, 1254) It can be a host name like: mandroo.cs.mu.oz.au Client Socket

Part I • Socket Programming • APIs for Socket Programming • JAVA • java.net.ServerSocket • java.net.Socket • java net.DatagramSocket • java.net.DatagramPacket • .NET • System.Net.Sockets.Socket • System.Net.Sockets.TcpClient • System.Net.Sockets.TcpListener • System.Net.Sockets.UdpClient Sockets are mostly used over TCP and identify a connection oriented communication model. UDP is connection-less communication model and it is based on the concept of Packet.

Part I • Multi-threaded Programming • Modern systems… • …perform multiple operations at the same time • …host multiple processes running concurrently games web & email office automation multimedia pictures Multitasking

Part I • Multi-threaded Programming • Modern applications… • …perform multiple operations at the same time • …host multiple threads running concurrently Background printing Threads GUI rendering Application core logic Multithreading

Part I • Multi-threaded Programming • Characterizing a Thread “A thread is a set of instructions that are executed sequentially within an program.” public class SingleThread { public static void main(String[] args) { …… … for (int i=0; i<args.Length; i++) { if (args[i].equals("-r") == true) { SingleThread.executeOptionR(args); } } } private static void executeOptionR(String[] args) { …… … } } One Thread

Part I • Multi-threaded Programming • Characterizing a Thread “A multi-threaded application executes multiple threads within the same process space.” public class MultiThread { public static void main(String[] args) { …… … for (int i=0; i<args.Length; i++) { if (args[i].equals("-op1“) == true) { Thread t1 = new Thread(Op1); t1.start(); } if (args[i].equals("-op2“) == true) { Thread t2 = new Thread(Op2); t2.start(); } } } } public class Op1 implements Runnable { public void run() { …… … } } public class Op2 implements Runnable { public void run() { …… … } } Thread Thread Thread

Part I • Multi-threaded Programming • Threads and Processes Multithreaded Process • When a process starts a default (main) thread is created. • From this thread other threads can be created. • Each of these threads can run for different periods • Each of these threads can create new threads • All the threads have access to a shared memory space. Common address space Main Thread Thread Thread Execution Timeline Thread Thread Multiple Execution Streams

Part I • Multi-threaded Programming • What is the use of threads? • Parallelism and concurrent execution of independent tasks / operations. • Implementation of reactive user interfaces. • Non blocking I/O operations. • Asynchronous behavior. • Timer and alarms implementation. • Major uses: • Throughput computing • GUI rendering

Part I • Multi-threaded Programming • Multi-threaded Servers • Allows to improve the throughput of server applications • Three Architectures • Thread per request • Thread per connection • Thread per component/object • Use of pooling to optimize OS resources.

Part I • Multi-threaded Programming • Synchronization • The use of multiple execution streams executed concurrently lead s to… • … contention… • … inconsistency of state …. • … unexpected / unpredictable behavior … … while accessing shared resources. • Thread synchronization helps solving this issues by using…. • … appropriate programming patterns • … appropriate programming language constructs

Part I • Multi-threaded Programming • Synchronization • There are several issues introduced by the use of shared resources • exclusive access • concurrent and compatible accesses • proper sequencing of operations • deadlock avoidance techniques • Basic techniques • use of lock/synchronized for exclusive access • use of wait-notify / wait-set patterns • resource acquisition ordering

Part II

Part II • Operating Systems • Constitute… • … a common sub-stratus for all applications • … an essential component for DS • Operations: • Resource management (memory, CPU, disks) • Application scheduling (program and services) • Device access (printers, ad-hoc devices, etc..) • User management

Part II • Operating Systems • Main Design Principles • The two key examples of kernel design approaches are: • Monolithic • Microkernel • Key difference: what does belong to the kernel? • Three main models: • Monolithic OS • Layered OS • Microkernel-based OS • The first two can be considered monolithic The chambers 20th century dictionary definition of monolithic is: a pillar, column, of a single stone: anything that resembling a monolithic, massiveness.

Part II • Operating Systems • Monolithic vs Microkernel S1 ....... S3 S4 S1 S2 ....... S2 S3 S4 ....... Monolithic Kernel Micro-Kernel Server: Kernel code and data: Dynamically loaded server program:

Part II • Operating Systems • Monolithic Kernel OS • Better application performance • Hard to extend • Example: MS-DOS Application Programs Application Programs UserMode Kernel Mode System Services Hardware

Part II • Operating Systems • Layered Kernel OS Application Programs Application Programs User Mode Kernel Mode System Services Memory & I/O Device Mgmt Process Scheduler Hardware

User Kernel Part II • Operating Systems • Micro Kernel OS • Tiny OS kernel providing basic primitive (process, memory, IPC) • Traditional services becomes subsystems • OS = Microkernel + User Subsystems • Examples: Mach, PARAS, and Chorus Client Application OS Emulators File Server Network Server Display Server Microkernel Send Reply Hardware

Part II • Operating Systems • Pros and Cons • Micro-Kernel main advantages: • Extensibility and its ability to enforce modularity beyond memory protection boundaries • A relative small kernel is more likely to be free of bugs than one that is larger and complex. • Monolithic OS main advantage: • Relative efficiency with which operations can be invoked is high because even invocation to a separate user-level address space on the same node is more costly.

Part II • Distributed Systems Models • Distributed system models helps in… • ..classifying and understanding different implementations • ..identifying their weaknesses and their strengths • ..crafting new systems outs of pre-validated building blocks • We will study distributed system models from different perspectives • Structure, organization, and placement of components • Interactions • Fundamental properties of systems

Part II • Distributed Systems Models • Three different perspectives to study from: • Architectural models • Capture structure and organization of systems • Define placement and interaction among components • Design requirements • Express goals in terms of performance and reliability • Fundamental models • Based on the fundamental properties • They give insights on.. • …characteristics of the systems • …associated potential failures and security risks

Part II • Distributed Systems Models • Architectural Models • First rough classification (by process types): • Server processes • Client processes • Peer processes • (Possible variations and compositions) • Other Models: • Mobile Code based systems • Dynamic code = security is a concern • Ad-hoc Systems (based on proximity networks) • High dynamism and volatility, more heterogeneity, • Limited capabilities Peer-to-Peer Systems Client-Server Systems

Part II • Distributed Systems Models • Architectural Models • Layered model (reference architecture) • Advantages • Breaking up complexity • Decomposition of functions and responsibilities • Different levels of abstraction Distributed systems cover this part of the layered architecture and might be internally organized into layers as well. Application & Services Middleware Operating System Computer and Network Hardware

Part II • Distributed Systems Models • Architectural Models • Client Server Model • Mostly cited in the case of distributed systems. • Most widely employed. • Based on: • Two roles: server and client • Communication pattern: • asymmetric • request (client) – response (server) • Examples • HTTP, SMTP, DNS, NNTP

Part II • Distributed Systems Models • Architectural Models • Client-Server • Two-tier model (classic) • Three-tier (when the server, becomes a client) • Multi-tier (cascade model) client server client server Server/client server client Server/client Server/client server

Part II • Distributed Systems Models • Architectural Models • Peer-to-Peer Model • All the processes play a similar role. • No distinction between server and client that are played by each component. • Cooperative interaction. • Avoids centralization and potential SPOF. • More difficult to manage. • Provides a better scalable infrastructure (1000s hosts). • Examples • P2P File sharing (OpenNAP, eMule, etc..). • Distributed Hash tables.

Part II • Distributed System Models • Architectural Models • Peer-to-Peer Model peer peer peer peer peer peer peer

Part II • Distributed Systems Models • Architectural Models • Variations of the previous two models • Multiple Server (kind of multi-tiers) • Cache and Proxy architectures • Mobile Code • Mobile Agents • Network Computers • Thin Clients • Mobile devices and ad-hoc networking

Part II • Distributed Systems Models • Design Requirements • Aspects to be considered: • Performance • Responsiveness • Throughput • Load-balancing • Quality of Service • Analysis of non-functional properties of systems • Applications: • Reliability & Security • Performance, and Adaptability

Part II • Distributed Systems Models • Design Requirements • Aspects to be considered: • Data and Replica management (caching) • Increase of throughput and availability • Dependability • How much we can trust, rely on a system? • How to measure? • Attributes: Availability, Reliability, Safety, Integrity, Confidentiality, Maintainability. • What tampers it? • Faults, Failures, Errors • Means as a support of dependability? • Prevention, Fault Tolerance, Forecasting

Part II • Distributed Systems Models • Fundamental Models • Addresses the following questions: • What are the main entities in the system? • How do they interact? • What are the characteristic that affect their individual and collective behavior? • Aspects to consider: • Interaction: communication and coordination • Failure: classification of Faults/Failures • Security: classify attacks and devise potential countermeasures

Part II • Distributed Systems Models • Fundamental Models • Interaction Model • Facts: • Communication takes place with delays (often of considerable duration) • Delays and the absence of global time limit the accuracy with which we can coordinate processes. • What are the element of interest? • Performance Communication Channels • Computer Clocks and Timing Events • Synchronous vs Asynchronous models • Event Ordering (Lamport: logical event ordering)

Part II • Distributed Systems Models • Fundamental Models • Failure Model • What is failure? • Process and communication may depart from what is the expected behavior. • What is a failure model? • Defines the ways in which failure may occur in order to provide understanding of the effects it can cause. • Observations • Different kinds of failures can be addressed differently • Different kinds of failures denote different (major or minor) problems • Classification: omitting, arbitrary, timing [failures].

Part II • Distributed Systems Models • Fundamental Models • Security Model • Goals: • Securing the processes composing the system. • Protecting the objects they encapsulate by unauthorized access. • Avoiding unauthorized access • Identifying requests (Principal) • Securing the channels they use to communicate. • Encryption and Cryptography • Measures: • Creation of a threat model.

Part II • Distributed Systems Models • Fundamental Models • Security Model • Threat Model • A careful analysis of all the aspects of a DS (hardware, software, network, and human) allows to build a threat model. • The threat model lists all the potential attacks that the systems might be exposed to. • Security costs have to be balanced against these attacks (“how much your enemy is willing to pay to break your security?”)

Part II • Distributed Objects Programming • Reference Model • Object Oriented Programming • Stub-skeleton model (aka. Client-Server) • Concept of Proxy • Technologies Studied • Java RMI • .NET Remoting • CORBA (no samples) • Web Services

Part II • Distributed Objects Programming • Java RMI (Remote Method Invocation) • Implements a RPC model for Objects in Java • Components • Java Remote Object: java.rmi.server.UnicastRemoteObject • Java Remote Interface: java.rmi.Remote • Java Client (Stub) object • Java RMI Registry: java.rmi.Naming (rmiregistry) • Properties • Transparent method invocation pattern over the network • Automatic parameter and return value marshalling

Part II • Distributed Objects Programming • Java RMI (Remote Method Invocation) • System View RMI RMI Registry RMI RMI Server RMI URL Protocol RMI Client URL Protocol Web Server URL Protocol Web Server

Part II • Distributed Objects Programming • Java RMI (Remote Method Invocation) • Architecture Request Skeleton & Dispatcher for B Object B Communication Module Proxy for B Object A Communication Module Remote Reference Module Remote Reference Module JVM Server JVM Client Reply

Part II • Distributed Objects Programming • Java RMI (Remote Method Invocation) • RMI System Design • Remote Interface • Exposes the set of methods and properties available • Define the contract between the client and the server • Constitutes the root for both stub and skeleton • Servant component • Represent the remote object (skeleton) • Implement the remote interface • Server component • Main driver that makes available the servant • It usually register with the naming service • Client component Example: RemoteTime

Subject Revision Distributed Systems: Principles and Paradigms