1 / 39

计算机网络与分布式系统

计算机网络与分布式系统. 北京大学计算机科学与技术系 王建勇 Email: jwang@net.cs.pku.edu.cn URL: HTTP://csnetlib.pku.edu.cn/~jwang/course/cnds.html. Why Do We Study Distributed Systems?. A dozen remaining IT problems proposed by James Gray:. 世界的“梅米克斯” ( world MEMEX) 虚拟现实 (TelePresence, i.e. VR) 无故障系统

nimrod
Télécharger la présentation

计算机网络与分布式系统

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 计算机网络与分布式系统 北京大学计算机科学与技术系 王建勇 Email: jwang@net.cs.pku.edu.cn URL: HTTP://csnetlib.pku.edu.cn/~jwang/course/cnds.html

  2. Why Do We Study Distributed Systems? A dozen remaining IT problems proposed by James Gray: • 世界的“梅米克斯” • ( world MEMEX) • 虚拟现实 • (TelePresence, i.e. VR) • 无故障系统 • (trouble-free systems) • 安全系统(secure systems) • 高可用系统(AlwaysUp) • 自动程序设计 • (automatic programming) • 规模可伸缩性(scalability) • 通过图灵测试 (Turing test) • 语音到文本的转换 • (speech to text) • 文本到语音的转换 • (text to speech) • 机器视觉(machine vision) • 个人的“梅米克斯” • ( personal MEMEX)

  3. 教材: G. Coulouris, J. Dollimore, J. Kindberg. Distributed Systems: Concepts and Design. Addison-Wesley, 1994 参考书: Larry L. Peterson and Bruce S. Davie. Computer Networks: A System Approach, Morgan Kaufmann, 1996. Andrew S. Tanenbaum. Distributed Operating Systems. Prentice Hall International, Inc., 1996. Andrew S. Tanenbaum. Computer Networks. Prentice Hall International, Inc., 1996. Xueliang Yang. Distributed Computer Systems. Graduate school of USTC, the Chinese Academy of Sciences, P.R. China. 成绩考核: 1次编程(11月12日前完成提交), 1次文献阅读(论文提交的截止日期:2000/12/26),一次期末考试(占总成绩的60%).

  4. 《网络与分布式系统》文献阅读 URL: http://csnetlib.pku.edu.cn/~jwang/course/Assignment2.html

  5. 安全与身份验证 1. Jennifer G. Steiner, Clifford Neuman, and Jerrfey I. Schiller. "Kerberos: An Authentication Service for Open Network Systems." Proceedings of the 1988 USENIX Winter Conference, February 1988, Dallas, Texas, Pages 191-202.(MIT) 2. Butler Lampson, Martin Abadi, Michael Burrows, and Edward Wobber. "Authentication in Distributed Systems: Theory and Practice." Proceeding of the 13th Symposium on Operating Systems Principles, October 1991, Pacific Grove, CA, Pages 165- 182.(DEC) 3. Victor L. Voydock and Stephen T. Kent. "Security Mechanisms in High-Level Network Protocols." ACM Computing Surveys, 15(2), June 1983, Pages 135-171.

  6. Tentative course outline • Introduction, Basic networking, ISO model, networks & internetworking • Inter-Process communications: BSD sockets, Client-server model • RPC. Sun's RPC etc. • DOS principles. • Name service: terminology, DFS & DNS • Distributed file systems:concepts,design and implementation;DFS case studies: NFS, AFS, Coda, COSMOS(or S2FS).

  7. Tentative course outline(continue) • Distributed shared memory: IVY & Munin • Coordination in Distributed System: potential causality, clock synchronization, logical time • Replication: Gossip & Isis • Transaction: Acid, Locks, deadlocks, nested transaction, optimistic concurrency control, timestamp ordering, distributed transaction • Recovery & Fault tolerance • Security in DS: DES,RSA, digital signature, Needham-Schroeder model, kerberos

  8. Services provided by distributed systems name services distributed file systems distributed shared memory Time & coordination shared data services (distributed transactions & concurrency control, recovery) highly available services(replication & security fault tolerance) The micro-kernel of DOS Processes & threads, Naming & protection Communication & invocation, Virtual memory Foundations ofRemote Procedure Calling distributedInterprocess Communication systemNetworking and Internetworking Components of distributed system

  9. Chapter 1 Introduction to Distributed Systems • Review of computing history • Why should we develop distributed system • Key characteristics of distributed system

  10. 1.1 Review of computing history • Physically distributed hardware • Logically centralized software 1.1.1 The trend of hardware 1960s & 1970s: timesharing system 1980s: personal computer & personal workstation 1990s: distributed computer systems 2000s: mass distributed systems 1.1.2 The need for logically centralized software • User’s requirement: • - a system built out of large numbers of powerful PCs or workstations • - but which act together in a coherent way • >> that is as easy to use & understand as an old fashioned timesharing system. • Role of a new generation operating system(DOS): - e.g., Web OS, Cluster OS

  11. Review of computing history

  12. Figure 1.2 Milestones in Distributed Systems

  13. 1.2 Why should we develop distributed system 1.2.1 Most important reason is that application is a starting point and end result of development of distributed systems. 1.2.2Many computer applications occur in a distributed or decentralized environment. • Sharing expensive resources • Exchange data between systems 1.2.3 Proliferation of low cost and high performance PCs or Workstations 1.2.4 The interface between users and Computers is more friendly 1.2.5 LAN & Internet applications stimulate DOS’s development • It’s the software, not the hardware that determines whether a system • is distributed or not 1.2.6 Examples of distributed systems and applications 1、Distributed UNIX: • Berkeley BSD UNIX+NFS+NIS • Amoeba, Mach, Chorus

  14. 2、Commercial applications • airline seat reservation and ticketing • automatic teller machine Reliability, security 3、Wide area network applications • Internet, ARPAnet 100=> 1 million • Internet information service,such as Email, Web, www search engine, BBS, E-commerce、digital library 4、Cluster system •IBM SP2 •Berkeley’s NOW •NCIC’s Dawning superserver 5、Meta Computing • idle computers are ubiquitous 6、Multimedia information access and conferencing application • continuous media service, such as VOD servers, video phone and video conference, their main requirement is quality of service • ATM, real time OS, continuous media servers

  15. 1.3 Key characteristics of distributed system What’s the Distributed System? Definition 1:A distributed system is one in which there exists a multip- licity of interconnected processing resources able to cooperate under system-wide control on a single problem with minimal reliance on centralized procedures, data or hardware. —Formulated by the organizing committee for the 1st conf. on DCS Definition 2: A Distributed system consists of a collection of autonomous computers linked by a computer network and equipped with distributed system software —From our textbook 1.3.1 Resource Decentralization and sharing - Some or all of the computing resources should be decentralized in function as well as distance - and this is a prerequisite for making the distinction from other types of systems, such as time-sharing system.

  16. - Some resources are very expensive, and data sharing is an essential requirement in many computer applications 1.3.2 Cooperative Autonomy - Cooperative autonomy, especially control autonomy increases the overall reliability and availability of the system 1.3.3 Concurrency (i.e. work parallelism) - Concurrent vs Parallel、 >> MIMD(Multiple Instruction & Multiple Data stream) vs Concurrent of TSS - Two reasons: >> Many users simultaneously invoke commands or interact with applications programs; >> Many server processes run concurrently, each responding to different request from client processes. 1.3.4 System transparency - it looks to its users like a centralized single computer system - but runs on multiple independent machines , i.e. Single System Image.

  17. ISO definition: • Access transparency, Location transparency, Concurrency transparency, • Replication transparency, Failure transparency, Migration transparency, • Performance transparency, Scaling transparency 1.3.5 Fault tolerance - Two approaches to the the design of fault-tolerant computer systems: >> hardware redundancy: the use of redundant components; >> software recovery: the design of programs to recover from faults. - Availability is a measure of the proportion of time that it is avail- able for use. 1.3.6 Scalability - Scalable techniques: >> Re-configurable , removing performance bottleneck{serverless, replicated data and services, caching} >> e.g., NFS is short of scalability. 1.3.7 Openness - the characteristic that determines whether the system can be extended in various ways. - e.g., UNIX

  18. - To summarize: >> Open systems are characterized by the fact that their key interfaces are published; >> Open distributed systems are based on the provision of a uniform inter-process communication mechanism and published interfaces for access to shared resources; >> Open distributed systems can be constructed from heterogeneous hardware and software, possibly from different venders.

  19. Chapter 2 Design Goals & Issues • Introduction • Basic technical issues • Users’ requirements • Summary

  20. Performance • Reliability • Security • Scalability • Consistency Key characteristics of distributed system • Concurrency • Transparency • Fault tolerance • Scalability • Resource sharing • Openness Key design goals

  21. 2.1 Basic design issues • Naming: - global meaning & scalability • Communication: -how to optimize the implementation of communication in distributed system - while retaining a high-level programming model for its use • Software structure: - how to structure a system so that new services can be introduced >> that will interwork fully with existing services >> without duplicating existing service elements • Workload allocation: - how to deploy the processing and communication resources in a network to optimum effect in the processing of a changing workload • Consistency maintenance: - maintenance of consistency at reasonable cost

  22. 2.1.1 Naming • name vs identifier • resolved name is an identifier together with other attributes - internet communication: IP+PORT number - UNIX file system: index node number - Mach communication system: Port number • naming design considerations - choose an appropriate name space - use name service to resolve names to communication identifiers - scalability considerations • name contexts are represented by tables or databases - file system: /etc/a.out vs /usr/a.out - internet: www.cs.pku.edu.cn vs www.cs.tsinghua.edu.cn • names maybe structured or flat, readable or unreadable, location-independent or containing location clues • naming schemes can incorporate security mechanism - file systems’ directory

  23. 2.1.2 Communication • Communication between a pair of processes involves: - transfer of data & synchronization activity • Communication primitives: send & receive may be: - synchronous(i.e. blocking) or asynchronous(i.e. non-blocking) • Two communication patterns: - client-server model between pairs of processes - group multicast model between groups of cooperating processes 2.1.2.1 Client-server Communication • it’s oriented towards service provision,and an exchange consists of: - transmission of a request from a client process to a server process; - execution of the request by the server; - transmission of a reply to the client. • it can be implemented in terms of message-passing operations(send & receive) - but commonly presented at the language level as RPC

  24. Dynamic binding in client-server model • - example: DNS name server • Function shipping in client-server model • - example: Postscript with laser printers 2.1.2.2 Group multicast - sending a message to the members of a specified group of processes is known as multicasting

  25. Motivation of group multicasting • - Locating an object • - Fault-tolerance • - Multiple update • >> e.g., maintaining cache coherence under write-update mechanism • >> e.g., Time synchronization, RAID

  26. 2.1.3 Software Structure

  27. components of DOS - operating system kernel services >> extending conventional Unix kernel, like BSD Unix >> microkernels, like Mach, Amoeba and Chorus - open services >> DFS >> DSM >> other services, like electronic mail delivery service - Support for distributed programming >> RPC >> MPI or PVM

  28. 2.1.4 Workload allocation Figure 2.5 the processor pool model • two main workload allocation model • - processor pool model, • - the use of idle workstations 2.1.4.1 The processor pool model

  29. examples: Amoeba, Plan 9, Cambridge Distributed Computing System Dawning 2000 super server 2.1.4.2 Use of idle workstation • use of idle or under-utilized workstations as a fluctuating pool of extra computers • example: Sprite, LSF 2.1.4.3 Shared-memory multiprocessors - also called Symmetric shared-memory Multi-Processor (or SMP)

  30. 2.1.5 Consistency maintenance • Update consistency - there are likely to be many users accessing shared data; - the operation of the system itself depends on the consistency of certain databases • Replication consistency • Cache coherency • - hypothesis of locality • Failure consistency • Clock consistency • User interface consistency

  31. 2.2 User requirements • Functionality - what the system should do for users • Reconfigurability - the need for a system to accommodate changes without causing disruption to existing service provision • Quality of service - embracing issues of performance, reliability and security 2.2.1 Functionality • Key benefits of a distributed computer system: • - economy & convenience from resource sharing; • - potential improvement in performance & reliability from • distributed resource.

  32. Enhancements to the services provided by centralized computers: • - sharing across a network can bring access to a richer variety of resources • than could be provided by any single computer; • - utilization of the advantages of distribution enables explicit sharing, • fault-tolerant or parallel applications can be programmed. • Three options when considering a migration from centralized computing • to distributed computing: - adapt existing operating systems for networking >> example: BSD Unix + NFS - move to an entirely new operating system designed specifically for distributed systems - emulation: move to a new DOS, but can emulate one or more existing OS. >> examples: Mach & Chorus

  33. 2.2.2 Reconfigurability • Requirements of a reconfigurable distributed system: • - the changes due to the scalability of a distributed system design and its • ability to accommodate heterogeneity • - a failed process, computer or network component is replaced by another • working counterpart; • - computational load is shifted from over-loaded to less-loaded machines, • so as to increase the total throughput of the distributed system; 2.2.3 Quality of service • Performance: in terms of the response times experienced by its users • - Optimizing the performance of all of the software components that involved • >> OS’s communication services • >> distributed programming support ( e.g., RPC) • >> and the software that implements the service.

  34. Reliability and availability: - a fault-tolerant system is one >> which can detect a fault >> either fail gracefully(that is, predictably) >> or mask the fault so that no failure is perceived by users of the system. • Security comes from two main threats - against the privacy and integrity of users’ data as it travels over the network - their openness to interference with system software: >> not all machines on a network can in general be made physically secure

  35. In next class we’ll discuss: Chapter 3 Networking & Internetworking • Network technologies • Protocols • Technology case studies: Ethernet, Token Ring and ATM • Protocol case studies: Internet protocols and FLIP Thanks for your attention!

More Related