1 / 0

Distributed Web-Based Systems II

Distributed Web-Based Systems II. CSE5306 Lecture Quiz Due 2 July 2014 at 5 PM. Naming. Fig. 12-15, -16, pp.568, 9. Web document names are URIs (Uniform Resource Identifiers), either:

selima
Télécharger la présentation

Distributed Web-Based Systems II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Web-Based Systems II

    CSE5306 Lecture Quiz Due 2 July 2014 at 5 PM
  2. Naming Fig. 12-15, -16, pp.568, 9 Web document names are URIs (Uniform Resource Identifiers), either: URLs (Uniform Resource Locators) tell how and where to access the document (e.g., http, ftp, telnet, see left above) or… URNs (Uniform Resource Names) are “true identifiers” (p.181); i.e., globally unique, location independent, persistent references to the document. URIs’ syntax are determined by their schemes (see upper right); e.g., the data scheme includes the data in its name.
  3. R U O K ? What are URI’s? Uniform Resource Identifiers. Uniform Resource Locators. Uniform Resource Names. All of the above. None of the above.
  4. Synchronization Web synchronization is just beginning to become a design issue. Collaborative authoring (whiteboard designs): WebDAV (Web-distributed authoring and versioning) enables an engineer to lock (check out) create, delete, copy or move a document from remote Web servers. Clients themselves must prevent write-write conflicts when multiple engineers check out a single document. Clients check out a document by sending the Web server an HTTP Post command and trading a lock token (which proves access rights) for excusive file access. The client may disconnect from the server, modify the file and return it later. (There is no provision for orphan locks, when clients never return.) Web services also need coordination (pp.553-4).
  5. R U O K ? 2. How is synchronization relevant to Web-based system design? Collaborative authors need to lock (check out) create, delete, copy and move their documents from remote Web servers. Clients must prevent write-write conflicts, when multiple engineers check out a single document. Clients check out a document by sending the Web server an HTTP Post command and trading a lock token (which proves access rights) for excusive file access. All of the above. None of the above.
  6. Consistency & Replication Distributed systems that serve Web documents have performance and availability requirements. They cache and replicate Web content, including dynamic content generated by requests and those containing secure scripts.
  7. Web Proxy Caching Fig. 12-17, p.572 In addition to the browser’s own cache, its Web proxy mediates access to many shared caches. The Web proxy intercepts the browser’s requests and fulfills them locally or passes them on to the responsible Web server. In addition to its own cache, the proxy checks neighboring (cooperative, distributed) proxy caches or hierarchical caches (county through country, slower links, limited storage). The proxy may validate its cached copy with an If-Modified-Since request header. To avoid frequent server requests, Squid Web proxies apply an expiration time to every document: Texpire = 0.2 * ( Tcached – Tlast_modified) + Tcached. Orthe server may push invalidations to all proxies when documents change, using leases to limit the server’s workload. Proxies are useless on dynamic documents (e.g., pages with advertising banners on the margins). When caches fill, least recently requested cache entries are deleted.
  8. R U O K ? 3. How does a Web proxy serve its clients? It gives clients access to many shared caches. It intercepts a browser’s requests and fulfills them locally or passes them on to the source Web server. It may validate its cached copy with an If-Modified-Since request to the server, or it may apply expiration timestamps to cached documents. All of the above. None off the above.
  9. Replication for Web Hosting Systems Aggressive marketing has led to 18-thousand automated Content Delivery Networks (CDNs) in 70 countries, replicating and distributing original Web server content to consumers. This self-organizing system has a feedback control system’s architecture (see above), with metric estimation, adaptive triggers and measured responses. Fig. 12-18, p.574
  10. R U O K ? 4. How do feedback control systems work? The input signals the user’s desires (e.g., desired temperature, humidity). Performance metrics are gathered at the system’s output (e.g., actual temperature, humidity). The system seeks to null out the difference between the desires and its current performance (e.g., temperature and humidity errors). All of the above. None of the above.
  11. Metric Estimation Massively replicating a document costs money and bandwidth in the disseminating network. (CDNs must meet service-level agreements with their paying customers.) Latency metrics: How long must a customer wait to fetch a document? What are the network bandwidths between pairs of nodes? Spatial metrics: How many network router hops are there between nodes? (Speed-enhancing multi-protocol label switching techniques may violate measurable routing tables.) Network usage metrics: How many bytes are transferred per document? How often is it read, updated and replicated? Consistency metrics: How much can a replicate deviate from its master copy? Financial metrics: How much network bandwidth can the master copy Web server afford to buy from its Internet service provider?
  12. R U O K ? Match the following CDN metrics with they questions they seek to answer below. 5. Latency __ 6. Spatial __ 7. Network usage __ 8. Consistency __ 9. Financial __ How much network bandwidth can the master copy Web server afford to buy from its Internet service provider? How many bytes are transferred per document? How often is it read, updated and replicated? How long must a customer wait to fetch a document? What are the network bandwidths between pairs of nodes? How much can a replicate deviate from its master copy? How many network router hops are there between nodes?
  13. Adaptation Triggering A “flash crowd” can shut down a Web service and cause a cascade of failures in its vicinity. Replicates must be distributed before these sudden bursts of requests for one document. International marketers can predict world-changing, “tipping point” documents. Machines can extrapolate a regression line, and raise an alarm when it crosses a preset threshold. (Presetting that threshold and sizing the request-time window requires manual tuning in consideration of Web site traffic.) Fig. 12-19, p.577
  14. R U O K ? 10. How can a “flash crowd” be prevented from shutting down a Web service and causing a cascade of failures in its vicinity? Replicates might be distributed before these sudden bursts of requests for one document. International marketers could predict which are the world-changing, “tipping point” documents. Machines can automatically raise alarms, when extrapolated regression lines cross preset thresholds. Any or all of the above. None of the above.
  15. Adjustment Measures Client-request redirection can dramatically improve a Web hosting service’s performance. Every document consists of a dynamic main page and many static embedded documents, which can be cached and replicated. Akamai’s CDN (see diagram above) replaces the embedded documents’ URL’s with a virtual ghost server’s name, plus the URL of the origin server: Client requests document’s main page. Origin server returns it with references to embedded documents. Client looks up those docs’ names, which start with ghosting.com. The CDN provides the IP address of the client’s best replica server. Client requests docs from replica server’s cache. If not cached, server gets them from origin server. Replica server delivers docs to client. Consistency innovations: Client gets changeable main page from origin server. When embedded doc changes, the CDN changes its name to one that does not appear in cache (miss). CDN’s adaptive redirection policy : The CDN controls client-perceived performance. Prevents flash crowd shut downs by redirecting requests to lightly loaded replicate servers. Redirection is transparent to the client, which prevent her from bookmarking a favorite replica server. Fig. 12-20, p.578
  16. R U O K ? 11. How can client-request redirection improve a Web hosting service’s performance? It controls client-perceived performance. It prevents flash crowd shut downs by redirecting requests to lightly loaded replicate servers. It is transparent to the client, and that prevents her from bookmarking a favorite replica server. All of the above. None of the above.
  17. Replication for Web Applications Fig. 12-21, p.580 Caching and replicating dynamic Web content and applications is harder than static Web content…. (On the left above is a CDN, drawing content from the database server on the right.) When the update/read ratio is low and database searches are extensive, all Web content can be replicated. Partial replication is more cost effective (i.e., fewer updates and fewer origin server queries) when queries are simple and repetitious, but how can we decide which content to replicate…? Globule (p.64) decides based on cost of many importance-weighted metrics. Content-aware caches hold the most requested subset of the original database. (If query containment check fails, origin server takes query.) Content-blind caching hashes each query and instantly recites that query’s (previously hashed and cached) result.
  18. R U O K ? 12. When database queries are simple and repetitious, how can a CDN decide which parts of the source Web server’s database tables to replicate? Globule decides based on the accumulated cost of many importance-weighted metrics. Content-aware caches hold the most requested subsets of the original database. Content-blind caching hashes each query and instantly recites that query’s (previously hashed and cached) result. Any of the above. None of the above.
  19. Fault Tolerance Client-side caching and server replication tolerate faults in Web-based distributed systems. DNS also can map each name to several addresses. In multi-tiered web services, servers can be clients to other servers, which complicates the Byzantine fault tolerance (BFT) problem: A client might collect k+1 identical answers from 2k+1 BFT servers. A BFT client might request even more responses, when its server is failure prone. Services should require k+1 identical requests from a replicated BFT client, before fulfilling the requests.
  20. R U O K ? 13. How do multi-tiered web services deal with the Byzantine fault tolerance (BFT) problem? Clients should collect k+1 identical answers from 2k+1 BFT servers. A BFT client should request even more responses, when its server is failure prone. Web services should require k+1 identical requests from a replicated BFT client, before fulfilling a request. Any and all of the above. None of the above.
  21. Security Formerly called the Secure Socket Layer (SSL), Transport Layer Security (TSL) is positioned between TCP and any HTTP, FTP, etc. application. The TSL record protocol layer implements the client and server’s secure channel (see above): Client sends list of Crypto algorithms it can handle. Server chooses one. Server authenticates itself with its public key from a certification authority. If required, client sends its certified public key. Client generates random number, which will be used to create their session key, encrypted with the server’s public key and (if required) signed with the client’s private key.
  22. R U O K ? 14. How would the TSL record protocol layer’s implementation of the client and server’s secure channel change, if client authentication were required? Step 4 would be [KC+]CA. Step 5 would be KC—(KS+([ R ]C)). Step 2 would include ”Authenticate.” All of the above. None of the above.
  23. Summary Web-based applications and hypertext documents touch all of our lives. Huge standardization efforts today will lead to sophisticated Web services tomorrow. Apache is the best web server available, because it is flexible and extensible. Client-side caching and replication enable content delivery networks (e.g., Akamai’s 18,000 servers in 70 countries) to extend the reach of telemarketers to billions of people worldwide.
More Related