440 likes | 554 Vues
This chapter provides a comprehensive overview of web applications and the complexities of maintaining state across page interactions. It discusses core concepts such as the significance of protocols, user requests, and browser behaviors, alongside critical security concerns like phishing and cross-site scripting. The chapter emphasizes the importance of user awareness regarding URLs and the handling of sensitive data during web interactions. It also explores sessions, cookies, and how state management is essential for a coherent web experience, while highlighting the evolving standards and practices in web security.
E N D
Overview • The web is where it’s at • Simple web pages are simple • But things get complicated when a web application requires state to be maintained across pages and time • Privacy and security remain concerns
The Web • Now central to commerce and daily life • Still evolving • Standards • Code • Practices • Browser features
Web Pages • Alice requests a web page using her browser (URL) • Bob’s server returns a copy of the web page • Alice’s browser displays the web page
Browser’s Request • The URL (URI) protocol://hostname/path • An example http://cs.metrostate.edu/~fitzgesu/courses/
The Protocol • Most common – http • Hyptertext Transport Protocol • Stateless, request-reply protocol • Other common protocols • ftp • file • telnet • https
The Hostname • A domain name and host computer • Host + parent’s domain cs.metrostate.edu • Translates to an IP address
Path • Translates into a path in the web server’s directory structure ~fitzgesu/courses • Translates into /home/fitzgesu/public_html/courses/ • If no file name is given, the default file name is used – usually index.htm, index.html or index.php
Paths - Exceptions • URL’s can be redirected to another web page • A load balancer could forward the request to one of several servers
The Request • User types in URL • User cuts and pastes URL • User clicks link on page • User clicks link from email • User clicks link from pdf or other ‘non-web’ content
User Expectations • Phishing – a malicious entity invites a user to click on a bogus link in order to collect sensitive information • User pays insufficient attention to URL • User’s expectations are incorrect • Some browsers expand incomplete URL’s – users are sent somewhere unexpected
User Expectations (continued) • Typejacking – registering a hostname deceptively similarly to a real hostname • paypa1.com • whitehouse.gov vs. whitehouse.com
Interaction • Forms • input fields – have names • input fields can be ‘hidden’ • input fields can have default values • action – the script/page designated to handle the form input on the server side • method • get – inputs are appended to URL • put – inputs are sent to server separately
Web Form Processing • Data sent to the server is not encrypted by default • If data is appended to the URL, it can be stored in web logs and history lists or it can be passed along to third parties
Cross-Site Scripting • Bob’s site allows users to upload content (pictures, blogs), but does not check for malicious payloads • Bob’s site inadvertantly displays a link to malicious content, perhaps in the form of a link • Alice visits Bob’s site and clicks the link
Page Content • Browser incompatibilities • Platform/browser incompatibilities • Rendering is not always done by the browser • pdf’s • other applications • Opens these applications up to web-based attacks
Active (Executable) Content • ActiveX – small pieces of distributed programs downloaded and executed from browsers • Java applets (not so much) • JavaScript
One Page • User thinks they are accessing a single page • Pages commonly include images • Images may be located in a local cache • Or a separate http request may be issued for each image • Image may reside on a site other than the web page • Image may be too small to be visible
One Page (continued) • Page could include scripts (as well as images) • Framesets – page could be broken into several frames, each of which could be located on a different server
Sessions • http is stateless – information is not saved between request/responses • But applications need to save state • They do this by considering a series of related requests as a session
Cookies • Data is stored on the client machine • Cookie has an expiration date • Cookie has a domain and path • When the browser visits a server, it looks for client-side cookies whose domain and path match the server. • If such an unexpired cookie exists, it is passed to the server automatically
Saving State • Hidden fields in forms can be used to maintain state and pass information from page to page • Browsers use the REFERER field to inform the server about the current page when invoking a new one
Sessions – The Holes • Should a session accommodate a family of servers at the same site? • Alliances between domains? • Allow interruptions (go to other sites, return)? • Multiple windows? • Different browsers?
The Holes (continued) • What state is preserved? • Which servers see it? • What can adversaries with access to Alice’s machine learn? • How does the server know the user is still the same user? • How does the server know the client has not replaced the front end?
The Network • Plaintext traffic can be eavesdropped • DNS and BGP attacks can send browser to wrong website
Security Techniques for Web Apps • Basic access control • .htaccess • restricts access to specific directories by IP address or domain name • can be used to password protect specific directories • Not a significant barrier • IP addresses can be forged • username and password are sent in plaintext • username and password do not change • contents are sent in plaintext
Server-Side SSL • Secure Sockets Layer (SSL) – part of Netscape • Ubiquitous • See ladder diagram, p. 319 • Specify https as link protocol • Browser and server choose a crypto algorithm • symmetric cipher • integrity checking method • Server has public/private key pair and X.509 certificate
Server-Side SSL (continued) • Server sends certificate to browser • Browser decides whether or not to trust certificate • Browser generates random bits – the pre_master_secret, encrypts with server’s public key and sends to server • Both browser and server now privately know the pre_master_secret
Server-Side SSL (continued) • Both browser and server use pre_master_secret to generate • encryption key for browser-to-server communications • different encryption key for server-to-browser communications • two keys for integrity checking function (one for sending, one for receiving) • All subsequent communication is encrypted
Server-Side SSL (continued) • SSL is now standardized and known as Transport Layer Security (TLS) • Also called server-side SSL since the server presents the PKI certificate
SSL Protections • Confidentiality – encrypted communications • Integrity – traffic is hashed • Authentication – server has certificate • No reflection attacks – keys are different for each direction of traffic • No reordering – messages have sequence numbers • No replay attacks – keys are fresh for each session
SSL Problems • User interface problems • user may send via https • but server may respond via http • PKI • Can user trust the server’s certificate? • Pre-installed trust root CA’s may be outdated • Users may click OK on warnings
SSL Problems (continued) • Bugs • Bad PRNG led to bad shared keys • Post-SSL privacy spills • Servers are hacked, information is stolen from servers, not eavesdropped
Client-Side SSL • Server also asks browser to present its certificate • See Fig 12.3, p. 325
Privacy • Browsing leaves traces on client side • browsing history • back/forward • cookies • cache images • visited links • autofill • saved login information
Privacy (continued) • Server-side • URL information • responses to forms input • cookies • REFERER field • user’s IP address and hostname • connection speed • browser, version • operating system
Third-Party Servers • Bob’s site may include images or frames loaded from Edna’s site • When Alice visits Bob’s site, Edna will see • the URL of the page at Bob’s site (REFERER) • GET data • Edna’s cookies
DoubleClick • An advertising agency, DoubleClick, convinced many commercial sites to include a DoubleClick image • DoubleClick was then able to plant and retrieve cookies from many users across many web pages and track user browsing behavior • DoubleClick then displayed demographically tailored ads
Private Browsing • Onion routing – each node on a packet’s route knows only about its direct predecessor and successor • CROWDS • web requests are randomly passed around a crowd • the request is sent to a server by a random member of the crowd • The potential for abuse where there is no accountability is high
P3P • Platform for Privacy Preferences • Web sites encode their privacy policies • A privacy tool reads the privacy policy and compares it against the user’s preferences • A good idea which has not caught on
Web Services • Calling a procedure over the web • Data is exchanged in Extensible Markup Language (XML) format • XML is wrapped up in SOAP envelopes and transported using HTTP • New security mechanisms are needed (XML-signatures, XML-encryption, XML Key Management Specification, Security Assertion Markup Language, Extensible Access Control Markup Language)
Summary • Web applications are at the core of commerce today • Standards, code and practices are still evolving • Users’ mental models are too simple • Data inputs open the door to security problems • Sessions are used to maintain state • SSL is the most common web app security mechanism • There is no privacy
References • Smith and Marchesini, The Craft of System Security, Addison-Wesley, 2008 • http://browserspy.dk/