DIS Revision Week 13
What are Distributed Information Systems? • “Systems where the processing and/or data storage are distributed across two or more autonomous networked computers” • Almost all information systems in current use are, by this definition, distributed • The most common experience for most people of a distributed system is from the use of the web.
DIS are complex • 1000s of component • 100s of supplier • Sheer size in database and users • Geographic spread • Frequent change
We are approaching DIS as an architect would • Carry out the broad design • Architects use structural and mechanical engineers and the various trades • System architects use use network specialists, programmers, analysts, DBAs and the like • But are responsible overall • So we need to know enough to specify and supervise
What are standards & protocols? • These terms are used fairly interchangeably in the computer world. It can be argued that a protocol is a type of standard peculiar to computer systems, usually with a time element. • A protocol defines the format and order of messages exchanged between two communicating entities, and the actions taken on receipt or transmission of a message.
Some examples of standards & Protocols • De facto (by fact – by general acceptance) • TCP/IP – managed by the Internet Engineering Task Force (IETF) • HTTP, HTML & XML managed by the IEFT & W3 Consortium • IBM PC platform – established by IBM, Intel & Microsoft • De jure (by law – set by an officially recognised body) • LAN standards – 802.x set by IEEE • V series (V.32, V.33) X series (X.25, X.500) ISDN set by ITU.T used to be called CCITT set up by the United Nations • But the boundaries are blurred
Business rules • They are the rules, definitions and policies that are necessary for any organisation to function • Examples are: • Course pre-requisites – INFO2000 or INFO2006 for this course • Parking fines must be paid within 30 days • Employees who work less than 30 hours per week are judged as part-time etc • Many are very complex • The DIS automates many of those rules • But often not precisely defined until then • And very difficult to do – but necessary!
There are many different types of applications in a DIS • Communications • Information • Commercial • Education, Health etc • Government • Multi-media • E-Commerce
Structural change has been underway in business for some years • Integration of the world’s capital markets • Reduction in trade and capital barriers • Privatisation of government services • Business Process Re-engineering (BPR) • Enterprise Resource Planning systems (ERP) • Technology fitting Moore’s Law • Focus on core business & outsourcing
Characteristics of the traditional model • High fixed capital • Owned production capacity • Sell what you make • Reduce cost of production by • Large scale plant • Increased throughput
Characteristics of the new model • Very few capital assets • Often no production capacity • Concentrates on customers (CRM) and brand • Speed of response is the driver • Manages a network of suppliers • Suppliers bid via an electronic market • Design is collaborative – via internet
Characteristics of the new model (cont.) • Customer orders placed via Internet • Orders are routed automatically to the appropriate suppliers and component manufacturers • Goods are routed directly from supplier to customer • Customers and suppliers have full access to computer systems showing status of orders • Administration systems are also outsourced
Corporate Business Strategies • Increasingly, businesses have 3-5 year business strategies. These seek to define the business they are in and their plans for the next 3-5 years • IT is an enabler and a critical success factor is achieving those plans • Thus a corporate IT strategy is an underlying requirement
We start with a Business Strategy • In most cases an organisation will start with a business strategy. This is increasingly necessary because: • Business conditions change rapidly • Competition is actively encouraged • Management teams change more frequently • Business is more complex • Organisations have to be focused • Organisations seek to re-invent themselves rapidly
Many objectives will affect IT • Some of these will directly require IT services • IT can also feed into the process and facilitate new strategies and objectives • IT must brief Senior management on emerging technologies • Differentiate between technologies that are there and those which maybe offer more potential but not yet certain • IT may also prevent strategies from being followed • It is an Iterative process
Where do we start in the design process? • Like a building architect, by assembling a brief • The Corporate IT strategy defines many of the components • The problem definition set the functional boundaries • Existing systems pose some constraints • Volumes of data, transactions and users establish the size • The location of users sets parameters on security, internationalisation and controls • User community agrees performance criteria
Design is an iterative process • It starts in the feasibility study. • Often a number of preliminary designs are looked at this stage, costed and discussed • As the stages of development proceed, so the design is reworked and refined • Often the final design bears little similarity to the one opted for in the feasibility study
The feasibility study will • Define the key processes • Define the initial data model • Specify interface requirements to other systems • Identify and review the relevant corporate IT strategies and standards • Collect the volumes • Review solutions to the same problem in other organisations • Identify and review possible application packages
As the process continues • Make or buy decisions will be made • Development tools and methodologies will be put in place • DBMS will be selected • Development and implementation plans will be developed • Capital and operating costs will be estimated • Configuration and location of servers and data storage will be determined • Networks will be designed, upgraded and sized
And continues • Risks will be identified and minimisation strategies developed • Performance criteria agreed • Security requirements established • Implementation steps identified • The client server model selected • Infrastructure components identified in detail • The data model is developed • Processes are analysed and designed
The main clients server models Client server Centralised PC LAN 3 Tier 4 Tier 2 Tier Presentation Presentation Presentation Presentation Presentation Presentation Presentation Presentation Application Application Database Network – LAN and WAN Presentation Presentation Application Application Application Database Database Database Database File system File system File system File system File system
Database tier • This is the most easily defined • It parses and executes SQL to: • Update the database, or • Make the query and pass back the requested data set • Maintains transaction integrity (ACID) for a single database – moves back to application tier for multiple databases
Application tier • Executes the code that process the application • Sometime the interface between Presentation and Application is blurred • Varies between implementation • An example might help: In an enrolment system; • Presentation tier would • gather the details of the course and • establish that they were valid. • Application tier would • Process the rules to ensure you were eligible to take those courses, • update your records via SQL to the Database tier, and • draft a course schedule for the Presentation layer to show you.
3&4 Tier Presentation • In a three tier, the Presentation layer code is held remotely on the client or a local server. It presents forms etc for viewing or for data entry. It still has application specific material that must be updated if an application changes • Four tier usually means a WEB based system • The presentation layer is then split – the application specific stuff stays in the web server so that the only part that is required to be resident in the client is the Browser
As DIS architects, we want a network service that: • Provides a reliable message transport • Gives acceptable & predictable transmission times • Allows a host at any location to be part of the system • Does not require our application to adapt to any specific network characteristics.
Voice Networks • Voice networks were: • Circuit switched • Analogue • Circuit switching requires all resources to be dedicated for the length of the connection • Voice is a reasonably consistent user of bandwidth for the length of the connection • Data on analogue circuits requires a modem
Data Networks • Data does not use switched circuits efficiently as data is bursty – large quantities of data in bursts followed by quite periods • Packet switched gives better utilisation as many users can then share the channels • Digital signals allow greater bandwidth • High capacity lines can be multiplexed into multiple digital channels • Voice can be digitised and packetised for transmission on data networks – eventually all networks will be packet switched
Packet switched networks • Messages are broken into packets usually variable in length but not of unlimited length • Packet of data is wrapped in an enveloped with an electronic address • Packets sent down the line like cars on a highway • Routers act like road junctions, directing the packet along the right road to get to the eventual destination • Packet switched networks can be virtual circuit or datagram
Effective end-to-end transfer rates determined by: • The bandwidth of each link • The Latency at each switch • The Store & Forward process • The congestion or queuing at switches • Lost packets due to buffer overflow • Error detection and correction mechanism
The Layers of the Internet architecture • Application – HTTP, FTP etc • Transport – TCP and UDP • Network – IP – connectionless & unreliable • Data Link – FR, ATM • Physical
Domain Name Service • Converts host names e.g. cs.usyd.edu.au to 32 bit IP addresses 126.96.36.199 • IP addresses made up of two parts • Network address • Host or device address • IPv6 will introduce 128 bit addresses (maybe)
An Organisation’s network can be: • Leased channels • VPN Virtual Private Network • VPN on Public network • Public Network • Combination of some or all or these
Leased circuits • High initial fixed cost – may be cheaper if bandwidth well utilised • Fixed bandwidth – not easy to add bandwidth • Longer time frame to set-up • Circuits may not be readily available • Not flexible for mobile users
Frame Relay VPNs • Easier to set-up • Buy as much bandwidth (CIR) as needed and increase with a phone call • FR allow bursting above CIR if capacity available. • FR may not be available in some remote locations • Thus POP may not be available for local call access from mobile users • Network can be managed by supplier
VPNs on Internet • Cheap to set up • Variable bandwidth • Wide availability is good for remote offices and mobile users • No guaranteed bandwidth although QoS is coming • Some concern about data security
Hubs, (Bridges) Switches & Routers Application Application Transport Transport Network Network Network Link Link Link Link Physical Physical Physical Physical Physical Host Host Hub Bridge or Switch Router
Hubs • Physical level devices • They work at the bit level • When a bit is received from one line, it propagates down all the other lines • Can carry out limited network management functions – if an adaptor is faulty and floods the line with bits, the hub can internally disconnect that line • Extends the length of the LAN, because segment UTP lengths have discrete limits.
Bridges • Are Data link layer devices • Work on frames and use adaptor addresses • Store & forward devices • They act as a switch and only send frames down the line where the destination device is, thus if the frame address is not “over” the bridge the frame is not passed on. • Create limited area “collision zones” • Usually support 2-4 links • Can connect links of different bandwidths eg 10 & 100mbps Ethernet • They are plug & play devices – they learn where adaptors are • Will disable duplicate paths in its internal tables.
Switches • Are newer Link layer Ethernet devices (but there are WAN switches as well e.g. ATM switches) • Tend to replace bridges but do similar things • Larger number of links 12+ • Higher performance design – required because of larger number of links • Facilitates connection of servers
Routers • Network layers devices • Transfer IP packets and use IP addresses • Transfer packets down the best link to get to the destination host • Support redundant links • While they are inherently slower than hubs and switches, the more sophisticated technologies used compensate for that. • They are the “end device” of separate networks within the Internet • Can be used as simple firewalls by filtering out unwanted packets.
Routing algorithms • The network layer has to determine the route the message is to take • In a virtual circuit all packets for the connection will follow the same path • In a datagram service like IP, packets may take different routes • In both situations the routing algorithm within the Network layer will determine the routes
Quality of Service • One drawback with the Internet is that it is democratic, and all packets are treated as important as any other. • It provides “best effort” service • IPv4 has no mechanism to provide priority • This is needed for time critical applications such as telephony, real time conferencing and high performance transaction processing • QoS aims for a predictable and specifiable bandwidth and latency
QoS the key to one network • When packet switched networks can offer the QoS of switched circuits, that will be the day when all major users stop having two networks • Service providers are aware of this • The network must be able to differentiate between delay sensitive and delay insensitive applications
QoS requires: • The ability to request and receive resource reservation • Bandwidth • Router buffers • Prioritisation where network traffic is classified and priority given according to bandwidth management policy • These services could be for: • An individual data stream • Aggregate flows of a particular type
The Web is an application! • To many people The Internet and The Web are synonymous • But we know that The Web is an application that sits at the application level of the Internet • But is is the biggest, and therefore the most important to most people • But theoretically it could use different protocols on a different network
Some definitions • HTML HyperText Mark-up Language describes how the document is to be presented with tags or meta-data imbedded in the document. The Browser then uses that meta-data to format the document • HTTP is the application level protocol or service, for establishing connections and transmitting messages, between the Browser client and the Web server
Statelessness in HTTP • HTTP is a stateless protocol • When a resource has been sent, the server keeps no record of the exchange, so that if a second request is made by the same client, it is as if this was first contact with that client • This is not satisfactory for many complex transactions, say completing a multi-page form
Techniques for improving Web performance • Caching • Load balancing • Content Distribution Networks
Caching • Initially implemented near the client in a proxy server operated by the organisation – all requests are first directed at the proxy server. If it cannot supply then the request is passed on to the target server. • Works on the basis that similar users frequently access the same pages – between 20-70% of requests can be satisfied this way, reducing bandwidth on the WAN