1 / 58

Databases in Internet Applications: Case Studies Anil Nori CTO AserA Inc. Palo Alto USA

Databases in Internet Applications: Case Studies Anil Nori CTO AserA Inc. Palo Alto USA Acknowledgements. Sources for some of the material Oracle Corporation CNN Custome News Excite Cisco. Database Technology Timeline. Simple Data Management.

Télécharger la présentation

Databases in Internet Applications: Case Studies Anil Nori CTO AserA Inc. Palo Alto USA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Databases in Internet Applications: Case Studies Anil Nori CTO AserA Inc. Palo Alto USA

  2. Acknowledgements Sources for some of the material • Oracle Corporation • CNN Custome News • Excite • Cisco

  3. Database Technology Timeline Simple Data Management Global Enterprise Management Early 80s Late 80s Early - Mid 90s Late 90s - 21st C EarlyRelational Client-server Relational Enterprise -capable Relational Internet Computing Pre- relational Packaged & Vertical Applications Data Warehouse & Hi-end OLTP Simple OLTP Active Database Middleware (messaging, queues, events) Java, CORBA, Web interfaces Scaleable OLTP, parallel query, partitioning, cluster support, row-level locking, high availability Simple transactions, on-line backup & recovery Support for all types of data, extensibility, objects Stored procedures, triggers

  4. Current State of DBMSs • OLTP applications • Large amounts of data • Simple data, simple queries and updates • Update statement from debit/credit transaction:UPDATE accounts SET abalance = abalance + :deltaWHERE aid = :aid; • Typically update intensive • Large number of concurrent users (transactions) • Data warehousing applications • Large amounts of data • Simple data but complex querying • Typically read intensive • Large number of users

  5. Current State of DBMSs • These applications require: • Large users/transactions • High performance • High availability (7x24 operations) • Scalability • High levels of security • Administrative support • Good utilities

  6. Transaction Processing Larger User Populations Trained Self-Service Network Systems Gigabytes Terabytes Independent Integrated Systems Management Usage Batch Immediate Simple Intelligent Operations Hours Importance Local Global Business-Critical Useful Internet Applications: Challenges Data Warehousing Users Analysts Every Employee Size

  7. E-commerce/Apps Information Management APIs Type Proprietary Open Tabular Heterogeneous Applications Delivery Standalone Integrated Generic Personalized Access Read/write Lots of read-only Content Direct Search Internet Applications: Challenges Site Operation Management Low TCO, Mission Critical Availability Occasional 24X7

  8. Internet Challenges • Availability • Need near 100% availability • Must be easy to manage • Replication, hot standby, foolproof system? • Scalability • Number of users is orders of magnitude higher • Security • Global users • Managing millions of users • Encryption • Performance • Internet user expectations • Speed vs correctness • (e.g. Search engines vs blade/cartridge/extender • Availability vs correctness

  9. Internet Application Architecture: Today Client Tier authoring Browser Browser tools etc. HTTP HTTP Physical Middle Tier WEB/APP Server Data Integration, Storage, Query, Management Middle Tier Application Application messages Remote messages Gateways Data Sources Other OLE/DB ORDBMS Data Data source Sources

  10. Case Studies • CNN Custom News • Excite • Cisco Internet Applications

  11. CNN Custom News • On-line news service • Allows users to customize news in a personalized manner • Offers variety of news items (e.g. national, international, business etc.)

  12. Application Server Application Server Application Server Oracle DBMS Oracle DBMS Custom News Application Architecture Client Tier Browser Browser HTTP Hardware Load Balancing Physical Middle Tier WEB Server WEB Server WEB Server ... Database Tier OPS

  13. CNN Custom News • Backend: • SUN SOLARIS enterprise servers • Oracle Parallel Server 7.3.4 • Middle-Tier (9 Machines) • Web Servers • Oracle Application Servers • PL/SQL Cartridges • Load Balancing • Harware based • DNS router • Round -robin

  14. Cartridge Cartridge Cartridge Oracle Application Server Adapter CORBA Backend

  15. CNN Custom News • Data feeds into the database • Keeps text in the database • Images in files • Images accessed in the middle-tier • PL/SQL Cartridge

  16. PL/SQL Cartridge PL/SQL Cartridge Connection pooling Session Caching Parameter Marshalling Validation Result Processing OAS Oracle DBMS PL/SQL

  17. PL/SQL • Server-side • Used to generate HTML • Suited for database logic

  18. Searching • Uses Oracle ConText cartridge • Content-based searching • Uses bitmap indexes

  19. CNN Custom News: Observations • Database-centric • Uses PL/SQL based scripting • Application Server for scalability

  20. Excite • Personalized online service that gives Web users everything they want, all in one place • Builds tools that manage vast amounts of information available on the internet • Provides variety of user services (apps): • News • Money and Investing -- stock quotes • Message boards and Chat • Mail • Communities • Classifieds • Jobs

  21. Excite • Supports suite of applications • Each application uses three-tier architecture • Federated approach • Many databases • Databases specific to applications • Application logic in the middle-tier as multi-threaded embedded C programs (pro*c programs)

  22. Middle Tier Application Middle Tier Application Middle Tier Application Oracle DBMS Excite: An Application Architecture Client Tier Browser Browser HTTP HTTP Physical Middle Tier WEB Server WEB Server Database Tier

  23. Excite - PFP Application • Personalized front page application • Application is deployed as 50 middle-tier daemon processes • The middle-tier application daemons perform: • Application logic in C • Connection pooling • Each daemons keeps about 40 connections to the database (about 2000 total connections to the database) • Load balancing

  24. Excite - PFP Database Configuration • Oracle8 on SUN solaris server • 2 SUN 6500s -- 28 way SMP • PFP database is split into multiple databases for load balancing and scalability • Scalar data stored in the database in relational tables • About 20 tables for storing user profiles; 100 tables for content

  25. Excite - PFP Database Configuration • Multi-media content (e.g. Stock quotes or news item) stored in memory mapped files for fast access. File references stored in the database • Lot of the content is read-only; need not be backed up; can be reconstructed from the original sources

  26. Excite - Scalability • By partitioning the application across multiple databases • Each application partition supported by multiple middle-tier daemon processes • Multiple web servers to reduce traffic congestion

  27. Excite - Availability • Using replication and hot standby • Uses oracle8 hot standby feature • Uses asynchronous replication. Data replicated at 10 sec latency • Almost every database is replicated for failover • Replication preferred over hot standby. Hot standby cannot be used for normal usage

  28. Excite - Other Applications • Most of the Excite applications have similar three-tier architecture

  29. Excite - Observations • Some content (specially, for communities applications) could be stored in the database. Management benefits attractive. If content stored in the database, access performance is very critical • Need fast replication • Currently not using middle-tier caching. Caching could be quite useful but coherency is an issue

  30. Cisco • Successfully implemented applications for the internet • Internet commerce • Order placement • Checking order status • On-line, guided product configuration • Price quotes • Employee self-service • Provides all employee services electronically • Employee directories • Employee benefits • Expense reports

  31. Cisco • Supply chain management • Networked suppliers, resellers and customers • Enables business partners to manage and operate major portions of its supply chain • Entire supply chain works off one central demand forecast • Customer care • Exchange of technical information • Software upgrades (90% of software upgrades via internet) • On-line support ( 70% of support on-line) • On-line, assisted trouble-shooting

  32. Cisco • Communications and collaboration • Sales and technical training • Virtual classrooms • Company-wide meetings and broadcasts

  33. Commerce Server Oracle DBMS Cisco Commerce Server Architecture Client Tier Browser Browser HTTP HTTP Physical Middle Tier WEB Server Oracle DBMS Database Tier Oracle Applications

  34. Cisco Commerce Server • Typical three-tier architecture • Proprietary web server • Performs content aggregation • Encryption • Accesses oracle DBMS • Runs on a dedicated SUN server • Proprietary commerce server • Proprietary application server • Performs variety of commerce functions

  35. Cisco Commerce Server • Scalability and availability • Big servers for scalability • Multiple commerce server processes for load balancing • Databases replicated • Hot standby for availability

  36. Case Studies: Observations • Database is being used mostly for storage • Application in the middle-tier • Middle-tier also provides: • scalability • load balancing • large number of users

  37. Analyzing Internet Applications • Web integration • Web publishing • Application integration • E-commerce

  38. WEB Integration • Heterogeneous data sources • Heterogeneous data types • 1000s of data sources • Dynamic data • Warehousing

  39. Web Publishing • Problem: internet placing new requirements on content management • Heterogeneity: access different types of content from browsers e.g. Email, data warehouses, reports, HTML files • Personalized: structured, dynamic, customized content • Transactive: content blending with application • Aggregation: portalization via major “gateways”

  40. Application Integration • Integrating Multiple Applications (e.g. ERP/Front Office) • Application workflow specification • Asynchronous communication • Queuing and propagation • Message tracking • Message warehouse (persistence) • Message broker/server • Data transformation • Transforming messages to different application formats (e.g. SAP, CLARIFY, …I

  41. Electronics Commerce • Automating business-to-business, business-to-consumer interactions • Selling and buying • Order management • Product catalogs • Product configuration • Sales and marketing • Education and training • Service • Communities

  42. Database Technology Uses • Business/workflow transactions • Support across multiple database/ERP systems • Transactional • Tools to generate compensating actions • Transformations • Queuing • Support for heterogeneous messages • Transactional • Querying, e.g. On attribute, value pairs • Indexing, e.g. On attribute, value pairs • Publish/subscribe

  43. Database Technology Uses • Rule engines • Complex business processing rules • Customization/profiling rules • Business domain rules • Presentation rules • Repositories for Application Development • Managing Java objects, interfaces, etc. • Must for application integration • Standardized object models and protocols • Directories vs repositories

  44. Database Technology Uses • XML support • XML schema/storage • XML caching • XML querying • Coexistence with SQL -- current efforts seem disjoint • Multiple caches • Consistency of middle-tier and database caches • Data mining • Algorithms need to become more pragmatic

  45. Database Technology Uses • Internet user expectations • Speed vs correctness • (e.g. Search engines vs blade/cartridge/extender) • Availability vs correctness • Component Architecture • Caching • XML support • Querying • Transactions • Rule engines • Metadata management • Queueing

  46. Database Technology Uses • Availability • Need near 100% availability • Must be easy to manage • Replication, hot standby, foolproof system? • Scalability • Number of users is orders of magnitude higher • Security • Global users • Managing millions of users • Encryption • Performance

  47. XML documents on the Web Internet Applications Architecture: Future Client Tier XML enabled tools: Browser Browser authoring tools etc. XML XML Logical Middle Tier WEB/APP Server XML enabled Application Messages XML Integration & Query Server; XML Database Warehouse Server XML XML XML XML Transformer & Gateway Data Sources XML enabled Other documents OLE/DB ORDBMS on the Web Data source e.g. HTML, WORD

  48. XML in the Database • XML has the potential to impact four important markets • Web integration • Web publishing • Application integration • Electronic commerce Xml-enable the DBMS

  49. Xml-enabled DBMS • “Xml-enable” the database system • Store XML data/documents the database server • Querying and searching of structured and unstructured XML • In generate XML data from the database server • Add XML capabilities in supporting database facilities DBMS Integrate with other facilities Generate XML Store XML

  50. Store XML Data • Enhance XML storage facilities in the database with support in utilities • Facilities to load XML data into the database • Provide more efficient database storage (componentized storage, compression, indexing,…) • XML export facilities from the server

More Related