320 likes | 493 Vues
Next Generation Internet Services. Dean Jacobs Technische Universität München. Talk Outline. Web 2.0 and Consumer Services Enterprise Software as a Service Case Study: salesforce.com Database Virtualization. What Is Web 2.0. The Web As Platform Harnessing Collective Intelligence
E N D
Next Generation Internet Services Dean Jacobs Technische Universität München
Talk Outline • Web 2.0 and Consumer Services • Enterprise Software as a Service • Case Study: salesforce.com • Database Virtualization
What Is Web 2.0 • The Web As Platform • Harnessing Collective Intelligence • Data is the Next Intel Inside • End of the Software Release Cycle • Lightweight Programming Models • Software Above the Level of a Single Device • Rich User Experience Tim O'Reilly 2005
1. The Web As Platform • Seamless cooperation (mashups) between services • Google maps + Craig’s List = housingmaps.com • An ecosystem of interdependent services • Reach the “long tail” of users • Exploit self-service and algorithmic data management • Google AdSense rather than DoubleClick ($3.1B) • Easy to experiment • Low cost for failure • Fostering a huge wave of innovation
2. Harnessing Collective Intelligence • People perform tasks for themselves, the community, or both (reputation building) • Computers amplify the data into significant value • Google Page rank • Ebay Supply chain • Wikipedia, MySpace, Flickr User-generated content • Flickr, del.icio.us Tagging for classification • ESP Game Two-person game that tags photos • Digg Vote to determine importance of news stories • Last.fm Song recommendations from listening patterns • HSX Hollywood predictive market • Amazon Customer ratings and reviews • Amazon Mechanical Turk Hire people for simple tasks
3. Data is the Next Intel Inside * Streaming Data: “The Live Web”
4. End of the Software Release Cycle • Software is offered as a service not a product • Operations become a core competency • As important to Google as PageRank • Perpetual beta • Weekly or even daily updates • Real-time monitoring of user behavior • Dynamic scripting languages (Perl, Python, PHP, Ruby) are more nimble and therefore more suitable
5. Lightweight Programming Models • Started with HTTP/HTML and lead to Web Services • Also applies to the design of Web Services APIs • Loose coupling is essential • Make few assumptions about the other end • Be self-describing and resilient to changes • Simple, flexible protocols are better than complex, specialized protocols • Amazon reports 95% of their Web Services traffic uses REST (XML over HTTP) rather than SOAP • SOAP and WS-* are designed for Enterprise Application Integration
6. Software Above the Level of a Single Device Applications should span servers, desktops, laptops, cell phones, PDAs, and special devices Example: iPod/iTunes reaches from the handheld device to a massive web back-end with the PC acting as a local cache and control station 7. Rich User Experience Rich Internet Applications started with browser Applets Realized today as AJAX • “AJAX isn't a technology. It's really several technologies, each flourishing • in its own right, coming together in powerful new ways. AJAX incorporates: • standards-based presentation using XHTML and CSS; • dynamic display and interaction using the Document Object Model; • data interchange and manipulation using XML and XSLT; • asynchronous data retrieval using XMLHttpRequest; • and JavaScript binding everything together.“ • Jesse James Garrett - Adaptive Path
Talk Outline • Web 2.0 and Consumer Services • Enterprise Software as a Service • Case Study: salesforce.com • Database Virtualization
Customer Data Center Enterprise Software Models On-Premises Software System Integrators Application Vendors Platform Vendors Software as a Service Customers System Integrators Application Service Provider Open Source Data Center
Web 2.0 Checklist The Web As Platform Harnessing Collective Intelligence Data is the Next Intel Inside End of the Software Release Cycle Lightweight Programming Models Software Above the Level of a Single Device Rich User Experiences Customers System Integrators 2 1 5 6 7 Hosted Services On-premises Applications Application Service Provider 3 4 1 5 1 5 Data Center
Lower Total Cost of Ownership • Leverage economy of scale • Capital expenditures – hardware, software • Operational expenditures – bandwidth, personnel • Leverage open-source software • Natural fit for a service-oriented model • One version of the software • Runs on only one platform • Upgrade all users at the same time (rolling upgrade) • Tight coupling of operations and support • Resolve customer issues on the production system • Perform professional services on the production system
Hosting Sweet Spots • An application is more attractive to host if … • Leverages Internet connectivity • On-premises solutions expensive • Less data and simpler operations • Simpler configuration • Simpler backend integration • Weaker transactional requirements • Weaker security requirements • Less mission critical Apply more to People Apps Sales, Marketing, HR, Help Desk, Portal than Process Apps Planning, Purchasing, Inventory, Financials, Manufacturing Apply more to small to mid-sized businesses Opening up new markets Limits on functionality
The sweet spot Not worth building Not possible to build Using on-premises software Sacrificing Functionality Scalability Total Cost of Ownership Functionality
Talk Outline • Web 2.0 and Consumer Services • Enterprise Software as a Service • Case Study: salesforce.com • Database Virtualization
#3 #2 #4 7% #5 7% 4% 14% All others 18% 50% salesforce The Primary Application • Customer Relationship Management (CRM) • Marketing and Campaign Automation • Salesforce Automation • Customer Service and Support • Analytics and Reporting Hosted CRM Market Share - IDC Worldwide On-Demand CRM Vendor Analysis, 2005
The Web As Platform • Web Services API to access salesforce objects • End users: strongly-typed for organization data model • Developers: weakly-typed to span organization models • Simple and scalable • REST model to reduce server-side state • Bulk operations to reduce communication overhead • Used to integrate with • the ecosystem of partner services • on-premises applications in enterprise data centers • clients: browser (AJAX), off-line edition, mobile devices, spreadsheets, calendar
Harnessing Collective Intelligence • A set of extensions can be packaged into an installable application by customers and SIs • Project management, expense tracking, budgeting, purchasing, HR, education, manufacturing, … • Targets the “long tail” of custom applications
October 2005 March 2007 Organizations Organizations 20,000 30,000 Subscribers Subscribers 650,000 350,000 Service Growth 2003 2000 2001 2004 2005 1999 2002 - Based on publicly available data. Bars represent fiscal quarters.
Scalability and Performance Average Response Time (milliseconds) Transactions per quarter (millions) - Based on publicly available data. Points represent fiscal quarters.
Talk Outline • Web 2.0 and Consumer Services • Enterprise Software as a Service • Case Study: salesforce.com • Database Virtualization
High-Level Goals • Consolidate multiple businesses (tenants) onto the same operational system • Reduce total cost of ownership • Pooling resources • Improving management efficiency • Support Web 2.0 style collaboration • To the extent possible, hide multi-tenancy from developers and system administrators
Specific Requirements • Pool database resources to improve their utilization • Avoid provisioning each tenant for their maximum load • Breaks down isolation: weakens security, increases resource contention, interferes with optimizations • Provide a tenant-aware administrative framework • Manage farms of individual multi-tenant databases • Support DML and DDL operations across tenants • Support tenant migration within and across farms • Support Web 2.0 style collaboration • Allow shared extensions to the base schema • Allow shared public data with private updates 2 3
Multi-Mode Delivery • Three ways of delivering software to the enterprise • On-premises The business controls the application • Hosted The service provider controls the application • Hybrid The application runs in the data center of the business in an appliance and is remotely administered by the service provider • Customers should be able to migrate between them • Or keep a warm back-up that is ready to use • Sweet spot isopen source software, since that is commonly used in the hosted and hybrid models
Isolation Resource Pooling Consolidation Options • Shared Machine • Shared Process • Shared Table
Shared Machine • Cannot scale beyond tens of tenants per server • Appropriate for applications with a smaller number of larger tenants, e.g., for banking Memory requirements for a database with one empty CRM schema instance
Shared Process • Scalability limited due to the amount of metadata • But it is redundant for the base schema • Should scale up to thousands of tenants • If each tenant gets their own table space then migration entails simply moving files Memory requirements for a database with 10,000 empty CRM schema instances * extrapolated
TenId Account Name ... Val0 Val1 ... Val100 1041 0021 Acme 1/3/95 ---- ---- 1041 0029 Ball 3/7/72 ---- ---- 1053 0016 Gump red 35 ---- 1053 0049 Wonk blue 18 ---- Shared Table • Data from many tenants in the same tables • Add a tenant id column • Tenant queries must fix the value for this column • Extend the base schema using generic columns • May be varchar or a mix of types • The database must compactly represent sparse tables
Shared Table • The Good News - Everything is pooled • Processes, memory, connections, prepared statements • Easy DML and DDL operations across tenants • Add, remove, and extend tenants with DML (not DDL) • The Bad News - Isolation is very weak • Irrelevant data infects query processing • Optimization Statistics • Table Scans • Data locality • No indexes or integrity constraints on generic columns • Migration requires querying the operational system
Possible Improvements • Shared Process – increase resource pooling • Keep one copy of the meta-data about the base schema • Allow prepared statements over table names • Allow queries over table names • Dynamically set the principal for a database connection • Shared Table – increase isolation • Take the tenant ID into account in query optimization, indexing, and data placement • Could be done from inside the database or from outside in a SQL transformation layer
Our Current Research • Adapt an open source database for multi-tenancy • Measure its scalability and performance using a benchmark for multi-tenant databases conversion source Lead Campaign Account parent source Opportunity Product Contact reports to Simple CRM Schema signed by Line Item Asset Case Contract