1 / 39

Website Survival: Concealing Back-End Outages with Oracle Coherence and HotCache

Website Survival: Concealing Back-End Outages with Oracle Coherence and HotCache. Jim Xu Senior Technology Architect TELUS Randy Stafford Architect At-Large Oracle Coherence Product Development. Presented with. Session Agenda. 1. TELUS introduction and business challenge

Télécharger la présentation

Website Survival: Concealing Back-End Outages with Oracle Coherence and HotCache

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Website Survival:Concealing Back-End Outages with Oracle Coherence and HotCache Jim Xu Senior Technology Architect TELUS Randy Stafford Architect At-Large Oracle Coherence Product Development Presented with

  2. Session Agenda 1 TELUS introduction and business challenge Oracle technology addressing the challenge Technical highlights of the implementation HotCache whole product: transformation and more Solution validation through business metrics 2 3 4 5

  3. TELUS (TSX: T, NYSE: TU) - Canada’s fastest-growing national telecommunications company HeadquarterBurnaby, British Columbia, Canada Revenue $11.7 billion EBITDA $4.0 billion Customer 13.4 million connections, including 7.9 million wireless subscribers 3.2 million wireline network access lines 1.4 million Internet subscribers 865,000 TV customers Website www.telus.com Introduction to TELUS

  4. Context Digital experience serving several millions of customers Challenges 80% of clients researched online prior to purchase 85% of clients preferred to solve problems online Slow responding web pages and frequent unplanned outages seriously degraded client experience Voice of Client indicated 39% of complaints were related to speed & stability Unreliable self-serve impacted web adoption and drove calls to call centers Subscriber growth increased considerably with traffic and load Goals Under 3 seconds to render customer experience 99.99% uptime Business Challenges

  5. High Availability and Resiliency Program was started in 2011 A number of enhancements reduced response time from 21 sec to 8 sec in 2012, then 6 sec in 2013 Journey on Performance Improvement East West National Q1 Q2 Q3 Q4 (2012)

  6. Impossible to reach 3 sec and 99.99% uptime target without architecture redesign and new technologies Extended outages (10-20 hours) during quarterly releases and maintenance windows Customer data is collected from multiple data sources across multiple data centers Legacy infrastructure requires frequent maintenance Caching data is critical Coherence 3.7 was introduced, but facing challenges in keeping cached data fresh Custom cache updater was considered but later discarded due to complexity Tipping Point

  7. Session Agenda 1 TELUS introduction and business challenge Oracle technology addressing the challenge Technical highlights of the implementation HotCache whole product: transformation and more Solution validation through business metrics 2 3 4 5

  8. Build In-Memory Data Grid with Coherence 12c Resolve cache data update issue with HotCache Conceal back-end outages to provide 7/24 service reliability Improve system performance and maintain consistent client experience Technologies: Exalogic X3-2 and X4-2 Coherence Data Grid edition 12.1.2 Oracle Traffic Director Golden Gate Weblogic12c Stats: Cached raw data – 212 G Number of objects: 821 Million The Solution

  9. Java EE Application Physical Tiering - and Scalability Site 1 WebTier web servers These tiers can scale out… App Tier app servers The grid tier scales out! Grid Tier cache servers The EIS tier is hard to scale! Database EIS Tier Legacy System External Service

  10. Coherence GoldenGateHotCache ExternalApplication Coherence Application Coherence Read / Write Read / Write GoldenGateHotCache GoldenGate Database Push DB changes to Coherence Via GoldenGate and TopLink JPA Tables map to entities, caches Event-driven and efficient Solves stale cache problem when external apps write to shared DB Allows caching to be leveraged in that class of application

  11. Exalogic System Hardware Overview Fast. Easy. Open

  12. Coherence on Exalogic MessageBus: an asynchronous, binary, message-based, event-driven transport layer in Coherence, with pluggable implementations Exabus: a native RDMA implementation of MessageBus, bypassing the OS kernel, avoiding buffer copies Exabus preprocesses messages on I/O threads, avoiding context switches between Coherence threads prior to Exabus Separate MessageBus per Coherence service, instead of all services sharing same transport layer prior to MessageBus, allows utilizing full IB bandwidth MessageBus and Exabus

  13. Data Grid Server - Exalogic vs Commodity Failover Latency Throughput CPU Utilization

  14. Session Agenda 1 TELUS introduction and business challenge Oracle technology addressing the challenge Technical highlights of the implementation HotCache whole product: transformation and more Solution validation through business metrics 2 3 4 5

  15. System Architecture

  16. Data Consolidation Data Grid • Benefits: • Reduce data roundtrips • Improve performance • Less dependency on legacy data centers • Canonical model across multiple source databases Golden Gate Billing Account Customer User Profile

  17. Data Grid Geo-Redundancy • Benefits: • Replicated infrastructure and data • Active-Active to support production • Data and Services closer to consumers Global Traffic Manager Data Services Data Services Data Grid Data Grid East West

  18. Data Grid Synchronization (Current State)

  19. Data Grid Synchronization (Next Stage )

  20. Aggressive timeline on launching Data Grid Closely collaborated with Oracle to resolve any technical issues Project Timeline on Data Grid

  21. Manage Object Relationships • Cached Data: • Objects are independent in the grid • But they are logically related • Object traversal through keys

  22. Session Agenda 1 TELUS introduction and business challenge Oracle technology addressing the challenge Technical highlights of the implementation HotCache whole product: transformation and more Solution validation through business metrics 2 3 4 5

  23. Data Transformation with Coherence Live Events V V V V V V V K K K K K K K Canonical Domain Model Coherence Data Grid Legacy Schema Addr AA Object/Cache Mapping BA BC Live Events HotCache AA Cust BC PH PH Svc

  24. Live Events Use Cases in HotCache • Project HotCache model into desired model • Duplicate data for denormalization • Ensure referential integrity in relationship implementations • Merge data from multiple databases • Pending Mutations pattern • Refresh Aggregates when child tables are not replicated

  25. Stage Cache JPA entities Identical data structure as source database to simplify HotCache implementation Initial load is not required, and object can be removed after target object is updated Reduced data grid memory footprint Target Cache Similar to database view, populated with UI optimised domain objects Denormalized/flatten objects to improve performance for data retrieval Process object dependencies through Event Interceptors and Entry Processors Data Aggregation with Layered Caches

  26. Update Dependent Objects with Event Interceptors

  27. Scaling HotCache via Parallel Data Flows Coherence Data Grid V V V K K K • DB schema must be amenable (related tables in same trail) • One HotCache throughput: 700-3000 TPS depending on HW, configuration • This approach has been tested to 18,000 TPS Trail 1 Extract 1 HotCache 1 Trail 2 Extract 2 HotCache 2 Trail N Extract N HotCache N

  28. HotCache High Availability • Coherence is already HA • Oracle Clusterware manages redundant GoldenGateHotCache processes • http://www.oracle.com/technetwork/middleware/goldengate/overview/ha-goldengate-whitepaper-128197.pdf Active Passive GoldenGate Oracle Clusterware GoldenGate check() Manager Manager stop() start() HotCache HotCache

  29. Monitoring HotCache

  30. Session Agenda 1 TELUS introduction and business challenge Oracle technology addressing the challenge Technical highlights of the implementation HotCache whole product: transformation and more Solution validation through business metrics 2 3 4 5

  31. No more outages! - supported all major releases and infrastructure maintenance since initial launch in last November Enhanced performance at the service level 2 – 30x faster Reduced dependency on legacy data centers and hardware footprint Offered single view on customer with data from various legacy systems Business Benefits

  32. Data grid enabled Client Account WS response time: 20ms vs 99-10294ms Outage Mode – Portal overview page response time: 3.2s Operational Mode – 48% performance gain from staging performance test Performance Metrics MS HRS

  33. Questions?

  34. Focus on Oracle Coherence

  35. Focus on Oracle Coherence (continued)

More Related