1 / 28

Datacenters And Resilient Services

ES30. Datacenters And Resilient Services.  Benjamin Ravani General Manager Global Foundation Services Microsoft Corporation. Agenda And Objectives. Web services operations Size and scale Data center challenges Case studies Best practices in building resiliency

ember
Télécharger la présentation

Datacenters And Resilient Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ES30 Datacenters And Resilient Services  Benjamin Ravani General Manager Global Foundation Services Microsoft Corporation

  2. Agenda And Objectives • Web services operations • Size and scale • Data center challenges • Case studies • Best practices in building resiliency • Opportunities during design phase • Summary

  3. Global Foundation Services’ Mission Enable and Deliver Winning Services • To Everyone, Everywhere

  4. Global Foundation Services Across the company, all overthe world, around the clock

  5. Data Center Operations Challenges Growth expected to continue to increase over the next 5 years!

  6. Why Power Matters… • In 2006, U.S. Data centers consumed an estimated 61 billion kilowatt-hours (kWh) of energy, which accounted for about 1.5% of the total electricity consumed in the U.S. that year • The total cost of that energy consumption was $4.5 billion, which is more than the electricity consumed by all color televisions in the country and is equivalent to the electricity consumption of about 5.8 million average U.S. householdsKoomey, jonathan. 2007. Estimating total power consumption by servers in the U.S. And the world. Oakland, CA: analytics press. February 15.Http://enterprise.Amd.Com/downloads/svrpwrusecompletefinal.Pdf • Data centers' power and cooling infrastructure accounts for about half of that electricity consumption; IT equipment accounts for the other half

  7. Why Power Matters … • If the status quo continues, by 2011 data centers will consume 100 billion kWh of energy, at a total annual cost of $7.4 billion. • Those levels of power consumption would also necessitate the construction of 10 additional power plants

  8. Environmental Sustainability • Protecting our environment • Smart growth in data center • Make every KW count! • Invest in innovation for energy efficiency • Examples • Hydro • Power equipment supply • Compute resource utilization • Virtualization • Green grid http://www.Microsoft.Com/environment/our_commitment/articles/green_grid.Aspx Last year beans, this year a data center

  9. Data Center Costs • Land - 2% • Core & Shell Costs – 9% • Architectural – 7% • Mechanical / Electrical – 82%

  10. Case Study I Capacity planning and internal security November 2006 • Problem and impact • Poor planning • About 500K users experienced delays in creating/ updating accounts for several hours • Root cause • Interdependent service’s batch job affecting overall performance • Batch job had bugs • Solution • Capacity planning cross-services/cross-groups • Testing all batch jobs in a test environments first • Increase internal security

  11. Case Study IIProtection against accidental partner’s error - March 2007 • Problem and impact • ~5 hours of login outage for 75% of users • We couldn’t isolate the source of load • Root cause • An internal service partner bug caused latency in another dependent service, resulting in re-authentication requests – overloading with login rate • Solution • Application architecture – reduce dependency • Improved monitoring – specific to partner dependency • Develop throttling – throttling by partners

  12. Throttling - at all layers of the systemControl incoming requests to prevent total shut down • Network • Protect against DDOS attacks • Front-end machines • Kernel throttling – for high connection queue • IIS connections – for high connections • Interface queue throttling - for high request queue • CPU throttling – CPU threshold based • TPS throttling – for high TPS per interface • Partner level throttling – for unexpected load increase from a partner • Back-end SQL connections • Throttling on number of database connections

  13. URL Reputation Service (URS) Internet Explorer 7, 8 Phishing Filter

  14. URS Phishing reporting site

  15. URL Reputation Service (URS) Overview • Service profile • Grown to billions of transactions daily • Capacity model: Capable of sustaining a res. time of <0.5 sec • Managed by 3 people • Architecture • Designed with a pod-oriented architecture (POA) • A pod consists of a couple of dozen servers and a couple of load-balancers across multiple VIPs • Pods are distributed in multiple data centers globally • Pods are globally load balanced by intelligent traffic control for reliability and performance

  16. URL Reputation Service Topology ITM NA NA NA EU NA Asia

  17. Input Model: Known Phish Business Rules ITM • Customers feedback loop • Grading filters • Partners input • URS DB on SQL Cluster • URF distribution to all pods NA NA NA Feedback EU Partners NA Asia Grading URF URS URS

  18. Performance And Global LoadBalancing • Optimizing client traffic by geography reduces latency and error rates • Send customers to closest data center based on source IP • Response time < 0.5 sec ITM NA NA NA EU NA Pulse Asia

  19. Fault Tolerance Data Center Failover • Intelligent traffic management • Based on policy- re-route traffic from unavailable data center (DC) to other DCs • No service downtime during a DC failure • Disaster recovery/business continuity is built-in ITM NA NA NA EU NA Pulse Asia

  20. Rolling Upgrade (Roll Forward/Backward) • Multiple VIPs per DC • Reassign 1 pod to test VIPs for deploying new bits • Rolling upgrade: Change validation process • Low risk of outage during deployment • Rollback: Easy recovery • Lower cost of test labs ITM NA NA NA EU NA Pulse Asia

  21. Environment Control • Security – security trumps feature • Monitoring and instrumentation • Availability, performance • Transaction monitoring • Capacity and load • System center operations manager 2007 • Capacity management • Software control, not people control • Change management • Release pipeline • System administration automation

  22. Datacenter Agnostic Deployment And Standards • Deploy servers where there is capacity • Global scale • Eliminate moves • Standards • Hardware SKUs • Optimize costs at data center level

  23. Fault Tolerance • Throttle incoming traffic/limit retries • Back-end servers failover • Datacenter failover – services failover cross DC

  24. Summary • 24X7 global data centers operations – managing tens of thousands of servers • We have learned from the industry and from our growing experiences – what it takes to make it better! • GFS partnership with Windows Azure from the start • Resiliency is competitive advantage

  25. Evals & Recordings Please fill out your evaluation for this session at: This session will be available as a recording at: www.microsoftpdc.com

  26. Q&A Please use the microphones provided

  27. © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related