1 / 29

MySpace.com MegaSite v2

MySpace.com MegaSite v2. Aber Whitcom b – Chief Technology Officer Jim Benedetto – Vice President of Technology Allen Hurff – Vice President of Engineering. Previous Myspace Scaling Landmarks. First Megasite 64+ MM Registered Users 38 MM Unique Users

hateya
Télécharger la présentation

MySpace.com MegaSite v2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MySpace.com MegaSite v2 Aber Whitcomb – Chief Technology Officer Jim Benedetto – Vice President of Technology Allen Hurff – Vice President of Engineering

  2. Previous Myspace Scaling Landmarks • First Megasite • 64+ MM Registered Users • 38 MM Unique Users • 260,000 New Registered Users Per Day • 23 Trillion Page* Views/Month • 50.2% Female / 49.8% Male • Primary Age Demo: 14-34 185 M 70 M 6 M 1 M 100K

  3. MySpace Company OverviewToday • As of April 2007 • 185+ MM Registered Users • 90 MM Unique Users • Demographics • 50.2% Female / 49.8% Male • Primary Age Demo: 14-34 Source: comScore Media Metrix March - 2007

  4. Total Pages Viewed - Last 5 Months Source: comScore Media Metrix April 2007

  5. Site Trends • 350,000 new user registrations/day • 1 Billion+ total images • Millions of new images/day • Millions of songs streamed/day • 4.5 Million concurrent users • Localized and launched in 14 countries • Launched China and Latin America last week

  6. Technical Stats • 7 Datacenters • 6000 Web Servers • 250 Cache Servers 16gb RAM • 650 Ad servers • 250 DB Servers • 400 Media Processing servers • 7000 disks in SAN architecture • 70,000 mb/s bandwidth • 35,000 mb/s on CDN

  7. MySpace Cache

  8. Relay System Deployment • Typically used for caching MySpace user data. • Online status, hit counters, profiles, mail. • Provides a transparent client API for caching C# objects. • Clustering • Servers divided into "Groups" of one or more "Clusters". • Clusters keep themselves up to date. • Multiple load balancing schemes based on expected load. • Heavy write environment • Must scale past 20k redundant writes per second on a 15 server redundant cluster.

  9. Relay System Relay Client Relay Service IRelayComponents • Platform for middle tier messaging. • Up to 100k request messages per second per server in prod. • Purely asynchronous—no thread blocking. Concurrency and Coordination Runtime • Bulk message processing. • Custom unidirectional connection pooling. • Custom wire format. • Gzip compression for larger messages. • Data center aware. • Configurable components Relay Client Socket Server Berkeley DB Non-locking Memory Buckets CCR CCR Fixed Alloc Shared Interlocked Int Storage for Hit Counters Message Forwarding Message Orchestration

  10. Code Management:Team Foundation Server, Team System, Team Plain, and Team Test Edition

  11. Code Management • MySpace embraced Team Foundation Server and Team System during Beta 3 • MySpace was also one of the early beta testers of BizDev’s Team Plain (now owned by Microsoft). • Team Foundation initially supported 32 MySpace developers and now supports 110 developers on it's way to over 230 developers • MySpace is able to branch and shelve more effectively with TFS and Team System

  12. Code Management (continued) • MySpace uses Team Foundation Server as a source repository for it's .NET, C++, Flash, and Cold Fusion codebases • MySpace uses Team Plain for Product Managers and other non-development roles

  13. Code Management: Team Test Edition • MySpace is a member of the Strategic Design Review committee for the Team System suite • MySpace chose Team Test Edition which reduced cost and kept it’s Quality Assurance Staff on the same suite as the development teams • MySpace using MSSCCI providers and customization of Team Foundation Server (including the upcoming K2 Blackperl) was able to extend TFS to have better workflow and defect tracking based on our specific needs

  14. Server Farm ManagementCodespew

  15. CodeSpew • Maintaining consistent, always changing code base and configs across thousands of servers proved very difficult • Code rolls began to take a very long time • CodeSpew – Code deployment and maintenance utility • Two tier application • Central management server – C# • Light agent on every production server – C# • Tightly integrated with Windows Powershell

  16. CodeSpew • UDP out, TCP/IP in • Massively parallel – able to update hundreds of servers at a time. • File modifications are determined on a per server basisbased on CRCs • Security model for code deployment authorization • Able to execute remote powershell scripts across server farm

  17. Media Encoding/Delivery

  18. Media Statistics • Images • 1 Billion+ images • 80 TB of space • 150,000 req/s • 8 Gigabits/sec • Videos • 60TB storage • 15,000 concurrent streams • 60,000 new videos/day • Music • 25 Million songs • 142 TB of space • 250,000 concurrent streams

  19. 4th Generation Media Encoding • Millions of MP3, Video and Image Uploads Every Day • Ability to design custom encoding profiles (bitrate, width, height, letterbox, etc.) for a variety of deployment scenarios. • Job broker engine to maximize encoding resources and provide a level of QoS. • Abandonment of database connectivity in favor of a web service layer • XML based workflow definition to provide extensibility to the encoding engine. • Coded entirely in C#

  20. 4th Generation Encoding Workflow

  21. MySpace Distributed File System

  22. MySpace Distributed File System • Provides an object-oriented file store • Scales linearly to near-infinite capacity on commodity hardware • High-throughput distribution architecture • Simple cross-platform storage API • Designed exclusively for long-tail content • Accesses • Demand

  23. Sledgehammer • Custom high-performance event-driven web server core • Written in C++ as a shared library • Integrated content cache engine • Integrates with storage layer over HTTP • Capable of more than 1Gbit/s throughput on a dual-processor host • Capable of tens of thousands of concurrent streams

  24. DFS Interesting Facts • DFS uses a generic “file pointer” data type for identifying files, allowing us to change URL formats and distribution mechanisms without altering data. • Compatible with traditional CDNs like Akamai • Can be scaled at any granularity, from single nodes to complete clusters • Provides a uniform method for developers to access any media content on MySpace

  25. Appendix

  26. Operational Wins

  27. MySpace Disaster Recovery Overview • Distribute MySpace servers over 3 geographically dispersed co-location sites • Maintain presence in Los Angeles • Add a Phoenix site for active/active configuration • Add a Seattle site for active/active/active with Site Failover capability

  28. Storage Cluster Users Distributed File System Architecture Sledgehammer Cache Engine Business Logic Server Accelerator Engine DFS Cache Daemon

More Related