Scalable Web Site Antipatterns

Scalable Web Site Antipatterns Justin Leitgeb Stack Builders Inc.

Overview • Based on architectures that have caused significant down-time and pain • Like examples in Nygard's book, but more emphasis on essential rather than accidental properties of system

Anti-pattern 1: Monotonically-increasing data set with rapid growth • Having a system that relies on querying all historical data • Requires joins from mega-tables (hundreds of millions of rows) • Often from automatically aggregated data

Detection • Slow query log • SHOW FULL PROCESSLIST • SHOW ENGINE INNODB STATUS • vmstat

vmstat

Anti-solutions • Partitioning • Pre-caching (cron jobs) • Switching to MyISAM • NoSQL?

NoSQL • Out-of-the box solutions with NoSQL (e.g., Mongo) help with data modeling • Use CAP instead of ACID • May lead to better ability to distribute algorithms • But: • Haven't had as much effort yet expended on engines as MySQL (INNODB) • Often use the same algorithms (e.g., Btree indexes) • Can require more dev time (e.g., Cassandra and good implementation of distributed algorithms)

Stop the bleeding • Cut off long queries • Turn off site sections • Fail whale

Band-aids • Obvious - adding app servers, memcached, bigger DB server • Adding app servers puts more pressure on DB server • HTTP Caching (varnish) • MySQL tuning (look for things like FILESORT) • Read slaves

Solutions • Hard-limit data volume - look for cases where data decreases in value with time • Add features related to scale • Distributed algorithms and data stores • Data warehousing

Anti-pattern 2: Allowing "risky" writes to block HTTP responses • Symptoms: • Slow requests • Servers hitting MaxClients and 500 error

Possible Causes • Possible causes: database backed analytics tracking • Session management • Any SQL DML (UPDATE, DELETE)

Risk increases with: • The number of requests invoking the write operation • Traffic • Concurrent background operations • The algorithmic complexity of the write • Slow AWS I/O on EBS

Solutions • Asynchronize! • Write to a queue • Write to memcached or other non-ACID store • Later bring to data warehouse for advanced analytics

More info • Nygard, Michael T. Release It!: Design and Deploy Production-ready Software. Raleigh, NC: Pragmatic, 2007. • Fowler, Martin. Patterns of Enterprise Application Architecture. Boston: Addison-Wesley, 2003. • Kimball, Ralph. The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses: John Wiley & Sons 2010. • Schwartz, Baron. High Performance MySQL: O'Reilly, 2008

Scalable Web Site Antipatterns

Scalable Web Site Antipatterns

Presentation Transcript

MSHDA’s Web Site

Web Site

Designing Scalable Web: Patterns

Web Site

Predicting Bugs Using Antipatterns

WEB SITE

Antipatterns

Web Site

FP6 WEB-SITE

Building Scalable Web Archives

Scalable Web Architectures

Scalable Web Architectures

WEB SITE

Scalable Web Architectures

Web site?

AntiPatterns

Patterns and AntiPatterns

Web Site

Patterns and AntiPatterns