310 likes | 448 Vues
In the realm of web application development, performance and scalability are paramount, and Drupal is no exception. While Drupal 6 offers satisfactory performance with caching, its successor, Drupal 7, boasts enhanced flexibility for scalability despite being slightly slower initially. This guide will lead you through setting up a Drupal site, from a single server to a complex, multi-tiered infrastructure, emphasizing optimization strategies and future performance enhancements. Learn about effective caching methods, database solutions, and PHP optimization techniques to ensure your website thrives in high-traffic environments.
E N D
Prepare to Scale Revolutionizing enterprise web development
Intro • Performance is critical when it comes to any web application and Drupal is certainly no different. • D6 performance out of the box is okay, but needs caching to shine. • D7 actually has slightly poorer performance (maybe 20% slower) out of the box, but provides greater and easier flexibility for scalability down the road. • We’ll walk through building a site-from 1 server to a multi-tiered infrastructure-with an eye to the future and common steps for improving performance over time. • Performance: How fast pages are returned to a user. • Scalability: How well a site can handle many users.
Basic Infrastructure Single-Server • Database & Application on the same server • Start optimizing what you have • Web Server • Drupal • PHP • Database • Optimizations you make for the first server will be applicable for future servers • Strategy: Optimize what you have, then divert traffic through caching and specialization
Web Server • Apache • Standard, but bloated • Lots of history; know things will work. • Nginx • Lighter • Faster • There are edge cases that sometimes make it unusable for the web server.
Drupal 1-word: • Support for Database Replication • Support for Squid/Varnish • MySQLoptimizations • PHP5 optimizations • http://fourkitchens.com/pressflow-makes-drupal- scale/downloads • Currently ONLY relevant for D6. Most of the above has been incorporated into D7.
DB MyISAM • Relational database • Default storage engine for <= Drupal6 • Good for selects • Read-only sort of websites • Poor read-write performance, particularly for large websites where it can cause locking
DB, Cont. InnoDB is your friend in most scenarios • Relational Database • Row-level vs Table-level locking • Improves read/write functionality • Does slow pure read functionality to some degree • Default Store Engine of Drupal 7+ • Best bet at the moment for allowing your site to scale
DB, Cont. The horizon or ‘other RDBMS of note’ • Drizzle • Rewrite from the ground up of MySQL • Slightly poorer performance than MySQLInnoDB at low volumes but far better scalability • No production release • MariaDB • ‘Drop-in’ replacement for MySQL • Uses XtraDB instead of InnoDB • Superior performance to MySQL
DB, Cont. Other DBs of Note (NoSQL) • MongoDB • Document-oriented DB • Used by the Examiner • D7 module for it • Cassandra • Column-oriented DB • Facebook Inbox • Eventual consistency
PHP Opcode Caching • Sort of like having a compiled version of your application • Optimizes PHP components • Stores the compiled PHP bytecode for execution in stored memory • Result: Smaller PHP memory footprint (read: more users with less hardware) and faster execution of code • Virtually a necessity for any large-scale/high-volume Drupal deployment
PHP, Cont. Opcode caching • eAccelerator • Off & on maintenance • Only works with threadsafe PHP • Has – in my experience – led to some strange crashing, WSOD, etc. • Xcache • Reasonable performance improvement, though tends to performance test slowest of the 3 • Actively maintained • Stable, but still prone to cache-corruption, SWOD, etc.
PHP, Cont. Opcode caching, cont. • APC • Current opcode cache of choice • Most actively updated • Most stable of the 3 • Usually the winner in performance benchmarks • Maintained by core PHP developers (Rasmus)
Static Caching Static Caching Modules • Creating and storing rendered versions of the html • Rather than building the page on request • Avoids having to load any aspect of your application depending on the implementation • Acts as a layer between the user and actual execution of your program • Alleviates DB issues since the DB is no longer involved • Simplifies any PHP execution
Static Caching, Cont. Static Caching Modules, Cont. • Boost Module • Static file caching • Good for Anonymous traffic only • Great Performance for small sites • Ideal for shared hosts • AuthCache Module • Static file caching • Attempts to handle logged-in traffic • Plays nice with and/or can utilize multiple caching engines • Can be a bit of a pain for user-specific content as you have to write particular cases for each user-specific area
Static Caching, Cont. Static Caching Modules, Cont. • Shameless plug: Ajaxify Regions • Aptly-named….or not • Actually pulls Blocks not Regions via ajax • Early release w/plenty of work to do, needs more real-world testing etc. • Automatically handles all user specific block content based on block-caching settings • BLOCK_NO_CACHE • BLOCK_CACHE_PER_USER • BLOCK_CACHE_PER_ROLE • Concept: ajax load anything that can’t be cached for everyone
Object-level Caching Object-level caching • Provides a way to store fully-generated objects • Can be the amalgam of many queries • Think of all the queries run on a node_loadvs retrieving all that information in 1 query. • Stores the information in memory for fast access • Performance characteristics not significantly different than MySQL when MySQL can handle the load • BUT can handle a much higher load • Protects the DB – the area most likely to inhibit performance for Drupal – from becoming overwhelmed
Object-level Caching, Cont. Object-level caching, Cont. • APC • Not a typo • APC can handle object caching as well as op-code caching • It’s fast: everything is stored in local memory • It caches only for one server. • This means that you could have synchronization issues between servers if you have more than one • If that’s not an issue, it’s a quick and easy solution • Ideal for single-server implementations or when synchronicity isn’t an issue
Object-level Caching, Cont. • Object-level caching, Cont. • Memcache • Utilized by most high-profile sites • Facebook, for instance, makes tremendous use of lots and lots of memcache servers • Drupal.org uses it • Provides an object cache that can be used by multiple servers • Slower in the single-server instance than APC, but provides synchronicity • Multiple silos/buckets can be created for information so you can distribute information across multiple servers
Advanced Infrastructure (ex) Load Balancer Static-Caching Application (Drupal) Solr Memcache Deployment GlusterFS 19 19 Database Slave DB
Specialization Specialized Servers/Services • DB Server • SOLR • Memcache • Static-caching • CDN • GlusterFS
Specialization MySQL Server • One of the fastest ways to improve performance is to separate your MySQL DB from your application • This allows both your application and your db to make full use of independent hardware • The change is basically transparent at the application layer: just single change to settings.php
Specialization Search • Problem: Search is incredibly hard on the system • Particularly w/ multiple search terms • Drupal search works, but despite great efforts is still not as quick or useful as an outside solution • Search is particularly hard on the DB, Drupal’s traditional bottleneck • In other words, search makes a bad problem worse
Specialization Search, Cont. • Solution: Solr • Communication layer between the website and the Lucene search index • Offloads all of the complex processing to a search • More power for searches (search faster!) • Doesn’t lock up your website DB • Website can focus on what it does, search can focus on what it does • Additional benefit: faceting (filtering), sorting • Ability to search content based on specific criteria (content type, author, taxonomy terms) and sort based on criteria (title, date, author, content type) • Hosted model (Acquia Search) or can be installed on server in your infrastructure
Specialization Static Caching • Static-caching on the same server as the website provides performance improvement • Downside: there’s still a lot of wasted overhead, apache has everything it needs for a website, not just serving html; php also has to load • Static-caching elsewhere provides the opportunity to optimize the server for static-caching • Side effect: your web server now has more memory free to handle requests that require phpprocessing • D6 does not, but Pressflow and D7 provide capabilities for leveraging external caching services.
Specialization Static Caching, Cont. • Squid • Free • Not Specifically designed just for http acceleration • Difficult to setup/configure • Performance improvement, but less than competition
Specialization Static Caching, Cont. • Varnish • Free (to download) • Pressflow/D7 built to work w/ Varnish • Varnish servers set up for Drupal and usable off Amazon EC2 (developed by Chapter 3) ($.34/hr + $.17/GB) • Designed from the group up for http acceleration • Can take time/expertise to get the performance you want • Can create a significant performance improvement once configured correctly • Most popular + off-the-shelf/aws implementations
Specialization Static Caching, Cont. • AI-Cache • Best performance of the bunch • Simple configuration • Provides additional features for caching • Header recognition • Session caching • Drop-in solution • Not free • Amazon EC2 instance is available ($.68/hr + $20/GB)
Specialization CDN • Cache content that is static (outside of full pages) • Images • Video • CSS • JS • Popular examples • Akamai • LimeLight • Amazon CloudFront • Separate domains, more bandwidth, geographic servers all equal faster loading • Can be an expensive option
Summary • Start small and make the easy optimizations: • Pressflow/D7 • InnoDB (D7 by default) • APC • Add servers and services as necessary and based on individual traffic: • MySQL • SOLR • Memcache • Static Cache • CDN
The End • Questions?
Thank You Bill O’Connor, CTO d.o: csevb10 t: csevb10 e: bill@achieveinternet.com