230 likes | 366 Vues
In "The Scaling IQ Test," Richard Campbell emphasizes the critical IT/Dev meetings that every web application must have. With over thirty years in the tech industry, Campbell shares crucial insights on what both development and IT teams need to communicate for successful application performance. He covers essential topics like reliability, performance, action plans during application failures, and the importance of disaster recovery strategies. By fostering cooperative firefighting and collaboration, teams can significantly enhance application scalability and reliability.
E N D
The Scaling IQ Test: When Dev and Admin Collide Richard Campbell Strangeloop Networks
Richard Campbell • Background • After thirty years, done every job in the computer industry you’ve ever heard of • Currently • Co-Founder and Product Evangelist for Strangeloop Networks • Co-Host of .NET Rocks! • Host of RunAs Radio
The IT/Dev Meeting • Every web application has this meeting eventually • Sooner is always better • The goal is to trade information • What IT needs to know about the app • What Dev needs to know about the operating environment
The IT/Dev Meeting • Who needs to be in the room? • The architect/senior dev • Seniors devs that know the features in detail • IT personnel that will operate the application • Senior personnel that know the entire network
The IT/Dev Meeting • When does the meeting need to happen? • When the application is being designed (collected as requirements) • While the application is being developed • After the application is deployed • After the application has crashed horribly • When the application is too slow
The IT/Dev Meeting • Starting the meeting • What are the priorities • Reliability • Performance • Scalability • Accuracy • Put them in order, every site has different priorities
The IT/Dev Meeting • What IT Needs to Know • What’s in the web.config file (a great starting point) • What load balancing strategies will work for the application • Any known performance bottlenecks
Web.Config • <authorization> • None (Anonymous) • Windows (Active Directory, Basic, etc) • Forms-Based
Web.Config • <appSettings> • Global connection strings, paths etc • Make sure they’re being used! • Remove dead strings • These can be critical in failover/disaster recovery scenarios
Web.Config • <customErrors> • Decide on how errors should be displayed to the customer (internal or external) • Defaults are really not enough • You can create separate pages for each error (handle 404 page not found differently from 500 internal server error)
Web.Config • <sessionState> • In-process vs. out-of-process • More dependencies • Affects options around load balancing
Load Balancing • Find out what load balancing will work with the application • In-process session requires “sticky” load balancing • You only get to load balance the first request • Talk through server failure effects
Performance Bottlenecks • Discuss known performance issues • Night time processing that conflicts with existing work • Administrators work that significant impacts performance of regular users • What parts of the application are more scalable than others?
Things Dev Need to Know The Network Diagram (in detail!) How to get at production log data What redundacy/failover/disaster recovery options there are
The Network Diagram How developers see it
The Network Diagram Closer to reality
Production Logs • Production logs are the truth of what happened with the application • Providing developers with production logs gives them a chance to help out • Provide access to the backups of the logs • Saying “I’ll give them to you when you ask” is not enough • You’re looking for proactive analysis
Disaster Recovery • All DR strategies require at least some coding support • SQL Server failover still needs to have queries retried to be seamless • What happens between the time a server fails and the load balancing strategy detects it? • Is losing request acceptable in your scenario?
Disaster Recovery • Switching to a backup site • Are DNS changes needed? • What references within the application need to be changed? • What does a switch-back look like? • Practice practice practice! • Don’t let your first failover test with an application be a real failure!
After the Meeting • What follow ups are there for management? • You’ve probably made some business-related decisions, make sure you have buy-in • When do we need to meet again? • Preferably before the next disaster
The Cooperative Firefight • IT is invariably on the front lines of an application failure • But when should development be brought in? • Post-mortem is often not enough
The Cooperative Firefight • Make a strategy to involve development during the firefight • They often have deep insight into how the application works and so can understand why it might fail • Just make sure they’re educated to not make the problem worse • This is NOT a time for fixing code
Summary Have the meeting early Repeat as necessary Each group must learn from the other Assist and seek assistance during a firefight