Writing Highly Available .Net Framework Applications
Future of CLR in .Net 2.0. Writing Highly Available .Net Framework Applications. Sriram Ramamurthy. Introduction. Customize CLR for Application Scenarios High Degree of Availability Process must live for a very long time Provide features for host – handle exceptional conditions
Writing Highly Available .Net Framework Applications
E N D
Presentation Transcript
Future of CLR in .Net 2.0 Writing Highly Available .Net Framework Applications Sriram Ramamurthy
Introduction • Customize CLR for Application Scenarios • High Degree of Availability • Process must live for a very long time • Provide features for host – handle exceptional conditions • Application Domains: Isolation & Unloading • Host can remove code from erroneous process & continue execution
Advantages • Host runtime code – Reliable • Handle Resource Exhaustion & Exceptional Conditions • How to handle add-ins that might not be written properly?
Goals • Unload Application Domain without leaking any resources • Customize the handling of various exceptional conditions • E.g. – System.OutOfMemoryException • Customizing Escalation Policy
Application Domain Isolation & Process Lifetimes • Process should not crash under exceptional conditions • Why build such a complex infrastructure? • Why not simply write managed code to handle all exceptions properly? • Writing reliable managed code handling all exceptions is impractical
Application Domain Isolation & Process Lifetimes • CLR Model of executing managed code • May throw exception in any line of code • Unexpected Memory and runtime operations • Memory Allocation • MSIL – to be JITed • Boxing of Value Types • E.g. HashTable.Add(“Entry1”, 5); • Using PInvoke – InPtr semHandle = CreateSemaphore(…);
Application Domain Isolation & Process Lifetimes • .Net Framework assemblies & eXtensible applications – if practical • What about add-ins ? • CLR 1.0, 1.1 – no guarantee for high availability • No such need due to lack of CLR hosts • Microsoft ASP.Net – Process Recycling Model
ASP. Net & IIS • Multiple processes – load balancing incoming requests • High Demand – more processes created • Low Demand – processes idle or killed • High Scalability achieved – Process recycling model • Web applications – request or connection is stateless
ASP. Net & IIS • Process hangs or fails • Process – kill safely without affection application state • User – Try Again Later, error message • Refresh browser and resend request to different process
CLR Design Decisions • Works well – Web Servers • Does not work well – Database Servers • High per Process state – starting a new process becomes expensive • .Net 1.0, 1.1 – CLR Host (ASP . Net) • .Net 2.0 – CLR Host (SQL Server 2005) • Shall support long lived processes
Failure Escalation • .Net 1.0, 1.1 – certain unhandled exceptions will be swallowed • Does not terminate process • Silent Failures & Process Corruption • .Net 2.0 – all unhandled exceptions will bubble up affecting entire process • Make failures more apparent & easier to debug
Escalation Policy - Failures • Failure to allocate resource: • Memory or resources managed by OS • Failure to allocate resource in critical region of code: • Block of code shared b/w multiple threads • Code relies on state from another thread cannot be cleaned up by terminating running thread – integrity not guaranteed • E.g. SQL Server: • Abort thread - if failure to allocate resource • Unload Application Domain – if thread is in critical region
Escalation Policy - Failures • How does CLR know – if code is in critical region ? • CLR detects code executed – waits on a synchronization primitive (mutex, event, semaphore or locks) • Resource failure occurs in a region depending on sync primitive – code is in critical region
CLR Catch • CLR ability to detect code waiting on sync primitive – limited • System.Threading – mutex & events • CLR tracks locks created in managed world • Add-ins – given full trust in CAS & use PInvoke to create sync primitives by calling Win32 API’s • Unknown to CLR – outside realm of managed code • Won’t be reported as critical region code if failure occurs
Escalation Policy - Failures • Fatal Runtime Error: • Internal error – cannot continue to execute managed code • Exit process or disable CLR • Orphaned Lock: • Sync primitive is created but never freed • E.g. – Mutex or Monitor created on a thread that is aborted before lock is freed • Lock is Orphaned and can never be freed • Result in Resource Exhaustion
Escalation Policy - Actions • Throw an exception: • Default action – resource failures • E.g. – StackOverflowException, OutOfMemoryException • Gracefully Abort Thread: • Throws ThreadAbortException on terminating thread • CLR gives add-in chance to free resources by running code in finally blocks
Escalation Policy - Actions • Rudely Abort Thread: • No guarantee about cleaning up add-in code • Use to remove threads that do not gracefully abort • Gracefully Unload Application Domain: • Gracefully abort all threads • Free CLR data structures associated with domain • Finalizer is run for all objects in domain
Escalation Policy - Actions • Rudely Unload Application Domains: • Rude abort of all threads • CLR data structures are freed • No guarantee of Finalizers to run • Gracefully exit Process: • Gracefully unload application domains • Rudely exit Process: • Rudely unload application domains • TerminateProcess – Win32 API • Disable the CLR: • Prevent execution of managed code • Process is still alive – continue other work • E.g. – SQL Server Process
Escalation Policy - Operations • Specify Timeouts for operations • Indicate actions that should occur • Diagram – Escalation Policy of SQL Server 2005 Host
Critical Finalization, Safe Handles & Constrained Execution Region • Ensure application domains unload without leaking resources • Guarantee native handles held will be closed properly • Framework classes – wrappers around native handles • E.g. System.IO, System.Net • Dispose Pattern & Object Finalizers – no guarantee that they run
Critical Finalization, Safe Handles & Constrained Execution Region • Critical Finalizer: • CLR will always run • Guaranteed to complete • System.Runtime.ConstrainedExecution.Cri—ticalFinalizerObject • Safe Handle: • Wrapper around native handle • BCL rewritten in .Net 2.0 using Safe Handles • System.Runtime.InteropServices.SafeHandle
Critical Finalization, Safe Handles & Constrained Execution Region • CER: • How is it that it always run and always complete? • Block of code in which exceptions are never thrown due to lack of resources • CLR Steps: • Prepare CER • Restrict Operations inside CER
Guidelines for Writing Highly Available Managed Code • Use Safe Handles to Encapsulate Native Handles: • Use classes in System.Runtime.InteropServices • Write a custom class • Create a class derived from System.Runtime.InteropServices • Provide a constructor that enables callers to associate native handle • Implement ReleaseHandle method • Implement IsInvalid Property
Safe Handles • Derive from CriticalFinalizerObject • Classes derived from SafeHandle require permission to call unmanaged code • Constructor has ownsHandle parameter • Annotate with SuppressUnmanagedCodeSecurityAttribute
Guidelines for Writing Highly Available Managed Code • Use only Synchronization Primitives provided by . Net • Code is shared or in Critical Region – sync primitives • System.Threading – Monitor, Mutex, ReaderWriterLock • Custom primitives – CLR cannot detect shared state, Escalation Policy cannot be used
Guidelines for Writing Highly Available Managed Code • Ensure calls to Unmanaged Code return to CLR: • Thread can enter a state that prevents CLR to abort it. • Use PInvoke – call unmanaged API and waits infinitely on sync primitive or blocks • CLR has no control of unmanaged code • Provide timeout values • Regain control and ask CLR to abort thread
Guidelines for Writing Highly Available Managed Code • Annotate Your Libraries with Host Protection Attribute: • Host Protection to prevent API’s that violate programming models • Prevent add-ins from using any API that allows it to share state across threads • Reduce resource failures and application domain unloads • Use custom attribute HostProtectionAttribute