1 / 42

The Failure of the London Ambulance Service

The Failure of the London Ambulance Service. Michael McDougall CIS 573 November 16 th , 1999. The Accident. On October 26 th 1992 the London Ambulance System failed. Phones rang for up to 10 minutes Ambulance response times were delayed Some calls were lost

Télécharger la présentation

The Failure of the London Ambulance Service

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Failure of the London Ambulance Service Michael McDougall CIS 573 November 16th, 1999

  2. The Accident • On October 26th 1992 the London Ambulance System failed. • Phones rang for up to 10 minutes • Ambulance response times were delayed • Some calls were lost • On November 2nd the system crashed completely. • Software was a major cause of the failures.

  3. Outline • London Ambulance Service Computer Aided Despatch (CAD) system • Background • Planning the system • Developing the system • How it failed • ISO 12207 – Software Development Standard • LAS failure w.r.t. ISO standard

  4. Background • The London ambulance service (LAS) is was the largest ambulance service in the world. • 6.8 million residents – much higher during daytime. • Services 5000 patients a day. • Handles between 2000 and 2500 calls a day (more than 1 per minute). • Employs 2700 full-time staff.

  5. Background • In 1990 the LAS was not meeting the U.K. standards for ambulance response times. • Other parts of the U.K. National Health Service had undergone reforms throughout the 80’s but the LAS had not changed much since 1980. • Staff/Management relations were low.

  6. Despatch system • The despatch system was responsible for: • Taking emergency calls • Deciding which ambulance to send • Sending information to ambulances • Managing allocation of ambulances

  7. Despatch system RA Take Call Collection Point Paper RA Regional Allocator Paper Ambulance Paper Despatcher Voice

  8. Despatch system • The UK national standard required that this take less than 3 minutes. • The LAS system in 1990 had a number of inefficiencies which made it impossible to meet the standard.

  9. Inefficiencies Take Call Finding the location of an accident was often difficult and time consuming.

  10. Inefficiencies Paper Paper Moving pieces of paper took unnecessary time Paper

  11. Inefficiencies Collection Point Identifying duplicate calls relied on human memory and was therefore slow and error prone.

  12. Inefficiencies Allocating ambulances was done by hand. Relied on memory of allocator. Regional Allocator Voice communication was slow Ambulance Paper Despatcher Voice

  13. Improving the system • The LAS was under pressure from their superiors, MPs, the public and the media to improve performance. • LAS management decided that a Computer Aided Despatch system was the fastest way to improve service.

  14. The Plan • LAS wanted to radically change the despatch system. • In Autumn 1991 they began to write the system requirements for the new system.

  15. CAD system goals Take Call Finding the location of an accident was often difficult and time consuming. Software connected to public telephones will locate incidents automatically

  16. CAD system goals Paper Paper Moving pieces of paper took unnecessary time Paper Information will move through a network between workstations.

  17. CAD system goals Collection Point Identifying duplicate calls relied on human memory and was therefore slow and error prone. AI will try to identify duplicate calls.

  18. CAD system goals Allocating ambulances was done by hand. Relied on memory of allocator. Regional Allocator Allocation of nearest ambulance will be done by computer in most cases.

  19. CAD system goals Digital communication to and from ambulances Voice communication was slow Ambulance Despatcher Voice

  20. LAS ambitions • The new system was intended to mobilize an ambulance in less than 1 minute. • The system would be the most ambitious of its time. • A much more modest system had been planned for the LAS, but this was abandoned when it failed load-testing. • No independent audit of the system requirements was carried out.

  21. CAD requirements • LAS wanted a one-phase delivery • LAS decided that the system should cost £1,500,000 • LAS decided that the system would take 6 months to implement (though a project of this scale would usually take 18 months) • These requirements were not based on any analysis of the design. They appear to be arbitrary.

  22. Asking for tenders • In early 1991, LAS publicized the requirements and asked for bids • Many potential suppliers expressed doubts that the project could be finished on time with the required budget • LAS replied that the timetable was not negotiable

  23. Bids • Many potential suppliers submitted bids for the project • Most of the bids required more time and/or money • The bids were evaluated by LAS staff who had no experience with information technology

  24. Selecting a contractor • Only one bid was under £1,500,000 and promised an implementation system in 6 months. This bid was selected. • The winning bid was from Systems Options Ltd (SO), a small software house with no experience in safety-critical software. • SO had never managed a large project

  25. The Contract • LAS signed a contract with SO in September 1991. • The system was supposed to go on-line on January 8th, 1992. • The contract did not specify who would act as project manager or who would be responsible for quality assurance. • No acceptance criteria was defined

  26. Developing the system • Suppliers failed to meet deadlines • SO initially handled the project management, but this shifted to LAS as the project proceeded • No independent QA or audit was performed; LAS intended to save money by leaving QA to the suppliers

  27. Problem tracking • There was a formal procedure for reporting, analyzing and fixing bugs but… • this was often skipped so that the software could be changed quickly to satisfy users

  28. Training problems • Users were trained long before the system was on-line. The training was often out of date or forgotten by the time the system was available • Users were only trained for their part of the system

  29. Partial deployment • The complete system was not ready by Jan 8; systems was deployed in pieces • Bugs encountered • System needed perfect vehicle information • Every 53rd vehicle was unavailable • Workstations froze often (Windows 3.0) • Vehicle allocation could not be overridden • Sending the wrong vehicle

  30. Expected to fail • Interacting with the system was often awkward and frustrating • The LAS Staff had little confidence in the system

  31. No testing • No testing of the full system was ever done • Nobody ever tested to see if radio system could handle traffic • Management did not know what resources were required to maintain service; the CAD system was supposed to give this information

  32. Failure 1 • On October 26th the LAS management decided to switch to the full CAD system. This decision was made even though • the system was never tested • there were outstanding bugs which were considered ‘severe’

  33. Failure 1 • Initially the system worked; there were some errors but the staff were able to correct them • As the load increased the system response time decreased and the ambulance location data became less and less reliable

  34. Feedback problems Crew frustration Fewer available vehicles Longer waits for ambulance Bad data More calls Bad allocation

  35. Design errors • Some of the design decisions made it harder to recover from errors • Allocators could only get info on ambulances by reserving an ambulance • Control room layout made it hard for operators to communicate • System could not handle operators overriding computer decisions

  36. Consequences • At the height of the accident emergency calls were ringing for 10 minutes before being answered • Some calls were lost because the list of calls was too big for the terminals • 80% of ambulances took more than 15 minutes to respond. (Average was 67%).

  37. Consequences cont. • The media reported that patients died because of the failure. A coroner later concluded that this was false.

  38. Failure 2 • After the first failure LAS went back to the semi-automated system in use before October 26th. • On November 4th the system froze • The cause was a server that had run out of memory

  39. Memory leak • The server software had been changed 3 weeks before. This change introduced a small memory leak. • The server had been running out of memory ever since

  40. Backup system • There was a backup server, but it was only designed to work in the full CAD system

  41. Consequences • At the time of the 2nd failure the load was light enough that the staff recovered all the information lost in the crash. • No calls were missed. • LAS went back to the original paper system

  42. Next class • ISO 12207 - Software life cycle processes • Would standards have prevented the LAS failure? • Are standards worth it?

More Related