1 / 21

Unreliable Computers

Unreliable Computers. 18 Oct, 1999 Park, Hyun-Joong. Contents. Prolgue What is Reliability? How much fail? Case by Case Why Are Complex Systems So Unreliable? What Are Computer Scientists Doing about it? Conclusions. Prologue.

melvyn
Télécharger la présentation

Unreliable Computers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unreliable Computers 18 Oct, 1999 Park, Hyun-Joong

  2. Contents • Prolgue • What is Reliability? • How much fail? • Case by Case • Why Are Complex Systems So Unreliable? • What Are Computer Scientists Doing about it? • Conclusions

  3. Prologue • 중간고사 기간. 09는 도서관에 갔다. 도서관에 가는 도중에 가게에 가서 새우깡을 샀다. 그런데…도서관 입구에서 09는 출입증을 집에 놓고 온 사실을 알게 되었다. Bar Code가 있는 도서관 출입증을 입력해야 도서관에 들어갈 수 있다. 생각 끝에 09는 새우깡 봉지에 있는 Bar Code를 입력했다. 잠시 후…놀란 얼굴을 하고 나타난 도서관 관리 아저씨… 09는 도서관 관리 아저씨 손에 이끌러 관리 사무실로 갔다. 그리고, 09는 관리 사무실의 모니터를 보고 눈을 의심했다…. • 이름: 새우깡 학번 : 500원 • 다 같이 생각해 봅시다. 과연 이 도서관의 출입관리 시스템이 Reliable한가요? 아니면 Unreliable한가요?

  4. What is Reliability? • “Reliability in a computer system means the probability that it will not fail during a given period of operation under given condition” [ by Peter Mellor ] • Measure of reliability can include the number of failure per unit of operating time and the expected length of time that a system will operate without failing. • System fail not only in operation, but at various stages in their design and development.

  5. How much fail? • A study by U.S. government’s General Accounting Office • 9 projects cost a total $6.8 million • $3.2 million(47%) were delivered but not used • $2.0 million were paid for but not delivered • $1.3 million were abandoned or reworked • $200,000 were only used after substantial modification • $100,000 was used

  6. Case by Case • Medical Therapy(1) • A British Hospital in 1992 • almost 1000 cancer patients over ten years • Below the proper dosages • Because of an undetected programming error • Only 447 of the 989 patients were still alive

  7. Medical Therapy(2) • Therac-25 X-ray Machines • Killing one person, badly burning others, and leaving others with paralysis • Some body areas to receive between 17,000 and 25,000 rads

  8. Missile Systems(1) • Ballistic Missile Early Warning System • In 1960, the rise of the moon above the horizon was interpreted as a nuclear attack • A flock of geese was thought to be a group of inbound nuclear missiles • USS Vincennes • Shot down an Iranian Airbus • Caused the loss of 290 civilian lives • Aegis fleet defense system • Track hundreds of objects in 300kilometer • Twenty targets

  9. Missile Systems(2) • U.S Patriot missile systems • Against Iraqi Scud ballistic missiles in Persian Gulf war • An 80% percent success rate was claimed by U.S Army • A mere 10% percent of the eight Scud warheads by videotapes • The timing hardware was developed in 1960’s • A third of a second every hundred hours • 24-bit register for the clock

  10. Military Airplanes • UH-60 Blackhawk Helicopter • Since 1982 22 U.S. serviceman have died in 5 separate crashes • The machines either have spun out of control or nose-dived into ground • The UH-60 was inherently susceptible to radio interference in its computer-based-wire control system

  11. Why Are Complex Systems So Unreliable? • Questions • Why can’t we build computer systems with the same inherent reliability of such as bridges and buildings? • Why isn’t S/W guaranteed in the same way the other goods are? • Should we entrust responsibility for the conduct of nuclear war etc?

  12. Nature of computer & S/W • Complexity • Our ability to know all states of some system is severely restricted

  13. Example of the complexity • Suppose that the system is designed to monitor 100 binary signal • Given these, 100 different signal sources 2100 or 1.27* 1030 • The path that depends on the combination of signals • There may be at least 10,000 possible path for large scale • If one could test at a rate of 100/second 4 * 1024 years

  14. Ethical issues • Are programmers free of obligation in the event of a substantial system failure? • If such a system kills someone, is the programmer a murderer? • Is the programmer guilty simply providing a system that could not be guaranteed for application in a life-critical situation?

  15. S/W Engineering • Computer scientists attempt to answer the problems of reliability • By developing new tools for the design and development of S/W

  16. What Are Computer Scientists Doing about it? • A number of rubrics (1) • First, object oriented programming(OOP) • Information hiding • high modularity • Predefining properties

  17. A number of rubrics (2) • Second, the area of program verification and derivation • Mathematical techniques for proving programs correct once they have been written(verification) • Showing programs to be correct in the process of building them(derivation) • Not yet able to handle programs of modest size

  18. A number of rubrics (3) • Third, the development of programming environment • Operating system & collections of S/W tools • Aid programmers by providing flexible and powerful ways managing • UINX system의 make • UML관련 Tools

  19. A number of rubrics (4) • Forth, extensive & expensive debugging, quality control testing , and product proving • NASA space shuttle S/W • Spent $500 million in testing its space shuttle software • About $1000 per line of code tested • Tokyo’s bullet train • 9 years of error-free operation • The developer relies on tired and tested S/W modules and heavy monitoring of programmers

  20. A number of rubrics (5) • Last, the human management and project management • “personnel attributes and human resource activities provide by far the largest source of opportunity for improving S/W development productivity” [Boehm]

  21. Conclusions • The construction of S/W is a complex and difficult process • The existing techniques do not provide S/W of assured quality and reliability • There are clear signs that the S/W community is becoming sensitive to the possibility of a public backlash

More Related