790 likes | 1.06k Vues
COMS W4156: Advanced Software Engineering. Prof. Gail Kaiser Kaiser+4156@cs.columbia.edu http://bank.cs.columbia.edu/classes/cs4156/. Topics covered in this lecture. Mythical Man Month No Silver Bullet. Mythical Man Month. The Mythical Man-Month. Fred Brooks, 1975. Background.
E N D
COMS W4156: Advanced Software Engineering Prof. Gail Kaiser Kaiser+4156@cs.columbia.edu http://bank.cs.columbia.edu/classes/cs4156/ COMS W4156
Topics covered in this lecture • Mythical Man Month • No Silver Bullet COMS W4156
Mythical Man Month COMS W4156
The Mythical Man-Month Fred Brooks, 1975 COMS W4156
Background • Fred Brooks began managing IBM’s OS/360 software development effort in 1964 • Brooks’ previous experience was in hardware design • OS/360 was (probably) the largest software system attempted to date at that time • OS/360 “was late, took more memory than was planned, costs were several times the estimate, and it did not perform very well until several releases after the first” • Planned for release in 1965 (low-end) and 1966 (high-end), both finished a year late, so another temporary OS was cobbled together so hardware could be used • OS.360 retired in 1972 (see http://ldworen.net/fun/os360obit.html) COMS W4156
Background • The MMM book results from analyzing the OS/360 experience, which was quite different from the 360 hardware effort • MMM was first published in 1975, 20th anniversary edition published in 1995 • Turing Award in 1999 • Brooks is now a Professor at University of North Carolina, Chapel Hill, with research interests in graphics, user interfaces and virtual worlds • http://www.cs.unc.edu/~brooks/ COMS W4156
The Tar Pit • Developing large software systems is “sticky” - the more you fight it, the deeper you sink • Projects may emerge from the tar pit with running systems, but miss goals, schedules and budgets • “No one thing seems to cause the difficulty – any particular paw can be pulled away. But the accumulation of simultaneous and interacting factors brings slower and slower motion” • Analogy meant to convey that it is hard to discern the nature of the problem(s) facing software development COMS W4156
Program to Product *3 *3 *9 COMS W4156
Program to Product • A product (more useful than a program): • can be run, tested, repaired by anyone • usable in many environments on many sets of data. • must be tested and documented • To be a component in a programming system (collection of interacting programs like an OS): • input and output must conform in syntax and semantics to defined interfaces • must operate within resource budget • must be tested with other components to check integration (interactions grow exponentially) COMS W4156
What makes programming fun? • Sheer joy of creation • Pleasure of creating something useful for self and other people • Creating (and solving) puzzles • Life-long learning • Working in a tractable medium - software is extremely malleable COMS W4156
What’s not so fun about programming? • You have to be perfect! • You are rarely in complete control of the project • You have to depend on other people • Design is fun; debugging is just work • Testing takes too long! • The program may be obsolete by the time its finished COMS W4156
Why are software projects late? • Estimating techniques are poorly developed • Estimating techniques confuse effort with progress • Since we are uncertain of our estimates, we don’t stick to them • Progress is poorly monitored • When slippage is recognized, we add people (“like adding gasoline to a fire”) COMS W4156
Optimism • “All programmers are optimists” • “All will go well” with the project – thus we don’t plan for slippage • Each task has a nonzero probability of failure or slippage - probability that all will go well is near zero • One reason for optimism is the nature of creativity • Idea, implementation and interaction • The medium of creation constrains our ideas – in software the medium is extremely malleable, thus we expect few problems in implementation COMS W4156
The Mythical Man-Month • Cost does indeed vary as the product of the number of people and the number of months • Progress does not! • The unit of man-month [person-month] implies that people and months are interchangeable • This is only true when a task can be partitioned among many workers with no communication among them • When a task is sequential, more effort has no effect on the schedule – many tasks in software engineering have sequential constraints COMS W4156
MMM • Most tasks require communication among workers • Communication consists of • Training • Sharing information (intercommunication) • Training affects effort at worst linearly • Intercommunication adds n(n-1)/2 to effort if each worker must communicate with every other worker COMS W4156
Intercommunication effort COMS W4156
Comparison graphs Adding more people then lengthens, not shortens, the schedule months No communication With communication people COMS W4156
Scheduling • Brooks’ rule of thumb • 1/3 planning • 1/6 coding • 1/4 component test • 1/4 system test • In looking at other projects, Brooks found that few planned for 50% testing • But most actually spent 50% of their time testing (writing test harnesses can be almost as much work as or sometimes more than writing the actual code) • Many projects were on schedule until testing began COMS W4156
Development Team • How should the development team be arranged? • The problem: good programmers are much better than poor programmers • Typically 10 times better in productivity • Typically 5 times better in terms of program elegance and resource consumption COMS W4156
The dilemma of team size • Consider a 200-person project with 25 experienced managers (experienced at programming, not necessarily at managing) • Productivity differences argue for firing the 175 workers and use the 25 managers as the team • OS/360 had over 1000 people working on it COMS W4156
Two needs to be reconciled • For efficiency and conceptual integrity a small team is preferred • To tackle large systems considerable personnel resources are needed • One solution: Harlan Mill’s Surgical Team approach – one person performs the work (with a co-pilot), all others perform support tasks COMS W4156
Surgeon – chief programmer Co-pilot – like surgeon but less experienced (training, insurance) Administrator – relieves surgeon of administrative tasks Editor – proof-reads and copy-edits documentation 2 Secretaries – support administrator and editor Program clerk – tracks versions Toolsmith – develops tools, utilities for surgeon Tester Language lawyer (shared among multiple projects) Harlan Mills’ Surgical Team COMS W4156
How is this different? • Normally, work is divided equally among team members – now only surgeon and copilot divide the work • Normally each person has equal say – now surgeon is absolute authority • Note communication paths are reduced • Normally 10 people → 45 paths • Surgical Team → 15 paths • Many roles on team now automated (no one hires 9 people to support each chief programmer) COMS W4156
How does this scale? • Reconsider the 200 person team – communication paths →19,900! • Create 20 ten-person surgical teams • Now only 20 surgeons must work together – 20 people → 190 paths • The key problem is ensuring conceptual integrity of the design COMS W4156
Conceptual Integrity • Brooks’ analogy: Cathedrals • Many cathedrals (e.g., Worms Cathedral) consist of contrasting design ideas • But Reims Cathedral was the result of 8 generations of builders repressing their own ideas and desires to build a cathedral that embodies the key design elements of the original architect COMS W4156
Worms vs. Reims COMS W4156
Conceptual Integrity • With respect to software, design by too many people results in conceptual disunity of a system, making the program hard to understand and use • Better to leave functionality out of a system if it causes the conceptual integrity of the design to break • Ease of use is enhanced only if the functionality provides more power than it takes to learn (and remember) how to use it in the first place COMS W4156
Software Architects as Aristocrats • Conceptual integrity requires that the design be the product of one mind • The architect (or surgeon) has ultimate authority - and ultimate responsibility • Does this imply too much power for architects? • Architect sets the structure of the system • Developers can then be creative in how system is implemented COMS W4156
The Second-System Effect • An engineer is careful in designing his/her first system – he/she realizes that they are working in uncharted territory • But in the second system, the engineer has some experience and wants to throw everything into the design • Symptoms: • Functional embellishment – to an unnecessary degree • Optimizations to obsolete functionality • OS/360 linker had a sophisticated program overlay functionality, but the architecture no longer depended on overlays – resulting in unnecessarily slow linkage COMS W4156
How to avoid it? • Employ extra self-discipline • Avoid functional ornamentation • Be aware of changes in assumptions • Strive for conceptual integrity • How do managers avoid it? • Insist on a senior architect with more than two systems under his/her belt COMS W4156
Communicating design decisions • Written specifications - “The Manual” • Answers questions • Conceptual integrity • Demands high precision • Telephone log – make sure to capture all design decisions [today: IM, email, wiki, blog, etc.] • Product test – external test group keeps implementation honest COMS W4156
Formal Definitions • Natural language is not precise • Notations help express precise semantics – however, natural language is often needed to “explain” the meaning to the uninitiated • What about using an implementation as the formal definition? • Advantages: precise specification • Disadvantages: Over-prescription, potential for inelegance, may be modified • Inconsistencies between multiple implementations can identify problems in the specs – with only one implementation its easier to change the manual COMS W4156
Why did the Tower of Babel Fail? • Communication (lack of it) made it impossible to coordinate • How do you communicate in large project teams? – informal, meetings, workbook • Workbook • Structure placed on project’s documentation • Technical prose lives a long time, best to get it structured formally from the beginning, also helps with distribution of information COMS W4156
[Genesis 11:1-9] 1 And the whole earth was of one language, and of one speech. 2 And it came to pass, as they journeyed from the east, that they found a plain in the land of Shinar; and they dwelt there. 3 And they said one to another, Come, let us make brick, and burn them thoroughly. And they had brick for stone, and slime had they for mortar. 4 And they said, Come, let us build us a city and a tower, whose top may reach unto heaven; and let us make us a name, lest we be scattered abroad upon the face of the whole earth. 5 And the Lord came down to see the city and the tower, which the children builded. 6 And the Lord said, "If as one people speaking the same language they have begun to do this, then nothing they plan to do will be impossible for them. 7 Come, let us go down, and there confound their language, that they may not understand one another's speech. 8 So the Lord scattered them abroad from thence upon the face of all the earth: and they left off to build the city. 9 Therefore is the name of it called Babel (confusion); because the Lord did there confound the language of all the earth: and from thence did the Lord scatter them abroad upon the face of all the earth Tower of Babel COMS W4156
Reducing communication paths • Communication needs are reduced by • Division of labor • Specialization of function • A tree structure often results from applying this principle • However this serves power structures better than communication (since communication between siblings often needed) • So communication structure is often a network COMS W4156
Organizational structure • Brooks outlines • Mission, producer, director, schedule, division of labor, interfaces between the parts • The (then) novel suggestions are the producer and the director • Producer manages project and obtains resources (product manager) • Director manages technical detail (program manager) COMS W4156
Calling the shot • Estimates for programming in the small don’t scale • Need to add planning, documentation, testing, system integration and training in large projects • Effort vs. program size increases exponentially COMS W4156
Plan to throw one away • You will anyway • Consider chemical engineers: scaling a laboratory result up to actual (and practical) use requires a pilot step – e.g., desalting water 10,000 gallons/day first, then 2,000,000 • Software projects typically plan to deliver the first thing they build to customers • Typically hard to use, buggy, inefficient, etc. • Experience shows you will discard a lot of the first implementation anyway (wrt version 2) COMS W4156
Rapid prototypes • Help gain early feedback • Intended from start to be thrown away • Management question: • Plan to build a system to throwaway • Experience gained, feedback can be applied • Plan to build a throwaway - that is delivered to the customer • User is aggravated and demands support • Brooks focused on planning for change COMS W4156
Change • Causes: • Both the actual need and the user’s perception of that need will change as programs are built, tested, used • Other factors – hardware, assumptions, environment • Handling: • Modularization • Precise and complete interfaces • Standard calling sequences • Complete documentation • High-level languages • Configuration management COMS W4156
Organizational issues • Culture must be conducive to documenting decisions, otherwise nothing gets documented • Other points to consider • Job titles • Keeping senior people trained COMS W4156
Maintenance • Two steps forward and one step back • Lifecycle of bugs • Fixing a bug has a chance of adding another, lots of regression testing needed • One step forward and one step back • Maintenance is an entropy-increasing process • As maintenance proceeds, system is less structured than before, conceptual integrity degrades (foreshadows refactoring) COMS W4156
Hatching a catastrophe • A project gets to be a year late one day at a time • Major calamities are “easy” to handle – whole team pulls together and solves it • Day to day slippage is harder to recognize COMS W4156
How to keep on track? • Have a schedule • Overestimates come steadily down as the activity proceeds • Underestimates do not change until scheduled time draws near • Have checkable milestones • Not “coding complete” • But “specifications signed by architects” • Or “debugged component passes all tests” • Track the critical path – who is waiting on whom to finish what • Address the “status disclosure problem” • Managers must distinguish between action meetings and status meetings: if inappropriate action taken in response to a status report, it discourages honest status reports COMS W4156
Summary of MMM • Adding more people to a late project makes it later • Reduce communication paths • Design by small number of top-notch people, but avoid second system effect • Plan throwaway • Track schedule COMS W4156
No Silver Bullet COMS W4156
No Silver Bullet – Essence and Accident in Software Engineering Fred Brooks, 1986 COMS W4156
No Silver Bullet “There is no single development, in either technology or management technique, which by itself promises even one order-of-magnitude improvement within a decade in productivity, in reliability, in simplicity” COMS W4156
Why a Silver Bullet? • http://www.fvza.org/wmyths.html COMS W4156
Past Gains in Software Productivity • Came from removing artificial barriers such as severe hardware constraints, awkward programming languages, lack of machine time • Unless at least 9/10ths of what software engineers do today is still devoted to the accidental (incidental), then shrinking all the accidental activities to zero time will not give an order of magnitude improvement COMS W4156