90 likes | 217 Vues
This work assesses the modeling and solution techniques used in computational experiments, focusing on theoretical and empirical approaches. It discusses common shortcomings, the differentiation between algorithms and their implementations, and emphasizes the need for reproducibility in results. Performance measures including efficiency, robustness, and accuracy are examined, alongside the advantages and disadvantages of benchmark sets and random problem generation. Special attention is given to distinguishing competitive testing from scientific testing, advocating for experiments that enhance understanding within the field.
E N D
Evaluation of modeling and solution techniques • Theoretical • worst case, average case, partial orders • shortcomings: • worst case seldom occurs • unrealistic assumptions • Empirical • computational experiments
Principles • Results presented must be sufficient to justify claims • e.g., don’t confuse an algorithm with an implementation • Sufficient detail to allow reproducibility of results • give actual code • experimental notebook
Test problems • Benchmark sets • from practice • specially constructed • Randomly generated • simple random • model a real problem
Advantages & disadvantages • Benchmark sets • sometimes representative of real world • expensive to collect, thus sets often small • biased • Randomly generated • can explore entire space of problems • allows statistically valid conclusions • lack of realism
Performance measures • Efficiency • CPU time • nodes visited • constraint checks • Robustness, scope • class of problems which can be effectively solved • Scalability • size of problems • Accuracy, solution quality
Performance claims A claim that… • a new algorithm is feasible and promising • preliminary testing on several hand-picked problems • an algorithm/implementation is better • detailed comparison with prominent methods already available on broad range of problems
Pitfalls • Straw algorithms • only compare against the “best” • Easy problems • Unfair comparisons • different languages, programmers, optimization efforts, machines, ... • Test set tuning • e.g., parameter tuning • solution: divide into “training” and test sets
Competitive testing vs Scientific testing • Drawbacks of competitive testing • enormous amount of work • dictates implementation language • tells us which algorithm is better but not why • negative results are considered uninteresting • Scientific testing • experiments designed to contribute to understanding
References • Crowder, H.P., Dembo, R.S., and Mulvey, J.M. “On reporting computational experiments with mathematical software,” ACM Transactions on Mathematical Software, 5:193-203, 1979. • Jackson, R.H.F., Boggs, P.T., Nash, S.G., and Powell, S. “Guidelines for reporting results of computational experiments,” Mathematical Programming, 49:413-426, 1990. • Hooker, J.N., “Needed: An empirical science of algorithms,” 1993. • Hooker, J.N., “Testing heuristics: We have it all wrong,” 1995.