1 / 18

SteerBench: a benchmark suite for evaluating steering behaviors

SteerBench: a benchmark suite for evaluating steering behaviors. Authors : Singh, Kapadia , Faloutsos , Reinman Presented by : Jessica Siewert. Content of presentation. Introduction Previous work The Method Assessment. Introduction – Context and motivation. Steering of agents

yoland
Télécharger la présentation

SteerBench: a benchmark suite for evaluating steering behaviors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SteerBench: a benchmark suite for evaluating steering behaviors Authors: Singh, Kapadia, Faloutsos, Reinman Presentedby: Jessica Siewert

  2. Content of presentation • Introduction • Previous work • The Method • Assessment

  3. Introduction – Context and motivation • Steering of agents • Objective comparison • Standard? • Test cases and scoring, user evaluation • Metric scoring • Demonstration

  4. Introduction – Previous work There is not really anything like it yet (Nov ‘08)

  5. Introduction - Promises • Evaluate objectively • Help researchers • Working towards a standard for evaluation • Take into account: • Cognitive decisions • Situation-specific aspects

  6. The test cases • Simple validation scenarios • Basic one – on – one interactions • Agent interactions including obstacles • Group interactions • Large-scale scenarios

  7. The user’s opinion • Rank on overal score across test cases (comparing) • Rank algorithms based on • a single case, or • one agent’s behavior • Pass/fail • Visually inspect results • Examine detailed metrics of the performance

  8. The metric • Number of collisions • Time efficiency • Effort efficiency • Penalties?

  9. Movies…

  10. Developments since then • Ioannis Karamouzas , Peter Heil , Pascal Beek , Mark H. Overmars, A Predictive Collision Avoidance Model for Pedestrian Simulation, Proceedings of the 2nd International Workshop on Motion in Games, November 21-24, 2009, Zeist, The Netherlands • Shawn Singh , Mubbasir Kapadia , Billy Hewlett , Glenn Reinman , Petros Faloutsos, A modular framework for adaptive agent-based steering, Symposium on Interactive 3D Graphics and Games, February 18-20, 2011, San Francisco, California • Suiping Zhou , Dan Chen , Wentong Cai , Linbo Luo , Malcolm Yoke Hean Low , Feng Tian , Victor Su-Han Tay , Darren Wee Sze Ong , Benjamin D. Hamilton, Crowd modeling and simulation technologies, ACM Transactions on Modeling and Computer Simulation (TOMACS), v.20 n.4, p.1-35, October 2010

  11. Experiments – Claim recall • Evaluate objectively • Help researchers • Working towards a standard for evaluation

  12. Assessment – good things • All the measured variables seem logical (Too?) • Extensive variable set, with option to expand • Customized evaluation • Cheating not allowed • collision penalties • fail constraint • goal constraint • Layered set of test cases

  13. Assessment • The measurements all seem to be approximately the same • User test makes the difference? • Who are these users? • Examine, inspect, all vage terms • What about the objective of objectiveness?

  14. Assessment • How good is it to be general • How general/specific is this method? • Time efficiency VS. Effort efficiency • Should it be blind for the algorithm itself? • Penalties, fail and goal constraints not specified!

  15. Assessment – scoring(1/2) • The test cases are clearly specified. But it is not specified HOW a GOOD agent SHOULD react, though they say there is such a specification • How can you get cognitive decisions out of only position, direction and a goal?

  16. Assessment – scoring(2/2) • “Scoring not intended to be a proof of an algorithm’s effectiveness.” • How do you interpreted scores and who wins? • “B is slightly better on average, but A has the highest scores.”

  17. Assessment – final questions • Can this method become a standard? • What if someone claims to be so innovative this standard does not apply to them? • Nice first try, though! Getty images

More Related