Advancing Simulation Modeling through Case-Based Reasoning: Addressing Challenges and Implementing Solutions

Case-based Reasoning For Simulation Modeling: Issues And Challenges Ming Zhou, PhD., Professor Center for Systems Modeling & Simulation Indiana State University Terre Haute, IN 47809, USA Collaborating with Zhimin Chen, Ph.D., Professor School of Management, Shen Zhen University, PR. China

Reasons for simulation being underutilized • Simulation modeling is a time-consuming and knowledge/information intense process • Most models developed are customized “rigid” models that cannot be reused or easily adapted to other even similar problems (Arons 1999, 2000; Zhou 2004) • Conceptual modeling is a critical step that directly affects the quality and efficiency of simulation projects, but hardly supported with current technology, and still a very difficult and ad-hoc process that depends on the skill and experience of individual modelers (Mclean, 2001; Robinson, 2004).

Efforts made to address the difficulties • Develop standard templates for specific classes of simulation problems (pattern-based approach) • Develop modularized models or component-based modeling approach • Develop standard interface that integrates simulation with other application systems • Develop neutral data formats to facilitate model data transfer between different systems (to address interoperability)

Studies on knowledge-based simulation • Develop “extended programming languages”, i.e. general programming augmented with simulation oriented language constructs • Developing specialized simulation language based on a flow-chart type of logic • Develop better interface to create a more interactive modeling environment (as opposed to “batch” mode)

Continued development • Object-oriented representation of simulation concepts has been emphasized since 1980s • Case-based approach: adapt existing models for new applications (implemented models) • In software vendor industry, emphasis has been given to the development of high-level simulators for special application systems

Application Simulation Implementation Knowledge involved in simulation modeling Bounded by a context …

Two phases of modeling • Involving different types of knowledge and different styles of reasoning… Application Problem Definition Simulation Problem Definition Conceptualization Conceptual Simulation Model (CSM) Implementation Simulation Model (ISM) Implementation

Deeper examination of SM process • SM is an interactive decision-making process • Problems in SM are usually unstructured or semi-structured, i.e. the logic relations between decision factors is not well defined or clear • SM is a knowledge/information intense process • Information and knowledge are used in a contextual manner, i.e. they are related to the unique structural and behavioral characteristics of a specific application. This context is very important in deriving solutions to similar problems but very difficult to store with conventional databases/information systems • A great amount of knowledge/information is embedded among the solved simulation cases. A popular approach in practice is to develop a new simulation model by retrieving “old” models developed for similar past solved problems, and modifying the “old” model to solve the new case

Problems with traditional KBS: rule-based expert systems (Watson, Leake, Bachant, …) • Knowledge acquisition__ it is difficult to obtain generalized knowledge from SM processes due to the lack of basic understanding and unstructured nature of problem domain. When problem domain is not well defined, the rules formulated are imperfect and produce unreliable solutions • Knowledge elicitation__ it is difficult and laborious to extract empirical knowledge from human experts and formalize the knowledge into decision rules that can characterize the expert performance. However many rule-based systems assumed that expert knowledge is available and can be elicited and organized efficiently

Problems with traditional KBS: rule-based expert systems • Knowledge maintenance__ in many applications, rules are interrelated (e.g. “chained” with each other) and the number of rules required are unmanageably large • Results interpretation__ in many domains, the inference process can become complex, and it is difficult for users to understand or verify the solutions suggested by rule-based reasoning

Advantages of CBR (Kolodner, Watson, …) • Does not require an explicit domain model, so elicitation becomes a task of gathering case history • Implementation is reduced to identifying significant features that describe a case, a task easier than creating an explicit model • By applying database techniques, large volume of data/information can be managed • Can learn by acquiring new knowledge as cases and thus making maintenance easier

A work flow model of simulation modeling processes • We propose a general process flow model for simulation process, based on the modification of objective-directed approach (K. Musselman, 1998). The process determines model elements, i.e. I/O requirement and activities (both physical and logical activities required by simulation). • This model helps determine an overall work flow logic that integrates modeling functions needed to develop complete specifications (at conceptual level) about model content (i.e. I/O requirement and model activities). It also helps identify and define the “features” that are used to form a case representation for the purpose of simulation modeling.

Describe/classify problem, including process mapping Set up goals and objectives Determine output requirement Determine experimental requirement Determine data collection requirement Determine input requirement and model activities A work flow model of simulation modeling processes

Found? Retrieve Start a new case Adapt Case-base Accept? Evaluate Retain CBR approach in general

Key issues in applying CBR to simulation • Representing simulation cases: adequate and robust representation of relevant knowledge contained in a simulation project for the purpose of reuse • Indexing and matching cases: efficient retrieval of most similar cases (with respect to a target problem case) • Adapting cases: modify an “old” case to derive solutions to the target problem • Retaining cases: determine if a finished simulation case should be retained in the case library for future retrieval

Issues on case representation (CR) • Function of CR: for problem-solving or training (teaching/learning) or hybrid? This affects what features should be represented • Level of abstraction: how much details should be kept for what level of communication or implementation? Conceptual level v.s. implementation level • Heterogeneity of case content: e.g. conceptual model specs v.s. legacy models developed with specific tools (require different formalism)

Features to be represented At “top level” of a simulation case, five features: • Problem description (specific application) • Conceptual model specification • Legacy (executable) model development • Experimental results (validation, verification and proposed solutions) • Implementation results (outcome and feedback by user during and after the implementation of proposed solutions)

Feature representation at second level • Problem description: • System/processes (type, product/service, operations flow, activities, facility/equipment, controls, procedures) • Problem symptoms (productivity, resource, service level related…) • Problem-solving goal/objectives (including client’s view) • Problem constraints

Feature representation at second level • Conceptual model specification • Model elements (entities, resources, activities, locations, queues, state-variables, etc.) represented through objects/class diagrams • Entity flow diagrams (EFD) • Data collection and input modeling requirements • Output analysis and experimental design • Validation and verification requirement • At lowest level, a sub-feature contains a set of attribute values (of different data types)

Organization of features • Given the hierarchical nature of simulation case features, we can use a graph G = {N, A} to represent the set of features, where N = set of nodes or features; and A the set of arcs that connect the nodes (i.e. relate features). Let |N| = n; |A| = n - 1; and for any node iN, in-degree(i) = 1. Such a graph G is called a direct-out tree and contains no cycles (Ahuja 1993). Let g = {g1, g2, …, gk} be a set of searching goals, then for each feature we can define a set of weights w = {wig1, …, wigk}, where wigk represents the importance assessment assigned to feature i with respect goal gk by modeler. We represent a case with a super set SimCase = {N, A, w, g}

A case feature tree Node class:: name; type; value … Arc class:: relation …

Case representation formalism • For implementation, the knowledge of cases need to be represented using formalisms that can be encoded for computer execution • Common formalisms used including frames, objects, predicates, semantic nets and decision rules • Majority current CBR software used frame/objects representation (Watson and Marir)

Case representation language • At implementation level: case content can be divided into sub-categories to obtain a better homogeneity within each category • We can use a “universal” language such as XML to encode the case contents divided into sub-categories to facilitate model information transfer over web/Internet

Matching cases • Function of matching: determine what to match (identify correspondence between features), how much is the match (compute the degree of match between corresponding features), and how important is the match (assign importance weight to each feature that is used in matching comparison) • Assessment of similarity between cases (and between features): a numerical approach, problem: assigning weights to features

“Nearest neighbor similarity” • Similarity between stored cases and the new/target case is based on matching a weighted sum of features. The weights of the features are assumed “fixed”__ independent of search purpose (Ian Watson and Farhi Marir) • Partial match must be assessed

Modified similarity • Let Sxy = total/overall similarity between case x and y; F = {f1, …, fk} a set of goal-dependent top-level features; and Sfi(x, y) = similarity between case x and y in terms of feature fi; and wfi(g) = importance weight of fi with respect to a given goal g; We can use an exhaustive search algorithm to traverse feature trees to calculate/accumulate the similarities between cases/features

A matching procedure Procedure Match_Case(Given a target case x and a retrieved case y for comparison) { determine search goal g; set Sxy = 0; for each top-feature fi F: { set s = fi; //s is a root node pred(s) = 0; Plist = {s}; //Plist = a list of admissible nodes (activities) that can be reached from some identified/marked source nodes while (Plist ) { select a node i = front(Plist); return_info(i); get_targetCase_info(i); compare_case(i, x, y); if (type(i)  leaf) then add children(i) to Plist; update: Plist = Plist \ i;//delete i from Plist; }//end of while-loop }//end of for-loop }//end of the Procedure

Comparing features • Let ai = {ai1, …, aim} be the set of values that a feature or a sub-feature fi can assume. When we compare two cases x and y in terms of fi, we define a characteristic function (ai(x), ai(y)) to characterize how “close” that case x is versus case y in terms of fi, based on the given value ai(x)ai and ai(y)ai • if ai(x) = ai(y) then set caseMatch = perfectMatch and (ai(x), ai(y)) = 1 • else (ai(x) ai(y)) set caseMatch = partialMatch • partialMatch is defined on a discrete scale

A scale for characteristic function • (ai(x), ai(y)) = 0 ai(x) is totally different from ai(y); • (ai(x), ai(y)) = 0.2 ai(x) is different from ai(y); • (ai(x), ai(y)) = 0.5 ai(x) is somewhat similar to ai(y); • (ai(x), ai(y)) = 0.75 ai(x) is similar to ai(y); • (ai(x), ai(y)) = 0.9 ai(x) is very similar to ai(y); • The value of (ai(x), ai(y)) will be used to calculate a partial similarity = (ai(x), ai(y)) * wfi(g) • Fuzzy sets and fuzzy inference can be used to map and calculate the partial similarity

Search values for (ai(x), ai(y)) • Given the values of ai(x) and ai(y), finding the value for (ai(x), ai(y)) becomes a simple search from a look-up-table of m2 rows: For each pair of ai(x) and ai(y), we can define a linguistic descriptor (a fuzzy set), e.g. “similar”, “very similar”

Case base: case memory models • Dynamic memory model: Memory organization packets (MOPs) • Enhanced MOPs (E-MOPs) or generalized episode (GE) • Category-exemplar model • Key is to balance the needs for preserving semantic richness and indices and the need for efficiency of access and case retrieval

Feature models of cases (parametric models) Problem description Experiment results Implementation results Legacy models CSM specs Modular design of case-base {SimCases} = {fmi} = {(N, A, w, g)i} Indexing pointers Five functional parts, modularized storage that divide the case content into more homogeneous function-oriented sub-categories

Developer interface (GUI) Case Base (Stored cases) Case-based Inference engine (matching, adapting and retaining algorithms) User interface (GUI) Working memory (facts of target case and added information) Explanation Utility System architecture (prototype)

Conclusions • CBR can be an effective approach for preserving and reusing simulation modeling knowledge and information • Issues of applying CBR on SM process include case representation, matching and adaptation • A modular design of case-base can help structure the case content for more efficient and function-oriented search • The research just got started … …

Advancing Simulation Modeling through Case-Based Reasoning: Addressing Challenges and Implementing Solutions

Advancing Simulation Modeling through Case-Based Reasoning: Addressing Challenges and Implementing Solutions

Presentation Transcript

Computational Challenges for Modeling and Simulation

Case-based reasoning

Case Based Reasoning

Case Based Reasoning

Case-based reasoning

Case-based reasoning

Integrated Case-Based and Rule-Based reasoning approaches for Insurance

Schank’s Case-based reasoning

Case-based Reasoning

Case-Based Reasoning

Case-Based Reasoning

Conversational Case-Based Reasoning

Modeling Challenges Case Study

Case Studies in Modeling and Simulation

Case-Based Reasoning

Integration of Constraint-Based Reasoning and Case-Based Reasoning

Ceaseless Case-Based Reasoning

„Component-Based“ Modeling and Simulation

Case-Based Reasoning

DEVS-based Modeling and Simulation

DEVS-based Modeling and Simulation

Case-Based Reasoning