Performance Evaluation of Multiobjective Evolutionary Algorithms METRICS

Performance Evaluation of Multiobjective EvolutionaryAlgorithmsMETRICS

MOEA Testing and Analysis Specifically, they suggest that awell-designedexperiment follows the following steps: 1. Define experimental goals; 2. Choose measures of performance - metrics; 3. Design and execute the experiment; 4. Analyze data and draw conclusions; 5. Report experimental results.

Choose measures of performance - metrics; • Every algorithm can maintain a group of nondominated individuals at theend of the run. • Sometimes the result from one algorithm fully dominates the other,which is the simplest condition. But generally, some results from one algorithmdominate some from another algorithm, and vice versa. • Another reason for the special consideration on the performance evaluation isthat we are interested in not only the convergence to PF* but also the distributionof the individuals along PF*. • Adequately evaluating convergence and distribution isstill an open problem in the field of MOEAs. • Benchmark problem design is also an interesting field because we want to convenientlygenerate problems with different shapes of PF*, different convergent difficulties,different dimensions, etc.

Performance Indices • After determining which benchmark MOPs to optimize, we need to make a carefuldecision on how to evaluate the performance of different MOEAs. • Those criteria areperformance indices (PI).

Three are normally the issues to take into consideration to design agood metric in the given domain (Zitzler, 2000): 1. Minimize the distance of the Pareto front produced by ouralgorithm with respect to the true Pareto front (assuming weknow its location). 2. Maximize the spread of solutions found, so that we can have adistribution of vectors as smooth and uniform as possible. 3. Maximize the amount of elements of the Pareto optimal setfound.

A A B B The Need for Quality Measures Is A better than B? independent ofuser preferences Yes (strictly) No dependent onuser preferences How much? In what aspects? Ideal: quality measures allow to make both type of statements

Independent of User Preferences Pareto set approximation (algorithm outcome) = set of incomparable solutions weakly dominates = not worse in all objectives sets not equal dominates = better in at least one objective strictly dominates = better in all objectives is incomparable to = neither set weakly better A B C D B A A C D C B C

A B hypervolume 432.34 420.13 distance 0.3308 0.4532 diversity 0.3637 0.3463 A spread 0.3622 0.3601 cardinality 6 5 B Dependent on User Preferences • Goal: Quality measures compare two Pareto set approximations A and B. application ofquality measures comparison andinterpretation ofquality values “A better”

The research produced in the last few years has included a wide variety ofmetrics that assess the performance of an MOEA in one of the three aspects. Some examples are the following: 1-Cardinality-based Performance Indices • Number of obtained solutions • Error Ratio • Coverage

The number of obtained solutions |Sj| (Cardinality) • K. Deb: Multi-Objective Optimization Using Evolutionary Algorithms, Wiley, Chichester, U.K., 2001. • Let S be the union of the J solution sets. • Let Sj be a solution set (j=1, 2, 3, …, J). For comparing J solution sets (S1, S2, …, SJ), we use the number of obtained solution |Sj| f1 : reference solution r : current solution x |Sj| f2

Error Ratio (ER): • This metric was proposed by Van Veldhuizento indicate the percentage of solutions (from the non-dominated vectorsfound so far) that are not members of the true Pareto-optimal set: where n is the number of solutions in the current set of non-dominated set; ei = 0 if solutioni is a member of the Pareto-optimal set, and ei = 1 otherwise. It should then be clear that ER = 0 indicates an ideal behavior, since it would mean that all the solutions generated by our MOEA belong to the Pareto-optimal set of the problem. This metric addresses the third issue from the list previously provided.

ER=2/3

Coverage (C): • In 1999, Zitzler suggested a binary PI called coverage (C) • C(S1, S2) is the percent of the individuals in S2 who are weakly dominated by S1. • The larger C(S1, S2) is, the better S1 outperforms S2 in C.

2-Distance-based Performance Indices • Distance-based PI evaluate the performance of the solutions according to the distanceto PF*. Generational Distance (GD): • In 1999, Van Veldhuizen suggested a unary PI called generational distance (GD). • First we need to define the minimum distance from S to PF* as

The GD measure can be written as follows: where S* is a reference solution set for evaluation the solution set Sj. dxr is the distance between a current solution xand a reference solution r. • TheGD measure is not the average distance from each solution in Sjto its nearest reference solution in S*. • It is referred to as the generation distance. • While the generation distance can only evaluate the convergency of the solution setSjto S*,GD(Sj) can evaluate the distribution of Sjas well as the proximity of Sjto S*. f1 : reference solution r : current solution x dr-1, x-1 drx f2

It should be clear that a value of GD = 0 indicates that all the elementsgenerated are in the Pareto-optimal set. Therefore, any other value willindicate how “far” we are from the global Pareto front of our problem. • This metric addresses the first issue from the list previously provided.

Metrics for Diversity Spacing (SP): Here, one desires to measure the spread (distribution) of vectors throughout the non-dominated vectors found so far. Since the “beginning” and “end” of the current Pareto front found are known, a suitably defined metric judges how well the solutions in this front are distributed. Schott proposed such a metric measuring the range (distance) variance of neighboring vectors in the non-dominated vectors found so far. This metric is defined as:

Hyperarea and Ratio (HA,HR): The hyperarea (hypervolume) andhyperarea ratio metric which are Pareto compliant relate to the area of coverageof PFknown with respect to the objective space for a twoobjectiveMOP. • This equates to the summation of all the rectangular areas,bounded by some reference point and (f1(x), f2(x)).

Mathematically, this isdescribed in equation • Also proposed is a hyperarea ratio metric defined as: • where HA1 is the PFknown hyperarea and HA2 is the hyperarea of PFtrue.

Hypervolume {A, B, C, D, E} = 11 A B C D Reference point E Hypervolume in 2D Objective 2 5 4 3 2 1 0 Objective 1 0 1 2 3 4 5

PFtrue ’s H = 16+6+4+3 = 29 units2, and PFknown ’s H = 20+6+7.5 = 33.5 HR = 33.5/29 = 1.155

Performance Evaluation of Multiobjective Evolutionary Algorithms METRICS