Bayesian Decision Making for Proposal Ranking: A Fair and Efficient Approach

Miroslav Kárný Department of Adaptive Systems Institute of Information Theory and Automation Academy of Sciences of the Czech Republic school@utia.cas.cz,http://as.utia.cas.cz

… speaker’s home institute … nickname for Institute of Information Theory and Automation Cybernetics  Communication & Control in Machines & Animals Cybernetics is speaker’s research domainandled to applications in: • Adaptive control of paper machines, rolling mills, drum boilers,… • Nuclear medicine modeling & DM, dynamic image studies … • Support of operators of complex systems (FET) • Traffic control in cities, optimization of financial strategies • Multiple participants’ DM and E-democracy … …? …! Bayesian DM: single-horse on decades-lasting trip with a good team

FET organizes a review process … … to select the best proposalsp among all submitted proposals • An expert e assigns marks emp {0,…,M}to several proposals within a small group ep of proposals • A small group of experts pe,reviewing the proposal p,harmonizes the final mark mp via discussion • Assembly of all experts completelyranks all proposals EC supports top proposals up to a budget-implied border-line

Addressed problem Procedure is good & fair … up to the extremelydisturbing step • An expert e assigns marks emp {0,…,M}to several proposals within a small group ep of proposals • A small group of experts pe, reviewing the proposal p, harmonizes the final mark mp via discussion • Assembly of all experts completelyranks all proposals • Each expert ehas studied a tiny portion of all proposals • Experts’ marks emp are subjectively scaled • Discrete-valued marks cause many coincidences • Time slot of the assembly is strongly limited errors manipulation expenses 

Aims … of the research • to test belief that Bayesian DM is (almost) universaltool relying on the proper modeling only • to test a promising negotiation methodology needed in other contexts, too … of the talk • to help FET to be fair and cost-efficient • to help proposing researchers to be treated fairly • to share fun (?) from the conclusions

Basic idea Experts serve as rank-measuring devices Project proposal p has its objective rank rp! Ranking  estimation of rank rp from marks emp, which are noise-corrupted observations of the objective rank

Guide • Experts as measuring devices • Prior knowledge • MAP estimate • Experimental results • Discussion

Experts as measuring devices emp… mark of proposal p by the expert e = rp… objective rank of proposal p + e… personal error experts try to be fair  mark emp proportional to rp e independent of p e… personal error = eb… bias + e … personal fluctuations with variance ev interpretation of marks top M  Nobel Prize top M  flawless Simplicity & maximum entropy eassumed to be Gaussian

number of data 1 – 2 number of unknowns Prior knowledge Needed emp = rp +eb + e = (rp – C) + ( eb + C) + e, for anyC Available rank[0, largest mark] rp [0, M] biaseb[-M, M ] , noise variance ev [0, M2]

MAP estimate Posterior log-likelihood function • smoothly dependent on the estimatedr, b, v • concavein the estimatedr, b, v • defined on a convex domain • unique maximum • harmonised domain and data range • maximum in interior Evaluation Conditions for extreme are solved by successive approximations … fast, simple and reliable … can be used “on-line”

Experiments - proposals’ viewpoint Processed marks m  {0, 0.5,…,30}; Assemblyranking available Extreme cases: #Proposal 32 1341 #Experts 33 588 acceptance Threshold 22 25 #proposals above T byA11 157 #proposals above T byus16 72 #proposals chosen by Aandus11 57 #common acceptance / A-one [%] 100 36 • typical numbers • prior does not spoil results with a few data

Histogram of rank estimates … box width about 2% of the mark range ! #(r>T 25) = 57 #(r >T 22) = 11

Experiments - experts’ viewpoint • mean (bias) / Threshold [%] 6 4 • minimum (bias) / T - 13 -45 • maximum (bias) / T 15 13 • mean (std. dev.) / T 13 12 • minimum (std. dev.) / T 10 7 • maximum (std. dev.) / T 21 38 Box width containing significant number of proposals  3 % of T !

Individual results – small file

Individual top results – extensive file

Discussion Evaluation aspects • it works • it exhibits fast and reliable convergence • it is reasonably robust to variations of prior statistics Operational aspects • it can substitute or at least support assembly ranking • it allows continuous-valued marking • it avoids the need to harmonize marks within pe • it makes ranking less sensitive to experts’ biases & variations • it suppresses lottery-type results for gray-zone-ranked proposals(those with the rank around threshold) • it makes evaluation more objective

Discussion Quality assurance aspects • it checks reliability of experts, using their biases & variances: 70-80 [%] experts o.k. but unreliable or cheating rest still forms a significant portion • it allows tracking of “bad” experts • it opens a way to relate prior & posterior ranking, i.e., the achieved results of supported projects Methodological aspects • it can be tailored to other problems • it can serve as a tool supporting negotiation

Future • alternative models of experts, e.g., non-normal, Markov-chain type • comparison of prior and posterior ranking • application to other negotiation-type processes • application to individual marks & thresholds • quality assurance of the evaluation including experts’ competence !

Bayesian Decision Making for Proposal Ranking: A Fair and Efficient Approach