SView0.2- 排序 ( 续 )

SView0.2-排序 (续) • Rank aggregation • Two rank aggregation method • Kemeny method • Spearman's Footrule method • Improve Folksorting • How to do experiments and evalution

1. Ranking aggregation This is a classical problem from social choice and voting theory, in which each voter gives a preference on a set of alternatives, and the system outputs a single preference order on the set of alternatives based on the voters' preferences

Election Winner • Two such schemes. One was invented by the mathematician Charles Dodgson (better known as Lewis Carroll), and the other was suggested by J. Kemeny. • We show that under a either a Dodgson election or a Kemeny election, it is NP-hard (that is, at least as hard as an NP-complete problem) to determine whether any particular candidate has won!

a. Dodgson Election • Dodgson score is NP-complete. • Now we contrive an election for which determining the winner entails solving Exact Cover by 3-Sets (X3C), which is known to be NP-complete.

b. Kemeny Election • Kemeny score is NP-complete, and Kemeny ranking and Kemeny winner are NP-hard. • Proof. Kemeny score is in NP since the score of any candidate can be computed in polynomial time. • Kemeny ranking and Kemeny winner is finding a minimum feedback arc set

Feedback Arc Set • 给定一个有向图G=( V, A) , V={v1 , v2 , ⋯, vn }, A={a1 , a2 , ⋯, am}, 其中V 为顶点集合, A 为弧集合, n 为顶点个数,m为弧个数, 要求找到一个最小的子集A*A, 使得G*=( V, A-A*) 无环。

2. Rank aggregation method • Kemeny method • Spearman's Footrule method

Definition • U: a finite set of elements. {x1,x2,…,xn}. By assignig a unique identifier to each element in U, assume U={1,2,…,|U|} • Τ: ordered list of U. • Τ(i): denote position of i in T.( i U) • πk: the kth user preference orders on U. • σ: is closest to a given set of input lists {πk }, called consensus order.

Kemeny method • 距离度量：2个序列π₁, π₂的Kendall-tau距离为: K(π₁, π₂) = |{(i, j)|i<j , π₁(i) < π₁( j) ,butπ₂(i) > π₂(j)}| • counts the number of pairwise disagreements between two lists • Kemeny 排序方法对由k 个排序π₁, π₂,…, πk计算最优的合成排序σ ，使得Kendall-tau距离总和 K(σ; π₁, π₂,…, πk)= 最小。 K(σ, πl)

Kemeny optimal aggregation is equivalent to find a minimum feedback arc sets(FAS) in digraph. • A weighted complete digraph G=(V ,E, W) • V: the set of elements. Here, V=U. • E: the set of arcs. <xi, xj>: prefer xito xj • W: the total K-distance (from the πi‘s) of a ranking that among elements. weight W(xi, xj)= |{<xi, xj>|xi< xj , π(xi) < π(xj)| x1 x2 x3 x4

x1 x2 • This problem can be cast as a special case of weighted minimum feedback arc sets in tournaments. x3 • The minimum feedback arc set problem can be approximated to within O(logn loglog n) in general graphs and has (at least) the same approximation hardness as the vertex cover problem, which is 1.36. • Even 等将带权有向图FAS 问题规约为有向循环网络( circular networks) 中最小容量多路分割问题( minimum capacity multicut) , 提出了一个近似度为O( log2|X| ) 的算法, 其中X 是给定的顶点集合, 从图中删除FAS 中边将使得图中没有圈与X 中点相交 x4

computational complexity • Total = Distance computed+ kemeny optimal aggregation • Distance computed: • Two lists :K(π₁ , π₂ ) is O(nlogn) • W(xi, xj): O(knlogn); W: O((n*(n-1)/2) *knlogn)) • kemeny optimal aggregation: • minimum feedback arc sets

Spearman's Footrule method • 距离度量：2个序列π₁, π₂ 的Spearman's Footrule distance F(π₁ , π₂ ) = • the sum of the absolute difference between the rank of i according to the two lists. • Spearman排序方法对由k 个排序π₁, π₂,…, πk计算最优的合成排序σ，使得Spearman's Footrule距离总和最小。 | π₁(i)-π₂(i) | F(σ, πl)

footrule optimal aggregation is equivalent to find a minimum cost perfect matching in a bipartite graph. • A weighted complete bipartite graph G=(C ,P, W) • C: the set of elements. Here, C=U. • P: the set of available positions. PN+ • W: the total F-distance (from the πi's) of a ranking that places element c at position p. W(c,p)= |πi(c)-p | x1 x2 xn … C: W: … P: 1 2 n

computational complexity

summary • Above two method according to full list • Two method for partial listsis more problematic. • footrule optimal aggregation to be equivalent to theNP-hard problem of computing the minimum number of edges to delete to convert a directed graph into a directed cyclic graph(DAG). • Distance measure: scaled footrule distance; induced footrule distance.

3. Improve Folksorting Definition • U: a finite set of elements. {x1,x2,…,xn}. By assigning a unique identifier to each element in U, assume U={1,2,…,|U|} • Τ: ordered list of U. • Τ(i): denote position of i in T.( i U) • Τ∣S : supposed SU, a partial list of T that contains only elements from S. • Wi: Given i=1,2,…,k users, WiU, represent the scope of the ith user’s focus on U, all of the elements involved in the user’s movement

πi: the ith user rearrangement the list Τ∣Wi, finally obtain a preference order. πi denote a permutation of Τ∣Wi • σ: calledconsensus permutation,is closest to a given set U of input permutations{πi}, finally obtain a consensus orderof U. • d(π₁ , π₂ )= • D(σ; π₁, π₂,…, πk)= • d(σ∣πi , πi)= • costF(σ∣πi(x), πl (x)) = d(σ∣πi , πi) costF(σ∣πi(x), πi (x)) | π₁(i)π₂(i) |

① Use for reference from Kemeny optimal aggregation • equivalent to find a maximum weight directed cycle ina weighted digraph. • A weighted directed graph G=(V ,E, W) • V: the set of elements. Here, V= Wi. • E: the set of edges, with respect to edges of permutation graph of {πi}, and delete duplicate edge. • W(xi, xj): the number of duplicate edge in edges of permutation graph of {πi}

原始序列 (x1 x2 x3 x4) ；用户移动后序列(x4 x1 x2 x3) ；置换为πi =(x1 x4 x3 x2) x1 • permutation graph x4 x2 • Majoritygraph x3 x1 x2 • 原始序列 (x1 x2 x3 x4 x5) ； • 3个用户移动后序列分别(x4 x1 x2 x3 x5) , (x1 x3 x4 x5x2), (x1 x2 x4 x5 x3 ), • 对应π1=(x1 x4 x3 x2) π2=(x2 x3 x4 x5) π3=(x3 x4 x5) 2 2 x3 x5 x1 x2 Target: maximum weight directed cycle 2 2 x4 x3 2 x5 2 x4

② Use for reference from Footrule optimal aggregation • equivalent to find a maximum weight perfect matching in a bipartite graph. x3 x3 x1 x2 x4 x5 x1 x2 x4 x5 2 2 最终得到：每个元素对应全序列中的位置；可能存在不同的结果集 T(x1) T(x3) T(x4) T(x5) T(x2) x1 x2 x3 x4 x5 2 2 x1 x2 x3 x4 x5

summary • 借鉴前面的两种方法，利用图论中问题来解决。 • 不同点 • 输入:kemeny和footrule是lists，folksorting是置换； • kemeny和footrule的图中边的权重是距离，folksorting是边出现重复次数 • 计算复杂度，folksorting边权重计算：W(xi, xj): is O(k)，W: O(nk)； kemeny和footrule是O((n*(n-1)/2) *knlogn)) 和O(n2k)。 • 优化算法的复杂度一样O(n2.5). [但nfolksorting  nfootrule ] • 现提出改进方法，边的权重计算没有完全体现出hamming距离，总目标可以体现出 • 现没有对涉及用户的非关注点的元素的边权重进行类似induced,scaled的计算 • 现选择边时，除权重外，没加入启发式规则

谢谢 • 请提出宝贵意见

SView0.2- 排序 ( 续 )

SView0.2- 排序 ( 续 )

Presentation Transcript