1 / 23

Towards a Quadratic Time Approximation of Graph Edit Distance

Towards a Quadratic Time Approximation of Graph Edit Distance. Fischer, A., Suen, C., Frinken, V., Riesen, K., Bunke, H. Contents Introduction Graph edit distance Hausdorff distance Approximating the ged with Hausdorff distance Application , experimental evaluation and results

elmo
Télécharger la présentation

Towards a Quadratic Time Approximation of Graph Edit Distance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a Quadratic Time Approximation of Graph Edit Distance Fischer, A., Suen, C., Frinken, V., Riesen, K., Bunke, H. Contents Introduction Graph editdistance Hausdorffdistance ApproximatingthegedwithHausdorffdistance Application, experimental evaluationandresults Conclusions Fischer, A., Suen, C., Frinken, V., Riesen, K., Bunke, H.: A fast matching algorithm for graph-based handwriting recognition, submitted

  2. Introduction • Graph editdistanceis a well establishedconcepttomeasurethedissimilarityofgraphs • Itcanbeusedfornearest-neighborclassification, clustering, andvariouskernelmethodsbased on dissimilaritiy • However in its original form, ist complexityisexponential • Therefore, variousapproximateprocedureshavebeenproposedfor ist computation; for a recentreviewsee X. Gao, B. Xiao, D. Tao, X. Li: A surveyofgrapheditdistance, Pattern Analysis & Applications 13, 113-119, 2010 • In thispresentationwedescribeworktowards a newapproximateprocedure, based on Hausdorfdistance, thatruns in quadratictime

  3. Graph Edit Distance • Measuresthedistance (dissimilarity) ofgivengraphs g1and g2 • Is based in theideaofediting g1into g2 • Common editoperationsaredeletion, insertionandsubstitutionofnodesandedges • Can beusedwith a costfunction

  4. ComputationalProcedure • Bunke, H., Allermann, G.: Inexactgraphmatchingforstructuralpatternrecognition, PRL 1, 245 – 253, 1983

  5. Approximatingthe GED by an AssigmentProcedure • Givenaretwosets, X={x1,…,xn} and Y={y1,…,yn} togetherwith a costfunctioncij. • Wewantto find a one-to-onemappingthatminimizesthecostΣcif(i) • Problem was originallystudied in thecontextofOperations Research (assignmentofworkerstojobs) • Manyalgorithmsexist, typicallywith O(n3) complexity (Hungarian, Munkres, Volgenant/Jonker,…)

  6. The assignmentproblemhasnothingto do withthegedproblem • However, gedcanbereformulated (simplified), such thatitcanbeapproximatelysolvedwith an assignmentprocedure • Different reformulationsarepossible (onlynodes, nodes plus edges) • The proceduresthatsolvetheassignmentproblemare optimal • Theyareonly suboptimal w.r.t. gedproblem, but theyrun in cubic time andgivegoodapproximationsofthetruedistance K. Riesen and H. Bunke. Approximategrapheditdistancecomputationbymeans ofbipartitegraphmatching. Image and Vision Computing, 27(7):950–959, 2009

  7. HausdorffDistance (1) • A well-knowndistancemeasurebetweensetsofpoints in a metricspace • Oftenused in imageprocessingas a distancebetweensetsofpoints in the 2-D plane, or in 3-D space; see, forexample, Huttenlocher, D.P., Klanderman, G.A., Rucklidge, W.J.: Comparingimages using the Hausdorff distance, PAMI 15, 850–863, 1993

  8. Given sets A and B, and a distance metric d(a,b) H(A,B)=max(maxa∊Aminb∊Bd(a,b),maxb∊Bmina∊Ad(a,b)) • Computationalcomplexityis O(nm), where |A|=n and |B|=m

  9. HausdorffDistance (2) • Becauseofthemax-operator, H-distanceis sensitive tooutliers in thedata • Therearevariouspossibilitiestoovercomethisproblem: delete top-n, average, median,… • In thefollowing: replacemax-operatorbysummation (equivalenttoaveraging) • H’(A,B) = Σa∊Aminb∊Bd(a,b) + Σb∊Bmina∊Ad(a,b)

  10. HausdorffDistance (2) • Becauseofthemax-operator, H-distanceis sensitive tooutliers in thedata • Therearevariouspossibilitiestoovercomethisproblem: delete top-n, average, median,… • In thefollowing: replacemax-operatorbysummation (equivalenttoaveraging) • H’(A,B) = Σa∊Aminb∊Bd(a,b) + Σb∊Bmina∊Ad(a,b)

  11. Approximating Graph Edit Distancewith H-Distance • Sets A and B correspondtothesetsofnodesofgraphs g1and g2 • Distance d(a,b) betweena∊Aandb∊Bisgivenbynodesubstitutioncost • In thepresentcase, itistheEuclideandistanceofthenodeattributevectors (x,y)uand (x,y)vofnodes u∊g1and v∊g2: c(u,v)= ∥(x,y)u- (x,y)v∥ • Result: • h(g1,g2), original Hausdorffdistance, appliedtographs • h‘(g1,g2), max-operationreplacedbysummation • Possibleenhancement: includecostofeditoperations on theedgesadjacenttoconsidered pair ofnodes (similartoassignmentapproximation)

  12. Additional Enhancement • h(g1,g2) and h‘(g1,g2) enforceall nodes in bothgraphsbeingmatchedwitheachother, i.e. thereareonlysubstitutions (possiblywith multiple assignments), but nodeletionsorinsertionsallowed • Measure h“(g1,g2) also allowsdeletionandinsertionofnodes • Itisidenticalto h‘(g1,g2), but usesthefollowingcostfunction: c(u,v)/2, if c(u,v)<c(u,Ɛ)c(u,Ɛ), otherwise • c“(u,v)=

  13. Application, Experimental Evaluation andResults: Recognition ofHandwritten Historical Text

  14. Conventional Approach

  15. Conventional Features • Based on a sliding window, e.g. features by • Marti et al.: 9 features extracted from a window of 1 pixel width • Vinciarelli et al.: 16 windows of size 4 x 4 pixel; fraction of black pixels in each window; result: 16 features

  16. Potential problem with conventional approach: • Two-dimensional shape of characters is not adequately modeled; no structural relations • Possible solution: • Use skeletons to represent the handwriting by a graph • Transform the graph of a handwritten text into a sequence of feature vectors • Apply HMMs or RNN to sequence of feature vectors

  17. Graph Extraction • Apply a thinning operator to generate the skeleton of the image • Nodes: • Key points: crossings, junctions, end points, left-most points of circular arcs • Secondary points: equidistant points on the skeleton between key points; distance d is a parameter • Edges: • Nodes that are neighbors on the skeleton are connected by edges • However, in the experiments it turned out that the performance without edges is comparable to that with edges if parameter d is chosen appropriately; therefore, no edges were used

  18. General Ideaof Graph Based Approach

  19. Experiments: Motivation andAim • Typicalgraphsizeisabout 30 nodes • The approximategedusing an assignmentalgorithmis still slow • Questionstobeanswered in theexperiments: • Howmuchspeed-up do wegainwiththe H-distancebasedapproach? • Howmuchrecognitionaccuracy do weloose?

  20. Experimental Setup • Data: Parzival dataset http://www.iam.unibe.ch/fki/databases/iam-historical-document-database • 13th centurymanuscriptwritten in Old German • Segmentedinto individual words • 11,743 wordinstances (images) • 3,177 wordclasses • 79 characterprototypes • Distancemeasures h, h‘, and h“ werenormalized • Division ofthedatabaseintotraining, validation, andtestsets

  21. Experimental Results • Word recognition rate on testset • h, h‘, h“ asintroducedbefore; s based on assignmentproc. • Computationalspeed (Java implementation) • Median graphsize: 30 nodes • Median # ofgraphmatchings per word: 6162 • Run time in seconds

  22. Conclusions • Gedis a powerful concept but is, in its original form, tooslowformostapplications • Variousfasterapproximationsofgedhavebeenproposed • In this talk, a newapproximateversionwithquadraticcomplexityisproposed, based on Hausdorffdistance • It was practicallyevaluated in thecontextof a handwritingrecognitiontaskandhasshowngoodresults • Nevertheless, moreexperimentsareneeded, especiallywithothergraphdatasets (otherattributes), and larger graphs; itwouldbeinterestingtocomparethenewdistances „moredirectly“ withdistancesobtainedfromotherapaproximatemethods

  23. Acknowledments • HISDOC consortium: R. Ingold, J. Savoy, M. Bächler, N. Naji (collaborators in historicalhandwritingrecognitionproject) • SNF (financialsupportfor HISDOC) • SNF (postdocstipendfor AF)

More Related