1 / 39

Graphics Recognition – from Re-engineering to Retrieval

Graphics Recognition – from Re-engineering to Retrieval. Karl Tombre, Bart Lamiroy LORIA, France. Document Analysis in the IR era. Information is at the core of industrial strategies A lot of digital or digitized information, but often in very “poor” formats

manning
Télécharger la présentation

Graphics Recognition – from Re-engineering to Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graphics Recognition – from Re-engineering to Retrieval Karl Tombre, Bart Lamiroy LORIA, France

  2. Document Analysis in the IR era • Information is at the core of industrial strategies • A lot of digital or digitized information, but often in very “poor” formats • The challenge: not necessarily re-engineering of documents, but enrich poorly structured information, add (limited) amount of semantics, build indexes • Purposes: browsing, navigation, indexing • DAR methods and tools useful, but must be adapted

  3. Specific challenges of large-scale IR applications • Genericity: we cannot necessarily build a complete and exhaustive a priori model of contextual knowledge (ontology) • Adaptability: various input data – scanned paper, PDF, DXF, HTML, GIF… – various resolutions • Robustness: “back-office” applications • Efficiency: online searching in heterogeneous data • Scaling: methods have to scale to increasing number of symbols/features

  4. DAR and IR • Media without (or with very little) contextual knowledge • Image-based indexing and retrieval, indexing of video sequences • Documents do explicitly convey information from one person to another person • Much more structure, syntax and semantics

  5. DAR and IR – some examples • Indexing and/or searching scanned text without OCR • Similarities, signatures • Query or index on layout structure • Table spotting • Keyword spotting • …

  6. What about Graphics Recognition? • Subfield of DAR, for graphics-rich documents • Numerous methods for various analysis and recognition problems • Raster-to-vector conversion • Text/graphics separation • Symbol recognition • Many specific technical areas: maps, architectural drawings, engineering drawings, diagrams and schematics, …

  7. Graphics recognition methods • Text/graphics separation

  8. Graphics recognition methods • Vectorization

  9. Graphics recognition and IR applications • Usual text-based indexing and retrieval still useful • But need for access to other kinds of information: • Symbols • Text-drawing connections • Description-illustration connections

  10. Some contributions • Syeda-Mahmood – maintenance drawings IEEE Trans. On PAMI 21(8):737-751, Aug. 1999

  11. Some contributions • Arias et al., Najman et al. – use of information contained in legend / title block Proc. GREC’01, Kingston (Ontario, Canada), p.19-26, Sept. 2001

  12. Some contributions • Samet & Soffer – symbols from legend IEEE Trans. On PAMI 18(8):783-798, Aug. 1996

  13. Some contributions • Müller & Rigoll – graphical retrieval in database of engineering drawings Proc. ICDAR’99, Bangalore (India), pp. 697-700, Sept. 1999

  14. Some contributions • Boose et al. (Boeing) – Generation of Layered Illustrated Parts Drawings (GREC’ 03) Proc. GREC’03, Barcelona, pp. 139-144

  15. Symbol DB Or even better… Wishful thinking?

  16. Symbol recognition Before we move on: 1st contest on symbol recognition held last week See IAPR TC10 homepage for further details • Natural features for indexing and retrieval • Most methods work with known databases of reference symbols – what about interactive querying of arbitrary symbols? • From segmentation followed by recognition, to segmentation-free recognition, or segmenting while recognizing • Scalability • Efficiency / complexity • Discrimination power • Signatures

  17. Image-based signatures • Compute invariant signatures on binary document image • F-signatures (ICDAR’01) • Radon transform: R-signatures [Tabbone & Wendling] • Ridgelets [Ramos Terrades & Valveny – GREC’03] – aka wavelet transform of Radon transform

  18. R-signatures Detection of arrowheads [Girardeau & Tabbone] DEA degree thesis, INPL, Nancy, Jul. 2002

  19. R-signatures Another example [Girardeau & Tabbone]

  20. Ridgelets [Ramos Terrades & Valveny – GREC’03] Proc. GREC’03, Barcelona, pp. 202-211

  21. Vector-based signatures [Dosch & Lladós – GREC’03] • Based on set of basic graphical features: • Parallelism • Overlap • Collinearity • T- and V-junctions • Quality factor associated with the various relations • Match signatures of reference symbols with signatures of buckets

  22. Vector-based signatures Proc. GREC’03, Barcelona, pp. 159-169

  23. Towards symbol spotting • Pre-compute – or compute on the spot – a set of basic signatures • Can be sufficient for symbol spotting and retrieval • Followed by classical symbol recognition if more discrimination is needed

  24. Symbol spotting • [Jabari & Tabbone] : graph matching through probabilistic relaxation, with nodes=segments and vertices=relations DEA degree thesis, INPL, Nancy, Jul. 2003

  25. Symbol spotting • [Jabari & Tabbone] : another example

  26. Combining Text and Graphics • Extracting Text/Graphics relationships within document • Using Text matching for inter-document relationships • Transitive inter-document Graphics matching • No need for complex graphics matching • Restricted to well known document types

  27. Example: continuation of Wiring Diagrams (Boeing) • [Baum et al. – GREC’03] Proc. GREC’03, Barcelona, pp. 132-138

  28. Scan2XML Example Proc. GREC’01, Kingston (Ontario, Canada), pp. 312-325

  29. Indexing and Semantics • Signature + metric • Semantics = measured distance to signature • Applies only to homogenous contexts • Pre-segmented images • Pre-determined image classes • Implicit application of domain kowledge • ... • Semantics = Syntax

  30. Example Signature type A Metric M Signature value l  Semantics1 = (1, 1) Semantics2 = (2, 2) M(l,s1) < m1 ? M(l,s2) < m2 ? semantics = measurement to reference value

  31. Heterogenous Document Bases • Semantics do not have a unique syntax anymore • Syntax metrics may be context sensitive • Semantics = Syntax + Context Context needs to be considered

  32. Two different contexts from the automobile industry

  33. Example Context 1: Signature type A Metric M Context 2: Signature type B Metric N Signature value l  What if M(l,s1) < m1and N(l,t2) < n2 ? (1, 1) = Semantics1 = (t1, n1) (2, 2) = Semantics2 = (t2, n2)

  34. Data Data Data (syntax) (semantics) (semantics) A step to taking into account context (while consolidating existing approaches) Component Algebra : • Image Analysis = Pipeline • Syntax + algorithm = semantics Algorithm Algorithm Syntax and semantics need not be distinguished

  35. Component Algebra • Components : Known and implemented document analysis algorithms, taking input data from one domain, and producing data into another domain. • Application Context : Set of all available Components. • Semantics : Data sets needed by or produced by Components.

  36. Component Algebra is a Graph Data Component Data Data Component Data Data Component Data Data

  37. Advantages • Each node is a semantic concept, semantic relationships are explicitly expressed. • Structure may support automatic reasoning and knowledge inference. • Context is embedded in components, different contexts give different paths in the graph. • Highly scalable and open architecture. • Bridge between signal-level document analysis and high-level document representation.

  38. However ... The formalism exists, the realization doesn't (yet) • What about parametrization ? • How context independant can you get ? • What about « guessing » context appropriateness ? • How to design fully interoperable components ?

  39. Conclusion • A lot of DA methods – and more specifically GR methods – can be of direct use in IR, indexing and browsing applications • Specific challenges • Scaling and efficiency • Heterogeneous sets of documents • Incomplete domain knowledge • Symbol spotting • On-the-fly symbol searching • Sketch of open framework for including document semantics when context can be heterogeneous

More Related