1 / 14

Chapter 12 Multimedia IR: Indexing and Searching

Chapter 12 Multimedia IR: Indexing and Searching. Date: 11/17/2005. Introduction. Feature extraction, feature indexing, distance, similarity query Distance

Télécharger la présentation

Chapter 12 Multimedia IR: Indexing and Searching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 12 Multimedia IR: Indexing and Searching Date: 11/17/2005

  2. Introduction • Feature extraction, feature indexing, distance, similarity query • Distance • Editing distance: smallest number of insertions, deletions, and substitutions that are needed to transform the first string to the second • Euclidean distance: • Similarity query • Whole match: the query and the objects are of the same type • Sub-pattern match: 16×16 sub-pattern on 512×512 grey-scale images • Nearest neighbors • All pairs

  3. Correctness of query results • False dismissal: unacceptable • False alarm: can be discarded via post-processing • Spatial access method: R-tree

  4. A generic indexing approach • The whole match problem • Given O1, O2, …, On, D(Oi, Oj), Q,   {Oi | D(Q, Oi)  } • Basic idea of the approach • A quick-and-dirty test • Discard non-qualifying objects • Allow false alarms • Use of SAM

  5. An example (yearly stock-price movements) • Average as the quick-and-dirty test • Large difference  can’t be similar • Small difference  similar  false alarm • f features  reduce false alarms  each object can be mapped into a point in f-dimensional space • No need to test all f-d points ?

  6. Preservation of distance mapping • Exact preservation • No false alarm, no false dismissal • Difficult to find such features • Dimensionality curse • No false dismissal if Dfeature(F(O1), F(O2))  D(O1, O2) • With potential false alarms • The lower bounding lemma

  7. Feature selection • Preserve distance • Carry much information about the corresponding objects to reduce false alarms • Nearest neighbor query • Find the point F(P) that is the nearest neighbor to the query point F(Q) • Issue a range query with Q and  = D(Q, P)

  8. One-dimensional time series • The first day’s value is a bad feature • 365 values  dimensionality curse • Average is better • Discrete Fourier Transform • For a signal x = [xi], i = 0, 1, …, n-1, let XF denote the DFT coefficient at the F-th frequency, F = 0, 1, …, n-1 • Keeping the first f coefficients of the DFT as the features 

  9. The fewer the coefficients that contain most of the energy, the fewer the false alarms, and the faster the response time • The dimensionality curse is avoided with the low-bounding lemma and the energy-concentrating property of the DFT • f = 1 ~ 3

More Related