Download
topics for today n.
Skip this Video
Loading SlideShow in 5 Seconds..
Topics for Today PowerPoint Presentation
Download Presentation
Topics for Today

Topics for Today

83 Views Download Presentation
Download Presentation

Topics for Today

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Topics for Today • General Audio • Speech • Music • Presentation of MusicWiz project

  2. General Audio • Mapping audio cues to events • Recognizing sounds related to particular events (e.g. gunshot, falling, scream) • Mapping events to audio cues • Audio debugger to speed up stepping through code • Spatialized audio • Provides additional geographic/navigational channel • Example: Michael Joyce’s Interactive Central Park

  3. SpatializedAudio • Spatialized audio is easier when assuming headphones because of control • Head-related transfer function (HRTF) • Difference in timing and signal strength determine how we identify position of sound • Beamforming • Timing for constructive interference to create stronger signal at desired location • Crosstalk Cancellation • Destructive interference to remove parts of signal at desired location

  4. Audio Signal Analysis • Fast Fourier Transform (FFT) and Discrete Wavelet Transform (DWT) • Transforms commonly used on audio signals • Allow for analysis of frequency features across time (e.g. power contained in a frequency interval) • FFTs have equal sized windows where wavelets can vary based on frequency • Mel-frequency cepstralcoeffients (MFCC) • Based on FFTs • Maps results into bands approximating human auditory system

  5. Echology • An interactive soundscape combining human collaboration with aquarium activity • Engage visitors to spend more time with (and learn more about) Beluga whales • Spatialized sound based on whale activity and human interaction

  6. Echology Architecture

  7. Speech • Speaker segmentation • Identify when a change in speaker occurs • Useful for basic indexing or summarization of speech content • Speaker identification • Identify who is speaking during a segment • Enables search (and other features) based on speaker • Speech recognition • Identify the content of speech

  8. Speech Recognition • Start by segmenting utterances and characterizing phonemes • Use gaps to segment • Group segments into words • Limited vocabulary of commands • Classifiers for limited vocabulary (HMMs) • Continuous speech • Language models for disambiguation • Speaker dependent or not

  9. Music • Music processing can support a variety of activities • Composition • From traditional to interactive • Selection • Example: iTunes, Pandora, • Use for shared spaces • Playback • Example: MobinLenin • Management & Summarization • Example: MusicWiz • Games • Guitar Hero, Rockband, etc.

  10. MobiLenin • Enable interaction with music in a public space • Not karaoke • Voting like in many pub/bar games • Audience can affect which version of music and video is shown

  11. Lessons • Gave a focal point for interaction between members of a group • Content variety is necessary for continued engagement • Lottery for free beer motivated participation

  12. Music Summarization • Most summaries in commercial sites are either the first phrase or a single selected musical phrase • Study of whether 22 second long multi-phrase music summaries would be better previews • Three algorithms vary the selection of the components between phrases that are • sonically distinctand phrases that are • repeated more often • A comparative evaluation study showed that: • Multi-phrase previews were selected in 87% of the cases over the preview representing the first 22 seconds of the song • 90% of the summary choices valued at least a good representation of the song

  13. Managing Personal Music Collections • Music management is mainly based on: • explicit attributes (e.g. metadata values like the artist, the composer and the genre). • explicit feedback (e.g. ratings of preference and relevance) • Benefits • Easy to understand • Formal: consistent updating and access • Context-free • Question • How can music be accessed based on the feelings or memories it triggers?

  14. Current Practices • Common metadata tags usually not sufficient to describe mood, feelings, memories and complex concepts • Effort/benefit trade-off issues • Personal reactions to music change • Explicit feedback and usage statistics helpful in retrieving music of preference • Questions • How would people organize music if there was a low-effort way of expressing their personalized interpretation of music?

  15. Preliminary Study • 12 participants asked to organize songs & create playlists using spatial hypertext • In spatial hypertext, information has visual attributes & spatial layout that can be changed to express associations • The majority found spatial hypertext helpful in organizing • Participants appreciated: • expressive power and freedom of the workspace • directly accessible metadata information of music • music previews for remembering music • Participants missed: • interactive hierarchical / tree views • music previews for understanding music

  16. Organization using categories & subcategories with labels Preliminary Study

  17. Music Access & Implicit Attributes • Considerable research into extracting and using implicit cues for associating music to overcome: • limitations of metadata & statistics to describe music concepts • unwillingness of users to provide explicit feedback • cost of employing human experts to find music similarity • Music Management extended by: • signal features (e.g. intensity, timbre and rhythm) • collaborative filtering • interaction • e.g. Last.fm, Genius, Music Gathering Application, Flytrap, Musicovery, MusicSim, Musicream

  18. MusicWiz Interface Songs Related Song Titles Workspace Status Music Collection Songs & Metadata Artist Module Audio Signal Module MetadataModule Lyrics Module Worksp. Express. Module Relatedness Assessment RelatednessTable Inference Engine Sim. Values Statistics of Artist Similarity Lyrics Internet MusicWiz Architecture • Music management environment that combines: • explicit information • implicit information • non-verbal expression of personal interpretation • Two basic components: • interface for interacting with the music collection • inference engine for assessing music relatedness

  19. MusicWiz Interface Hierarchical Folder Tree View Workspace Playlist Pane Related Songs & Search Results View Playback Controls The MusicWiz interface

  20. MusicWiz Inference Engine • 5 modules for extracting, processing and comparing artists, metadata, audio content, lyrics, and workspace expression Overall Similarity (S1, S2) = = W1 * Overall Metadata Similarity(S1, S2) ++ W2 * Overall Audio Signal Similarity(S1, S2) + + W3 * Overall Lyrics Similarity(S1, S2) + + W4 * Overall Workspace Expression Similarity(S1, S2) where, • S1, S2 are the songs under comparison and Wn, n = 1..4 the user adjusted weights of the specialized similarity assessments

  21. MusicWiz Inference Engine – Artist Module • Assesses relatedness in music using online resources: • human evaluations of artist similarityfrom: • Similar Artists lists of the All Music Guide website • co-occurrence of artists in playlists from: • OpenNap file-sharing network • Art of the Mix website

  22. MusicWiz Inference Engine – Metadata Module • Evaluates the pair wise similarity of the metadata values of all songs • String comparison is applied to the title, genre, album-name, and year of the songs as well as the file-system path where they are stored • uses a distance metric that combines the Soundex and the Monge-Elkan algorithms

  23. MusicWiz Inference Engine – Audio Signal Module • Uses signal processing techniques to analyze music content • Extracts and compares information about the harmonic structure and acoustic attributes of music • beat, brightness, pitch, starting note and potential key (music scale) of the song

  24. MusicWiz Inference Engine – Lyrics Module • Textually analyzes the lyrics • Lyrics are scraped from a pool of popular websites for: • display in music objects • comparison • Lyrical comparison uses term vector cosine similarity: Overall Lyrics Similarity (S1, S2)= cos(θ) • The more words lyrics have in common, the greater the possibility that the songs are motivated by or describe related themes

  25. Composite List Stack MusicWiz Inference Engine – Workspace Expression Module • Music objects can be related visually and spatially • Spatial parser identifies relations between the music objects • Recognizes three types of spatial structures: lists, stacks and composites

  26. MusicWiz Functionality • Music collection can be explored by filtering: • attribute values (i.e. id3 tags, audio signal attributes and lyrics) • similarity values (i.e. overall similarity) • Playlists can be created: • manually: songs can be added from the left-side views & the workspace) • automatically: • filter - based mode: selection based on the ID3 tags • similarity - based mode: selection based on the relatedness of songs on the current playlist

  27. MusicWiz Evaluation • 20 participants were asked to: • Task 1: organize 50 rock songs into sub-collections according to their preference • Task 2: form three, twenty-minutelong playlists based on three different moods or occasions of their choice • Task 3: form three six-song long playlists, where each of them had to be related to a provided “seed”-song (not from the fifty of the original collection)

  28. MusicWiz Evaluation

  29. Task 1 - Organization of Music

  30. Tasks 2 & 3 – Playlist Creation

  31. Topics From Today • General Audio • Audio cues, spatialized audio • Speech • Segmentation, speaker id, recognition • Music • Interactive music, summarization, organization