1 / 54

The Mind’s Ear How the Brain Listens to What the Ear Hears

The Mind’s Ear How the Brain Listens to What the Ear Hears. Shihab Shamma Institute for Systems Research Electrical and Computer Engineering University of Maryland College Park. Auditory Processing. Analyzing sounds in complex reverberant environments requires :.

thor
Télécharger la présentation

The Mind’s Ear How the Brain Listens to What the Ear Hears

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Mind’s Ear How the Brain Listens to What the Ear Hears Shihab Shamma Institute for Systems Research Electrical and Computer Engineering University of Maryland College Park

  2. Auditory Processing Analyzing sounds in complex reverberant environments requires : Extracting the spectrum of incoming sounds Estimating the pitch of concurrent sources Localizing and tracking them accurately Perceiving and recognizing their timbre

  3. Auditory Scene Analysis Segregating multiple sound sources monaurally

  4. Tones Tone Complex Frequency (Hz) 1000 750 500 Time Noise /a/ /i/ /u/ Vowels F3 Frequency (Hz) Frequency (Hz) 1000 1000 F2 F1 Time Time

  5. Music & Speech Spectrograms Violin (vibrato) Piano Frequency (Hz) Frequency (Hz) F3 F2 Frequency (Hz) F1 R I g.h T A W A Y

  6. Unnatural Distortions Down-Shift Normal Dilate Compress

  7. An auditory scene Frequency Time

  8. Two classes of ASA processes Frequency Simultaneous processes Sequential processes Time

  9. Simultaneous ASA Processes • Grouping Concurrent Sounds • (What is it?) The perceptual phenomenon • Harmonicity, Onset • (What’s it good for?) Relationships with other aspects • of perception

  10. Residue Musical Pitch Pitch Harmonicity : Musical Pitch

  11. 4000 2000 1000 500 250 125 Perceived pitch = Fundamental frequency(regardless!) Full harmonic series Missing fundamentals Frequency (Hz) Time Time

  12. Spectral Grouping or “Fusion” of Harmonics Mistuning a harmonic • Fusion is found in humans and animals alike • Fusion also breaks with onset mismatches

  13. Segregating Harmonic Sets Frequency Time Frequency Time

  14. Grouping by Onsets

  15. Sequential ASA processes • Streaming • (What is it?) The perceptual phenomenon • (What’s it good for?) Relationships with other aspects • of perception • (How does it come about?) Attention and Placticity

  16. Frequency B B B B … dF … A A A A Time Miller & Heise (1950), Bregman & Campbell (1971), … Bregman (1990), …

  17. “1 stream of sounds jumping up and down in pitch” Frequency B B B B … … A A A A Time

  18. Frequency B B B B … dF … A A A A Time

  19. “2 streams, one high, one low” Frequency B B B B … … A A A A Time Note: you can only attend to one stream at a time

  20. Frequency B B … … A A A A Time

  21. “1 stream with a galloping rhythm” Frequency … B B … … A A A A Time

  22. “2 streams, one high and slow, the other low and fast” Frequency B B … … … A A A A Time Note: when streamed, the relative timing between A and B tones becomes less important.

  23. Streaming also depends on temporal parameters Frequency dt B B B B … … A A A A Time

  24. … Streaming also depends on connectedness  Frequency B B B B A A A A B B B B … … A A A A Time

  25. Streaming based on Pitch differences Frequency Frequency B B B B … … A A A A A A A A Time Time PITCH Musical melodies also stream B B … Telemann A A A A Time

  26. Streaming Based on Timbre Trumpet Cello Cello-Trumpet Different Spectral Envelopes Alternating Vowels /e/ and /a/

  27. Streaming: What’s it good for ? Rhythmic Masking Frequency (Hz) Time Target Masking Frequency (Hz) Time

  28. Simultaneous cues help perception of speech Sinewave Speech S1 F3 Frequency (Hz) S2 F2 F1 Pulsed Sinewave speech F3 S1-pulsed Frequency (Hz) F2 S2-pulsed F1

  29. Courtesy ofDr. Chris Darwin Speech music

  30. Courtesy ofDr. Chris Darwin Speech music

  31. Courtesy ofDr. Chris Darwin Speech Music

  32. Continuity Illusion Tone in Noise Glides in Noise Frequency (Hz) Frequency (Hz) Time Time Speech in Noise

  33. The Biological Bases of Auditory Scene Analysis

  34. Representation of Consistent Features * Timbre (Voice) * location * Stationary spectra Auditory Scene Analysis Disassembling Sorting and Streaming “Learning” Plasticity Acoustic Auditory “Scene” Primitive Cues + Auditory mixture (Two Speakers) Multi-Scale Representation Objects y c n e u q e r F Speaker A Time Primitive cues * Harmonicity * Onset/Offset Learning and Adaptation Speaker B * Multi-resolution temporal cortical dynamics * Plasticity * Attention

  35. A t t r i b u t e s o f C o m p l e x S o u n d s A n a t o m y o f t h e A u d i t o r y Location Timbre Pitch S y s t e m C e n t r a l A u d i t o r y S t a g e s Spatial maps Computing pitch MGB IC C o l l i c u l a r S t a g e s N L L Harmonic templates ILD, ITD Spectral cues L L M i d b r a i n N u c l e i T B The auditory spectrum D C N P V C N E a r l y A u d i t o r y A V C N S t a g e s s o u n d

  36. Representation of Consistent Features * Timbre (Voice) * location * Stationary spectra Auditory Scene Analysis Disassembling Sorting and Streaming “Learning” Plasticity Acoustic Auditory “Scene” Primitive Cues + Auditory mixture (Two Speakers) Multi-Scale Representation Objects y c n e u q e r F Speaker A Time Primitive cues * Harmonicity * Onset/Offset Learning and Adaptation Speaker B * Multi-resolution temporal cortical dynamics * Plasticity * Attention

  37. Experimental Set-up

  38. Multi-Resolution Analysis with Different STRFs Frequency (kHz) Time (ms)

  39. Scale-Rate Decomposition Reconstruction

  40. Patterns of Musical Timbre

  41. Compare prediction with current input t t t t 1 2 3 4 . . . F r e q u e n c y Exploring Streaming Mechanisms Disassembling Input Cortical Multiscale Spectral Representation Spectrogram y c n e u q e r t1..t4 F Time “Adaptive Feedback” Learned Stream ‘A’ 2Hz 4Hz 8Hz . . . Learned Stream ‘B’ Input Selector . . . Dynamics Sorting and Streaming

  42. Integrated Streamed Tone A Tone B Initial STRF Streamed STRF A A B B Time Time STRF may evolve or adapt within seconds!

  43. Enhancing Excitatory Fields Weakening Inhibitory Fields C Time (ms)

  44. 1 0 0 2 0 0 3 0 4 0 0 0 T i m e ( m s ) 2 0 0 0 2 0 0 0 1 0 0 0 1 0 0 0 5 0 0 5 0 0 2 5 0 2 5 0 1 2 5 1 2 5 1 0 0 2 0 0 3 0 4 0 5 0 0 6 0 0 7 0 8 0 9 0 1 0 0 0 0 0 0 0 0 T i m e ( m s ) Manipulating Temporal and Spectral Modulations Normal Spectrally smeared 2 0 0 0 2 0 0 0 1 0 0 0 1 0 0 0 5 0 0 5 0 0 2 5 0 2 5 0 1 2 5 1 2 5 1 0 2 0 3 0 4 0 5 0 6 0 0 7 0 8 0 0 9 0 1 0 0 0 5 0 0 6 0 7 0 8 0 0 9 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 T m i e ( m s ) Temporally smeared Temporally sharpened 1 0 0 2 0 3 0 0 4 0 5 0 0 6 0 7 0 0 8 0 9 0 1 0 0 0 0 0 0 0 0 T i m e ( m s )

  45. Morph Voices

  46. Acknowledgment Cortical Physiology and Auditory Computations Jonathan Fritz, Didier Depireux, David Klein Jonathan Simon Auditory Speech and Music Processing Tai Chi, Mounya ElHilali, Powen Ru, Nima Masgarani Supported by: MURI # N00014-97-1-0501 from the Office of Naval Research # NIDCD T32 DC00046-01 from the NIDCD # NSFD CD8803012 from the National Science Foundation

More Related