1 / 22

Auditory input processing

Cross-sensorial processing – MED7. Auditory input processing. Lecturer: Smilen Dimitrov. Introduction. The immobot base exercise Work on the auditory input Goal – sound source localization in 3D Setup: PC Two microphones Sound card. Setup – microphone problems.

Télécharger la présentation

Auditory input processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cross-sensorial processing – MED7 Auditory input processing Lecturer: Smilen Dimitrov

  2. Introduction • The immobot base exercise • Work on the auditory input • Goal – sound source localizationin 3D • Setup: • PC • Two microphones • Sound card

  3. Setup – microphone problems • We need to use two microphones to obtain a stereo signal • For regular PC microphones (like our Sandbergs): • Take note they are electret! • They demand +5V from the PC in order to work • All PC mic inputs follow this standard:although we have a tip-ring-sleeve jack connector, it is NOT a stereo jack. • Thus a PC mic input will always show as mono (stereo button will be greyed out in Recording control of Windows mixer)

  4. Setup – microphone problems • We need to use two microphones to obtain a stereo signal • For regular PC microphones (like our Sandbergs): • Hence the connection cable below will NOT work (as it assumes that the electret connector is a stereo one)

  5. Setup – microphone problems • Hence, we will have to use : • a dedicated audio card, • with two microphone inputs, even if we want to use cheap electrets for stereo! • One possible soundcard: M-Audio mobilePre USB

  6. Setup – microphone problems • Interfacing two electrets for stereo input: • would involve a schematic cable like below: • (assuming we have a stereo plug mic input on the card)

  7. Setup – microphone problems • To avoid these problems with electrets, we are going to use capacitor microphones (Generis) • Note that these microphones must be connected using an XLR cable (the M-Audio card has such mic inputs) • Note that condenser/capacitor microphones demand a power supply – so called “phantom power” (the M-Audio card has such facility) • Thus, we should make sure the sound card and the microphones are compatible.

  8. Setup • Setup for a PC: (In addition to the microphones and the sound card): • M-Audio MobilePre USB drivers • Max/MSP/Jitter • Microphone parameters need not be specified in the algorithm discussed today.

  9. Goal of the auditory processing algorithm • Relation to the model we had for visual input processing • Not really applicable for the algorithm discussed, but could be – here we will directly do tracking • Object detection: • the application needs to detect the presence of a new object whenever it enters the monitored environment (say, a sound louder that threshold) • Object recognition: • Once a new object is detected, it needs to be classified to determine its type (e.g., a car versus a truck, a tiger versus a deer) (involves comparing sounds – spectrum signatures) • Object tracking: • Assuming the new object is of interest to the application, it can be tracked as it moves through the environment. Tracking involves computing current location of the object and its trajectory Preprocess-audio Estimation of 3Dlocation through ITD / cross-correlation

  10. Goal of the auditory processing algorithm

  11. Sound-source localization using ITD and cross-correlation • Small comparison between stereo camera and microphones system • Camera – 2D sensor (2D array of photocells) • Single camera can give a vector of direction to tracked object • Two cameras can give a point (intersection of direction vectors – CPA) • Microphone – 1D sensor (senses values at a single point – corresponds to a single photocell in camera) • Single microphone cannot give any geometric information • Two microphones can only give azimuthal angle – which corresponds to a vector of direction, confined to the “horizontal” plane

  12. Sound-source localization using ITD and cross-correlation • Algorithm – computing the the time delay of arrival (TDOA) of the wave front at the two microphones • In biological terms this is the equivalent of the Interaural Time Difference (ITD) • We compute the lag of the wave at a specific point received at both microphones (the Interaural Phase Difference (IPD) ) • Must find the time difference between two identical points in the left and right sound signal – using cross-correlation

  13. Sound-source localization using ITD and cross-correlation • Cross-correlation – two arrays, representing the left and right audio signal: g and h – their correlation is also an array • The length of the cross-correlation array is

  14. Sound-source localization using ITD and cross-correlation • Cross-correlation – in essence, what we are doing is taking one array, and “sliding” it across the another, finding the sum of the products between respective elements.

  15. Sound-source localization using ITD and cross-correlation • Cross-correlation – algorithm • First, find the time increment between sampling: • Assume the sound can be analyzed through the diagram below: • Sound arriving at left channel, will arrive at right channel after crossing distance b – we know the speed of sound, so we can also calculate time difference

  16. Sound-source localization using ITD and cross-correlation • Cross-correlation – algorithm • Assume the sound can be analyzed through the diagram below: • Trigonometry:

  17. Sound-source localization using ITD and cross-correlation • Cross-correlation – algorithm • Assume the sound can be analyzed through the diagram below: • The time difference: • Where Δ = time between sound sampling,, and σ = the number of delay samples returned from the cross-correlation function.

  18. Sound-source localization using ITD and cross-correlation • Cross-correlation – algorithm • Calc length of line a • Speed of sound v = 384m/s at room temperature • Finally, calc the angle θ • Where c is a known distance between the microphones

  19. Sound-source localization using ITD and cross-correlation • When θ is finally computed, we obtain a direction vector, by rotating the unit vector in the horizontal plane (xz), around the vertical axis (y) for amount θ • So, the vector DA with components (-sin θ, 0, cos θ) will represent the direction of detected audio source

  20. Sound-source localization using ITD and cross-correlation • Overview of the algorithm (architecture)

  21. Sound-source localization using ITD and cross-correlation • Problems with the approach • We only retrieve a direction vector in a plane (azimuthal angle) – information about the “vertical” position of the sound source is lost • 3D localization of audio as a 3D point is possible using two microphones, if some medium (that changes sound) is placed between the microphones (a “head”), and then a head-related transfer function is calculated.

  22. Implementation in Max/MSP • Will program own MSP object, to perform audio cross-correlation realtime – then proceed to vector calculation and display

More Related