Structural modeling of HRTFs

Title: Techniques for Customized Binaural Audio Rendering with Applications to Virtual Rehabilitation Ph.D. candidate: Simone Spagnol Year: 2012 Supervisor: Prof. Giovanni De Poli Ph.D. School Director: Prof. Matteo Bertocco Technology-assisted motor rehabilitation is today one of the most potentially interesting application areas for research in sonic interaction design (SID). The final goal of the rehabilitation process is to facilitate re-integration of patients into social and domestic life, by helping them regain the ability to autonomously perform activities of daily living; still, such activities embody complex motor tasks for which current rehabilitation systems lack the sophistication needed in order to assist patients during their performance. In this context, very little attention to auditory feedback is paid in the robotic rehabilitation community. As the chart reported on the left depicts, the majority of a number of recent technology-assisted rehabilitation systems do not utilize any auditory display, whereas the others exploit only a limited set of possibilities. ThisPhDprogrammehasdevelopedtechniques for an effective, customizedrendering of spatial audio, thatisnowadaysone of the mostchallenging and interestingresearchareas for virtual and augmented reality. The finalapplication area of the studiedtechniquesisthat of technology-assistedmotorrehabilitation, a field in which the consistent use of auditory feedback islargelyunderestimatedyetwhere the use of evensimpleforms of auditory feedback can enhance performance and learning of a rehabilitative task. Auditory feedback in technology-assistedmotorrehabilitation Although current technology-assisted rehabilitation systems exploit only a limited set of possibilities from SID research, several studies show that properly designed auditory feedback, able to provide temporal and spatial information, can improve engagement and performance of subjects in the execution of motor tasks, can improve the motor learning process, and possibly substitute other feedback modalities (as with visually impaired users). Moreover, the relatively limited computational requirements of audio rendering and the low costs of related hardware make it attractive to use auditory feedback in the context of home rehabilitation systems. In light of this, there is strong evidence that research in this field may only take advantage from a wary use of the know-how in SID. Effects of audio in technology-assistedrehabilitationtasks Fivenovelexperiments, one for gait training and four for upperlimb training, wereperformedin the context of a joint work with the Department of Mechanical Innovation and Management, University of Padova, together with the University of Delaware (gait training experiment) and the University of California Irvine (upper limb training experiments). These studied were conducted on healthy subjects first to characterize the normative response of the human motor system to auditory feedback, providing a basis for a future comparison with neurologicallyimpairedpatients. The results corroborate the hypothesisthatcontinuous sound feedback can be successfullyemployedduringmotor training to provide the subject with additional and/or substitutive information on task and/or error. The addition of a secondary sensory channel that faithfully represents the information already provided by the visual channel helps the user having a stronger perception of the task, allowing for improved sensory-motor coordination. In particular, itisfoundthat: • rendering task-related information through sound helpssubjects to increase performance; • a visuomotortransformation can be learnedthrough a consistentauditory feedback; • sound spatialization can furtherenhance performance. The aware use of binauralspatial sound isexpected to bringeven more advantages, suchas positive effects on patient engagement and effortduringmovement training, and help in performing and relearningcomplexfunctionalmovements. Binaural sound renderingtechniques Most of the binaural rendering techniques (i.e. based on headphone reproduction) currently exploited in research rely on the use of the so-called Head-Related Transfer Functions (HRTFs), i.e. peculiar filters that capture the transformations undergone by a sound wave in its path from the source to the eardrum and typically due to reflection and diffraction effects on the torso, head, shoulders and pinnae of the listener. Such characterization allows virtual positioning of sound sources in the surrounding space by filtering the desired signals through a pair of HRTFs, thus creating left and right ear signals to be delivered by headphones. In this way, three-dimensional sound fields with a high immersion sense can be simulated and integrated within multimodal frameworks. However, such techniques bear relevant limitations. First, they may request considerably large computational resources, especially in the case where one needs to simulate several sound sources in the surrounding space. Second, and most important, HRTF filters are usually presented under the form of acoustic signals recorded through dummy heads: this means that anthropometric differences among different subjects are not taken into account. Contrariwise, anthropometric features of the human body have a key role in HRTF characterization: listening to non-individualized spatialized sounds may likely result in evident sound localization errors such as incorrect perception of source elevation, front-back reversals, and lack of externalization, especially in static conditions. On the other hand, individual HRTF measurements on a significant number of subjects is often both time- and resource-expensive. Structural modeling of HRTFs (see picture above) ultimately represents an attractive solution to these shortcomings. As a matter of fact, if one isolates the contributions of the listener’s head, pinnae, ear canals, shoulders, and torso to the HRTF in different subcomponents - each accounting for some well-defined physical phenomenon - then, thanks to linearity, he/she can reconstruct the global HRTF from a proper combination of all the considered effects. Structuralmodeling of HRTFs In this PhD programme, twonovelpersonalizablemodels, one for source distancerendering and onethatsimulates the pinna contribution to the HRTF, wereintroduced and objectivelyevaluated. Thesetwomodels can be thought of blockscomposing a more general structural model, granting a fast real-time rendering of HRTFsthanks to theirlowcomplexity. The main purpose of the distance model (see above) was to minimize magnitude differences with respect to the distance-dependent part of an analytical spherical head model through a low-order filter structure. This was done through direct fitting of three parameters easily extracted from the analytical responses to a number of exponential functions and through the use of a first-order shelving filter. The approximation was found to be appropriate; however, more work is needed in terms of further improving the model’s accuracy and correctly tuning its phase response in order to grant a correct ITD estimation when using two of such models in a real-time listening scenario. A new PRTF database Further investigations on the correspondence between anthropometry of the human pinna and PRTF features will be soon carried out using a new PRTF database, collected at the Department of Signal Processing and Acoustics, Aalto University, Finland, as analysis material. The database, accompanied by detailed photographs of the subjects’ pinnae and of the measurement setup (see above), consists of median-plane PRIRs measured at 61 different elevation angles from 25 subjects and is the first publicly available PRTF database, http://www.dei.unipd.it/~spagnols/PRTF_db.zip. Selectedpublications • S. Spagnol, M. Geronazzo, and F. Avanzini. Fitting pinna-related transfer functionsto anthropometry for binaural sound rendering. In Proc. IEEE International Workshop on Multimedia Signal Processing (MMSP’10), pages 194-199, Saint-Malo, October2010. Top 10% Paper Award winner. • G. Rosati, F. Oscari, D. J. Reinkensmeyer, R. Secoli, F. Avanzini, S. Spagnol, and S. Masiero. Improvingrobotics for neurorehabilitation: enhancing engagement, performance, and learning with auditory feedback. In Proc. IEEE 12th Int. Conf. on RehabilitationRobotics (ICORR2011), pages 341-346, Zurich, June-July 2011. Best Poster Award finalist. • S. Spagnol, M. Hiipakka, and V. Pulkki. A single-azimuth pinna-related transfer function database. In Proc. 14th Int. Conf. on Digital Audio Effects (DAFx-11), Paris, September 2011. • F. Avanzini, S. Spagnol, A. De Götzen, and A. Rodà. Designinginteractive sound for neurorehabilitation systems. Chapter in Sonic Interaction Design book, edited by K. Franinovic and S. Serafin, MIT Press. Accepted for publication. • D. Zanotto, G. Rosati, S. Spagnol, P. Stegall, and S. K. Agrawal. Effects of auditory feedback in robot-assisted lower extremity motor adaptation. IEEE Transactions on Neural Systems and Rehabilitation Engineering (IEEE TNSRE). Submitted for publication. The pinna model (see above) was obtained through a definitely more complex analysis. An algorithm that separates the resonant and reflective parts of the pinna-related component of the HRTF (commonly known as Pinna-Related Transfer Function, PRTF) spectrum was implemented, and the resulting decomposition drove the design of a low-order model consisting of two peak filters and three notch filters. Moreover, an analysis of real HRTF data was performed in order to study the relationship between PRTF features and anthropometry in the frontal median plane (see left), the findings supporting the hypothesis that reflections occurring on pinna surfaces can be reduced for the sake of design to three main contributions, each carrying a negative reflection coefficient. Based on this observation, the PRTF model was parameterized onto anthropometric features of the listener extracted from a picture of his/her pinna. This is an extremely innovative approach since customization of the model can be obtained by straightforward image processing techniques. Spectral distortion and notch frequency mismatch measures indicated that the approximation is objectively satisfactory. http://www.dei.unipd.it/~spagnols spagnols@dei.unipd.it

Structural modeling of HRTFs

Structural modeling of HRTFs

Presentation Transcript

Structural Modeling

Structural Equation Modeling

Structural Equation Modeling

Structural Equation Modeling

Structural Equation Modeling

Structural Equation Modeling

Structural Equation Modeling

Structural Modeling

CEE 371 – Modeling of Structural Systems

Structural Equation Modeling

Structural Equation Modeling

Structural Equation Modeling

Basic Structural Modeling

Structural Modeling

Structural Equation Modeling

Structural Equation Modeling

Structural Equation Modeling

Structural Equation Modeling

Structural Equation Modeling

Structural Modeling