Sound Analysis and Synthesis Techniques Comparison

Analysis/Synthesis Comparison James Beauchamp, Maarten de Boer, Kelly Fitz, Lippold Haken, Xavier Rodet, Axel Roebel, Xavier Serra, Gregory Wakefield, and Matthew Wright, moderator International Computer Music Conference 2000, Berlin, Germany

Overview • Introduction: Matt Wright (10 min) • Short presentation from each participant (8 min) • Panel discussion (40 min) • Audience questions (20 min) Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

What is Analysis/Synthesis? • Analysis: fit a parametric sound model to an input sound • Synthesis: Turn that model back into sound Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Goals of Analysis/Synthesis • “Perfect” reproduction • “indistinguishable”? • low dimensionality • Data reduction • Control • Intuition • Afford musically interesting transformations and control Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

What Happened for this Comparison? • Not a competition! • A common set of input sounds • Each participant analyzed the sounds • Analysis results were provided in SDIF format Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

The Input Sounds • 27 sounds contributed by the panelists • Trimmed to be as short as possible • Variety of types of sounds Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Input Sounds by Category • Single trumpet note (1) • Monophonic Harmonic Phrases (8) • Noisy: shakuhachi, suling flute • Reverberant clarinet • Noisy apple bite (1) • Percussion (berimbao, bongo, piano, xylophone, gong, bass drum = 6) • Polyphonic (4) • Voice (Singing, singing into a flute, unison sopranos, yodel = 4) • Speech (giggle, “research” = 2) • Angry Cat (1) Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

SDIF • Sound Description Interchange Format • Streams/Frames/Matrices • Extensible collection of types Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

SNDAN • The two STFT-based analysis programs are implementations of 1) the phase vocoder with sample rate conversion preprocessing and 2) the frequency tracking method, including ability to convert to harmonic tracks. • Authors: James Beauchamp, Robert Maher, George Chaltas, Timothy Madden, Jonathan Mohr, and others, University of Illinois at Urbana-Champaign • Platforms: any Unix platform, including Linux • Availability: Free • http://ems.music.uiuc/cmp/software/sndan.html Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Loris • A C++ class library implementing analysis, manipulation, and synthesis of digitized sounds using the Reassigned Bandwidth-Enhanced Additive Sound Model. • Authors: Kelly Fitz and Lippold Haken, with contributions from Paul Christensen, Malcolm Slaney, Ken Turkowski, and Vladimir Batov. • Platforms: tested under Linux, IRIX 6.5, and MacOS 9. • Availability: Gnu GPL, http://www.sourceforge.net • http://www.cerlsoundgroup.org/Loris Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

SMS: Spectral Modelling Synthesis • Decomposes a sound into a collection of frequency and amplitude values, representing the partials of the sound (sinusoidal, or deterministic component), and either filter coefficients with a gain value or spectral magnitudes and phases representing the residual sound (non sinusoidal, or stochastic component). • Main authors: Jordi Bonada, Maarten de Boer, Eduard Resina and Xavier Serra, Music Technology group, Audiovisual Institute, Pompeu Fabra University • Platforms: Windows and Linux • http://www.iua.upf.es/~sms/ Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

PartialBench • Analysis model is a collection of modulated sinusoid with time varying amplitude and frequency trajectories (b-splines). The model is initialized using FFT and then adapted by means of minimizing the model error. • Author: Axel Roebel, Electronic Studio of the Technical University of Berlin and visiting scholar at CCRMA. • Platforms: Matlab (on Linux, Sun, Windows, etc.) • Availability: Still under development (not publicly available) Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

MDRx: Modal Distribution and Prescription • MDRx takes advantage of the relatively sparse nature of the time-frequency characteristics of musical instruments. It utilizes a kernel that optimizes the time-frequency localization of partials in the time-frequency surface. This local information can then be projected onto various signal models including additive and formant synthesis. • Gregory H. Wakefield, Maureen Mellody, Andrew Sterian, Rowena Guevara, William Pielemeier, and Anastasia Yendiki, The MusEn Project, University of Michigan • Platforms: Windows 98/NT, Matlab • Availability: http://www.eecs.umich.edu/~ghw Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

additive, hmm, estimate, filnor, f0, psolab, modRes, chant, transient • IRCAM's suite of analysis/synthesis software. Includes sinusoidal models, linear prediction analysis, autoregressive lattice filter, fundamental frequency estimates, voiced/unvoiced detection, pitch-synchronous marker finding, resonance modelling, FOF synthesis, and transient detection. • P.F. Baisnée, P. Chose, P. Depalle, B. Doval, G. Garcia, F. Iovino, F. Jaillet, F. Marti, G. Peeters, G. Poirot, Y. Potard, S. Roux, D. Schwarz, X. Rodet, D. Virolle, IRCAM. • IRIX, DEC-OSF, Linux, MacOS • Available from IRCAM Forum: http://www.ircam.fr/forum Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Short presentation from each participant • What software was used? Availability? • What model(s) did the analysis use? • What were the analysis parameters and how were they chosen? • How did you produce SDIF files? • Present best result and worst result • What did you learn from the exercise? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Discussion: Mutability • What transformations does each model afford? • Who has interesting transformations to play? • Is there a tradeoff between mutability and accurate resynthesis? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Discussion: Choosing the Appropriate Model • How can a user (e.g., a composer) choose the appropriate model for analyzing a given sound? • What are the strengths and weaknesses of sinusoidal models versus PSOLA, LPC, etc.? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Discussion: Choosing the Appropriate Analysis Technique • Which analysis techniques work best on which classes of sounds? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Discussion: Choosing Analysis Parameters • What parameters must be set by the user in each analysis tool? (E,.g., window size, hop size, pitch estimate, etc.) • How can a user (e.g., a composer) choose effective settings for these parameters? • What are the characteristic "artifacts" of various techniques, and how can a user learn to adjust the analysis parameters when she hears these artifacts? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Comparison: Representing Noise • What models are used for noise? • Loris: “spread” parameter for each partial • SMS: Spectral representation of residual • What determines which parts of the signal will be treated as noise? • How mutable is each noise model? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Comparison: Inter-frame Interpolation • How does each synthesizer interpolate parameters between SDIF frames? • PartialBench: B-Splines • QUASAR (Ding+Qian): Quadratic phase • TASS: Linear • How audible are these differences? • Should we think of different synthesis interpolation strategies as different sound models? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Comparison: Treatment of Phase • How does each synth handle phase? • TASS: Ignores phase except initial phase • What happens when a partial’s phase and frequency values are contradictory? • How do these differences affect the analysis? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Discussion: “Dumbing Down” a model for interchange • Treatment (or lack thereof) of instantaneous phase in additive synthesis • Discarding Loris’ “noisiness” field • Uniform time sampling of individual partials’ breakpoints • Is this a good thing? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Discussion: Standardization vs. Diversity of Models in SDIF • How conveniently did participants’ sound models map into SDIF’s standard sound description types? • What extensions to SDIF were required? • Which extensions should become standard? • Is “sum of sinusoids” a single model? Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Conclusions • Use SDIF! • All analysis results are on the web:http://www.cnmat.berkeley.edu/SDIF(follow the link to “Analysis/Synthesis Comparison”) Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

See These Presentations: • Friday 10:50: Kelly Fitz, Lippold Haken, Paul Christensen, “A New Algorithm for Bandwidth Association in Bandwidth-Enhanced Sinusoidal Sound Modeling” • Friday 11:40: Kelly Fitz, Lippold Haken, Paul Christensen, “Transient Preservation under Transformation in an Additive Sound Model” • Friday 10:30-12:30 (poster): Diemo Schwarz, Matthew Wright, “Extensions and Applications of the SDIF Sound Description Interchange Format” Analysis/Synthesis Comparison Panel Session, ICMC 2000, Berlin

Sound Analysis and Synthesis Techniques Comparison