Automatic G enre Classification Using Large High-Level Musical Feature Sets

Automatic Genre Classification Using Large High-Level Musical Feature Sets Cory McKay and Ichiro Fujinaga Dept. of Music Theory Music Technology Area McGill University Montreal, Canada

Topics • Introduction • Existing research • Taxonomy • Features • Classification methodology • Results • Conclusions 2/27

Introduction • GOAL: Automatically classify symbolic recordings into pre-defined genre taxonomies • This is first stage of a larger project: • General music classification system • Classifies audio • Simple interface 3/27

Why symbolic recordings? • Valuable high-level features can be used which cannot currently be extracted from audio recordings • Research provides groundwork that can immediately be taken advantage of as transcription techniques improve • Can classify music for which only scores exist (using OMR) • Can aid musicological and psychological research into how humans deal with the notion of musical genre • Chose MIDI because of diverse recordings available • Can convert to MusicXML, Humdrum, GUIDO, etc. relatively easily 4/27

Existing research • Automatic audio genre classification becoming a well researched field • Pioneering work: Tzanetakis, Essl & Cook • Audio results: • Less than 10 categories • Success rates generally below 80% for more than 5 categories • Less research done with symbolic recordings: • 84% for 2-way classifications (Shan & Kuo) • 63% for 3-way classifications (Chai & Vercoe) • Relatively little applied musicological work on general feature extraction. Two standouts: • Lomax 1968 (ethnomusicology) • Tagg 1982 (popular musicology) 5/27

Taxonomies used • Used hierarchical taxonomy • A recording can belong to more than one category • A category can be a child of multiple parents in the taxonomical hierarchy • Chose two taxonomies: • Small (9 leaf categories): • Used to loosely compare system to existing research • Large (38 leaf categories): • Used to test system under realistic conditions 6/27

Small taxonomy • Jazz • Bebop • Jazz Soul • Swing • Popular • Rap • Punk • Country • Western Classical • Baroque • Modern Classical • Romantic 7/27

Large taxonomy 8/27

Training and test data • 950 MIDI files • 5 fold cross-validation • 80% training, 20% testing 9/27

Features • 111 high-level features implemented: • Instrumentation • e.g. whether modern instruments are present • MusicalTexture • e.g. standard deviation of the average melodic leap of different lines • Rhythm • e.g. standard deviation of note durations • Dynamics • e.g. average note to note change in loudness • Pitch Statistics • e.g. fraction of notes in the bass register • Melody • e.g. fraction of melodic intervals comprising a tritone • Chords • e.g. prevalence of most common vertical interval • More information available in Cory McKay’s master’s thesis (2004) 10/27

Overview of the classifier 11/27

A “classifier ensemble” 12/27

Feature types • One-dimensional features • Consist of a single number that represents an aspect of a recording in isolation • e.g. an average or a standard deviation • Multi-dimensional features • Consist of vectors of closely coupled statistics • Individual values may have limited significance taken alone, but together may reveal meaningful patterns • e.g. bins of a histogram, instruments present 13/27

Classifiers used • K-nearest neighbour (KNN) • Fast • One for all one-dimensional features • Feedforward neural networks • Can learn complex interrelationships between features • One for each multi-dimensional feature 14/27

Simplified classifier ensemble 15/27

A “classifier ensemble” • Consists of one KNN classifier and multiple neural nets • An ensemble with n candidate categories classifies a recording into 0 to n categories • Input: • All available feature values • Output: • A score for each candidate category based on a weighted average of KNN and neural net output scores 16/27

Simplified classifier ensemble 17/27

Complete classifier ensemble 18/27

Feature and classifier selection/weighting • Some features more useful than others • Context dependant • e.g. best features for distinguishing between Baroque and Romantic different than when comparing Punk and Heavy Metal • Hierarchical and round-robin classifiers only trained on recordings belonging to candidate categories • Feature selection allows specialization to improve performance • Used genetic algorithms to perform: • Feature selection (fast) followed by • Feature weighting of survivors 19/27

Complete classifier ensemble 20/27

Complete classifier 21/27

Exploration of taxonomy space • Three kinds of classification performed: • Parent (hierarchical) • 1 ensemble for each category with children • Only promising branch(es) of taxonomy explored • Field initially narrowed using relatively easy broad classifications before proceeding to more difficult specialized classifications • Flat • 1 ensemble classifying amongst all leaf categories • Round-robin • 1 ensemble for each pair of leaf categories • Final results arrived at through averaging 22/27

Complete classifier 23/27

Overall average success rates across all folds • 9 Category Taxonomy • Leaf: 86% • Root: 96% • 38 Category Taxonomy • Leaf: 57% • Root: 75% 24/27

Importance of number of candidate features • Examined effect on success rate of only providing subsets of available features to feature selection system: 25/27

Conclusions • Success rates better than previous research with symbolic recordings and on the upper end of research involving audio recordings • True comparisons impossible to make without standardized testing • Effectiveness of high-level features clearly demonstrated • Large feature library combined with feature selection improves results • Not yet at a point where can effectively deal with large realistic taxonomies, but are approaching that point 26/27

27/27

Automatic G enre Classification Using Large High-Level Musical Feature Sets

Automatic G enre Classification Using Large High-Level Musical Feature Sets

Presentation Transcript

Issues in Automatic Musical Genre Classification

Musical Genre Classification

Automatic Text Classification

Feature Level Processing

Large-Margin Feature Adaptation for Automatic Speech Recognition

Musical Feature Detection

Analysing Large Data Sets using Formal Concept Lattices

Musical Feature Detection

using large data sets

Large-Scale Automatic Classification of Phishing Pages

Automatic GIS Feature Generation using SDE Java API

Automatic Feature Generation for Endoscopic Image Classification

using large data sets

Automatic Feature Generation for Endoscopic Image Classification

using large data sets

Manipulating Large Data Sets

Classification: Feature Vectors