A geometrical distance measure for determining the similarity of musical harmony

of 14
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Document Description
Harmony Music information retrieval Similarity Step function Tonality
Document Share
Document Tags
Document Transcript
  Int J Multimed Info RetrDOI 10.1007/s13735-013-0036-6 REGULAR PAPER A geometrical distance measure for determining the similarityof musical harmony W. Bas de Haas · Frans Wiering · Remco C. Veltkamp Received: 17 January 2012 / Revised: 25 July 2012 / Accepted: 6 February 2013© The Author(s) 2013. This article is published with open access at Springerlink.com Abstract In the last decade, digital repositories of musichave undergone an enormous growth. Therefore, the avail-abilityofscalableandeffectivemethodsthatprovidecontent-based access to these repositories has become criticallyimportant.Thisstudypresentsandtestsanewgeometricdis-tance function that quantifies the harmonic distance betweentwo pieces of music. Harmony is one of the most importantaspects of music and we will show in this paper that har-monic similarity can significantly contribute to the retrievalof digital music. Yet, within the music information retrievalfield, harmonic similarity measures have received far lessattention compared to other similarity aspects. The distancefunction we present, the Tonal pitch step distance, is basedon a cognitive model of tonality and captures the change of harmonic distance to the tonal center over time. This dis-tance is compared to two other harmonic distance measures.We show that it can be efficiently used for retrieving simi-lar chord sequences, and that it significantly outperforms abaseline string matching approach. Although the proposedmethod is not the best performing distance measure, it offersthe best quality–runtime ratio. Furthermore, we demonstratein a case study how our harmonic similarity measure cancontribute tothemusicological discussionofthemelody andharmony in large-scale corpora. Keywords Harmony · Music information retrieval · Similarity · Step function · Tonality W. B. de Haas is supported by the Netherlands Organization forScientific Research, NWO-VIDI grant 276-35-001. This article hasbeen made open access with funding of the NWO StimuleringsfondsOpen Access (036.002.298).W. B. de Haas ( B ) · F. Wiering · R. C. VeltkampUtrecht University, PO Box 80.089, 3508 TB Utrecht, The Netherlandse-mail: W.B.deHaas@uu.nl 1 Introduction Content-based music information retrieval (MIR 1 ) is arapidly expanding area within multimedia research. On-linemusic portals, like last.fm, iTunes, Pandora, Spotify andAmazon, provide access to millions of songs to millions of usersaroundtheworld.Propelledbytheseever-growingdig-ital repositories of music, the demand for scalable and effec-tive methods for providing access to these repositories stillincreasesatasteadyrate.Generally,suchmethodsaimtoesti-mate the subset of pieces that is relevant to a specific musicconsumer. Within MIR, the notion of  similarity is thereforecrucial: songs that are similar in one or more features to agiven relevant song are likely to be relevant as well. In con-trast to the majority of approaches to notation-based musicretrieval that focus on the similarity of the melody of a song,this paper presents a new method for retrieving music on thebasis of its harmony .Within MIR, two main directions can be discerned: sym-bolic music retrieval and the retrieval of musical audio. Thefirst direction of research stems from musicology and thelibrary sciences and aims to develop methods that provideaccess to digitized musical scores. Here music similarity isdetermined by analyzing the combination of symbolic enti-ties, such as notes, rests, meter signs, etc., that are typicallyfound in musical scores. Musical audio retrieval arose whenthe digitization of audio recordings started to flourish, andthe need for different methods to maintain and unlock digi-tal music collections emerged. Audio-based MIR methodsextract features from the audio signal and use these fea-tures for estimating whether two pieces of music are musi-cally related. These features, e.g., chroma features [29] or 1 Within this paper, MIR refers to music (and not multimedia) infor-mation retrieval.  123  Int J Multimed Info Retr Mel-Frequency Cepstral coefficients MFCCs [19], do notdirectly translate to the notes, beats, voices and instrumentsthat are used in the symbolic domain. Of course, muchdependsontheapplicationortaskathand,butwebelievethatfor judging the musical content of an audio source, translat-ing the audio features into a high-level representation, whichcontainsdescriptorsthatcanbemusicallyinterpreted,shouldbe preferred. Although much progress has been made, auto-matic polyphonic music transcription is a difficult problem,and is currently too unreliable to use as a preprocessing stepfor similarity estimation. Hence, in this paper, we focus on asymbolic musical representation that can be transcribed rea-sonablywellfromtheaudiosignalusingcurrenttechnology:chordsequences.Asaconsequence,forapplyingourmethodto audio data, we rely on one of the available chord labelingmethods (See Sect.2.2). In this paper, we present a novel similarity measure forchord sequences. We will show that such a method can beused to retrieve harmonically related pieces and can aid inmusicological discussions. We will discuss related work onharmonic similarity and the research from music theory andmusic cognition that is relevant for our similarity measure inSect.2.Next, we will present the Tonal pitch step distance in Sect.3.In Sect.4,we show how our distance measure performs in practice and we show that it can also contributeto musicological discussions in Sect.5.But first, we will give a brief introduction on what actually constitutes tonalharmony and harmonic similarity.1.1 What is harmony?Within Western tonal music, it is common to represent asoundwithafixedfrequencybya note .Allnoteshaveaname,e.g.,C,D,E,etc.Thedistancebetweentwonotesiscalledan interval and is measured in semitones, which is the smallestinterval in Western tonal music. Also intervals have names:minorsecond(1semitone),second(2semitones),minorthird(3 semitones), etc., up to an octave (12 semitones). Whentwo notes are an octave apart, the highest note will haveexactly twice the frequency of the lower. These two notesare perceived by listeners as very similar, so similar eventhat all notes one or more octave apart have the same name.Hence, these notes are said to be in the same pitch class .Harmonyarisesinmusicwhentwoormorenotessoundatthe same time. 2 These simultaneously sounding notes form chords , which can in turn be used to form chord sequences.The two most important factors that characterize a chord areits structure, determined by the intervals between the notes,and the chord’s root  . The root note is the note on which thechord is built. The root is often, but it does not necessarily 2 Onecanevenarguethatnotesplayedsuccessivelywithinashorttimeframe also induce harmony. II VIV C F G C Fig. 1 Averytypicalandfrequentlyusedchordprogressioninthekeyof C-major, often referred to as I-IV-V-I. Above the score the chordlabels, representing the notes of the chords in the section of the scoreunderneath the label, are printed. The roman numbers below the scoredenote the interval between the chord root and the tonic of the key. Wediscarded voice-leading for simplicity have to be, the lowest sounding note. The most basic chordis the triad  , which consists of a root and two pitch classes athirdand a fifth interval above the root.If the third interval inatriadisamajorthird,thetriadiscalleda majortriad  ,ifitisaminorthird,thetriadiscalleda minortriad  .Figure1displaysa frequently occurring chord sequence. The first chord iscreated by taking a C as root and subsequently a major thirdinterval (C–E) and a fifth interval (C–G) are added, yieldinga C-major chord. Above the score the names of the chords,which are based on the root notes, are printed.The internal structure of the chord has a large influenceon the consonance or dissonance of a chord: some combina-tions of simultaneous sounding notes are perceived to havea more tense sound than others. Another important factorthatcontributestoperceivedtensionofachordistherelationbetweenthechordandthe key ofthepiece.Thekeyofapieceof music isthe tonal center of the piece. Itspecifies the tonic ,which is the most stable, and often the last, pitch class in thatpiece. Moreover, the key specifies the scale , which is the setof pitches that occur most frequently, and that sound reason-ably well together. Chords can be created from pitches thatbelongtothescale,ortheycanborrownotesfromoutsidethescale,thelatterbeingmoredissonant.Therootnoteofachordhas an especially distinctive role, because the interval of thechord root and the key largely determine the harmonic func-tion ofthechord.Themostimportantharmonicfunctionsarethedominant(V)thatbuildsuptension,asub-dominant(IV)that can prepare a dominant, and the tonic (I) that releasestension.InFig.1,romannumbersdenotetheintervalbetweenthe root of the chord and the key, often called scale degrees ,are printed underneath the score.Obviously, this is a rather basic view of tonal harmony.For a thorough introduction to tonal harmony, we refer thereader to [26]. Harmony is considered a fundamental aspectof Western tonal music by musicians and music researchers.For centuries, the analysis of harmony has aided composersandperformersinunderstandingthetonalstructureofmusic.Theharmonicstructureofapiecealonecanrevealsongstruc-ture through repetitions, tension and release patterns, tonalambiguities,modulations(i.e.,localkeychanges),andmusi-calstyle.Forthisreason,Westerntonalharmonyhasbecomeone of the most prominently investigated topics in music  123  Int J Multimed Info Retr theory and can be considered a feature of music that is quiteas distinctive as rhythm or melody. Nevertheless, harmonicstructure as a feature for music retrieval has received far lessattention than melody or rhythm.1.2 Harmonic similarity and its application in MIRHarmonic similarity depends not only on musical informa-tion, but also largely on the interpretation of this informationby the human listener. Musicians as well as non-musicianshave extensive culture-dependent knowledge about musicthat needs to be taken into account while modeling musicsimilarity[4,6]. Hence, we believe that music only becomes music in the mind of the listener, and that not all informationneeded for making good similarity judgments can be foundin the musical data alone[10]. In this light, we consider the harmonic similarity of twochord sequences to be the degree of agreement betweenstructures of simultaneously sounding notes including theagreement between global as well as local relations betweenthese structures as perceived by the human listener. By theagreement between structures of simultaneously soundingnotes, we denote the similarity that a listener perceives whencomparing two chords in isolation and without surroundingmusical context. However, chords are rarely compared inisolation and the relations to the global context—the key of a piece—and the relations to the local context play a veryimportant role in the perception of tonal harmony. The localrelationscanbeconsideredtherelationsbetweenfunctionsof chords within a limited time frame, for instance, the prepa-ration of a chord with a dominant function by means of asub-dominant. All these factors play a role in the percep-tion of tonal harmony and thus contribute to the harmonicsimilarity of musical works.Harmonic similarity also has practical value and offersvarious benefits. It allows for finding different versions of the same song even when melodies vary. This is often thecase in cover songs or live performances, especially whenthese performances contain improvisations. Moreover, play-ing the same harmony with different melodies is an essentialpart of musical styles like jazz and blues. Also, variationsover standard basses in baroque instrumental music can beharmonically closely related, e.g., chaconnes.1.3 ContributionWe introduce a distance function that quantifies the dissimi-laritybetweentwosequencesofmusicalchords.Thedistancefunction is based on a cognitive model of tonality and mod-els the change of chordal distance to the tonic over time. Theproposed measure can be computed efficiently and can beused to retrieve harmonically related chord sequences. Theretrievalperformanceisexaminedinanexperimenton5,028human-generated chord sequences, in which we compare itto two other harmonic distance functions and measure theeffect of the chord representation. Although the proposeddistance measure is not the best performing measure, it ismuch faster and offers the best quality–runtime ratio. Wefurthermore show in a case study how the proposed measurecancontribute tothemusicological discussionoftherelationbetween melody and harmony in melodically similar Bachchorales.Theworkpresented hereextends andintegratestheearlier harmonic similarity work [11,13]. 2 Related work MIR methods that focus on the harmonic information in themusical data are quite numerous. After all, a lot of music ispolyphonic, and limiting a retrieval system to melodic dataconsiderably restricts its application domain. Most researchseems to focus on complete polyphonic MIR systems e.g.,[3]. By complete systems, we mean systems that do chordtranscription, segmentation, matching and retrieval all atonce. The number of papers that purely focus on the devel-opment and testing of harmonic similarity measures is muchsmaller. In the next section, we will review other approachesto harmonic similarity, in Sect.2.2,we will discuss the cur- rent state of automatic chord transcription; in Sects.2.3and2.4,we elaborate on the cognition of tonality and the cog-nitive model relevant to the similarity measure that will bepresented in Sect.3.2.1 Harmonic similarity measuresAn interesting symbolic MIR system based on the develop-ment of harmony over time is the one developed by Pickensand Crawford[25]. Instead of describing a musical segment as a single chord, the authors represent a musical segmentas a 24-dimensional vector describing the ‘fit’ between thesegment and every major and minor triad, using the Euclid-ean distance in the 4-dimensional pitch space as found byKrumhansl [15] in her controlled listening experiments (seeSect.2.3).TheauthorsuseaMarkovmodeltomodelthetran-sition distributions between these vectors for every piece.Subsequently, these Markov models are ranked using theKullback–Leibler divergence to obtain a retrieval result.Other interesting work has been done by Paiement et al.[24]. They define a similarity measure for chords rather thanforchordsequences.Theirsimilaritymeasureisbasedonthesum of the perceived strengths of the harmonics of the pitchclasses in a chord, resulting in a vector of 12 pitch classesfor each musical segment. Paiement et al. subsequentlydefine the distance between two chords as the Euclidean dis-tance between two of these vectors representing the chords.Next, they use a graphical model to model the hierarchical  123  Int J Multimed Info Retr dependencies withinachordprogression.Inthismodel,theyuse their chord similarity measure for calculating the sub-stitution probabilities between chords and not for estimatingthe similarity between sequences of chords.Besides the distance measure that we will elaborate onin this paper, which was earlier introduced in [11,13], there exist two other methods that solely focus on the similarity of chord sequences: an alignment-based approach to harmonicsimilarity[14]andagrammaticalparsetreematchingmethod[12]. The first two are quantitatively compared in in Sect.4. The chord sequence alignment system (CSAS)[14] is based on local alignment and computes similarity betweentwo sequences of symbolic chord labels. By performing ele-mentary operations, the one chord sequence is transformedinto the other chord sequence. The operations used to trans-form the sequences are deletion or insertion of a symbol,and substitution of a symbol by another. The most importantpart in adapting the alignment is how to incorporate musicalknowledge and give these operations valid musical meaning.Hanna et al. experimented with various musical data repre-sentations and substitution functions and found a key rela-tiverepresentationtoworkwell.Forthisrepresentation,theyrendered the chord root as the difference in semitonesbetween the chord root and the key; substituting a majorchord for a minor chord and vice versa yields a penalty. Thetotal transformation from the one string into the other canbe solved by dynamic programming in quadratic time. Notethat the key relative representation of Hanna et al. requiresthe global key to be known.Thethirdharmonicsimilaritymeasureusingchorddescrip-tionsisagenerativegrammarapproach[12].Theauthorsusea generative grammar of tonal harmony to parse the chordsequences,whichresultinparsetreesthatrepresentharmonicanalyses of these sequences. Subsequently, a tree that con-tains all the information shared by the two parse trees of twocompared songs is constructed and several properties of thistree can be analyzed yielding several similarity measures.However, the rejection of ungrammatical harmonies by theparser is problematic, but can be resolved by applying anerror-correcting parser [9].2.2 Automatic chord transcriptionThe application of harmony matching methods is extendedby the extensive work on chord label extraction from rawmusicaldatawithintheMIRcommunity.Chordtranscriptionalgorithms extract chord labels from either musical scores ormusical audio. Given a symbolic score, automatically deriv-ing the right chord labels is not trivial. Even if informationabout the notes, beats, voices, bar lines, key signatures, etc.,is available, the algorithm must determine which notes areunimportant passing notes. Moreover, sometimes the rightchord can only be determined by taking the surroundingharmoniesintoaccount.Severalalgorithmscancorrectlyseg-mentandlabelapproximately84%ofasymbolicdataset(forreview, see [28]).Although the extraction of chord labels from score datais interesting, most digital music repositories store musicas (compressed) digitized waveforms. Therefore, to be ableto apply the methods presented in this paper to the audiodomain, automatic audio transcription methods are neces-sary. Ideally, a piece of audio would be automatically tran-scribed into a representation similar to a musical score.However, although much progress has been made, multi-ple fundamental frequency (F0) estimation, the holy grail inpolyphonic music transcription, is still considered too unre-liable and imprecise for many MIR tasks. Hence, automaticchord transcription has offered a welcome alternative, whichtransforms polyphonic audio into musically feasible sym-bolic annotations, and can be used for serious MIR tasks.In general, most chord transcription systems have a sim-ilar outline. First, the audio signal is split into a series of overlapping frames . A frame is a finite observation intervalspecified by a windowing function. Next, chroma vectors [29], representing the intensities of the 12 different pitchclasses, are calculated for every frame. Finally, the chromavectors are matched with chord profiles, which is often doneusing the Euclidean distance. The chord structure that bestmatchesthechromavector isselectedtorepresenttheframe.Although the digital signal processing-specific parametersmayvary,mostapproachestowardautomaticchordtranscrip-tion use a chroma vector-based representation and differ inotheraspectslikechromatuning,noisereduction,chordtran-sition smoothing and harmonics removal. For an elaboratereview of the related work on automatic chord transcription,we refer the reader to [22].2.3 Cognitive models of tonalityOnly part of the information needed for reliable similarity judgmentcanbefoundinthemusicalinformation.Untrainedas well as musically trained listeners have extensive knowl-edge about music [4,6]; without this knowledge, it might not be possible to grasp the deeper musical meaning that under-lies the surface structure. We strongly believe that musicshould always be analyzed within a broader music cognitiveand music theoretical framework, and that systems withoutsuch additional musical knowledge are incapable of captur-ing a large number of important musical features [10].Of particular interest for the current research are theexperiments of Krumhansl [15]. Krumhansl is probably bestknown for her probe-tone experiments in which subjectsrated the stability of a tone, after hearing a preceding shortmusical passage. Not surprisingly, the tonic was rated moststable, followed by the fifth, third, the remaining tones of the scale, and finally the non-scale tones. Krumhansl also  123
Search Related
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!