6
Composer Classification in Symbolic Data Using PPM Antonio Deusany de Carvalho Junior Instituto de Matem´ atica e Estat´ ıstica Universidade de S˜ ao Paulo ao Paulo, SP, Brasil [email protected] Leonardo Vidal Batista Centro de Inform´ atica Universidade Federal da Para´ ıba Jo˜ ao Pessoa, PB, Brasil [email protected] Abstract—The aim of this work is to propose four methods for composer classification in symbolic data based on melodies making use of the Prediction by Partial Matching (PPM) algorithm, and also to propose data modeling inspired on psychophysiological aspects. Rhythmic and melodic elements are combined instead of using only melody or rhythm alone. The models consider the perception of pitch changing and note durations articulations then the models are used to classify melodies. On the evaluation of our approach, we applied the PPM method on a small set of monophonic violin melodies of five composers in order to create models for each composer. The best accuracy achieved was of 86%, which is relevant for a problem domain that by now can be considered classic in MIR. Keywords-PPM; melody; rhythm; pattern I. I NTRODUCTION The analysis of information by computers has been in- creasingly improved since the establishment of Information Theory. Retrieving information from music using concepts based on this theory is a challenge faced by researchers that have to consider aspects of both music representation and musical information content. However, thinking about music and about some ways the brain obtains musical information might help to develop algorithms to perform better musical analysis. When listening to a melody, the physiology of the auditory system discriminates each pitch sequentially and then sends a signal with this information to the brain until the stimulus comes to an end [1]. This means that each musical note is listened to and treated by the auditive cortex in the context of its predecessor [2]. In this work we combine notes and its predecessors with concepts of Information Theory and then propose a new way of analysis based on sound perception to create a classifier. The development of Information Theory allowed data processing in different forms that are relevant to many areas of science. Among them, Music Information Retrieval(MIR) is benefited by some algorithms brought by this theory. The state of art of universal data compression is the Prediction by Partial Matching (PPM) method [3], and here we correlate the similarities between this method and the perception of musical pitches by the brain. The PPM method acts similarly to the auditory perception considering that the main computation is based on a sequence of symbols, and the context of each symbol (its neighbours) is used to calculate the probability of its occurrence. A similar point of view is described on related works that associate concepts from different areas. Dowling [4], [5] discussed the advantages of psychological and physiologi- cal concepts approach, especially in melody memorization by intervals or scale contours. Similar concepts have also been used by Londei et al. [6] for the computation of the difference of pitches between sequential notes. Suyoto et al. [7] considered the Longest Commom Subsequence (LCS) algorithm to find similarities among musical pieces, showing that these sequences definitely are useful for information retrieving in the musical context. Although there are several works with significant results for musical analysis through melodies, our main goal is to propose a new way of using sequential analysis with pattern recognition. Here, we propose to consider melodic and rhythmic structures together and apply the PPM method on data modeled using psychophysiological aspects. Our method of analysis is based on pitch intervals and note duration ratios retrieved from symbolic data representation that permits an easy evaluation of each of these aspects in a sequence of notes interpreted from MIDI files. The MIDI format importance arises from its wide use and its firm position as an industry standard, so one can expect a lot of databases and softwares using this format of codification. The use of the PPM method on MIR is an old recommen- dation for this area. Lartilot et al. [8] made an evaluation of different statistical models for the modeling of musical style and suggest the use of PPM as a possible statistical technique to be considered. Afterwards, Pearce et al. [9] tested some techniques for PPM application on monophonic music, evaluating advantages of N-gram Markov models in MIR. Hazan et al. [10] evaluated PPM on audio signals, and showed another range of PPM use by pitch and onset analysis. Our results with PPM application on composer identification can predispose other works on musical patterns recognition based on our ideas which are evaluated with four strategies proposed. 2012 11th International Conference on Machine Learning and Applications 978-0-7695-4913-2/12 $26.00 © 2012 IEEE DOI 10.1109/ICMLA.2012.176 345 2012 11th International Conference on Machine Learning and Applications 978-0-7695-4913-2/12 $26.00 © 2012 IEEE DOI 10.1109/ICMLA.2012.176 345

[IEEE 2012 Eleventh International Conference on Machine Learning and Applications (ICMLA) - Boca Raton, FL, USA (2012.12.12-2012.12.15)] 2012 11th International Conference on Machine

Embed Size (px)

Citation preview

Page 1: [IEEE 2012 Eleventh International Conference on Machine Learning and Applications (ICMLA) - Boca Raton, FL, USA (2012.12.12-2012.12.15)] 2012 11th International Conference on Machine

Composer Classification in Symbolic Data Using PPM

Antonio Deusany de Carvalho Junior

Instituto de Matematica e EstatısticaUniversidade de Sao Paulo

Sao Paulo, SP, [email protected]

Leonardo Vidal Batista

Centro de InformaticaUniversidade Federal da Paraıba

Joao Pessoa, PB, [email protected]

Abstract—The aim of this work is to propose four methodsfor composer classification in symbolic data based on melodiesmaking use of the Prediction by Partial Matching (PPM)algorithm, and also to propose data modeling inspired onpsychophysiological aspects. Rhythmic and melodic elementsare combined instead of using only melody or rhythm alone.The models consider the perception of pitch changing and notedurations articulations then the models are used to classifymelodies. On the evaluation of our approach, we applied thePPM method on a small set of monophonic violin melodies offive composers in order to create models for each composer.The best accuracy achieved was of 86%, which is relevant fora problem domain that by now can be considered classic inMIR.

Keywords-PPM; melody; rhythm; pattern

I. INTRODUCTION

The analysis of information by computers has been in-

creasingly improved since the establishment of Information

Theory. Retrieving information from music using concepts

based on this theory is a challenge faced by researchers that

have to consider aspects of both music representation and

musical information content. However, thinking about music

and about some ways the brain obtains musical information

might help to develop algorithms to perform better musical

analysis. When listening to a melody, the physiology of the

auditory system discriminates each pitch sequentially and

then sends a signal with this information to the brain until the

stimulus comes to an end [1]. This means that each musical

note is listened to and treated by the auditive cortex in the

context of its predecessor [2]. In this work we combine notes

and its predecessors with concepts of Information Theory

and then propose a new way of analysis based on sound

perception to create a classifier.The development of Information Theory allowed data

processing in different forms that are relevant to many areas

of science. Among them, Music Information Retrieval(MIR)

is benefited by some algorithms brought by this theory. The

state of art of universal data compression is the Prediction by

Partial Matching (PPM) method [3], and here we correlate

the similarities between this method and the perception

of musical pitches by the brain. The PPM method acts

similarly to the auditory perception considering that the main

computation is based on a sequence of symbols, and the

context of each symbol (its neighbours) is used to calculate

the probability of its occurrence.

A similar point of view is described on related works

that associate concepts from different areas. Dowling [4], [5]

discussed the advantages of psychological and physiologi-

cal concepts approach, especially in melody memorization

by intervals or scale contours. Similar concepts have also

been used by Londei et al. [6] for the computation of the

difference of pitches between sequential notes. Suyoto et

al. [7] considered the Longest Commom Subsequence (LCS)

algorithm to find similarities among musical pieces, showing

that these sequences definitely are useful for information

retrieving in the musical context.

Although there are several works with significant results

for musical analysis through melodies, our main goal is

to propose a new way of using sequential analysis with

pattern recognition. Here, we propose to consider melodic

and rhythmic structures together and apply the PPM method

on data modeled using psychophysiological aspects. Our

method of analysis is based on pitch intervals and note

duration ratios retrieved from symbolic data representation

that permits an easy evaluation of each of these aspects in a

sequence of notes interpreted from MIDI files. The MIDI

format importance arises from its wide use and its firm

position as an industry standard, so one can expect a lot

of databases and softwares using this format of codification.

The use of the PPM method on MIR is an old recommen-

dation for this area. Lartilot et al. [8] made an evaluation

of different statistical models for the modeling of musical

style and suggest the use of PPM as a possible statistical

technique to be considered. Afterwards, Pearce et al. [9]

tested some techniques for PPM application on monophonic

music, evaluating advantages of N-gram Markov models in

MIR. Hazan et al. [10] evaluated PPM on audio signals,

and showed another range of PPM use by pitch and onset

analysis. Our results with PPM application on composer

identification can predispose other works on musical patterns

recognition based on our ideas which are evaluated with four

strategies proposed.

2012 11th International Conference on Machine Learning and Applications

978-0-7695-4913-2/12 $26.00 © 2012 IEEE

DOI 10.1109/ICMLA.2012.176

345

2012 11th International Conference on Machine Learning and Applications

978-0-7695-4913-2/12 $26.00 © 2012 IEEE

DOI 10.1109/ICMLA.2012.176

345

Page 2: [IEEE 2012 Eleventh International Conference on Machine Learning and Applications (ICMLA) - Boca Raton, FL, USA (2012.12.12-2012.12.15)] 2012 11th International Conference on Machine

II. MUSICAL PATTERNS ANALYSIS

Recognizing patterns by organizing and interpreting sen-

sory information is a natural way of learning. Considering

that every individual develops sensory perception before

meaningful perception, one can say that the aspects of per-

ception may aid the interpretation of human habits. Keeping

this in mind, the study of music perception together with

some peculiarities from Gestalt laws has lots of subjects to

be investigated [11].

Symbolic data formats include pitches and note duration

that can represent the melody, the rhythmic elements of a

melody, and also other elements with relevant information

for pattern recognition. The perception of those elements

have its analysis and foundations based on areas such as

psychology and physiology. It also gives some cues for

analysis of musical patterns considering note intervals and

durations in the same way as they are perceived [12].

The computation of intervals still rise important discus-

sions on literature. Cambouropoulos et al. [13] consider the

pros and cons of pitch interval representation by pitch-class

intervals and name-class intervals. Lin et al. [14] distinguish

pitch intervals by orientation (up and down) and the kind

of interval (considering major and minor conditions). On

the other hand, Ruppin et al. [15] uses more mathematical

tools and takes derivatives of the pitch differences to ensure

invariance of some transformations.

Shmulevich et al. [16] discuss complex ways of rhythmic

elements representation as, for example, T-Measure based

on quarter note divisions, LZ-Measure (Lempel-Ziv) based

on capturing the multi-level redundancy embedded in the

rhythmic pattern, and PS-Measure which is consistent with

the Gestalt simplicity principle. Based on this, they tried

to achieve temporal segmentation to reduce the coding

complexity of the stimulus. Cambouropoulos et al. [13]

present duration ratios as an easy way to reveal variations

of rhythmic patterns and consider that listeners generally

remember a rhythm pattern as a corresponding sequence

of durations independent of absolute tempo. Sioros et al.

[17] propose a rhythm quantification with reference to the

32nd note with no treatment to micro-timing deviations, and

taking into account the relative amplitudes. In a similar

viewpoint, Correa et al. [18] propose the relative note

duration representation as 1 for quarter note, 0.5 for eighth

note and so on.

A. Using PPM with symbolic data

The PPM method is an advanced compression method

developed by J. Cleary and I. Witten. It is based on an

encoder that calculates and maintains a contextual statistical

model for a sequence of symbols. When the encoder reads

the next symbol from input, a probability is assigned to the

new symbol. This statistical model looks at the number of

times each symbol was encountered in the past in a given

context, and assigns a conditional probability based on that,

thus creating a context-based tree. The context-based tree

works with the context of each item, and so depending on

the selected length for the context, the model is created based

on the number of times each symbol followed each context

[3].

There are similarities between our application of PPM

and the process of sound perception by the humans. PPM

context-based tree is created during a sequence interpre-

tation. The notes from a melody are also perceived se-

quentially by cochlea cells. The perception is based on

pitch articulation (variation between the cell stimulated)

and pitch duration (time spent during a cell stimulation).

The primary auditory cortex is tonotopically organized,

the secondary process melodic and rhythmic patterns, and

tertiary integrates everything. In the same way, our method

processes MIDI files sequentially and interpret the melody

from pitch articulation and duration. Those informations are

used to create the context-based tree and PPM acts like

the secondary auditory cortex. We also try to simulate the

tertiary auditory cortex by combining melodic and rhythmic

elements in different ways.

The PPM method has different versions, but one of the

most similar to human perception is the PPM-C, as this

version considers a fixed context and uses a escape to

characterize something new. Fixed context represents the

way humans group melodic patterns from short sequences,

and the escape can be compared with the surprise of a new

pattern perceived. PPM-C version has been also suggested

on Lartillot et al. [8] and Pearce et al. [9] researches about

music information retrieval. In a new work, Pearce et al.

[19] also shows improvements applying PPM* instead of

PPM-C, but this version of PPM consider undefined context

length, and for that reason the PPM* will not be analyzed

here, because the human brain memorizes short sequences

better.

B. Melodic similarity analysis in symbolic data

In order to analyze melodic and rhythmic sequences, the

format used to get symbolic information was MIDI. This

format has a reputation in the area of musical analysis by

computers and still being used by many software as an easy

way to write down track informations, e.g., software for

mixing with analog signal quantization and software used

to create digital audio.

Aiming composer identification using symbolic data we

analyze melody similarity with MIDI files by comparing the

probability models of sequences of pitch intervals or note

duration ratios between files. As an example we can see a

matching on two parts of different violin scores composed

by J. S. Bach. In Figure 1, pitch intervals are represented as

the difference between the MIDI representation of each note.

The rhythmic elements of the melody derived from note

duration is also shown on the same figure. These approaches

346346

Page 3: [IEEE 2012 Eleventh International Conference on Machine Learning and Applications (ICMLA) - Boca Raton, FL, USA (2012.12.12-2012.12.15)] 2012 11th International Conference on Machine

Figure 1. Representation of melodic similarities on melodies.

find similarities, even though they have different note and

beat representations.

On the example in Figure 1, the melodic sequence ana-

lyzed at BWV 1002 is comprised by the notes B4, G4, E4,

D5, C5#, B4, A4#, and at BWV 1004 is comprised by D4,

A3#, G3, F4, E4, D4, C4#. The notes are not the same and

neither is their contexts, but in terms of intervals counted

by the semitone (e.g., E5, F5 = +1 and D5, C5 = -2) the

description of both melodies is the same (-4, -3, +10, -1, -2,

-1). For these notes, the relative durations at BWV 1002 are332 , 1

32 , 332 , 1

32 , 332 , 1

32 , 332 , and at BWV 1004 are 3

16 ,116 , 3

16 , 116 , 3

16 , 116 , 3

16 . Despite the difference between

the duration sequences, the sequence of duration ratios is

equivalent on both: 13 , 3, 1

3 , 3, 13 , 3 . It is noticeable that

some sequences can match through this method, and than

we can analyze the patterns occurrences.

III. EVALUATION METHOD DESCRIPTION

The main idea of our evaluation process consisted on

grouping some melodic sequences to create models with the

PPM method. The MIDI files used to create the models were

normalized to ensure the same number of symbols avoiding

disproportionate models. The normalization is done before

the models creation., and its purpose is to guarantee that all

models have the same quantity of intervals. The main con-

sequences of this normalization might happen when some

melodic patterns happens only in the end of a sequence,

and they will be ignored here.

Some considerations with the alphabet had to be taken

into account for the interpretation of each MIDI file . The

PPM method uses symbols from an alphabet, and at this

work the type of symbol chosen was one unsigned byte, so

each interval has to be represented as a number between 0

and 255. Based on the fact that MIDI note numbers have a

range between 0 and 127, the pitch intervals from melodic

sequences can generate numbers from -127 to 127, so there

is no problem to convert it to a byte range by summing each

result with 127.

On the other hand, duration ratios from rhythmic se-

quences were represented using a special description due to

the ratios being normally represented by floating numbers or

fractions. In this case, floating numbers equal or greater than

1 were truncated after the first decimal digit and multiplied

by 10. Floating numbers smaller than 1 were truncated

after the second decimal digit, multiplied by 100, and then

summed with 127. Below is shown an example of these

truncations used to generate symbols for the PPM from

Figure 1 results:

3.0 ⇒ 3 ∗ 10 = 30

1

3= 0.3333.. ⇒ 0.33 ∗ 100 + 127 = 160

We now describe the proposed methods to classify

melodies using PPM. On our evaluation methods, two strate-

gies use pitch intervals alone or note duration ratios alone,

and the other two are based on some ways of grouping the

results from the first two mentioned.

We now define mathematically four evaluation methods

that can be applied to a file F, namely E1m, E2r, E3M, E4R.

At description of the strategies, we used ”melody model”

and ”rhythm model” to characterize when a PPM model

is created using pitch information or note duration ratios

respectively. Consider n Models M:

E1m Evaluation using sequences of pitch intervals forcreating the models:

E1m(F ) = argmaxi

[mi(F ), {i = 1, ..., n}]

mi(F) is the function that compresses the file Fusing PPM on melody model Mi and returns the

compression rate.

E2r Evaluation considering sequences of note dura-tion ratios for creating the models:

E2r(F ) = argmaxi

[ri(F ), {i = 1, ..., n}]

ri(F) is the function that compresses the file Fusing PPM on rhythm model Mi and returns the

compression rate.

347347

Page 4: [IEEE 2012 Eleventh International Conference on Machine Learning and Applications (ICMLA) - Boca Raton, FL, USA (2012.12.12-2012.12.15)] 2012 11th International Conference on Machine

E3M Evaluation considering the mean of each filecompression rate at a model using both melodymodel and rhythm model classification:

E3M(F ) = argmaxi

[(mi(F ) + ri(F ))/2,

{i = 1, ..., n}]E4R Calculation of the mean of the ranking position

of each model for a file classification on bothmelody and rhythm model:

E4R(F ) = argmaxi

[(#mi(F ) + #ri(F ))/2,

{i = 1, ..., n}]#mi(F) and #ri(F) give us the ranking position of

model Mi during the classification of file F using

PPM on the melody model and rhythm model,

respectively.

The first two evaluation methods (E1m and E2r) seem

reasonable enough. The other two are combinations of the

results from the first two. E3M tests if the same composer

model resulted in greater compression rates on both melody

and rhythm models. E4R tests the ranking position of

the composer on both melody an rhythm models. While

two composers models can compress the same file with

compression rates quite close by, the ranking position will

reconsider this closeness by one unit of difference.

There are also some details about the PPM model that

need consideration. The PPM method uses context lengths to

determine the maximum length of the partial sequences that

will be examined. If the maximum context length selected

is 3, contexts with lengths 0 (no context at all), 1, 2, and

3 will be considered. In musical terms, the context length

represents how many preceding notes are considered during

the evaluation of each note.

For pitch intervals, for example, if we consider a context

length 3 then we will evaluate 5 notes and 4 kinds of

intervals. For example, for the simple melodic sequence C,

D, E, F, G the determined pitch intervals will be 2, 2, 1,

and 2. Depending on the PPM method version, it is possible

to consider arbitrary contexts. In this work, we are using

the PPM-C version because it was the version with the best

results in an specific work from Pearce et al. [9] through

an analysis with monophonic music. Here we considered

contexts with lengths from 0 to 6 aiming to simulate a

recognition based on motifs and short sequences of notes.

IV. CLASSIFIER ANALYSIS AND RESULTS

The analysis consisted on measuring prediction perfor-

mance using a program developed in Java with a database

previously prepared using a MIDI editor. The database

comprised 50 musical pieces in MIDI format, 10 pieces from

each of 5 pre XX-century classical composers: Albinoni,

Bach, Beethoven, Brahms, Mozart. The files from the data

set are freely available from the Kunst Der Fuge web-

site (see http://www.kunstderfuge.com/).The selected pieces

were composed for violin solo, so the main melody from

each piece was performed on violin. The notes from violin

solo track were used to create the sequences of pitch inter-

vals and duration ratios. Each file was manually processed

to have the violin solo track as a first track. The violin is a

polyphonic instrument, so on harmonics situations only the

highest note is considered.

In order to analyze the classifier quality we used Leave-

one-out Cross-Validation (LOOCV) performance, and tested

the possibility of composer identification using our approach

based on sequential sound perception. During the evaluation

of the models, we considered an item as correctly classified

if the file gets the best result with its composer model, and

this is used to calculate the hit rate. Ties at first place are

also considered as a right classification, although they rarely

occur, as we are using compression rate comparisons.

The hit rate of the models for each context is presented at

Table I. For this testbed we used a notebook with 2.4GHz

Intel Core 2 duo processor and 4GB 1067 MHz DDR3

memory, and the mean time elapsed for each evaluation

strategy with all contexts was 28s for E1m, 27s for E2r,

and 55s for E3M and E4R.

Table IRESULTS SUMMARY

Method Contexts0 1 2 3 4 5 6

E1m 58% 78% 66% 70% 70% 66% 64%E2r 54% 76% 76% 70% 76% 68% 70%

E3M 70% 82% 82% 76% 82% 82% 76%E4R 70% 86% 84% 78% 80% 80% 72%

The best result is highlighted at Table I and represents

the E4R using PPM-C with context length 1. The first

order Markov model in our testbed represents a maximum

sequence of two intervals, what means that three notes are

being considered for the context. And even if the com-

posers were getting good results using only pitch intervals

or rhythmic elements represented by note duration ratios,

the approach used to combine the results improved the

matching process. E4R shows that ranking position during a

classification with PPM using a multi view strategy is also

important for composer classification.

A. Comparison with other works

Pearce at al. [9] had emphasized that their results apply

only to chromatic pitch, so here we are extending their

results to pitch intervals and note duration ratios, and also

applying the PPM-C that got consistent improvements in

performance over other methods on their work like escape

method B that got an average entropy of 2.93 against 2.47

from escape method C. Moreover, the performance of other

different methods in the literature is used to compare with

348348

Page 5: [IEEE 2012 Eleventh International Conference on Machine Learning and Applications (ICMLA) - Boca Raton, FL, USA (2012.12.12-2012.12.15)] 2012 11th International Conference on Machine

our method. Ruppin et al. [15] presented 5 different methods

using 50 MIDI files to mach genre and also composer. For

genre matching their results were better than the best results

in literature, with a score of 85% against 80% from Cilibrasi

et al. [21] using 36 files with a compression method, and

49% from Lin et al. [14] using 500 files with a repetition

method.

Despite that the performance for genre matching had high

score, the composer matching got 58% in the best method

used by Ruppin with LZW compression. We compared

the best results from these works to our method, and on

closer inspection our results are promising. As one can

see, our results at Table I show the hit rates varying from

54% to 86%, so the approach seems valid for composer

classification.

V. CONCLUSIONS

Methods used for musical pattern recognition with sym-

bolic data have been improved each year, getting better

results with many different algorithms and methods. Here we

proposed to use a method not only by its acknowledgment

of being useful for pattern recognition, but also having in

mind the similarity between this method and other concepts

from psychology and physiology regarding human music

perception. Although several research has also been carried

in this area, specially the symbolic domain [9], [13], [22] ,

our improvements are based on some considerations of the

methods discussed in the literature with some modifications

related to multiple view points [22].

In this work, we used pitch intervals and durations in-

spired by the perception of melody sounds by auditory

cortex. We got 86% as the highest classification accuracy

obtained by the classifier created. The union of melody

model and rhythm model in a single validation, as we did on

E3M and E4R, improved the results significantly. Although

we had good results by analyzing melody model and rhythm

model separately, the joint consideration demonstrated a

strong improvement during the classification and had shown

that there is some relationship between both models. Com-

paring our result with other works, we achieved a competi-

tive way of modeling data for composer classification.

Regarding the main test of composer identification, we

have found that the results are promising since they can

be used on many areas, e.g., cover identification and artist

label check. The method can also be adapted to check

rhythmic and melodic errors on some pieces based on

known error patterns, so it can help composers to check

for desirable composition particularities. These possibilities

predispose future works as a valuable way to improve

diverse researches.

VI. ONGOING AND FUTURE WORKS

Many concepts from Music Theory and music perception

that might bring better results still have not been used

on methods for musical pattern recognition. We are now

focused in applying our methods on harmonic and counter-

point sequences as these aspects are worthwhile topics to be

analyzed on symbolic music, as one can see in Dixon et al.

research on modelling of harmony [23].

The database used needs to be expanded. A genre database

need to be used to check if the melodies preserve charac-

teristics through the same genre and, by trying this kind

of evaluation, it should be possible to compare results with

other works. Many other challenging rhythm attributes can

be considered, for example accents and intensity, onsets and

offsets. Some kind of melodic sequences interpretations need

an attention, as well as other open music databases.

There are many possibilities for future works with the

PPM method, given its application on other fields of pattern

recognition. Some works had already tried PPM with audio

using pitch information and onset expectation [10], and the

exploration of their database with our methods can be a

straightforward consideration. Polyphonic music can also

be an addressed approach based on Bergeron and Conklin

research [24]. Another step could be to implement an useful

shuffle mode on a music player based on the last song’s

patterns, preventing punk rock to be played after country

or classical music. The time elapsed for PPM computation

using small context lengths with many files was not too

long. This implies that more ideas for data interpretation

and intercorrelation can be tried considering models of pat-

terns perception suggested by psychologist and physiology

researches in other areas.

VII. ACKNOWLEDGEMENTS

We would like to thank the members of the Computer

Music Group1 of the Computer Science Department of the

Institute of Mathematics and Statistics of the University

of Sao Paulo, and specially to Andre Bianchi and Camila

Santos for their help. This work has been partially supported

by the Brazilian government funding agency CAPES -

Coordenao de Aperfeioamento de Pessoal de Nvel Superior.

REFERENCES

[1] A. C. Guyton and J. E. Hall, Textbook of Medical Physiology,12th ed. Saunders, 2010.

[2] R. Jourdain, Music, The Brain, And Ecstasy: How MusicCaptures Our Imagination. Harper Perennial, 1998.

[3] D. Salomon and G. Motta, Handbook of Data Compression(5. ed.). Springer, 2010.

[4] W. J. Dowling, “Scale and contour: Two componentsof a theory of memory for melodies,” PsychologicalReview, 1978. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0033295X07621136

1http://www.compmus.ime.usp.br

349349

Page 6: [IEEE 2012 Eleventh International Conference on Machine Learning and Applications (ICMLA) - Boca Raton, FL, USA (2012.12.12-2012.12.15)] 2012 11th International Conference on Machine

[5] ——, “Context effects on melody recognition: Scale-step versus interval representations,” Music Perception,vol. 3, no. 3, pp. 281– 296, 1986. [Online]. Available:http://www.jstor.org/stable/40285338

[6] A. Londei, V. Loreto, and M. O. Belardinelli, “Musical styleand authorship categorization by informative compressors,” inProceedings of the 5th Triennial ESCOM Conference, 2003,pp. 200–203.

[7] I. S. H. Suyoto, A. L. Uitdenbogerd, and F. Scholer, “Effec-tive retrieval of polyphonic audio with polyphonic symbolicqueries,” in Proceedings of the 9th ACM SIGMM Interna-tional Workshop on Multimedia Information Retrieval, 2007,pp. 105–114.

[8] O. Lartillot, S. Dubnov, G. Assayag, and G. Bejerano, “Auto-matic modelling of musical style,” in Proceedings of the 2001International Computer Music Conference. San Francisco:ICMA, 2001, pp. 447–454. [Online]. Available: http://www.ircam.fr/equipes/repmus/RMPapers/ICMC99/index.html

[9] M. Pearce and G. Wiggins, “An empirical comparison ofthe performance of ppm variants on a prediction task withmonophonic music,” Artificial Intelligence and Creativity inArts and Science Symposium, 2003.

[10] A. Hazan, R. Marxer, P. Brossier, H. Purwins, P. Herrera,and X. Serra, “What/when causal expectation modellingapplied to audio signals,” Connect. Sci, vol. 21, no.2-3, pp. 119–143, Jun. 2009. [Online]. Available: http://dx.doi.org/10.1080/09540090902733764

[11] M. R. Jones, R. R. Fay, A. N. Popper, and B. L.Giordano, “Music perception (springer handbook of auditoryresearch).” Journal of the Acoustical Society of America,vol. 123, no. 2, pp. 935–945, 2011. [Online]. Available:http://www.ncbi.nlm.nih.gov/pubmed/21756454

[12] T. J. Tighe and J. W. Dowling, Psychology and Music: theunderstanding of melody and rhythm. New Jersey, Hillsdale:Lawrence Erlbaum Associates, 1993.

[13] E. Cambouropoulos, T. Crawford, and C. S. Iliopoulos, “Pat-tern processing in melodic sequences: Challenges, caveatsand prospects,” in Proceedings of the AISB’99 Convention(Artificial Intelligence and Simulation of Behaviour, 1999, pp.42–47.

[14] C. rong Lin, N. han Liu, Y. hung Wu, and A. L. P. Chen,“Music classification using significant repeating patterns,” inProcceedings Database Systems for Advanced Applications,2004, pp. 506–518.

[15] A. Ruppin and H. Yeshurun, “Midi music genre classificationby invariant features,” in ISMIR, 2006, pp. 397–399.

[16] I. Shmulevich, O. Yli-harja, E. Coyle, D.-J. Povel, andK. Lemstrom, “Perceptual issues in music pattern recognition:Complexity of rhythm and key finding,” in Computers and theHumanities, 2001, pp. 23–35.

[17] G. Sioros and C. Guedes, “Complexity driven recombinationof midi loops,” in ISMIR), 2011, pp. 381–386. [Online].Available: http://ismir2011.ismir.net/papers/PS3-5.pdf

[18] D. C. Correa, A. L. M. Levada, and L. da F. Costa, “Findingcommunity structure in music genres networks,” in ISMIR,2011, pp. 447–452.

[19] M. T. Pearce and G. A. Wiggins, “Evaluating cognitivemodels of musical composition,” in Proceedings of the 4thInternational Joint Workshop on Computational Creativity,A. Cardoso and G. A. Wiggins, Eds. Goldsmiths, Universityof London, 2007, pp. 73–80.

[20] H. Schaffrath, “Essen associative code and folksongdatabase,” Jan. 2010. [Online]. Available: http://www.esac-data.org/

[21] R. Cilibrasi, P. Vitanyi, and R. De Wolf, “Algorithmicclustering of music based on string compression,” ComputerMusic Journal, vol. 28, no. 4, pp. 49–67, Dec. 2004. [Online].Available: http://dx.doi.org/10.1162/0148926042728449

[22] D. C. Department, D. Conklin, and I. H. Witten, “Multipleviewpoint systems for music prediction,” Journal of NewMusic Research, vol. 24, pp. 51–73, 1995.

[23] S. Dixon, M. Mauch, and A. Anglade, “Probabilistic andlogic-based modelling of harmony,” in Proceedings of the7th international conference on Exploring music contents,ser. CMMR’10. Berlin, Heidelberg: Springer-Verlag, 2011,pp. 1–19. [Online]. Available: http://dl.acm.org/citation.cfm?id=2040789.2040791

[24] M. Bergeron and D. Conklin, “Structured polyphonic pat-terns,” in ISMIR, 2008, pp. 69–74.

350350