Developing a Style Modelling System for Creative Generative Music Improvisationpszcah/ITIMSc/ExampleThesis-nsd… · · 2009-07-07Developing a Style Modelling System for Creative

Developing a Style Modelling System for Creative Generative Music Improvisation

Nigel Stephen D’Souza

Dissertation submitted to the School of Computer Science and Information Technology, University of Nottingham, in partial fulfilment

for the degree of

Master of Science In

The Management of Information Technology

September 2007

1

Abstract

This dissertation describes an exploratory style modelling application

intended to simulate a guitar player’s style and replicate it with some degree

of creativity and originality.

This was achieved using a combination of a stochastic technique (Markov

Chains) and a biochemical restructuring process (Autocatalytic Set Theory).

The key objective was to see if natural restructuring processes could

enhance traditional stochastic style modelling techniques in order to promote

creative improvisation in computer generated music. In short, the program

generates improvised solos in the guitar player’s style and adds an element

of originality.

Previous studies into style modelling and generative music are discussed in

the literature review. Based on the arguments put forward, ideas for a

program simulation are suggested. Next, there is a description of the

program developed in Java to implement these ideas. Various tests are

conducted and the results analyzed. Finally, a discussion follows to provide a

theoretical insight into the evaluation of the success of the model.

2

The tests conducted yielded mostly positive results, suggesting that the

model has a sound theoretical foundation and has the potential for further

development and expansion.

Finally, conclusions are drawn about the reasons for the success of the

model and the techniques used. It is recommended that future studies

should be carried out to expand on the ideas presented here and develop the

model further.

Please note that a basic knowledge of music theory and terminology is

required in order to fully understand some of the content of this dissertation.

However, this is not essential.

This dissertation comes with a CD containing the program, its technical

documentation, a user manual, test data and output samples.

3

Contents

1 Introduction …………………………………………………….5 1.1 Aims of the project 5 1.2 Problem Description 7 1.3 Objectives 9

2 Literature Review ……………………………………………10 2.1 Generative Music 10 2.2 Methods for the Creation of Generative Music 11

2.2.1 Mozart’s Dice Game 2.2.2 Constraint Logic Programming 2.2.3 Cellular Automata

2.3 Style Modelling 17 2.3.1 Autocatalytic Set Theory 2.3.2 Stochastic Processes – Markov Chains

2.3.2.1 Markov Chains in Practice 2.3.2.2 Case for Markov Chains 2.3.2.3 Case against Markov Chains

3 Proposed Solution ………………………………………….33 3.1 Using the strengths of Markov Chains 33 3.2 Using the strengths of Autocatalytic Set Theory 35

3.2.1 Addressing the shortcomings of Markov Chains

4 Implementation – Stage 1 …………………………….39 4.1 Markov Chains 39

4.1.1 The Markov Chain Program 4.1.2 The Markov Chain Program – With Music

4.2 Autocatalytic Sets 55 4.2.1 Theory Behind an Autocatalytic Set Program 4.2.2 Simulating Autocatalysis

4.3 Combining the Two 67

4

5 Results……………………………………………………………….71 5.1 Markov Chains 72 5.2 Autocatalytic Set Theory 77

6 Discussion and Recommendations …….………….82 6.1 Markov Chain Music 82 6.2 Autocatalytic Set Theory 85

7 Conclusion ………………………………………………………94 8 References ……………………………………………………..96 9 Appendix 1 – Java Documentation ……………..101 10 Appendix 2 – Program User Manual ……………104

A2.1 Running the Program 104 A2.2 Sample Data Sets and Output Samples 106 A2.3 Program Source Code and Java Documentation 107

5

1. Introduction

The essence of music creation comes from the style of the creator. It is

developed through years of practice, inspiration and emotion that shape a

musician’s ‘voice’. This voice may then become the musician’s signature

playing style. For guitar players, often a particular choice of note

combinations in various musical phrases will become their individual style

and becomes instantly recognizable as their work.

1.1 Aims of this project

This project aims to develop a style modelling system for automated music

improvisation. It seeks to capture a particular player’s improvisation style

and generate an improvised solo using its characteristics. The goal is to

develop a suitable style modelling approach for doing this in a way that

produces the closest possible replication of the player’s style but still

maintains a degree of originality and creativity. The literature shows that

other style modelling music generators seem to lack a creative edge. The

problems faced by these systems will be addressed and a solution that

tackles the problem of encouraging computer creativity within a stylistic

context will be proposed.

6

The first question here is whether or not a style borne out of inspiration and

emotion can actually be quantified and recreated in an original sounding

manner. It is proposed that it can be done and, to achieve this end, a music

generator reflecting the author’s style of guitar playing will be developed.

The key argument in this project is whether or not such a music generator

can be ‘creative’ or ‘original’. To this end, the generator being created will

not merely replicate the author’s style, but rather it will compose music that

has an element of creativity.

The creation of a music generator is secondary to the study of the method of

generating the music. Automated music generation is in itself a large study

that has been approached in many different ways by a number of

researchers such as Francois Pachet, Brian Eno, David Cope, Sean Booth,

John Cage and several others. This project will isolate the problem of the

style modelling stage that comes before the actual automated music

generation. Subsequently, there will be a discussion on how this research

might improve current methods of automated generation.

This research may form the basis for further studies in computer creativity

within a specific stylistic context. When combined with music generation

techniques for playing in contexts, it may be used to perform an analysis

and provide an insight into how a particular musician would approach

improvising in those contexts. It may even lead to the generation of musical

7

phrases that the musicians themselves do not have the technical skills to

play accurately.

1.2 Problem Description

The primary focus of this research will seek to validate the proposed method

of creativity algorithms in a musical style for use by generative music

creators. This method will be based on the utilization of Markov Chains and a

biochemistry theory called Autocatalytic Set Theory. A Markov chain is a

discrete-time stochastic process that shows the probability of one event

following another event or a sequence of other events. Autocatalytic Set

Theory looks at the way RNA string sequences separate and recombine to

create new sequences with different compositions to the original one. There

will be a discussion on why Markov chains are suitable for style modelling

and a system of Markov chains that is best suited to guitar players will be

proposed. Following that, the shortcomings of Markov Chains will be

addressed and it will be shown how Autocatalytic Set Theory may be used to

make up for these.

Using the project’s author to provide the data set, musical phrases that

incorporate what is called the ‘Dorian flavour’ will be documented. The

8

‘Dorian’ is a particular collection of notes that form a specific mode. It is

identical to the ‘Minor’ mode except for the sixth note, which is a semi-tone

above that of the ‘Minor’ mode. Additionally, many guitar players add an

extra flatted-fifth note (half a step below the fifth note) in between the

fourth and fifth notes of the scale. Playing solos with a ‘Dorian flavour’ over

a song in a minor key will result in a slight dissonance, which, although

technically wrong according to the rules of music, is generally found to be

pleasing and musically interesting. A few short phrases have been composed

in the ‘Minor’ scale that have a similar structure to each other and play the

sixth note of the ‘Dorian’ occasionally to create the dissonance and give

them the unique pattern and sound. If the Markov chain program is

successful in accurately documenting the style, the music generator will

compose musical phrases that have this unique sound.

The suggested adaptation of the Markov Chains will be described in a

following section. Reasons for choosing this method will also be provided and

it will be compared to other methods in common use.

9

1.3 Objectives

The following minor objectives were set at the beginning of the project:

1. To develop a program capable of performing Markovian analyses on a set

of raw data.

2. To develop an algorithm for the generation of further data consistent with

the Markovian characteristics of the raw data and extend the program to

perform Markovian analyses on musical corpora.

3. To develop a program that simulates autocatalytic sets.

The following major objectives were set based on the minor objectives:

1. To develop a program capable of quantifying and capturing a guitar

player’s style.

2. To test a theory that the combination of Markovian principles with

autocatalytic set theory can result in the generation of new, original

sounding music while still maintaining characteristics of the original style.

10

2. Literature Review

2.1 Generative music

“Generative music is commonly agreed to describe music

in which a system or process is composed to generate

music rather than the composition of the direct musical

event which will result from that system. The generative

composer has only indirect control the final musical result,

and the creativity of the compositional process is found in

the decisions about how the system will operate and the

rules inside the system” (Rich 2003 cited in Brown 2005,

pp. 217).

The ‘Illiac Suite’ is generally heralded as the first piece of music

automatically generated by a computer (Baggi 1998). It was created using

the Illinois Automatic Computer in 1956 at the University of Illinois at

Urbana-Champaign. While the produced corpus was not meant to be a piece

of ‘beautiful’ music, the concept of music generation using programming

algorithms was the real triumph of the ‘Illiac Suite’.

11

Since then, there have been several advancements in the algorithms used to

generate music. Some moderately successful programs, softwares and

products have been created to automatically generate music based on a few

user inputs. Notable ones include “Koan Pro” by SSEYO and GarageBand by

Apple (Brown 2005).

2.2 Methods for the creation of Generative Music

Different methods have been used to automatically generate music. Eacott

(2000) argues that computers cannot really compose music. Any music

generated by a computer is the result of human input and/or interaction. All

methods use some form of user input to generate music. The following are a

sample of the broader groups of methods for automated music generation.

2.2.1 Mozart’s Dice Game

The famous composer W.A. Mozart conceived one of the earliest ideas for

random music generation, “Musikalisches Würfelspiel”. In this game, a series

of short musical phrases are pre-composed and notated. A dice is then rolled

to choose from these phrases to form a two-part waltz (Baggi 1998). Wilcox

12

(2007) created an online music generator based on Mozart’s dice game.

However, instead of a dice roll, the program used the user’s details as input.

The obvious shortcoming of the dice game is that it has a very limited

compositional scope. The composition is merely a recombination of pre-

existing phrases inputted by a human. The phrases themselves have their

whole structure maintained and are not broken down or separated. No part

of any phrase is analyzed to have its stylistic content isolated and used for

original composition.

Overall, Mozart’s dice game is just that, a game. There is little or no scope

for truly original composition and hence it cannot be considered seriously for

commercial music automation.

2.2.2 Constraint Logic Programming

Descriptions of a piece of music can be plotted in terms of rules. When

modelling a person’s style, these rules may become more complex and

abstract (Anders 2003). However, there is still some amount of recognizable

structure. The rules may apply to a large number of musical parameters

such as the note pitches or the duration of each note. How one note is

13

played may be dependant on how the preceding notes were played or what

the next note is likely to be.

Setting these rules will not necessarily lead to the creation of a single

musical composition. Rather, it isolates a subset of the infinite possible

musical scores that could be composed in the absence of fixed rules. This

subset conforms to the ‘rules’ behind an individual style. Programs that

define these rules and mutual note dependencies with multiple solutions are

called non-deterministic programs. The rules/restrictions and mutual note

dependencies are called constraints (Anders 2003).

For the simplest general form of music, independent of any style modelling,

these constraints could refer to the tempo of the song, the scale that the

song is played in (referred to as the key of the song). The model used in

project is a branch of Constraint Logic Programming that is aimed

specifically at playing within a set stylistic context. It is aimed at setting

constraints based on a guitarist’s probability of playing a certain note based

on the preceding notes in the phrase. This will be explained in full detail in

section 2.3.2.

A simple example of a constraint-based system is as follows. A single

constraint is set that the song should be in the key of ‘D minor’; that is,

every note generated by the computer should be a part of the ‘D minor’

scale. The computer takes a ‘random walk’ through every possible music

14

note until if finds one that satisfies the constraint (of being in the key of ‘D

minor’) and it plays this note.

A strong criticism of Constraint Logic Programming is that “the methods

employed to produce generative music are merely technical and cannot be

described as artistic” (Brown 2005 p. 224). It may be argued that the music

is derived mathematically and lacks inspiration or emotion, as computers are

incapable of experiencing either.

However, it should be noted that ‘style’ is essentially a function of a person’s

own mental representation of music as per the literature on the cognitive

psychology of music (Franz 1998). Whether or not a phrase is musically

appealing is completely subjective and is determined based on the opinion of

the listener. Thus a good test would be to have a sample of musicians listen

to the output of a program and rate its musical appeal.

2.2.3 Cellular Automata

Cellular automata are important modelling and simulation tools. It is an

interesting modelling system based on the evolution and growth of living

cellular organisms. “They are discrete dynamical systems; that is, they

change some feature with time” (McAlpine et al 1999, p. 23). Different

15

disciplines from physics and chemistry to sociology and philosophy have

used cellular automata principles.

A cellular automaton is an array of elements referred to as cells. Evolution

rules are applied to each cell to determine how each automaton develops in

time. McAlpine et al (1999) document the CAMUS 3D research project that

applies these evolution rules to music composition to model pattern

propagation. Each theme in a composition is a separate pattern, which is

subjected to evolutionary musical transformations (such as transposition,

augmentation, inversion, etc.). Design constructs, similar to rules as in

constraint logic, are set in place to guide the evolution of the composition.

McAlpine et al (1999) suggest that human composers employ pattern

propagation intuitively whereas computer based Cellular Automata formalize

this propagation at a much more advanced level. All the musical patterns

evolve according to the set constraints that are set initially. Hence, any

common stylistic musical features that emerge as a result are a sonification

of the evolutionary behaviour of the automaton.

The application of evolutionary rules may be seen as an advanced version of

constraint logic where the ‘constraint’ actually allows for evolution,

reconstitution and growth of a musical idea as represented by a simulated

organic cell. Composition constraints are specified in advance for both,

constraint logic and cellular automata.

16

In an automaton, the rules are localized to the existing cells. The ‘random

walk’, as mentioned earlier, will go through all the possible notes at time t

and choose one that meets the set evolution rules as they apply to the note

at time t-1. Thus, like living organic cells, they develop based on existing

states.

Cellular Automata is an important advancement in the intelligence of

generative music. Having ‘cells’ of notes that evolve, change, develop and

augment based on previous passages for the basis for the development of a

stylistic model. This can be done by guiding the evolution of notes to fit

stylistic ‘rules’ behind a player’s performance.

17

2.3 Style Modelling

Style modelling implies “building a representation of the musical surface that

captures important stylistic features hidden in the way patterns of rhythm,

melody, harmony and polyphonic relationships are interleaved and

recombined in a redundant fashion” (Dubnov et al 2003b, p. 2).

This kind of model makes it possible to generate new musical sequences that

conform to the documented style. Franz (1998) states the literature

indicates a lack of suitable models or analysis techniques of style and

creativity to measure improvisation suitably.

2.3.1 Autocatalytic Set Theory

Autocatalytic Set theory stems from observation of reactions in RNA

sequences. It is best explained with a diagram, as shown on the next page.

According to this theory, specific portions of an RNA sequence (called

‘digestive enzymes’) react with matching enzymes found in another

sequence to break it apart at those matching portions. This process is then

reversed so that these enzymes are used to join other sequences or

fragments of sequences around matching enzymes. Thus, each member in a

18

set of sequences is the product of at least one reaction catalyzed by at least

one other member.

“Theoreticians at the Santa Fe Institute found that an artificial-life program

could simulate a living metabolism, in which a few chemicals acting as

enzymes digest other molecules and reassemble them into different ones in

complicated cycles” (Zimmer 1993).

It can be seen that Autocatalytic Set theory, when applied to computer

science, is an extension of Cellular Automata. Iverson (1990) proposed the

idea of applying autocatalytic set theory to music as a basis for a style

modelling system. He developed the ‘Metamuse’ program to do this. This

was a completely different approach to style modelling at the time (Zimmer

1993).

19

Metamuse works by having a musical score fed into it. It then randomly

extracts a string of four notes. That string acts as a digestive enzyme by

identifying the same sequence of notes in the rest of the composition and

separating the composition into two pieces in the middle of that sequence.

The digestive enzyme then reproduces itself, and the two copies then search

for more matches. Upon reproduction, an enzyme has a small chance of

mutating--of having the notes changed. Eventually, a number of different

enzymes are at work, digesting and reproducing.

This digestive process ceases once the entire composition has been reduced

to a set of fragments roughly the size of the enzymes themselves. Metamuse

then reverses this process. The fragments reassemble themselves, each one

looking for two other fragments to stitch together end-to-end, as is shown in

the diagram above. Again, a sequence that catalyzes a reaction is allowed to

copy itself and is prone to mutation. (Zimmer 1993)

Iverson’s general idea was to preserve the unique stylistic characteristics of

the music by essentially dissecting and recombining a given piece of music.

However, the success rate was low as only one in three pieces of music

produced by Metamuse was considered to be a successful representation of

the intended musical style (1990).

20

While the program itself may not have been a success, the ideas behind it

may be useful for further development of a style modelling system. Iverson

made the following important suggestions:

Assimilation of a piece of music is best done by first isolating it into

different series of elements. These may be notes pitches, note

durations, rhythm, or any other user-selected element.

Having a certain degree of mutation may potentially result in

productive additions to the autocatalytic set.

An apparent flaw is the selection of the digestive enzyme. Iverson chose to

select a digestive enzyme at random, rather than based on any specific

reason or logical choice. The digestive enzyme will reappear frequently in

different sections of the improvised piece because it joins sub-strings

together to create larger strings and, barring mutation, is preserved in its

entirety. On dealing with style modelling, Pachet (2002a, p. 77) suggests,

“music, especially improvised music, is made up of…[…]…repetitive

ingredients…[in addition to]…unexpected events or material”. If a stylistic

trait of the player’s style was chosen to be an enzyme, its frequent

reappearance may give the listener some sense of stylistic trait.

This is supported by Cope (1999), who states that artists have certain

stylistic traits, or signatures that occur frequently across compositions. He

21

suggests that these signatures should avoid fragmentation, as doing so

would result in inaccurate style replication.

Hence, careful selection of the initial enzyme may help to preserve stylistic

traits or signatures, and this will be discussed further in section 3.2.

22

2.3.2 Stochastic Process - Markov Chains

Stochastic processes are the counterparts of deterministic processes

considered in probability theory. They deal with states that can have several

possible outcomes. They are sometimes random processes that choose one

of these possible outcomes based on a set of conditions (Winston 2004).

A ‘discrete-time stochastic process’ amounts to a sequence of random

variables based on certain conditions. Simply, if X is the value of a particular

variable at time ‘t’, a discrete-time stochastic process is a description of the

relation between X at time ‘t’ and X at time ‘n’ for any number of future time

periods (Winston 2004).

A Markov Chain is one such discrete-time stochastic process that shows the

probability of transition from Xt to Xt+1. The simplest way to present these

chains is in a transition probability matrix.

This matrix shows the probability of going from the vertical states in one

time period to the horizontal states in the next time period. Thus, if the

23

current time period is ‘t’, the probability of going from At to At+1 is 10%, At

to Bt+1 is 00%, At to Ct+1 is 90%, Bt to At+1 is 50% and so on (Winston

2004).

Note that the sum of all probabilities in each of the rows is equal to 100%.

The table is said to be normalized, implying that there is absolute certainty

that there will be some output at time ‘t+1’ from any of the states at time ‘t’

(McAlpine et al 1999).

Markov chains can also have multiple ‘orders’ or ‘states’ as such:

In this bigram (two order chain), the probability of going from AAt to At+1 is

10%, AAt to Ct+1 is 90%, and so on. Thus if A, B and C are musical notes,

the probability of playing At-1, Bt and then At+1 is 50%. As such, Markov

Chains can have any number of states (Winston 2004).

24

The invention of ‘mind reading machines’ created by Hagelbarger in the

early 50’s sought to predict choices made by humans. It was found that this

could be done by analysing patterns of previous choices using Markov Chains

(Dubnov et al 2003a). They provide a method for documenting patterns

previous choices, and thus can be used to predict future choices.

A stream of musical notes can be seen as a stream of events. Such a stream

might be analyzed using a Markov Chain and framed in a probability

transition matrix. To this end, Brooks et al (1957) suggest one of the earliest

musical applications of Markov Chains in their study entitled “An Experiment

in Musical Composition”. They state that computers are capable of

performing deductive reasoning and hypothesized that probability states

stored by Markov chains could be used as input for the deductive process in

order to arrive at a musical output. Analyzing patterns of previous note

choices in this method could be used to predict future note choices.

Pachet suggests (2002a, p. 77), “Markov-based models always generate

music, which is consistent with a style, itself defined by recurring patterns

found in some learned corpus”. Thus, he suggests that recurring patterns

can be used to define a style.

‘Style’ is a subjective concept and is therefore difficult to quantify (Franz

1998). One possible approach to recreating a guitar player’s style is to select

highly probable note choices or musical phrasings that she is likely to choose

25

in a particular musical context. The content of Markov chains shows the

movement from one note to another with some degree of probability. The

difficulty lies in measuring how external context (the style of the guitar

player playing the notes) affects these probabilities.

To solve this problem, we look at Claude Shannon’s ‘Information Theory’. It

considers the predictability or probability of a verbal or written message

using Markov chains (Huron 2001) much like Brooks et al (1957) did with

music notes. However, one important discussion is the Role of Context. He

sets up a frame of predictability to suggest that the role of previous states

may constrain the probability of occurrence of the next possible state.

This context of dependency may apply to guitar playing style in the following

way. Perhaps one guitarist’s style is to play each note of the scale in

ascending order. Perhaps she also prefers to skip every third note. Thus, if it

is observed that two consecutive notes have been played in the two

preceding states, there is a high probability that the third will be skipped in

the current state and the fourth will be played instead.

As has been shown, Markov chains can have any number of states. Hence, in

this example the frame of predictability should consider at least two

preceding states (a 2-state Markov chain). It becomes clear that a multiple

state chain will be more successful in predicting a player’s style.

26

Constraint Logic requires a set of rules or constraints. Markov chains provide

these ‘constraints’ as a set of probabilities of proceeding to a selection of

future states from a current state. Thus, it is suitable for use within a

Constraint Logic framework.

2.3.2.1 Markov Chains in Practice

Over the years, Markov Chains have been used extensively in music style

modelling. Limpaecher (2007) created a simple program using Markov

chains that could compose music based on an input file using a number of

probabilistic techniques, primarily Markov models. The output of this

program was evaluated and found to be fairly impressive.

McAlpine et al (1999) report on the aforementioned CAMUS 3D program,

suggesting that, with a little effort, the system results can produce pleasing

and much more natural sounding compositions as compared to those

obtained using comparable algorithmic techniques.

However, the most impressive system using Markov chains is the

‘Continuator’, a product developed by Sony Computer Science Laboratories

(Pachet 2002a). This is a MIDI enabled keyboard that stores and analyzes a

player’s performance in real-time as it is played. It then plays back an

improvisation based on the artist’s style. Pachet boasts of the ‘Aha effect’.

27

Upon listening to the generated improvisation, the musicians react

extremely positively, often exclaiming, “Aha!”

Some have even declared that it plays improvisation of a much higher level

of skill than they are capable of, but still maintains their unique style.

These programs and products are but some of the few that have been

designed using Markov chains as the primary form of style modelling

documentation. They stand testament to the fact that Markov chains can

result in successful and considerably accurate preservation and replication of

a musician’s style.

2.3.2.2 Case for Markov Chains

As is shown above, Markov Chains provide an excellent format for style

modelling.

Franz (1998, p.27) proved “Markov chains and the tools for their analysis

can be interpreted to determine quantitative measures of creativity and

style”. Documenting patterns of previous note choices using the chains

provides a means for predicting future note choices consistent with the style.

It has been shown that “expectations based on recent past context guide

musical perception” (Dubnov et al 2003b, p. 3). Franz tested the use of

28

Markov Chains as analytical tools by transcribing jazz solos played by John

Coltrane in the song ‘Giant Steps’ on several live records. Markov Chains

were found to be useful and effective for the analysis of jazz improvisation.

The specific style (jazz) is unimportant. The study proved that Markov

Chains are an effective format for style documentation, regardless of musical

style.

Stochastic techniques in general offer a useful means of data reduction

(Jones 1981). Constraint Logic requires a detailed specification of rules for

the progression of music and Markov Chains provide these rules (as a set of

possible options for state transitions) in a simple, compact manner.

‘Knowledge engineering’ is a generative music approach that involves

explicit coding of music rules in some form of logic or formal grammar

(Dubnov et al 2003b). While it produces impressive results, it requires

extensive exploitation of musical knowledge of a composer’s style. Statistical

learning methods are far easier to program and use.

These Stochastic models are fast, efficient data structures and can be used

to rapidly match contexts (Conklin 2003). This data reduction combined with

efficient computing techniques has led to the creation of many successful

real-time Markovian music generators. The best example of a real-time

music generator is the Continuator mentioned above.

29

Markov chains make it possible to compute the ‘optimal’ piece of music by

selecting a path, through the states, with the highest probability (Mahmud

2006). This is useful for pre-processing music. For real-time application,

using the ‘random walk’ method makes it straightforward to generate new

music using a Markov Chain. This may result in unanticipated note choices.

These are note choices or whole phrases that might have a low probability of

occurrence. Jones (1981, p. 45) states, “The bonds of a restrictive and

inaccurate acoustic theory and of a limited aural imagination may be

broken”.

He further suggests electronic music is often criticized for being too sterile,

but stochastic techniques may be used to add a human element to computer

generated music. However, all of this relies on the subjective perception of

the listener and the music generated by stochastic processes may or may

not be found pleasing. This will be discussed further in section 2.3.2.3 which

looks at criticisms of Markov Chains.

Given their easy use and efficient data storage technique Markov Chains

have proven to be a useful tool both for the analysis and for the generation

(both pre-processed and real time) of music. As is shown above, they have

been used in several successful projects, both academic and commercial.

30

2.3.2.3 Case against Markov Chains

While Markov chains have been used widely in music generation systems for

style modelling, they do have their drawbacks.

Firstly, they are limited to capturing only adjacent relationships between

events, in this case, notes. They cannot capture non-adjacent relationships

in a compact fashion (Chomsky 1957). Having an n-state model may help to

alleviate this problem to some extent as it defines the probability of note A

at time t being following by note B at time n.

As Pachet (2002a) suggested, the style of improvised music is reflected in

recurring patterns found in the corpus. Chomsky (1957) found that Markov

Chains could not model recursive embedding of such phrase structure. This

can be seen as an expansion of a non-adjacent relationship between, in this

case, whole phrases where a particular phrase is repeated in different places

as part of the musician’s style. Again, this problem might be solved to an

extent by having larger n-state chains wherein relationships between certain

combinations of notes (which make up a phrase) over a larger span of time

are evaluated.

This poses another problem. Choosing a suitably sized Markov Chain to

model musical phrases is difficult. As Brooks et al (1957) found, at low

31

orders such as 2-state chains, the model failed to capture the style

effectively. At very high orders, the phrases were simply replicated with very

little original composition being generated. Thus, finding the optimal order is

difficult and may vary depending on the application it is being used for.

Constraint Logic programs using Markov Chains and other stochastic

processes are dependant on the ‘random walk’ mentioned earlier. While this

may lead to the selection of notes that have some probability of occurrence,

they may not necessarily select the most optimal path to provide the most

statistically probable musical phrase (Franz 1998). It is possible for the

‘random walk’ to go through highly improbable note choices and result in the

composition of a highly improbable corpus. Another possible consequence is

that, at some stage, the program might arrive at a situation where

subsequently only low probability events are possible or that the distribution

at subsequent stages has high entropy (Conklin 2003).

Allan (2002) tested the effectiveness of the ‘random walk’ method and found

its output to be significantly lower than the optimal output.

A possible solution would be to create a ‘branch and bound’ algorithm, such

as Djikstra’s Method, to search every possible corpus (within a specified

corpus length) and identify the optimal corpus before hand. However, this

would not be feasible for real-time applications such as the Continuator as it

requires pre-processing time that would hinder performance speed when

32

playing simultaneously with the musician. However, here we can come back

to the highly subjective nature of music and its perception by different

listeners as mentioned earlier. Csikszentmihalyi (1988, p. 325) suggests that

it is difficult to differentiate “what is creative from what is statistically

improbable or bizarre”. This suggests an improbable piece may not

necessarily be aurally displeasing and may be perceived as creative. The

success of the ‘random walk’ would eventually depend upon the opinion of

the listener.

Overall, the literature suggests Markov chains used alone as a means for

generating original and creative pieces of music in a particular style are

ineffective. Instead, they are better used in conjunction with other

algorithms. This can be seen in the Continuator, which combines Markov

chains with Prefix Trees for quicker execution (Pachet 2002b). All in all,

Markov Chains are used at the base of highly successful programs and

therefore should not be dismissed as a usable system for style modelling.

They are currently the most widely utilized method of modelling and

implementing probability analysis (McAlpine 1999).

33

3. Proposed Solution

The proposed program works as follows:

The guitarist plays a series of musical phrases in a particular style. These are

transcribed into MIDI data and fed into the program. The phrases are

analyzed and Markov chains are created and populated as per the data in

the program. The program then performs a series of algorithmic

computations based on the input data and creates an output MIDI file with

an original phrase.

3.1 Using the strengths of Markov Chains

As has been shown above, Markov chains provide an efficient, fast and data

conservative method of documenting a musician’s style as shown by use in

the Continuator. Thus, Markov Chains were used to store the player’s

phrases and calculate probabilities of state transitions (going from one note

to another).

34

Choosing an optimal state size for the change is another challenge. Low

orders such as 2-state chains will not capture the style effectively. High

orders will lead to replication with very little original composition being

generated (Brooks et al 1957). Given the shape of the guitar and the

positioning of the guitarist’s fingers to sound the notes, a guitar player’s

pattern of note choice is likely to reflect to some extent the limited flexibility

in movement from one finger position to another.

A guitarist uses four fingers on the fret board to alter the pitch of the string

being plucked. Having a four state Markov chain would show the probability

of moving from four previous notes to a fifth one. Analyzing how the

guitarist’s fingers have just moved might provide some insight into how they

are likely to move next. Hence a four state chain were used to store the

data.

The next problem was the selection of notes. Biles (1994) introduced the

idea of fitness of individuals in a population. The individuals here are music

notes (in a population of music notes) in the Markov Chain that have

probability > 0%, implying that they may be chosen as the next step in the

musical path. A method must exist for determining which of the possible

future states is to be chosen in a way that reflects their probability of

occurrence. That is, if B and C have 90% and 10% chances respectively of

following A, B should be chosen 9 out of tem times on average. Biles

35

suggests that if a ratio scale can be developed for fitness, then greater

sophistication may be used in note selection.

To achieve this sophistication, a ‘random walk’ will was then taken through

the chains to form a path (or musical phrase) of a pre-specified length. At

each step in the path, a random number was generated and percentiles were

created based on the probabilities of possible future states in the Markov

Chain. The generated number was matched against these percentiles to

choose the next state. Using this method, on average, the choice of path

was weighted by probability. That is, if the current state was A, B was

chosen 9 out of 10 times on average.

3.2 Using the strengths of Autocatalytic Set Theory

The literature shows that Autocatalytic sets are only successful in creating

music a third of the time. It has been proposed earlier that a possible

shortcoming is the choice of a digestive enzyme. However, this might

actually be a strength of the theory. To see how, first the role of

recombinancy needs to be understood.

Cope (1999) suggests that recombining extant music in a particular style

into new logical successions creates stylistically similar music. Recombinancy

36

is a natural evolutionary and creative process. Autocatalytic Set Theory

essentially works by recombining extant phrases. The ‘logical succession’

here is only based on finding a matching enzyme when recombining two

strings.

It has been suggested earlier that an artist’s signature or stylistic trait

should be preserved in a piece of music to give a sense of the artist’s style.

As Cope (1999) suggested, signatures and recombination are opposites, are

signatures must preserve their entire structure to give a sense of the artist’s

style. It has also been suggested that frequent repetition of this signature

may reinforce this sense of the artist’s style.

The autocatalytic set program works by letting the digestive enzyme break

down a piece of music into fragments at sections of repetition of the enzyme

and recombine these fragments with each other. Thus, the enzyme will be

preserved in its entirety in the resulting corpus. If a signature trait is chosen

to act as the enzyme, it will be preserved in its entirety and repeated

frequently at points where strings have been recombined.

Thus, a few signature traits were identified and used to be the digestive

enzyme. The choice of a signature for an enzyme may be the strength of the

autocatalytic set, as it provides a mechanism for the natural preservation of

a stylistic trait that Markov Chains do not.

For the purpose of this project, mutation of the enzyme was ignored.

37

3.2.1 Addressing the shortcomings of Markov Chains

As the literature suggests, the style of improvised music is reflected in

recurring patterns found in the corpus. Chomsky (1957) found that Markov

Chains could not model recursive embedding of such phrase structure. This

can be seen as an expansion of a non-adjacent relationship between, in this

case, whole phrases where a particular phrase is repeated as part of the

musician’s style. This problem might be solved to an extent by having larger

n-state chains wherein relationships between certain combinations of notes

(which make up a phrase) over a larger span of time are evaluated.

However, autocatalytic sets allow the natural development of recursive

melodies. These melodies are the signatures that act as digestive enzymes

and recur in the piece upon recombination of strings (Iverson 1990).

Choosing a suitably sized Markov Chain to model musical phrases is difficult.

As Brooks et al found (1957), at low orders such as 2-state chains fail to

capture the style effectively whereas very high orders result in replication of

the original corpus with very little original composition being generated.

If a piece of music generated by the constraint logic process is then

subjected to fragmentation under the Autocatalytic Set Theory, strings that

38

might have been replicated due to choosing a high order will be broken down

and subsequently restructured into different combinations by autocatalysis.

39

4. Implementation – Stage 1

The program designed to analyze the input data was broken into two main

segments; Markov Chains and Autocatalytic sets. It was created using the

Java programming language. The creation of a program is not the primary

focus of this project. It was done to facilitate calculations and create an

output showing the result of the algorithm. The algorithm is the primary

focus of the project.

4.1 Markov Chains

The first part of the program constructs Markov Chains based on input data.

A flexible Markovian Analyzer was created to produce, evaluate, manipulate

and read from Markov chains. A UML diagram for the Markovian Analyzer is

shown in Figure 1 on the next page.

As MIDI music notes in Java are represented with numbers, the ‘states’ in

the Markov Chains are integers representing the music notes. Examples of

the chains created are shown in section 4.1.2.

40

4.1.1 The Markov Chain Program

The following diagram shows the classes (representing objects) that were

created in the program.

Fig 1. UML diagram of the Markovian Analyzer program

41

4.1.1.1 Diagram Explanation

The following is a brief explanation of the classes shown in the diagram and

how they interact with each other. For a more detailed explanation of each

of the classes, their methods and their variables, please refer to the Java

documentation section in appendix 1.

All the classes were placed in a program package called ‘markov’.

MarkovAnalyzer

The Markov analyzer is the main class that takes a raw data set and creates

transition states, Markov chains, transition probability matrices and probable

phrases. It also prints out the information sheets for any of these if needed.

TransitionState

A transition state is the probability of going from present state ‘A’ at time t

to future state 'B' at time t+1. In this class, the present state, future state

and number of occurrences of transitions between particular states are

recorded.

Note that all states are represented by integers. That is, a state can be "1",

"2", "10", etc. but for example purposes state will be referred to as letters

such as "A", "B", etc.

42

MarkovChain

A Markov chain has a present state (of any order), one or more future states

and a recorded number of total occurrences.

It derives its future states from a list of several transition states. For

example, it will hold transition states for "A B" to "C", "A B" to "D", "A B" to

"A", etc.

The number of total occurrences is the cumulative total of all occurrences of

transition states within the Markov chain. For example, if the chain is made

up of 2 occurrences of "A B" to "C" and 5 occurrences of "A B" to "A", the

cumulative total occurrence is 7.

TransitionProbabilitySet

A transition probability set is a collection of Markov chains of various orders.

Once collected, the chains can be analyzed and used to generate various

data.

ProbablePhrase

A probable phrase is a 'path' of states (in this case representing a musical

phrase) generated randomly based on Markovian probabilities of a set of

Markov chains. It is a path created through various states based on the

'random walk' method. As the states are represented by integers, the phrase

is just a set of integers, each representing the next step in the path.

43

A probable phrase extends a transition probability set. This means that it

inherits all the attributes and methods of transition probability sets and has

its own additional attributes and methods.

As such, it contains a set of member Markov chains. The random walk is

taken using the data from these chains.

‘MarkovAnalyzer’ was the main class in the program and the rest of the

classes represent objects that were created to facilitate programming. The

‘rawData’ is an array of music notes (integers) in the data set. An order for

the chains is set so that chains of that order and every lower order are

created. For example, for markovOrder 4, 4-state, 3-state, 2-state and 1-

state chains are created. Java documentation for all the classes showing

descriptions of the methods is provided in appendix 1.

The Markov analyzer was tested using dummy data and the produced chains

were tested against those constructed manually. They were found to be

accurate, thus proving that it works.

The following dummy data was used:

1, 2, 1, 1, 2, 1, 3, 1, 2, 9, 1, 3.

A markovOrder of 4 was set (i.e. 4-state chains and less) and the program

produced the following transition probability matrices (please note that the

44

following is an exact copy of the output of the program and has not been

formatted in any way):

1-state markov chains | 1 | 2 | 3 | 9 | 1 | 0.17 | 0.5 | 0.33 | 0.0 | 2 | 0.67 | 0.0 | 0.0 | 0.33 | 3 | 1.0 | 0.0 | 0.0 | 0.0 | 9 | 1.0 | 0.0 | 0.0 | 0.0 | 2-state markov chains | 1 | 2 | 3 | 9 | 1, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 1, 2 | 0.67 | 0.0 | 0.0 | 0.33 | 1, 3 | 1.0 | 0.0 | 0.0 | 0.0 | 2, 1 | 0.5 | 0.0 | 0.5 | 0.0 | 2, 9 | 1.0 | 0.0 | 0.0 | 0.0 | 3, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 9, 1 | 0.0 | 0.0 | 1.0 | 0.0 | 3-state markov chains | 1 | 2 | 3 | 9 | 1, 1, 2 | 1.0 | 0.0 | 0.0 | 0.0 | 1, 2, 1 | 0.5 | 0.0 | 0.5 | 0.0 | 1, 2, 9 | 1.0 | 0.0 | 0.0 | 0.0 | 1, 3, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 2, 1, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 2, 1, 3 | 1.0 | 0.0 | 0.0 | 0.0 | 2, 9, 1 | 0.0 | 0.0 | 1.0 | 0.0 | 3, 1, 2 | 0.0 | 0.0 | 0.0 | 1.0 | 4-state markov chains | 1 | 2 | 3 | 9 | 1, 1, 2, 1 | 0.0 | 0.0 | 1.0 | 0.0 | 1, 2, 1, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 1, 2, 1, 3 | 1.0 | 0.0 | 0.0 | 0.0 | 1, 2, 9, 1 | 0.0 | 0.0 | 1.0 | 0.0 | 1, 3, 1, 2 | 0.0 | 0.0 | 0.0 | 1.0 | 2, 1, 1, 2 | 1.0 | 0.0 | 0.0 | 0.0 | 2, 1, 3, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 3, 1, 2, 9 | 1.0 | 0.0 | 0.0 | 0.0 |

In all the chains, the Y axis (left) shows the present state and the X axis

(top) shows the future states. Thus, for example, in the 4 state matrix, the

probability of going from present state ‘1121’ at time t to future state ‘3’ at

time t+1 is 1.0, or 100%. All the chains have been normalized. This means

that the sum of the probabilities of moving to the future states is 100%,

45

ensuring that there is absolute certainty that some future state will be

reached.

In the main program, a ‘ProbablePhrase’ class has been created. This is an

array of integers that was created at random based on weighted Markov

probabilities of moving from a particular state to another. A ProbablePhrase

(named for a ‘phrase’ of music notes that has some probability of being

generated as per the Markovian probabilities) consists of a set of Markov

Chains, a phrase length and an arbitrary starting note set by the user. A

random walk is then taken (starting at the default start note) to generate

the music phrase. At each step, the walker looks for the largest order

Markov chain available based on the immediate preceding notes in the

phrase. For example, if the phrase thus far is 1, 2, 3, 1, 2, 9, the largest

chain in the phrase for the immediate preceding notes is a 4-state chain (3 1

2 9) based on the transition matrices shown above. If the phrase thus far is

1, 2, 9, 3, 1, 2, the walker will try to find a chain for 9 3 1 2. As none exists,

it will look for 3-state chains and will find one for 3 1 2.

46

Once a chain is found, a random number is generated between 0 and

0.9999… and is matched against percentiles created out of the future states

of a chain. For example, consider the following chain:

The percentiles constructed would be as follows:

Future state Lower Boundary Upper Boundary Explanation

1 0 0.17

The default lower boundary for the first possible future state is set to 0 and the upper boundary is set to its transition probability.

2 0.17 0.67

The lower boundary is set as the upper boundary of the previous percentile and the current upper boundary is set as the new lower boundary + the transition probability (i.e. 0.17+0.5 = 0.67)

3 0.67 0.99 Same as above

9 - - No matching transition state exists. This step has been ignored.

The random number is tested against the percentiles. If it is greater than or

equal to the lower boundary and less the upper boundary of a particular

future state, that state is selected as the next step in the path. For example,

if the random number 0.3214 is generated, it will fit within the percentile for

future state ‘2’ and thus ‘2’ will be chosen. If the random number 0.67 is

1 2 3 9 1 0.17 0.5 0.33 0.0

47

chosen, it is equal to the lower boundary for future state ‘3’ and thus ‘3’ will

be chosen.

Thus,

chosenPercentileLowerBoundary <= randomNumber < chosenPercentileUpperBoundary

4.1.2 The Markov Chain Program – With Music

Figure 2 on the next page shows the classes (representing objects) that

were created in phase two of the project.

The program was extended to incorporate MIDI functionality and output

musical notes. The structure and methods of the TransitionState,

MarkovChain and TransitionProbabilitySet classes remained the same.

48

Fig 2. UML diagram of the Markovian Analyzer program in phase 2

49

The following changes were made:

JFugue

The JFugue API developed by Koelle (2007) was used to give the program

MIDI capability. This API provided for simple access to Java’s MIDI

functionality without the need to set several parameters and settings. For

example, playing a C major scale could be done by creating a ‘music string’

as such:

“C D E F G A B”

and passing this ‘music string’ to JFugue’s player to play the notes in MIDI

(Koelle 2007).

MidiAnalyzer

The midiAnalyzer class was created to receive a MIDI file specified by the

user and perform various methods to be used by the markovAuto class such

as getting the notes played by the musician in a format usable by the

MarkovAnalyzer.

MarkovAnalyzer

This was no longer the main class. Hence the main() method was removed.

Instead, the program would now be run using the MarkovAuto class as the

main class.

50

MarkovAuto

This class was added and made the main class. At runtime, the user

specifies a MIDI file containing the dataset (a solo improvised by a

musician), a markovOrder (to be used by the MarkovAnalyzer) and a phrase

length for the phrase to be generated.

ProbablePhrase

A phraseToJFugueString() method was added that converted a phrase to a

‘music string’ compatible with the JFugue MIDI player. This made it easier to

play the phrase generated without having to work directly with the,

somewhat tedious, Java MIDI functionality.

A package called NSD containing two classes, called NSDUtils and

NSDUtilsMIDI, that had been created previously for general use over several

projects was imported into this project and used by most of the classes in

the program. This package included general useful methods such as

rounding numbers to n decimal places, matching arrays of integers,

automatically printing the contents of vectors, etc. Descriptions of the

classes in this package are included in appendix 1.

For a more detailed explanation of each of the classes, their methods and

their variables, please refer to the Java documentation section in appendix

51

1. For a discussion of the results of phase two of the program, please refer

to chapter 5.

4.1.2.1 Limitations

The markov analyzer program has a number of limitations to its functioning:

First of all, the program is completely monophonic. It does not recognize two

notes being played simultaneously. In the event that the musician plays two

notes simultaneously, the program will recognize these as two separate

notes played successively. The output of the program is also monophonic.

The program is intended primarily for guitarists. However, the program does

not provide for variables in guitar playing such as pitch bending, vibrato,

string bending, sliding, etc. As is discussed later in this section, the program

can easily be extended to provide for these, but they would require a large

data set for various reasons.

The dataset is read independently of the backing track. The MIDI file being

read should only contain the guitar solo and no other backing tracks. That is,

only the solo played by the musician is analyzed. Any chords or rhythms that

may be played by backing musicians in real life situations are not recorded

or analyzed in any way.

52

Consequentially, the probable phrase is not influenced at any point by the

underlying chord structure or rhythm structure of the song. While a musician

in a real-life scenario might respond to anticipated chord changes or

rhythms in the song structure, the probable phrase does not. The analyzer

does not read the backing tracks in the MIDI file.

Thus, it is recommended that the musician play in a single key throughout

the passage so as to ensure that the improvised passage is not subjected to

random shifts into other keys as a result of overlap of notes between these

keys.

As an example of this, the C major scale is comprised of the following notes:

“C D E F G A B C”

and the D major scale is comprised of the following notes:

“D E F# G A B C# D”

There are several overlapping notes between the two scales, however, if the

D major scale is played over a musical sequence in the key of C major, the

dissonance caused by the F# and C# notes will usually sound displeasing.

53

Consider the following situation in which a musician improvises over a piece

of music that switches between these keys.

Notes: C E D F# A G G E C D F# D

Keys: C D C D

Of the several markov chains that can be generated from the sequence

above, consider the following:

These chains would suggest that a C note will be followed by a D note 5 out

of 10 times and that a D note will always be followed by an F# note.

Thus is it stochastically possible that the following phrase might be

generated:

Notes: C D F# A G (etc)

Keys: C D

The underlined F# in the sequence will sound wrong when played in the key

of C. However, according to the Markov analyzer program, this progression

is stochastically correct.

D E F# C 0.5 0.5 0.0

D 0.0 0.0 1.0

54

As a side note, it is possible that a musician’s style of phrasing may

incorporate the use of out-of-scale notes in such a way that they will not

sound displeasing. This will be dealt with later in the results section (chapter

5) to see if the Markov analyzer program can generate such phrases.

Another limitation of the Markov analyzer program is that the

‘phraseToJFugueString()’ method in the ‘probablePhrase’ class generates all

eighth notes (i.e. 8 notes in every bar consisting of 4 counts, or 2 notes per

count). At the time of analysis of the musician’s solo, the analyzer does not

record the length of each note, but rather just the note itself. Therefore,

there is no context regarding the length of notes. This was left out, as it

would require a large data set to get a full understanding of how a musician

plays different notes at different lengths followed by other notes of at other

lengths. For the sake of simplicity, the phrase generated is played as quarter

notes. This will be discussed further in chapter 6.

55

4.2 Autocatalytic Sets

The second segment of the program design phase dealt with the simulation

of Autocatalytic sets. It was created using the Java programming language.

Again, the creation of a program was not the primary focus of this project. It

was done to facilitate calculations and create an output showing the result of

the algorithm. The algorithm is the primary focus of the project. The

program acts as a scientific test of the success of the algorithm.

Autocatalytic sets are found in many different forms. The process of

autocatalysis occurs among catalytic peptides and catalytic RNA (Kauffman

1993). While it may differ in its form and conditions depending on the

polymer, the general idea behind autocatalysis is the same as the one

proposed in section 2.3.1.

To recapitulate, the idea worked in the following way:

A ‘pool’ of catalytic polymers exists. Initially, all the polymers are joined

together in a ‘sequence’. If each RNA cell is represented by an integer, the

string may look like this:

1 4 7 10 4 2 3 6 8 11 2 3 4 7 5 2 3 4 5 6 2

56

This RNA sequence exists in the fragment pool in the initial state. The

autocatalytic set also contains one or more 'digestive enzymes' (which are

RNA sequences too) that are reactive and can split a sequence into smaller

sub-sequences. They search RNA sequences for sections that match their

own content and split them around that point.

For example:

The following enzymes might exist in the set: (2 3) and (4 7). Assume the

enzyme (2 3) is active.

The enzyme searches the initial complete sequence for a subsequence that

matches its own sequence. Hence, the active enzyme (2 3) could split the

original sequence in any of the following places:

1 4 7 10 4 2 3 6 8 11 2 3 4 7 5 2 3 4 5 6 2

It randomly finds the second potential split point and splits the original

sequence into two sub-sequences. The fragment pool now looks like this:

1 4 7 10 4 2 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2

Next, the (4 7) enzyme might become active. It can potentially split one of

the sub-sequences at one of the following points:

1 4 7 10 4 2 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2

57

It randomly finds the match in the first sub-sequence and splits it. The

fragment pool now contains three sub-sequences:

1 4 - 7 10 4 2 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2

The (2 3) enzyme might become active again, randomly finding a match in

the second sub-sequence and splitting it into two smaller sub-sequences

around the match, resulting in the fragment pool looking as such:

1 4 - 7 10 4 2 - 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2

This process might continue indefinitely.

The digestive enzymes can also facilitate ligase (cleavage) reactions that

result in the joining of two sub-sequences. This process works in reverse to

the one described above.

For example, consider the fragment pool status after the autocatalyzation

process is complete.

1 4 - 7 10 4 2 - 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2

58

If the (2, 3) enzyme became active again, it could potentially rejoin and

two of the sub-sequences at the following points:

2 3

7 10 4 2 and 3 6 8 11 2

or

2 3

7 10 4 2 and 3 4 7 5 2 3 4 5 6 2

or

2 3

3 6 8 11 2 and 3 4 7 5 2 3 4 5 6 2

or

2 3

3 4 7 5 2 3 4 5 6 2 and 3 6 8 11 2

As the example shows, any of the two RNA fragments that can be matched

around the digestive enzyme might be chosen; each at random.

59

Assuming it joins the second sub-sequence in the fragment pool with the

fourth one, the pool will look like this:

1 4 - 7 10 4 2 3 4 7 5 2 3 4 5 6 2 - 3 6 8 11 2

Using the (4 3) and (2 3) enzymes, a series of further ligase reactions

could potentially result in the following sequence:

1 4 7 10 4 2 3 4 7 5 2 3 4 5 6 2 3 6 8 11 2

Compare this to the initial sequence:

1 4 7 10 4 2 3 6 8 11 2 3 4 7 5 2 3 4 5 6 2

We can see that the resulting sequence is different to the original. It has

undergone recombination while still preserving the digestive enzymes.

60

4.2.1 Theory behind an Autocatalytic Set Program

RNA sequences in the autocatalytic set may be compared to a sequence of

music notes. As such, a ‘digestive enzyme’ may be chosen to break up this

sequence of notes and subsequently recombine them to arrive at a new

sequence.

It was suggested in section 3.2 that the ‘digestive enzymes’ chosen to

perform the reactions should be the guitarist’s signature phrases. This is

because digestive enzymes are preserved in their entirety in catalytic and

ligase reactions. This preservation of signatures might give the resulting

corpus more of a feel of the original style while still having an original

recombination of music notes.

This raises four questions:

1. How should the signatures be chosen?

2. How many signatures should be chosen?

3. How long should the signatures be?

4. How many reactions should take place?

To answer the first question, some of Cope’s (1992) work was referenced for

ideas. He suggested that signatures could be mathematical relationships

between notes rather than sequences of particular notes. Consider the

61

chromatic (or half-step) scale. There are 11 chromatic notes in standard

western music:

C C# D D# E F G G# A A# B

In MIDI data, a number represents every note. These numbers rise in

ascending order corresponding with these notes. Thus, the following

numbers correspond to the following notes:

Notes: C C# D D# E F F# G G# A A# B

No: 0 1 2 3 4 5 6 7 8 9 10 11

The numbers increase with the octaves (e.g. C1 is 12, C#1 is 13, etc).

According to cope, mathematical patterns between notes in a sequence

could be identified. For example, consider the following sequence:

C D G F G D B C#1

The mathematical relationship could be defined as the number of chromatic

notes note ‘N’ is away from note ‘N-1’. Therefore, for the subsequence C, D,

G, the note relationship would be 0, +2, +5. For D, G, F it would be 0, +5,

-2, and so on.

62

Upon gathering information about the note distance relationships for every

set of notes in the sequence, a signature dictionary can be compiled. The

signatures that occur most often would be considered the best ones.

Applying this conceptual idea of a signature (rather than a definite sequence

of notes) to an autocatalytic set, however, may not be advisable. Kauffman

suggests that “in order for reactions to occur effectively, the reactants must

be confined to a sufficiently small volume” (1993, pp. 298). This goes, in

part, towards answering the question of how many signatures should be

chosen. The number of digestive enzymes is best kept low. If the digestive

enzyme was a mathematical relationship, C D D#, A B C1 and E F# G all fit

the same signature (0 +2 +1). If all of them exist in the corpus, they will all

be used eventually over a large enough period of time breaking the original

sequence into tiny fragments.

This also has implications for the questions about how large a signature

should be and how many reactions should be allowed to take place. This

leads to a difficult situation when choosing enzymes and will be discussed

further in chapter 6.

For the sake of simplicity, it was decided that a particular sequence of notes

would be chosen to act as an enzyme. It is plausible to suggest that the

most commonly occurring sequence of notes could be considered a signature

phrase of the player. Hence it was decided that a few (two or three) of the

most commonly occurring phrases would be chosen as enzymes.

63

The Markov analyzer already provided a means for identifying this. Each

Markov chain has a number of total occurrences that could be used to

identify the most commonly occurring phrase.

It was decided that the phrase of the highest order chain and that of the

next highest order chain (as long as it is greater than 1) would be used as

enzymes.

4.2.2 Simulating Autocatalysis

It was decided that the user would specify the number of enzymes to be

chosen. Note that the user also specifies the order of the Markov chains. The

program then finds as many Markov chains of the specified order as are

available up to the specified enzyme number. Their present states are

extracted and used as enzymes.

Figure 3 on the next page shows the classes that were created to represent

an autocatalytic set.

64

Fig 3. UML diagram of the autocatalytic set program in phase 1

MarkovAuto

This class is the same as the one developed for phase 2 of the Markov

analyzer program.

DigestiveEnzyme

This is a 'reactive' array of integers in a particular sequence that can split

or rejoin greater RNA fragments containing a subsequence that matches

this enzyme’s structure within them.

The digestive enzyme has a head (its first half) and a tail (its second half).

In the case of enzymes with odd lengths, the head will be larger than the

tail. During the process of autocatalysis, when a sub-sequence is split into

two (a head and a tail) the digestive enzyme portion of it will be split in

half around its head and tail.

65

For example, consider a digestive enzyme of (2 1 3) in the following

sequence:

1 4 2 1 3 6 5 2

The sequence will be split around the matching enzyme sub-sequence as

follows:

1 4 2 1 - 3 6 5 2

AutocatalyticSet

This represents a fragment pool containing an RNA sequence (just a

sequence of integers) and has a list of digestive enzymes.

The process of autocatalyzation was programmed as per the description

above. The user specifies a number of reactions and enzymes are chosen at

random to find one of the many matching points within the RNA sequence

(and the sub-sequences that are created as a result of catalyzation). The

enzymes continue to break up the sequence and the number of reactions is

counted. Reactions continue to take place either until no more can occur

(the digestive enzymes have split up the sub-sequences in all possible

places) or until the desired reaction count is reached.

The process of ligation (or reverse catalyzation) had to detract from the

polymer concept of ligation. The intention of this program is to arrive at one

66

final, whole sequence with the same length as the original one. In real life,

the processes of autocatalyzation and ligation occur randomly until some

goal is reached. This goal may not necessarily be the creation of a single

sequence. However, it is entirely possible that, if only ligation took place

without any autocatalyzation, one or more series of cleavage reactions could

eventually result in the creation of one large sequence by chance.

Thus, the reverseCatalyze method actually seeks to find all possible series of

reactions that result in a whole final RNA sequence. Depending on which

enzymes react and which sub-sequences they join together, the results in

these cases are most likely to be different to each other.

In the end, one of the possible paths is chosen at random to act as the

series of cleavage reactions that take place to result in one large sequence.

This sequence will then be used as the musical sequence to be played.

Again, these classes, their variable and their methods are explained in

greater depth and detail in the Java documentation section in appendix 1.

The autocatalytic set program was tested with several sample data sets and

was found to provide algorithmically correct results.

67

4.3 Combining the Two

In the final stage, some minor changes were made to the programs and they

were then combined.

In the ‘markov’ package, the TransitionState and MarkovChain classes were

changed slightly. A new class called RecurringPhrase was created to

represent a ‘phrase’ of integers that may be found in more than one place

within a corpus. The number of occurrences of each recurring phrase is

stored.

A transition state is effectively one such recurring phrase that goes to a

particular future state. The number of times it does this in the corpus (or the

total occurrences of this transition state) is stored for each one.

Similarly, a markov chain is a recurring phrase that randomly goes to one of

many future states. It also stores the total number of occurrences of this

recurring phrase (regardless of which future state it is going to each time).

Hence, to facilitate further programming, the two were made subclasses of

the ‘RecurringPhrase’ class. The ‘present state’ of a transition state or a

Markov chain is the ‘phrase’ that recurs.

68

Fig 3. UML diagram of the complete program

69

For a clearer depiction of how the classes are arranged in their respective

packages, please refer to Figure 4 in Appendix 1.

The functioning of the main method in the MarkovAuto class was also

defined. It completes the combined analyses and uses the Markovian

processes along with autocatalysis and reverse catalysis to arrive at a newly

created corpus of music.

At runtime, the user specifies the name of the MIDI file containing the guitar

solo, the highest Markov order to be used, the number of enzymes to be

used and the length of the output phrase.

The main method works in the following way:

1. Using a MIDI analyzer, it extracts the notes of the solo from the MIDI file

and returns them in integer form (i.e. the integer values that MIDI

assigns to music notes).

2. A Markov analyzer is created to process this data. This process has

already been described in section 4.1. A probable phrase is generated,

which is a new guitar solo based on the probabilistic characteristics of the

original corpus.

3. The sequence of integers that makes up the probable phrase is then sent

to an autocatalytic set. To choose digestive enzymes, the program then

extracts the most frequently occurring Markov chains using the Markov

70

analyzer. Only chains of the order specified by the user at runtime are

chosen. If there are not enough chains of that order as per the user’s

specification, only as many as are available will be made into enzymes.

4. The set undergoes autocatalysis and reverse catalysis to result in a new

corpus in which the signature styles of the player (used as digestive

enzymes) are preserved but the rest of the corpus has been recombined

to some extent.

5. Finally, it saves this new corpus to a MIDI file.

71

5. Results

This chapter deals with the output of the program at various stages in its

development. In each section, a development phase of the program (as

discussed throughout chapter 4) is analyzed. Its limitations are specified and

the output is analyzed and discussed.

Note: The evaluations and opinions expressed in this section are subjective

to the perception of the author. To gauge the success of the program more

accurately, much larger tests would have to be conducted using a large

sample of musicians, ideally guitarists.

72

5.1 Markov Chains

This section deals with the output of the markov analyzer program at various

stages in its development

5.1.1 The Markov Chain Program – With MIDI

The basic Markov analyzer discussed in section 4.1.1 was extended to have

MIDI functionality. It was tested against two datasets containing improvised

solos played by the author.

In general, it was found that when the Markov chain order was set above 4,

there was a large amount of replication of the original dataset and little

creativity in the generated phrase. Large sections of the original data set

were simply repeated. Markov orders of 2 or 3 yielded better results and

maintained some degree of creativity. A Markov order of 1 resulted in a

phrase that bore little semblance to the original dataset.

73

The following is a transcription of a solo saved as dataset3.mid (all notes are

eighth notes:

D3, E3, F3, G3, G#3, A3, B3, C4, D4, E4, F4, G4, G#4, A4, C5, D5, F5, E5, D5, C5, E5, D5, C5, D5, C5, A4, G#4, A4, G#4, G4, F4, D4, F4, D4, C4, A3, C4, A3, G#3, G3, F3, E3, D3, C#3, D3, G#3, G3, F3, D3, A3, F3, G3, A3, B3, C4, D4, E4, D4, E4, F4, G4, A4, B4, G4, C5, G4, D5, F4, A4, C5, A4, D5, F5, D5, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, D4, F4, G#4, G4, F4, E4, D4, G4, A4, B4, C5, C5, D5, D5, F5, G5, F5, D5 The following phrase was generated at Markov order 3:

D3, C#3, D3, G#3, G3, F3, D3, A3, F3, G3, A3, B3, C4, D4, E4, F4, G4, G#4, A4, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, D4, F4, G#4, G4, F4, D4, F4, G#4, G4

It has some mix of the original characteristics but does not result in

repetitions of large sections of it. It sounds like a fair representation of the

player’s style without a lot of repetition.

The following phrase was generated at Markov order 4:

D3, G#3, G3, F3, D3, A3, F3, G3, A3, B3, C4, D4, E4, F4, G4, A4, B4, G4, C5, G4, D5, F4, A4, C5, A4, D5, F5, D5, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4

Here there is substantial repetition of the original piece with little original construction. A much longer solo incorporating the ‘Dorian flavour’ a lot more was played in dataset1.mid: D3, D3, A3, D4, C3, A2, D3, A3, D4, C3, D3, D3, A3, D4, F3, G3, D3, A3, D4, G#3, A3, D3, A3, D4, C4, G3, D3, A3, D4, G#3, A3, D3, A3, D4, D4, A3, D3, A3, D4, D3, F3, D3, A3, D4, G3, A3, D3, A3, D4, B3, C4, D3, A3, D4, D4, E4, D3, A3, D4, D4, E4, D3, A3, D4, F4, G4, D3, A3, D4, A4, B4, D3, A3, D4, G4, C5, D3, A3, D4, G4, D5, D3, A3, D4, F4, F#4, D3, A3, D4, G4, A4, D3, A3, D4, G#4, G4, D3, A3, D4, F#4, F4, D3, A3, D4, C5, F4, D3, A3, D4, B4, A#4, D3, A3, D4, A4, G#4, D3, A3, D4, G4, F#4, D3, A3, D4, F4, F5, D3, A3, D4, F4, E5, D3, A3, D4, D#5, D5, D3, A3, D4, C5, B4, D3, A3, D4, A#4, A4, D3, A3, D4, G4, F#4, D3, A3, D4, D4, C4, D3, A3, D4, A3, D3, D3, A3, D4, E3, F3, D3, A3, D4, G3, G#3, D3, A3, D4, A3, B3, D3, A3, D4, C4, D4, D3, A3, D4, E4, F4, D3, A3, D4, G4, G#4, D3, A3, D4, A4, C5, D3, A3, D4, D5, F5, D3, A3, D4, E5, D5, D3, A3, D4, C5, E5, D3, A3, D4, D5, C5, D3, A3, D4, D5, C5, D3, A3, D4, A4, G#4, D3, A3, D4, A4, G#4, D3, A3, D4, G4, F4, D3, A3, D4, D4, F4, D3, A3, D4, D4, C4,

74

D3, A3, D4, A3, C4, D3, A3, D4, A3, G#3, D3, A3, D4, G3, F3, D3, A3, D4, E3, D3, D3, A3, D4, C#3, D3, D3, A3, D4, G#3, G3, D3, A3, D4, F3, D3 Here the improvisations generated by the computer both at Markov order 3

and 2 were generally found to have roughly the same pleasing balance

between originality and trueness to the original style. Here is one

generated at Markov order 2:

D3, C#3, D3, G#3, G3, F3, D3, E3, F3, G3, A3, B3, C4, D4, E4, F4, G4, A4, G#4, G4, F#4, F4, F5, F4, E5, D#5, D5, C5, D5, F5, E5, D5, C5, E5, D5, C5, B4, A#4, A4, G4, F#4, F4, C5, F4, B4, A#4, A4, G#4, A4, C5, D5, F5, E5, D5, C5, E5, D5, C5, A4, G#4, G4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, C4, A3, C4, G3, G#3, A3, D4, A3, D3, F3, G3, G#3, A3, B3, C4, D4, E4, F4, G4, A4, B4

Through similar datasets, it was generally found that Markov orders 2 and 3

resulted in the best improvisations (3 generally being better than 2). With a

large enough data set in which the musician has had time to express herself,

the output at Markov order 3 will yield the best results.

It was hypothesized in section 3.1 that 4 would be the optimal Markov order

for capturing a guitarist’s style as the guitarist uses four fingers to play

notes and her style might be reflected in the limited movement of the

fingers across notes.

It was found that 4-order Markov chains captured the guitarist’s style almost

to the point of replication of the original piece. This does mean that it is a

good method for capturing style, but it does not capture only the essence of

75

the style. Markov orders 2 and 3 capture the essence without replicating

entire large phrases.

However, upon closer inspection, it was found that the style of the solos in

the datasets was based on guitar patterns that used 2 or 3 notes per string

and broke down sub phrases into groups of 3, i.e. the guitarist would put

together several tiny phrases of, on average, 3 notes each.

To further test this hypothesis, a section of a neo-classical guitar piece by

Paul Gilbert called ‘Scarified’ was analyzed. This piece involves a lot of

movement of all four fingers with large jumps across the guitar fret-board,

thereby giving a balanced dataset that requires the guitarist to increase the

limits of her movement across the fret-board. Here is a transcription of the

main melody:

C#6, A5, G#5, A5, C#6, A5, F#6, C#6, A5, B5, C#6, D6, C#6, F#6, C#6, A5, F#6, D6, C#6, D6, F#6, D6, B6, F#6, D6, E6, F#6, G6, F#6, B6, F#6, D6, B5, G#5, F#5, G#5, B5, G#5, E6, B5, G#6, E6, G#6, E6, B5, E6, B5, G#5, E6, C#6, B5, C#6, E6, C#6, A6, E6, C#7, A6, C#7, A6, E6, A6, E6, C#6, A5, F#5, E5, F#5, A5, F#5, D6, A5, F#6, D6, F#6, D6, A5, D6, A5, F#5, D6, B5, G#5, B5, D6, B5, F#6, D6, B6, F#6, B6, F#6, D6, F#6, D6, B5

It was found that 3 order and 4 order Markov chains produced the best

results in terms of capturing the style while still composing an original

sounding piece. 4 order Markov chains resulted in a substantial amount of

replication of the original corpus where as, 3 order Markov chains did not. 2

order Markov chains did not capture the essence of the corpus at all.

76

This would suggest that, on average, 3 order Markov chains generally

captured a wider style of guitar soloing. For best results, the Markov order

should be set according to the individual player.

It was decided that Markov order 3 would be best for capturing the ‘digestive

enzyme’ in the autocatalytic set based on the author’s datasets.

77

5.2 Autocatalytic Set Theory

This section deals with the output of the autocatalytic set program combined

with the Markov chain program.

Markov chains are generated and a ‘probable phrase’ of music notes is

created based on these chains. The highest number of total occurrences

amongst all the chains with the user specified order is found. It then

identifies any other chains with the same order and number of occurrences.

The user specifies a number of enzymes at runtime. If this number of

enzymes cannot be found (i.e. not enough chains of the user-specified order

have the maximum number of total occurrences), only as many as can be

found will be used as digestive enzymes.

78

5.2.1 The Autocatalytic Set Program

The autocatalytic set program was tested using the same data sets used to

test the Markov chain program.

Datasets 1 and 3 were tested first as these contained samples of the ‘Dorian

flavour’ phrases. Phrases of 100 notes were generated for both the Markov

chain program and the Autocatalytic Set program. The first problem was that

the reverse catalyzation method generated 100s of options (of ligation

reactions resulting in a single, whole sequence) with enzyme numbers

greater than 3 or 4. The program took significantly longer than usual to

process all this data and would almost always result in an OutOfMemoryError

by the Java program.

This was because of the way the reverse catalyzation method works. It

computes every possible series of ligase reactions that could result in the

generation of a single sequence (with no fragments remaining). It follows

logic that increasing the number of active enzymes and the length of the

phrases would increase the number of possible ligase reactions

exponentially.

To combat this problem, the user-specified output phrase length was

reduced to between 50 and 70 notes. Computation time was still quite high

even above 60 notes.

79

At lower phrase lengths, it was found that the difference between the

Markovian phrase and the autocatalyzed phrase was very small. Stylistically,

the differences were unnoticeable. No original phrases were easily perceived.

Dataset 3 was much larger than dataset 1 and had more musical variation in

the corpus. When tested, it yielded better results in terms of originality. On

rare occasion, at phrase length 50 the program would result in significantly

longer computation time. This was due to the random nature of the

autocatalyzation and subsequent ligation and the different combinations of

sub-sequences that might have emerged based on the series of random

reactions.

Overall, most of the time the results did, however, seem to suggest a level

of success. One example of this is a test done on dataset 3 with 3 state

Markov chains,

The Markovian analyzer generated the following phrase:

D3, G#3, G3, F3, D3, A3, F3, G3, A3, B3, C4, D4, E4, D4, E4, F4, G4, G#4, A4, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, D4, F4, D4, C4, A3, C4, A3, G#3, G3, F3, E3, D3, C#3, D3, G#3, G3, F3, D3, A3

80

These were then broken down and over 100 possible ligase options were

generated. The following one was chosen (note that ‘FP’ is the sub-

sequence’s reference number in the fragment pool).

(FP0)50, 56 - (FP1)55, 53, 50, 57, 53, 55, 57 - (FP4)62, 64, 62, 64 - (FP3)60 - (FP6)62, 65, 62, 60, 57, 60, 57, 56 - (FP5)65, 67, 68, 69, 72, 74, 77, 80, 79, 77, 74, 72, 71, 69, 67, 65 - (FP2)59 - (FP8)55, 53, 50, 57 - (FP7)55, 53, 52, 50, 49, 50, 56 This translated to the following corpus:

D3, G#3, G3, F3, D3, A3, F3, G3, A3, D4, E4, D4, E4, C4, D4, F4, D4, C4, A3, C4, A3, G#3, F4, G4, G#4, A4, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, B3, G3, F3, D3, A3, G3, F3, E3, D3, C#3, D3, G#3

The highlighted notes show where two sequence fragments had been

rejoined together. Upon listening to these sequences, a guitar player may

notice that these highlighted sections are large jumps across the fret board,

which would be unintuitive (though not technically impossible) based on the

phrase leading to the jump.

This is because the preceding phrases would be played using certain finger

positions, which would not necessarily lead intuitively into the jump across

notes. These jump are, however, playable with some amount of forward

thinking, and planning, on the guitarist’s part, about how and where to

position the fingers on the guitar fret board.

81

Most samples based on dataset 3 provided this kind of output and lead to

the composition of original phrases. The following phrase was particularly

interesting:

D3, A3, F3, G3, A3, G3, F3, D3, A3, F4, G4, A4, B4, G4, C5, G4, D5, F4, A4, C5, A4, D5, F5, D5, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, D4, F4, D4, C4, A3, C4, A3, G#3, B3, C4, D4, E4, D4, E4 The underlined arrangement was quite different to any part of the ‘Dorian

flavour’ datasets but was an aurally pleasing phrase. Furthermore, the ‘B3’

note is the dorian sixth note of the D Dorian scale. In the author’s opinion it

was placed well in the phrase. However, one must bear in mind that it was

generated based on random probability and without any specific conscious

musical intention on the part of the program. This particular improvisation

has been saved as a MIDI file and provided on the CD. It is called ‘dataset3-

out-3order-length-50-sample.mid’.

Dataset 1 was a smaller sample of ‘Dorian flavour’ phrases. While the

Markov analyzer program yielded good results with this dataset, the

autocatalytic set program did not improve on these. The output was not

original sounding at all. Dataset 2 was a linear musical pattern that was not

meant for testing the autocatalytic set program, and so was not used.

82

6. Discussion and Recommendations

In this section, the results in chapter 5 will be analyzed and evaluated.

Discussions will follow regarding the reasons for these results and what they

might suggest. Recommendations for future studies will be made where

appropriate.

Note: The analyses and suggestions in this section are based on the opinion

and evaluation of the author. It is recommended that larger studies should

be carried out using musicians to evaluate the programs in order to arrive at

a better statistical analysis of the output.

6.1 Markov Chain Music

On the whole, the music generated by the Markov chain program was

satisfactory. Experimenting with different order chains yielded various

results. It was found, with the chosen datasets, that orders two and three

yielded a better mix of style capture and originality than other orders.

Depending on the size of the dataset, orders two and one tended to yield

more random sounding corpora. With orders of four and above (especially

six and above) the dataset was replicated almost note for note.

83

While the latter half of the analysis agrees with Brooks et al’s (1957)

findings, the former half does not. They suggested that lower order chains

such as two or three would not capture the style faithfully. However, this

test seems to suggest otherwise.

It is worth noting that the datasets used in this test were much smaller and

had less variation than the ones used by Brooks et al. It is possible that

lower order chains will not capture the style well for large datasets with

more variation.

As was noted in section 5.1.1, a limitation of the Markov analyzer program is

that the ‘phraseToJFugueString()’ method in the ‘probablePhrase’ class

generates all quarter notes (i.e. 4 notes for every beat or count). At the time

of analysis of the musician’s solo, the analyzer does not record the length of

each note, but rather just the note itself. Therefore, there is no context

regarding the length of notes.

One possible solution would be to treat the same note at different lengths as

different states. Thus, for example, D1 held for two counts would be a

different state to D1 held for four counts. The immediate drawback,

however, would be that the Markov chains would only recognize the

existence of D1 (two counts) and D1 (four counts) and would generate

either of these. At no point will it consider giving, say, C1 a length of four

84

counts as no Markov chain would exist for C1 (four counts) and it would be

an unrecognized state.

Further work should be done on developing heuristics and algorithms to

modify note lengths.

Overall, the Markov chain program was not meant to be the main focus of

the project. It was developed to serve the purpose of generating a phrase

with a reasonable degree of originality while staying true to the author’s

style. This goal was achieved and the main program, the autocatalytic set,

could be evaluated.

85

6.2 Autocatalytic Set Theory

As was shown in chapter 5, the autocatalytic set program had mixed results

depending largely on the dataset being used. The following general

observations were made:

Larger datasets with more variation in the content yield better results

Dataset 3 had more content with a greater degree of variation and this

resulted in the generation of original phrases that were not previously part

of the general style of the corpus. Tests carried out on dataset 1, with its

limited content, did not yield particularly interesting corpora.

Deviation from Markov chains result in originality

In all cases of original sounding phrases that were generated, the particular

parts of the phrases that sounded original were the ones that broke away

from the Markov chain predictability. This agrees with Franz’s (1998)

definition of originality, which is an “uncommonness or statistical

infrequency of a person’s ideas” (pp. 60). This would follow reason as the

Markov chains facilitate predictability and the frequent recurrence of high

probability passages.

86

However, this may also be a fault of the program algorithms. Recall that the

user specifies a Markov chain order at runtime and all chains from 1-state up

until, and including, the user-specified state are created. When generating a

probable phrase, the program looks for the highest available chain to use

when generating the next note.

For example, consider the following phrase:

1 4 6 2 9 X

If the user-specified chain order is 4, the program will consider the last four

states before generating state X. Thus it will consider the phrase “4 6 2 9”

and generate state X based on its Markovian probabilities. If no chain with

that present state is found, it will then consider “6 2 9”, followed by “2 9”

and finally “9”. Thus, at all times, the highest state chain will be chosen

where possible.

On one hand, this facilitates effective capture of the player’s style. However,

as Brooks et al (1957) suggest, using larger order chains increase

predictability to the point of replication of the original corpus.

Autocatalyzation and subsequent ligation result in the breaking up of these

chain structures by combining smaller-order chains together. Consider the

following original corpus:

1 4 6 2 9 6 2 8 4 6

87

Markovian probability suggest that the 4-order phrase “1 4 6 2” is always

followed by “9”.

Consider if “6 2” was the digestive enzyme and autocatalyzed the set. The

result fragments would be:

1 4 6 - 2 9 6 - 2 8 4 6

Subsequent ligation around the “6 2” enzyme could result in the following

recombination:

1 4 6 2 8 4 6 2 9 6

Here the phrase “1 4 6 2” is not followed by “9”. It has, thus, broken away

from predictable patterns and resulted in some degree of originality.

However, this new sequence still conforms to lower order Markov chain

probabilities as the phrase “4 6 2”, a three-state chain, is still followed by

“9” in the second half of the phrase.

It is possible, then that Markov-generated music can be made more creative

by mixing lower state chains with higher state chains when choosing the

next note to generate. However, lower state chains should not be chosen too

often as, keeping in line with the earlier hypothesis, they may not effectively

capture and maintain the player’s style.

The originality is further enhanced when fragments broken up by different

enzymes combine together. Consider the following sequence:

1 4 6 2 8 4 6 2 9 6 2

88

If two digestive enzymes (“2 8” and “2 9”) split the sequence, the resulting

fragments would look like this:

(2 8) (2 9)

1 4 6 2 - 8 4 6 2 - 9 6 2

It is entirely possible, at the time of ligation, that fragments 1 and 3, which

have come into existence because of two separate enzymes, can recombine

around the “2 9” enzyme and fragments 2 and 3 can do the same around

the “2 8” enzyme. The resulting string would look as such:

1 4 6 2 9 6 2 8 4 6 2

Again, this deviation away from larger state Markov patterns and towards

smaller state Markov patterns results in a sequence with a somewhat

original structure.

The ligation process should be reviewed and approached differently

One severe drawback of the algorithms used for ligation was that, as shown

in section 5.2, computation time was increased substantially and this

sometimes led to memory overload. As had been stated, this difficulty arose

because the true process of ligation (random ligation interspersed with

random autocatalyzation) could not be modelled, as the program’s goal was

to arrive at only one remaining sequence within a short amount of time. Real

89

life ligation might reach this goal by random chance but it is not intentional

in any way and it may happen after a long period of time.

Revisions to the autocatalytic set simulation are needed. This may even

entail completely changing the algorithms behind the simulation.

The number of digestive enzymes and autocatalyzation reactions should not

be too high

It was observed that the higher the number of reactions and enzymes, the

more random the resulting corpora sounded. This follows Kauffman’s (1993)

suggestion that the number of enzymes should be kept low. It also follows

the logic, described above, that choosing lower state Markov chains (as a

result of recombination of sequences) too often may decrease the player’s

predictability too much, to the point where the essence of her style is lost. At

this point the music becomes more random.

As such, the number of digestive enzymes and autocatalyzation reactions

has diminishing returns to scale. It is suspected that future evaluation may

show it to follow a skewed, bell-shaped graphical pattern.

90

Using a player’s signatures as digestive enzymes facilitates better style

capturing

It was hypothesized earlier that one of the reasons for Iverson’s (1990)

limited success when simulating autocatalytic sets to create music might

have been that the player’s signatures were not preserved. According to the

literature, the preservation (and frequent recurrence) of certain signatures

provides a good style capturing mechanism. The output of the program

seemed to maintain the style of the original corpus. However, this output

was not compared to models in which signatures were not preserved. An

avenue of future work would be to compare the output of the two models.

Markovian and autocatalytic music generation models can never be truly

creative

A glaring drawback of the Markovian models is that every note that is

generated has been ‘learnt’ from the original corpus. A particular note will

never be generated if it did not exist in the original corpus. The same is true

of autocatalytic sets. They do not add any notes to the phrase but simply

recombine existing notes.

The true process of creativity as found in human nature often entails

exploring new options and gauging the outcome. For guitar players, this

could be trying new notes that may not necessarily be in a particular scale

91

but may sound pleasing nonetheless. The ‘Dorian flavour’ is one such

example where ‘wrong’ notes (the flatted fifth and the Dorian sixth) are

played in such a way so as to sound pleasing and creative. While humans

may choose to explore these options, Markovian models do not.

However, as has been discussed by Iverson (1990), the digestive enzymes

may mutate in some autocatalytic sets. Randomly changing the structure of

these enzymes to notes that may not be in the scale might produce more

original results and shows characteristics of true creativity and musical

exploration.

It is worth noting that, even amongst humans, this process of musical

exploration is often one of trial and error. New notes can be tried at random.

and may be found either to sound good or not. With enough exploration and

trial of different note combinations, aurally pleasing phrases may eventually

be discovered.

Finally,

Markovian music generation models can be enhanced using autocatalytic set

theory

All the observations seem to suggest that Markovian music generation has

latent potential to produce original sounding pieces within a certain stylistic

92

context. The study seems to suggest, however that it needs another process

to ‘push’ it towards exploring these options. The process of random

recombination of the corpus using autocatalytic set theory achieved this. If

originality is borne out of a lack of predictability, the random recombination

facilitates this creative process by breaking predictable patterns as shown

above.

This would, however, also suggest that any form of randomization would go

some way towards achieving this. The autocatalytic set theory was chosen

for two main reasons. Firstly, it preserves artist signatures. Secondly, it is a

natural process in living organisms and numerous studies promote the use of

algorithms found in nature (Pelta and Krasnogor 2006). Future studies could

be carried out to compare the results of this program to that of others where

Markovian principles are combined with different randomization models. This

would provide a platform to discover further processes and heuristics that

might result in a great degree of originality within a stylistic context.

All these analyses and evaluations are those of the author. It is suggested

that extensive future work should be carried out to conduct a proper

evaluation of the music generated by the program. Any number of different

combinations of variables may be chosen. These include the Markov chain

orders, the number and lengths of the digestive enzymes, the number of

93

autocatalyzation reactions and the content of the original corpus. It is

recommended that musicians, specifically guitar players, should be chosen

to evaluate the output of the program at different combinations of these

variables in order to gain a better estimate of the best ones.

94

7. Conclusion

The purpose of this project was to develop a style modelling system capable

of capturing a guitar player’s style and then playing an improvised solo in

that style. The main objective was to develop the foundations of a system

that would maintain this style but add creative and original touches to the

music it generated. To achieve this end, Markov-based music generation and

autocatalytic set-based music generation were tested and evaluated

separately and combined, and the following conclusions were drawn:

1. Larger datasets with greater variation in the content have better potential

for the generation of creative, original sounding music.

2. Deviation from large state Markov chains result in originality. A greater

degree of originality may be achieved by combining lower state chains

with higher state chains when generating phrases.

3. The identification of a player’s signatures and the use of these as

digestive enzymes result in better style capturing.

4. The number of digestive enzymes and autocatalyzation reactions has

diminishing returns to scale. The perceived creativity of the program

peaks at a certain level but too many enzymes and reactions start to

95

weaken the perception of the original style and the music becomes more

random.

5. Markovian models cannot promote real creativity, as found in humans, as

they inherently will never generate previously unexplored note choices. A

process of randomization and mutation of the original corpus is needed to

give Markovian models this ‘creativity’.

6. Markovian music-generation models can be enhanced using autocatalytic

set theory in order to promote creativity and originality in the music

composed by these systems.

In conclusion, the value of this dissertation lies in developing a model for

creative music generation based on a specific style. It has achieved this goal

to a large extent. Future studies bigger in scope and size are required to

further develop and test this model. This project has provided a firm

theoretical and technical basis on which larger studies can be developed.

96

5. References

Allan, M. (2002) Harmonizing Chorales in the Style of Johann Sebastian

Bach. MSc Thesis, School of Informatics, University of Edinburgh

Anders, T. (2003) Composing Music by Composing Rules: Computer Aided

Composition employing Constraint Logic Programming. Sonic Arts Research

Centre, Queens University Belfast, Northern Ireland (unpublished)

Baggi, D. (1998) The Role of Computer Technology in Music and Musicology.

Laboatoria di Informatica Musicale, Universita delgi Studi di Milano, Italy.

Retrieved on 16th of July, 2007 from

http://lim.dico.unimi.it/eventi/ctama/baggi.htm

Biles, J.A. (1994) “GenJam: A genetic algorithm for generating jazz solos,”

in Proc. Int. Computer Music Conf., pp.131-137

Brooks, F.P., Hopkins, A. L., Neumann, P.G. & Wright, W.V. (1957) ‘An

Experiment in Musical Composition’. In S.M. Schwanauer & D.A. Levitt (eds)

Machine Models of Music. The MIT Press, London, pp. 23-42

Brown, P. (2005) ‘Is the future of music generative?’ Music Therapy Today,

6(2), pp. 215-274

97

Chomsky, N. (1957) Syntactic Structures. Mouton, The Hague

Conklin D (2003) ‘Music Generation from Statistical Models’, Proceedings of

the AISB 2003 Symposium on Artificial Intelligence and Creativity in the Arts

and Sciences, Aberystwyth, Wales, pp. 30–35

Cope, D. (1992) ‘Computer Modelling of Musical Intelligence in EMI’.

Computer Music Journal, 16(2), pp. 69-83

Cope, D. (1999) ‘One approach to musical intelligence’. IEEE Intelligent

Systems, 14(3), pp. 21–25

Csikszentmihalyi, M. (1988) ‘Society, culture, and person: A systems view of

creativity’. In R. J. Sternberg (eds) The Nature of Creativity: Contemporary

Psychological Perspectives. Cambridge University Press, New York, pp. 325-

329

Dubnov, S., Assayag, G., Bejerano, G., & Lartillot, O. (2003a) ‘A System for

Computer Music Generation by Learning and Improvisation in a Particular

Style’. IRCAM (unpublished). Retrieved on 17th July 2007 from

http://mediatheque.ircam.fr/articles/textes/Dubnov03a/

Dubnov, S., Assayag, G., Lartillot, O., & Bejerano, G. (2003b) ‘Using

machine-learning methods for musical style modelling’. IEEE Computer, 36

(10), pp. 73-80.

98

Eacott, J. (2000) Form and Transience – Generative Music Composition in

Practice. A paper for Generative Art 2000, Milan, Retrieved July 7th, 2007,

from http://www.informal.org/FormandTransience.pdf

Franz, D.M. (1998) Markov Chains as Tools for Jazz Improvisation Analysis.

MSc Thesis, Virginia Polytechnic Institute and State University

Huron, D. (2001) Information Theory and Music. Ohio State University

School of Music, Retrieved on 15th June 2007 from http://www.music-

cog.ohio-state.edu/Music829D/Notes/Infotheory.html

Iverson, E. and Hartley, R.T. (1990) ‘Metabolizing Music’. Proceedings of the

1990 International Computer Music Conference, ICMA, San Francisco, 1990.

Jones, K. (1981) ‘Compositional Applications of Stochastic Processes’.

Computer Music Journal, 5(2), pp. 45-61

Kauffman, S. A. (1993) The Origins of Order. Oxford University Press, USA

Koelle, D (2007) ‘JFugue’. Java API for Music Programming. Retrieved

August 1st, 2007 from http://jfugue.org/

Limpaecher, A. (2007) Musical Markov Retrieved July 2nd, 2007, from

http://www.princeton.edu/~alimpaec/alimpaecMarkov/

99

Mahmud, J. (2006) ‘Grammar Based Modelling and Generation of Tabla

Compositions’. BSc Dissertation, Computer Science Department, University

of Bath

McAlpine, K., Miranda, E. and Hoggar, S. (1999) ‘Making Music with

Algorithms: A Case Study System’, Computer Music Journal, 23(2), pp. 19-

30

Pachet, F. (2002a) ‘Multimedia at Work - Playing With Virtual Musicians: The

Continuator in Practice’, IEEE Computer, 9(3), pp. 77-82

Pachet, F. (2002b) ‘The Continuator: Musical Interaction With Style’.

Proceedings of the International Computer music Conference, ICMA, Sweden

Pelta, D.A. and Krasnogor, N. (2006) Workshop on Nature inspired

cooperative strategies for optimisation (eds). University of Granada.

Pfisterer, M and Bomers, F. (2003) DumpReceiver.java, Java Sound

Resources Code examples. Retrieved on 16th of August, 2007 from

http://www.jsresources.org/examples/DumpReceiver.java.html

Wilcox, A. (2007) Generative Music. Retrieved July 15th, 2007, from

http://www.alexwilcox.co.uk/projects.php?id=10

100

Winston, W. (1994) Operations Research: Applications and Algorithms.

Brooks/Cole Thomson Learning Inc., USA

Zimmer, C. (1993) ‘Metamusic’. Discover Magazine, Retrieved on 17th July

2007, from the Discover Magazine database

101

Appendix 1 – Java Documentation

This section presents the java documentation for the program developed as

part of this project. Each class shown in the diagram has a Java document

that provides an in-depth explanation of what it represents, what attributes

and operations it has and how it works.

All of the classes belong to a package. The documentation has been

arranged in alphabetical order according to class name.

Four packages were developed by the author: ‘autoCatalyticSet’, ‘markov’,

‘markovAuto’ and ‘NSD’. ‘JFugue’ was developed by Koelle (2007).

Figure 4 shows all the classes and packages developed by the author for the

program. It does not show relationships or dependencies between classes.

102

Fig 4. The classes and packages developed by the author for the program

Each Java document begins with an overall description of the class. This may

then be followed by a ‘nested class summary’, which refers to a class within

a class and will have its own separate Java document.

Next is the ‘field summary’, which has a brief description of the attributes of

the class. For example, a recurring phrase has ‘phrase’ and ‘totalOccurences’

103

attributes. These are only short descriptions. More details are given in the

‘field summary’ section further down in the document.

The ‘constructor summary’ that follows may be ignored. The ‘method

summary’ describes the operations of the class in short. Again, full

descriptions follow later on in the document.

These Java documents describe attributes and operations in every class that

may not have been shown in figure 4. These are more technical methods

that were used to facilitate programming.

Note that no Java Documentation has been provided for JFugue, as it is

copyright of Koelle (2007). A reference to the JFugue resources can be found

in the references sections.

To read the Java documentation, please refer to the CD attached with this

dissertation and read ‘README.pdf’ for instructions. A copy of this instruction

sheet has been attached in Appendix 2.

104

Appendix 2 – Program User Manual

This user manual guides you through using the program developed for this

project. It has been included on the CD attached with this dissertation.

Before starting, copy the ‘Program’ folder from the root directory of the CD

to any location on your hard drive.

2.1 Running the program

First use the command line to navigate to the location you have stored the

program file in. Then direct it to:

“Program/dist/”

105

To run the program, type the following into the command line:

“java –jar markovAuto.jar yourmidifile.mid X Y Z”

Where:

X = the highest order Markov chains to be created

Y = the number of digestive enzymes to be selected

Z = the length of the phrase (in notes) to be generated

106

2.2 Sample Data Sets and Output Samples

The four sample data sets that were used in the project have been provided

on this CD. They can be found under:

<root>/Program/dist/

The data sets are:

dataset1.mid = a short improvised solo in the key of Dm using phrases

with the “Dorian Flavour”.

dataset2.mid = a slightly modified finger exercise pattern for guitarists.

This is a linear pattern that will mostly generate the same phrase even at

very low Markov chain orders.

dataset3.mid = a longer improvised solo in the key of Dm using phrases

with the “Dorian Flavour”. This was the main file used to test the

program.

dataset-scarified.mid = a sample melody from the song “Scarified” by

Paul Gilbert.

One output sample generated during the testing phase has been provided

under:

<root>/Program/dist/output

This particular file is specifically mentioned in the dissertation.

107

2.3 Program Source Code and Java Documentation

The source code written for the program has been included under:

<root>/Program/src/

Java files for all of the classes developed have been included. They have

been arranged in the following package structure:

The source code for the JFugue package has not been provided as it is

copyright of Koelle (2007) and is available at http://jfugue.org.

108

Java documents for each of the classes have been also been provided. They

describe each class in complete detail including a brief description, its

variables and methods. To navigate through the Java documents, open:

<root>/Program/dist/javadoc/index.html

Again, all the classes have been arranged in their package structure.

Documents

Developing a Style Modelling System for Creative Generative Music Improvisationpszcah/ITIMSc/ExampleThesis-nsd… · · 2009-07-07Developing a Style Modelling System for Creative