Developing a Style Modelling System for Creative Generative Music Improvisation
Nigel Stephen D’Souza
Dissertation submitted to the School of Computer Science and Information Technology, University of Nottingham, in partial fulfilment
for the degree of
Master of Science In
The Management of Information Technology
September 2007
1
Abstract
This dissertation describes an exploratory style modelling application
intended to simulate a guitar player’s style and replicate it with some degree
of creativity and originality.
This was achieved using a combination of a stochastic technique (Markov
Chains) and a biochemical restructuring process (Autocatalytic Set Theory).
The key objective was to see if natural restructuring processes could
enhance traditional stochastic style modelling techniques in order to promote
creative improvisation in computer generated music. In short, the program
generates improvised solos in the guitar player’s style and adds an element
of originality.
Previous studies into style modelling and generative music are discussed in
the literature review. Based on the arguments put forward, ideas for a
program simulation are suggested. Next, there is a description of the
program developed in Java to implement these ideas. Various tests are
conducted and the results analyzed. Finally, a discussion follows to provide a
theoretical insight into the evaluation of the success of the model.
2
The tests conducted yielded mostly positive results, suggesting that the
model has a sound theoretical foundation and has the potential for further
development and expansion.
Finally, conclusions are drawn about the reasons for the success of the
model and the techniques used. It is recommended that future studies
should be carried out to expand on the ideas presented here and develop the
model further.
Please note that a basic knowledge of music theory and terminology is
required in order to fully understand some of the content of this dissertation.
However, this is not essential.
This dissertation comes with a CD containing the program, its technical
documentation, a user manual, test data and output samples.
3
Contents
1 Introduction …………………………………………………….5 1.1 Aims of the project 5 1.2 Problem Description 7 1.3 Objectives 9
2 Literature Review ……………………………………………10 2.1 Generative Music 10 2.2 Methods for the Creation of Generative Music 11
2.2.1 Mozart’s Dice Game 2.2.2 Constraint Logic Programming 2.2.3 Cellular Automata
2.3 Style Modelling 17 2.3.1 Autocatalytic Set Theory 2.3.2 Stochastic Processes – Markov Chains
2.3.2.1 Markov Chains in Practice 2.3.2.2 Case for Markov Chains 2.3.2.3 Case against Markov Chains
3 Proposed Solution ………………………………………….33 3.1 Using the strengths of Markov Chains 33 3.2 Using the strengths of Autocatalytic Set Theory 35
3.2.1 Addressing the shortcomings of Markov Chains
4 Implementation – Stage 1 …………………………….39 4.1 Markov Chains 39
4.1.1 The Markov Chain Program 4.1.2 The Markov Chain Program – With Music
4.2 Autocatalytic Sets 55 4.2.1 Theory Behind an Autocatalytic Set Program 4.2.2 Simulating Autocatalysis
4.3 Combining the Two 67
4
5 Results……………………………………………………………….71 5.1 Markov Chains 72 5.2 Autocatalytic Set Theory 77
6 Discussion and Recommendations …….………….82 6.1 Markov Chain Music 82 6.2 Autocatalytic Set Theory 85
7 Conclusion ………………………………………………………94 8 References ……………………………………………………..96 9 Appendix 1 – Java Documentation ……………..101 10 Appendix 2 – Program User Manual ……………104
A2.1 Running the Program 104 A2.2 Sample Data Sets and Output Samples 106 A2.3 Program Source Code and Java Documentation 107
5
1. Introduction
The essence of music creation comes from the style of the creator. It is
developed through years of practice, inspiration and emotion that shape a
musician’s ‘voice’. This voice may then become the musician’s signature
playing style. For guitar players, often a particular choice of note
combinations in various musical phrases will become their individual style
and becomes instantly recognizable as their work.
1.1 Aims of this project
This project aims to develop a style modelling system for automated music
improvisation. It seeks to capture a particular player’s improvisation style
and generate an improvised solo using its characteristics. The goal is to
develop a suitable style modelling approach for doing this in a way that
produces the closest possible replication of the player’s style but still
maintains a degree of originality and creativity. The literature shows that
other style modelling music generators seem to lack a creative edge. The
problems faced by these systems will be addressed and a solution that
tackles the problem of encouraging computer creativity within a stylistic
context will be proposed.
6
The first question here is whether or not a style borne out of inspiration and
emotion can actually be quantified and recreated in an original sounding
manner. It is proposed that it can be done and, to achieve this end, a music
generator reflecting the author’s style of guitar playing will be developed.
The key argument in this project is whether or not such a music generator
can be ‘creative’ or ‘original’. To this end, the generator being created will
not merely replicate the author’s style, but rather it will compose music that
has an element of creativity.
The creation of a music generator is secondary to the study of the method of
generating the music. Automated music generation is in itself a large study
that has been approached in many different ways by a number of
researchers such as Francois Pachet, Brian Eno, David Cope, Sean Booth,
John Cage and several others. This project will isolate the problem of the
style modelling stage that comes before the actual automated music
generation. Subsequently, there will be a discussion on how this research
might improve current methods of automated generation.
This research may form the basis for further studies in computer creativity
within a specific stylistic context. When combined with music generation
techniques for playing in contexts, it may be used to perform an analysis
and provide an insight into how a particular musician would approach
improvising in those contexts. It may even lead to the generation of musical
7
phrases that the musicians themselves do not have the technical skills to
play accurately.
1.2 Problem Description
The primary focus of this research will seek to validate the proposed method
of creativity algorithms in a musical style for use by generative music
creators. This method will be based on the utilization of Markov Chains and a
biochemistry theory called Autocatalytic Set Theory. A Markov chain is a
discrete-time stochastic process that shows the probability of one event
following another event or a sequence of other events. Autocatalytic Set
Theory looks at the way RNA string sequences separate and recombine to
create new sequences with different compositions to the original one. There
will be a discussion on why Markov chains are suitable for style modelling
and a system of Markov chains that is best suited to guitar players will be
proposed. Following that, the shortcomings of Markov Chains will be
addressed and it will be shown how Autocatalytic Set Theory may be used to
make up for these.
Using the project’s author to provide the data set, musical phrases that
incorporate what is called the ‘Dorian flavour’ will be documented. The
8
‘Dorian’ is a particular collection of notes that form a specific mode. It is
identical to the ‘Minor’ mode except for the sixth note, which is a semi-tone
above that of the ‘Minor’ mode. Additionally, many guitar players add an
extra flatted-fifth note (half a step below the fifth note) in between the
fourth and fifth notes of the scale. Playing solos with a ‘Dorian flavour’ over
a song in a minor key will result in a slight dissonance, which, although
technically wrong according to the rules of music, is generally found to be
pleasing and musically interesting. A few short phrases have been composed
in the ‘Minor’ scale that have a similar structure to each other and play the
sixth note of the ‘Dorian’ occasionally to create the dissonance and give
them the unique pattern and sound. If the Markov chain program is
successful in accurately documenting the style, the music generator will
compose musical phrases that have this unique sound.
The suggested adaptation of the Markov Chains will be described in a
following section. Reasons for choosing this method will also be provided and
it will be compared to other methods in common use.
9
1.3 Objectives
The following minor objectives were set at the beginning of the project:
1. To develop a program capable of performing Markovian analyses on a set
of raw data.
2. To develop an algorithm for the generation of further data consistent with
the Markovian characteristics of the raw data and extend the program to
perform Markovian analyses on musical corpora.
3. To develop a program that simulates autocatalytic sets.
The following major objectives were set based on the minor objectives:
1. To develop a program capable of quantifying and capturing a guitar
player’s style.
2. To test a theory that the combination of Markovian principles with
autocatalytic set theory can result in the generation of new, original
sounding music while still maintaining characteristics of the original style.
10
2. Literature Review
2.1 Generative music
“Generative music is commonly agreed to describe music
in which a system or process is composed to generate
music rather than the composition of the direct musical
event which will result from that system. The generative
composer has only indirect control the final musical result,
and the creativity of the compositional process is found in
the decisions about how the system will operate and the
rules inside the system” (Rich 2003 cited in Brown 2005,
pp. 217).
The ‘Illiac Suite’ is generally heralded as the first piece of music
automatically generated by a computer (Baggi 1998). It was created using
the Illinois Automatic Computer in 1956 at the University of Illinois at
Urbana-Champaign. While the produced corpus was not meant to be a piece
of ‘beautiful’ music, the concept of music generation using programming
algorithms was the real triumph of the ‘Illiac Suite’.
11
Since then, there have been several advancements in the algorithms used to
generate music. Some moderately successful programs, softwares and
products have been created to automatically generate music based on a few
user inputs. Notable ones include “Koan Pro” by SSEYO and GarageBand by
Apple (Brown 2005).
2.2 Methods for the creation of Generative Music
Different methods have been used to automatically generate music. Eacott
(2000) argues that computers cannot really compose music. Any music
generated by a computer is the result of human input and/or interaction. All
methods use some form of user input to generate music. The following are a
sample of the broader groups of methods for automated music generation.
2.2.1 Mozart’s Dice Game
The famous composer W.A. Mozart conceived one of the earliest ideas for
random music generation, “Musikalisches Würfelspiel”. In this game, a series
of short musical phrases are pre-composed and notated. A dice is then rolled
to choose from these phrases to form a two-part waltz (Baggi 1998). Wilcox
12
(2007) created an online music generator based on Mozart’s dice game.
However, instead of a dice roll, the program used the user’s details as input.
The obvious shortcoming of the dice game is that it has a very limited
compositional scope. The composition is merely a recombination of pre-
existing phrases inputted by a human. The phrases themselves have their
whole structure maintained and are not broken down or separated. No part
of any phrase is analyzed to have its stylistic content isolated and used for
original composition.
Overall, Mozart’s dice game is just that, a game. There is little or no scope
for truly original composition and hence it cannot be considered seriously for
commercial music automation.
2.2.2 Constraint Logic Programming
Descriptions of a piece of music can be plotted in terms of rules. When
modelling a person’s style, these rules may become more complex and
abstract (Anders 2003). However, there is still some amount of recognizable
structure. The rules may apply to a large number of musical parameters
such as the note pitches or the duration of each note. How one note is
13
played may be dependant on how the preceding notes were played or what
the next note is likely to be.
Setting these rules will not necessarily lead to the creation of a single
musical composition. Rather, it isolates a subset of the infinite possible
musical scores that could be composed in the absence of fixed rules. This
subset conforms to the ‘rules’ behind an individual style. Programs that
define these rules and mutual note dependencies with multiple solutions are
called non-deterministic programs. The rules/restrictions and mutual note
dependencies are called constraints (Anders 2003).
For the simplest general form of music, independent of any style modelling,
these constraints could refer to the tempo of the song, the scale that the
song is played in (referred to as the key of the song). The model used in
project is a branch of Constraint Logic Programming that is aimed
specifically at playing within a set stylistic context. It is aimed at setting
constraints based on a guitarist’s probability of playing a certain note based
on the preceding notes in the phrase. This will be explained in full detail in
section 2.3.2.
A simple example of a constraint-based system is as follows. A single
constraint is set that the song should be in the key of ‘D minor’; that is,
every note generated by the computer should be a part of the ‘D minor’
scale. The computer takes a ‘random walk’ through every possible music
14
note until if finds one that satisfies the constraint (of being in the key of ‘D
minor’) and it plays this note.
A strong criticism of Constraint Logic Programming is that “the methods
employed to produce generative music are merely technical and cannot be
described as artistic” (Brown 2005 p. 224). It may be argued that the music
is derived mathematically and lacks inspiration or emotion, as computers are
incapable of experiencing either.
However, it should be noted that ‘style’ is essentially a function of a person’s
own mental representation of music as per the literature on the cognitive
psychology of music (Franz 1998). Whether or not a phrase is musically
appealing is completely subjective and is determined based on the opinion of
the listener. Thus a good test would be to have a sample of musicians listen
to the output of a program and rate its musical appeal.
2.2.3 Cellular Automata
Cellular automata are important modelling and simulation tools. It is an
interesting modelling system based on the evolution and growth of living
cellular organisms. “They are discrete dynamical systems; that is, they
change some feature with time” (McAlpine et al 1999, p. 23). Different
15
disciplines from physics and chemistry to sociology and philosophy have
used cellular automata principles.
A cellular automaton is an array of elements referred to as cells. Evolution
rules are applied to each cell to determine how each automaton develops in
time. McAlpine et al (1999) document the CAMUS 3D research project that
applies these evolution rules to music composition to model pattern
propagation. Each theme in a composition is a separate pattern, which is
subjected to evolutionary musical transformations (such as transposition,
augmentation, inversion, etc.). Design constructs, similar to rules as in
constraint logic, are set in place to guide the evolution of the composition.
McAlpine et al (1999) suggest that human composers employ pattern
propagation intuitively whereas computer based Cellular Automata formalize
this propagation at a much more advanced level. All the musical patterns
evolve according to the set constraints that are set initially. Hence, any
common stylistic musical features that emerge as a result are a sonification
of the evolutionary behaviour of the automaton.
The application of evolutionary rules may be seen as an advanced version of
constraint logic where the ‘constraint’ actually allows for evolution,
reconstitution and growth of a musical idea as represented by a simulated
organic cell. Composition constraints are specified in advance for both,
constraint logic and cellular automata.
16
In an automaton, the rules are localized to the existing cells. The ‘random
walk’, as mentioned earlier, will go through all the possible notes at time t
and choose one that meets the set evolution rules as they apply to the note
at time t-1. Thus, like living organic cells, they develop based on existing
states.
Cellular Automata is an important advancement in the intelligence of
generative music. Having ‘cells’ of notes that evolve, change, develop and
augment based on previous passages for the basis for the development of a
stylistic model. This can be done by guiding the evolution of notes to fit
stylistic ‘rules’ behind a player’s performance.
17
2.3 Style Modelling
Style modelling implies “building a representation of the musical surface that
captures important stylistic features hidden in the way patterns of rhythm,
melody, harmony and polyphonic relationships are interleaved and
recombined in a redundant fashion” (Dubnov et al 2003b, p. 2).
This kind of model makes it possible to generate new musical sequences that
conform to the documented style. Franz (1998) states the literature
indicates a lack of suitable models or analysis techniques of style and
creativity to measure improvisation suitably.
2.3.1 Autocatalytic Set Theory
Autocatalytic Set theory stems from observation of reactions in RNA
sequences. It is best explained with a diagram, as shown on the next page.
According to this theory, specific portions of an RNA sequence (called
‘digestive enzymes’) react with matching enzymes found in another
sequence to break it apart at those matching portions. This process is then
reversed so that these enzymes are used to join other sequences or
fragments of sequences around matching enzymes. Thus, each member in a
18
set of sequences is the product of at least one reaction catalyzed by at least
one other member.
“Theoreticians at the Santa Fe Institute found that an artificial-life program
could simulate a living metabolism, in which a few chemicals acting as
enzymes digest other molecules and reassemble them into different ones in
complicated cycles” (Zimmer 1993).
It can be seen that Autocatalytic Set theory, when applied to computer
science, is an extension of Cellular Automata. Iverson (1990) proposed the
idea of applying autocatalytic set theory to music as a basis for a style
modelling system. He developed the ‘Metamuse’ program to do this. This
was a completely different approach to style modelling at the time (Zimmer
1993).
19
Metamuse works by having a musical score fed into it. It then randomly
extracts a string of four notes. That string acts as a digestive enzyme by
identifying the same sequence of notes in the rest of the composition and
separating the composition into two pieces in the middle of that sequence.
The digestive enzyme then reproduces itself, and the two copies then search
for more matches. Upon reproduction, an enzyme has a small chance of
mutating--of having the notes changed. Eventually, a number of different
enzymes are at work, digesting and reproducing.
This digestive process ceases once the entire composition has been reduced
to a set of fragments roughly the size of the enzymes themselves. Metamuse
then reverses this process. The fragments reassemble themselves, each one
looking for two other fragments to stitch together end-to-end, as is shown in
the diagram above. Again, a sequence that catalyzes a reaction is allowed to
copy itself and is prone to mutation. (Zimmer 1993)
Iverson’s general idea was to preserve the unique stylistic characteristics of
the music by essentially dissecting and recombining a given piece of music.
However, the success rate was low as only one in three pieces of music
produced by Metamuse was considered to be a successful representation of
the intended musical style (1990).
20
While the program itself may not have been a success, the ideas behind it
may be useful for further development of a style modelling system. Iverson
made the following important suggestions:
Assimilation of a piece of music is best done by first isolating it into
different series of elements. These may be notes pitches, note
durations, rhythm, or any other user-selected element.
Having a certain degree of mutation may potentially result in
productive additions to the autocatalytic set.
An apparent flaw is the selection of the digestive enzyme. Iverson chose to
select a digestive enzyme at random, rather than based on any specific
reason or logical choice. The digestive enzyme will reappear frequently in
different sections of the improvised piece because it joins sub-strings
together to create larger strings and, barring mutation, is preserved in its
entirety. On dealing with style modelling, Pachet (2002a, p. 77) suggests,
“music, especially improvised music, is made up of…[…]…repetitive
ingredients…[in addition to]…unexpected events or material”. If a stylistic
trait of the player’s style was chosen to be an enzyme, its frequent
reappearance may give the listener some sense of stylistic trait.
This is supported by Cope (1999), who states that artists have certain
stylistic traits, or signatures that occur frequently across compositions. He
21
suggests that these signatures should avoid fragmentation, as doing so
would result in inaccurate style replication.
Hence, careful selection of the initial enzyme may help to preserve stylistic
traits or signatures, and this will be discussed further in section 3.2.
22
2.3.2 Stochastic Process - Markov Chains
Stochastic processes are the counterparts of deterministic processes
considered in probability theory. They deal with states that can have several
possible outcomes. They are sometimes random processes that choose one
of these possible outcomes based on a set of conditions (Winston 2004).
A ‘discrete-time stochastic process’ amounts to a sequence of random
variables based on certain conditions. Simply, if X is the value of a particular
variable at time ‘t’, a discrete-time stochastic process is a description of the
relation between X at time ‘t’ and X at time ‘n’ for any number of future time
periods (Winston 2004).
A Markov Chain is one such discrete-time stochastic process that shows the
probability of transition from Xt to Xt+1. The simplest way to present these
chains is in a transition probability matrix.
This matrix shows the probability of going from the vertical states in one
time period to the horizontal states in the next time period. Thus, if the
23
current time period is ‘t’, the probability of going from At to At+1 is 10%, At
to Bt+1 is 00%, At to Ct+1 is 90%, Bt to At+1 is 50% and so on (Winston
2004).
Note that the sum of all probabilities in each of the rows is equal to 100%.
The table is said to be normalized, implying that there is absolute certainty
that there will be some output at time ‘t+1’ from any of the states at time ‘t’
(McAlpine et al 1999).
Markov chains can also have multiple ‘orders’ or ‘states’ as such:
In this bigram (two order chain), the probability of going from AAt to At+1 is
10%, AAt to Ct+1 is 90%, and so on. Thus if A, B and C are musical notes,
the probability of playing At-1, Bt and then At+1 is 50%. As such, Markov
Chains can have any number of states (Winston 2004).
24
The invention of ‘mind reading machines’ created by Hagelbarger in the
early 50’s sought to predict choices made by humans. It was found that this
could be done by analysing patterns of previous choices using Markov Chains
(Dubnov et al 2003a). They provide a method for documenting patterns
previous choices, and thus can be used to predict future choices.
A stream of musical notes can be seen as a stream of events. Such a stream
might be analyzed using a Markov Chain and framed in a probability
transition matrix. To this end, Brooks et al (1957) suggest one of the earliest
musical applications of Markov Chains in their study entitled “An Experiment
in Musical Composition”. They state that computers are capable of
performing deductive reasoning and hypothesized that probability states
stored by Markov chains could be used as input for the deductive process in
order to arrive at a musical output. Analyzing patterns of previous note
choices in this method could be used to predict future note choices.
Pachet suggests (2002a, p. 77), “Markov-based models always generate
music, which is consistent with a style, itself defined by recurring patterns
found in some learned corpus”. Thus, he suggests that recurring patterns
can be used to define a style.
‘Style’ is a subjective concept and is therefore difficult to quantify (Franz
1998). One possible approach to recreating a guitar player’s style is to select
highly probable note choices or musical phrasings that she is likely to choose
25
in a particular musical context. The content of Markov chains shows the
movement from one note to another with some degree of probability. The
difficulty lies in measuring how external context (the style of the guitar
player playing the notes) affects these probabilities.
To solve this problem, we look at Claude Shannon’s ‘Information Theory’. It
considers the predictability or probability of a verbal or written message
using Markov chains (Huron 2001) much like Brooks et al (1957) did with
music notes. However, one important discussion is the Role of Context. He
sets up a frame of predictability to suggest that the role of previous states
may constrain the probability of occurrence of the next possible state.
This context of dependency may apply to guitar playing style in the following
way. Perhaps one guitarist’s style is to play each note of the scale in
ascending order. Perhaps she also prefers to skip every third note. Thus, if it
is observed that two consecutive notes have been played in the two
preceding states, there is a high probability that the third will be skipped in
the current state and the fourth will be played instead.
As has been shown, Markov chains can have any number of states. Hence, in
this example the frame of predictability should consider at least two
preceding states (a 2-state Markov chain). It becomes clear that a multiple
state chain will be more successful in predicting a player’s style.
26
Constraint Logic requires a set of rules or constraints. Markov chains provide
these ‘constraints’ as a set of probabilities of proceeding to a selection of
future states from a current state. Thus, it is suitable for use within a
Constraint Logic framework.
2.3.2.1 Markov Chains in Practice
Over the years, Markov Chains have been used extensively in music style
modelling. Limpaecher (2007) created a simple program using Markov
chains that could compose music based on an input file using a number of
probabilistic techniques, primarily Markov models. The output of this
program was evaluated and found to be fairly impressive.
McAlpine et al (1999) report on the aforementioned CAMUS 3D program,
suggesting that, with a little effort, the system results can produce pleasing
and much more natural sounding compositions as compared to those
obtained using comparable algorithmic techniques.
However, the most impressive system using Markov chains is the
‘Continuator’, a product developed by Sony Computer Science Laboratories
(Pachet 2002a). This is a MIDI enabled keyboard that stores and analyzes a
player’s performance in real-time as it is played. It then plays back an
improvisation based on the artist’s style. Pachet boasts of the ‘Aha effect’.
27
Upon listening to the generated improvisation, the musicians react
extremely positively, often exclaiming, “Aha!”
Some have even declared that it plays improvisation of a much higher level
of skill than they are capable of, but still maintains their unique style.
These programs and products are but some of the few that have been
designed using Markov chains as the primary form of style modelling
documentation. They stand testament to the fact that Markov chains can
result in successful and considerably accurate preservation and replication of
a musician’s style.
2.3.2.2 Case for Markov Chains
As is shown above, Markov Chains provide an excellent format for style
modelling.
Franz (1998, p.27) proved “Markov chains and the tools for their analysis
can be interpreted to determine quantitative measures of creativity and
style”. Documenting patterns of previous note choices using the chains
provides a means for predicting future note choices consistent with the style.
It has been shown that “expectations based on recent past context guide
musical perception” (Dubnov et al 2003b, p. 3). Franz tested the use of
28
Markov Chains as analytical tools by transcribing jazz solos played by John
Coltrane in the song ‘Giant Steps’ on several live records. Markov Chains
were found to be useful and effective for the analysis of jazz improvisation.
The specific style (jazz) is unimportant. The study proved that Markov
Chains are an effective format for style documentation, regardless of musical
style.
Stochastic techniques in general offer a useful means of data reduction
(Jones 1981). Constraint Logic requires a detailed specification of rules for
the progression of music and Markov Chains provide these rules (as a set of
possible options for state transitions) in a simple, compact manner.
‘Knowledge engineering’ is a generative music approach that involves
explicit coding of music rules in some form of logic or formal grammar
(Dubnov et al 2003b). While it produces impressive results, it requires
extensive exploitation of musical knowledge of a composer’s style. Statistical
learning methods are far easier to program and use.
These Stochastic models are fast, efficient data structures and can be used
to rapidly match contexts (Conklin 2003). This data reduction combined with
efficient computing techniques has led to the creation of many successful
real-time Markovian music generators. The best example of a real-time
music generator is the Continuator mentioned above.
29
Markov chains make it possible to compute the ‘optimal’ piece of music by
selecting a path, through the states, with the highest probability (Mahmud
2006). This is useful for pre-processing music. For real-time application,
using the ‘random walk’ method makes it straightforward to generate new
music using a Markov Chain. This may result in unanticipated note choices.
These are note choices or whole phrases that might have a low probability of
occurrence. Jones (1981, p. 45) states, “The bonds of a restrictive and
inaccurate acoustic theory and of a limited aural imagination may be
broken”.
He further suggests electronic music is often criticized for being too sterile,
but stochastic techniques may be used to add a human element to computer
generated music. However, all of this relies on the subjective perception of
the listener and the music generated by stochastic processes may or may
not be found pleasing. This will be discussed further in section 2.3.2.3 which
looks at criticisms of Markov Chains.
Given their easy use and efficient data storage technique Markov Chains
have proven to be a useful tool both for the analysis and for the generation
(both pre-processed and real time) of music. As is shown above, they have
been used in several successful projects, both academic and commercial.
30
2.3.2.3 Case against Markov Chains
While Markov chains have been used widely in music generation systems for
style modelling, they do have their drawbacks.
Firstly, they are limited to capturing only adjacent relationships between
events, in this case, notes. They cannot capture non-adjacent relationships
in a compact fashion (Chomsky 1957). Having an n-state model may help to
alleviate this problem to some extent as it defines the probability of note A
at time t being following by note B at time n.
As Pachet (2002a) suggested, the style of improvised music is reflected in
recurring patterns found in the corpus. Chomsky (1957) found that Markov
Chains could not model recursive embedding of such phrase structure. This
can be seen as an expansion of a non-adjacent relationship between, in this
case, whole phrases where a particular phrase is repeated in different places
as part of the musician’s style. Again, this problem might be solved to an
extent by having larger n-state chains wherein relationships between certain
combinations of notes (which make up a phrase) over a larger span of time
are evaluated.
This poses another problem. Choosing a suitably sized Markov Chain to
model musical phrases is difficult. As Brooks et al (1957) found, at low
31
orders such as 2-state chains, the model failed to capture the style
effectively. At very high orders, the phrases were simply replicated with very
little original composition being generated. Thus, finding the optimal order is
difficult and may vary depending on the application it is being used for.
Constraint Logic programs using Markov Chains and other stochastic
processes are dependant on the ‘random walk’ mentioned earlier. While this
may lead to the selection of notes that have some probability of occurrence,
they may not necessarily select the most optimal path to provide the most
statistically probable musical phrase (Franz 1998). It is possible for the
‘random walk’ to go through highly improbable note choices and result in the
composition of a highly improbable corpus. Another possible consequence is
that, at some stage, the program might arrive at a situation where
subsequently only low probability events are possible or that the distribution
at subsequent stages has high entropy (Conklin 2003).
Allan (2002) tested the effectiveness of the ‘random walk’ method and found
its output to be significantly lower than the optimal output.
A possible solution would be to create a ‘branch and bound’ algorithm, such
as Djikstra’s Method, to search every possible corpus (within a specified
corpus length) and identify the optimal corpus before hand. However, this
would not be feasible for real-time applications such as the Continuator as it
requires pre-processing time that would hinder performance speed when
32
playing simultaneously with the musician. However, here we can come back
to the highly subjective nature of music and its perception by different
listeners as mentioned earlier. Csikszentmihalyi (1988, p. 325) suggests that
it is difficult to differentiate “what is creative from what is statistically
improbable or bizarre”. This suggests an improbable piece may not
necessarily be aurally displeasing and may be perceived as creative. The
success of the ‘random walk’ would eventually depend upon the opinion of
the listener.
Overall, the literature suggests Markov chains used alone as a means for
generating original and creative pieces of music in a particular style are
ineffective. Instead, they are better used in conjunction with other
algorithms. This can be seen in the Continuator, which combines Markov
chains with Prefix Trees for quicker execution (Pachet 2002b). All in all,
Markov Chains are used at the base of highly successful programs and
therefore should not be dismissed as a usable system for style modelling.
They are currently the most widely utilized method of modelling and
implementing probability analysis (McAlpine 1999).
33
3. Proposed Solution
The proposed program works as follows:
The guitarist plays a series of musical phrases in a particular style. These are
transcribed into MIDI data and fed into the program. The phrases are
analyzed and Markov chains are created and populated as per the data in
the program. The program then performs a series of algorithmic
computations based on the input data and creates an output MIDI file with
an original phrase.
3.1 Using the strengths of Markov Chains
As has been shown above, Markov chains provide an efficient, fast and data
conservative method of documenting a musician’s style as shown by use in
the Continuator. Thus, Markov Chains were used to store the player’s
phrases and calculate probabilities of state transitions (going from one note
to another).
34
Choosing an optimal state size for the change is another challenge. Low
orders such as 2-state chains will not capture the style effectively. High
orders will lead to replication with very little original composition being
generated (Brooks et al 1957). Given the shape of the guitar and the
positioning of the guitarist’s fingers to sound the notes, a guitar player’s
pattern of note choice is likely to reflect to some extent the limited flexibility
in movement from one finger position to another.
A guitarist uses four fingers on the fret board to alter the pitch of the string
being plucked. Having a four state Markov chain would show the probability
of moving from four previous notes to a fifth one. Analyzing how the
guitarist’s fingers have just moved might provide some insight into how they
are likely to move next. Hence a four state chain were used to store the
data.
The next problem was the selection of notes. Biles (1994) introduced the
idea of fitness of individuals in a population. The individuals here are music
notes (in a population of music notes) in the Markov Chain that have
probability > 0%, implying that they may be chosen as the next step in the
musical path. A method must exist for determining which of the possible
future states is to be chosen in a way that reflects their probability of
occurrence. That is, if B and C have 90% and 10% chances respectively of
following A, B should be chosen 9 out of tem times on average. Biles
35
suggests that if a ratio scale can be developed for fitness, then greater
sophistication may be used in note selection.
To achieve this sophistication, a ‘random walk’ will was then taken through
the chains to form a path (or musical phrase) of a pre-specified length. At
each step in the path, a random number was generated and percentiles were
created based on the probabilities of possible future states in the Markov
Chain. The generated number was matched against these percentiles to
choose the next state. Using this method, on average, the choice of path
was weighted by probability. That is, if the current state was A, B was
chosen 9 out of 10 times on average.
3.2 Using the strengths of Autocatalytic Set Theory
The literature shows that Autocatalytic sets are only successful in creating
music a third of the time. It has been proposed earlier that a possible
shortcoming is the choice of a digestive enzyme. However, this might
actually be a strength of the theory. To see how, first the role of
recombinancy needs to be understood.
Cope (1999) suggests that recombining extant music in a particular style
into new logical successions creates stylistically similar music. Recombinancy
36
is a natural evolutionary and creative process. Autocatalytic Set Theory
essentially works by recombining extant phrases. The ‘logical succession’
here is only based on finding a matching enzyme when recombining two
strings.
It has been suggested earlier that an artist’s signature or stylistic trait
should be preserved in a piece of music to give a sense of the artist’s style.
As Cope (1999) suggested, signatures and recombination are opposites, are
signatures must preserve their entire structure to give a sense of the artist’s
style. It has also been suggested that frequent repetition of this signature
may reinforce this sense of the artist’s style.
The autocatalytic set program works by letting the digestive enzyme break
down a piece of music into fragments at sections of repetition of the enzyme
and recombine these fragments with each other. Thus, the enzyme will be
preserved in its entirety in the resulting corpus. If a signature trait is chosen
to act as the enzyme, it will be preserved in its entirety and repeated
frequently at points where strings have been recombined.
Thus, a few signature traits were identified and used to be the digestive
enzyme. The choice of a signature for an enzyme may be the strength of the
autocatalytic set, as it provides a mechanism for the natural preservation of
a stylistic trait that Markov Chains do not.
For the purpose of this project, mutation of the enzyme was ignored.
37
3.2.1 Addressing the shortcomings of Markov Chains
As the literature suggests, the style of improvised music is reflected in
recurring patterns found in the corpus. Chomsky (1957) found that Markov
Chains could not model recursive embedding of such phrase structure. This
can be seen as an expansion of a non-adjacent relationship between, in this
case, whole phrases where a particular phrase is repeated as part of the
musician’s style. This problem might be solved to an extent by having larger
n-state chains wherein relationships between certain combinations of notes
(which make up a phrase) over a larger span of time are evaluated.
However, autocatalytic sets allow the natural development of recursive
melodies. These melodies are the signatures that act as digestive enzymes
and recur in the piece upon recombination of strings (Iverson 1990).
Choosing a suitably sized Markov Chain to model musical phrases is difficult.
As Brooks et al found (1957), at low orders such as 2-state chains fail to
capture the style effectively whereas very high orders result in replication of
the original corpus with very little original composition being generated.
If a piece of music generated by the constraint logic process is then
subjected to fragmentation under the Autocatalytic Set Theory, strings that
38
might have been replicated due to choosing a high order will be broken down
and subsequently restructured into different combinations by autocatalysis.
39
4. Implementation – Stage 1
The program designed to analyze the input data was broken into two main
segments; Markov Chains and Autocatalytic sets. It was created using the
Java programming language. The creation of a program is not the primary
focus of this project. It was done to facilitate calculations and create an
output showing the result of the algorithm. The algorithm is the primary
focus of the project.
4.1 Markov Chains
The first part of the program constructs Markov Chains based on input data.
A flexible Markovian Analyzer was created to produce, evaluate, manipulate
and read from Markov chains. A UML diagram for the Markovian Analyzer is
shown in Figure 1 on the next page.
As MIDI music notes in Java are represented with numbers, the ‘states’ in
the Markov Chains are integers representing the music notes. Examples of
the chains created are shown in section 4.1.2.
40
4.1.1 The Markov Chain Program
The following diagram shows the classes (representing objects) that were
created in the program.
Fig 1. UML diagram of the Markovian Analyzer program
41
4.1.1.1 Diagram Explanation
The following is a brief explanation of the classes shown in the diagram and
how they interact with each other. For a more detailed explanation of each
of the classes, their methods and their variables, please refer to the Java
documentation section in appendix 1.
All the classes were placed in a program package called ‘markov’.
MarkovAnalyzer
The Markov analyzer is the main class that takes a raw data set and creates
transition states, Markov chains, transition probability matrices and probable
phrases. It also prints out the information sheets for any of these if needed.
TransitionState
A transition state is the probability of going from present state ‘A’ at time t
to future state 'B' at time t+1. In this class, the present state, future state
and number of occurrences of transitions between particular states are
recorded.
Note that all states are represented by integers. That is, a state can be "1",
"2", "10", etc. but for example purposes state will be referred to as letters
such as "A", "B", etc.
42
MarkovChain
A Markov chain has a present state (of any order), one or more future states
and a recorded number of total occurrences.
It derives its future states from a list of several transition states. For
example, it will hold transition states for "A B" to "C", "A B" to "D", "A B" to
"A", etc.
The number of total occurrences is the cumulative total of all occurrences of
transition states within the Markov chain. For example, if the chain is made
up of 2 occurrences of "A B" to "C" and 5 occurrences of "A B" to "A", the
cumulative total occurrence is 7.
TransitionProbabilitySet
A transition probability set is a collection of Markov chains of various orders.
Once collected, the chains can be analyzed and used to generate various
data.
ProbablePhrase
A probable phrase is a 'path' of states (in this case representing a musical
phrase) generated randomly based on Markovian probabilities of a set of
Markov chains. It is a path created through various states based on the
'random walk' method. As the states are represented by integers, the phrase
is just a set of integers, each representing the next step in the path.
43
A probable phrase extends a transition probability set. This means that it
inherits all the attributes and methods of transition probability sets and has
its own additional attributes and methods.
As such, it contains a set of member Markov chains. The random walk is
taken using the data from these chains.
‘MarkovAnalyzer’ was the main class in the program and the rest of the
classes represent objects that were created to facilitate programming. The
‘rawData’ is an array of music notes (integers) in the data set. An order for
the chains is set so that chains of that order and every lower order are
created. For example, for markovOrder 4, 4-state, 3-state, 2-state and 1-
state chains are created. Java documentation for all the classes showing
descriptions of the methods is provided in appendix 1.
The Markov analyzer was tested using dummy data and the produced chains
were tested against those constructed manually. They were found to be
accurate, thus proving that it works.
The following dummy data was used:
1, 2, 1, 1, 2, 1, 3, 1, 2, 9, 1, 3.
A markovOrder of 4 was set (i.e. 4-state chains and less) and the program
produced the following transition probability matrices (please note that the
44
following is an exact copy of the output of the program and has not been
formatted in any way):
1-state markov chains | 1 | 2 | 3 | 9 | 1 | 0.17 | 0.5 | 0.33 | 0.0 | 2 | 0.67 | 0.0 | 0.0 | 0.33 | 3 | 1.0 | 0.0 | 0.0 | 0.0 | 9 | 1.0 | 0.0 | 0.0 | 0.0 | 2-state markov chains | 1 | 2 | 3 | 9 | 1, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 1, 2 | 0.67 | 0.0 | 0.0 | 0.33 | 1, 3 | 1.0 | 0.0 | 0.0 | 0.0 | 2, 1 | 0.5 | 0.0 | 0.5 | 0.0 | 2, 9 | 1.0 | 0.0 | 0.0 | 0.0 | 3, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 9, 1 | 0.0 | 0.0 | 1.0 | 0.0 | 3-state markov chains | 1 | 2 | 3 | 9 | 1, 1, 2 | 1.0 | 0.0 | 0.0 | 0.0 | 1, 2, 1 | 0.5 | 0.0 | 0.5 | 0.0 | 1, 2, 9 | 1.0 | 0.0 | 0.0 | 0.0 | 1, 3, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 2, 1, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 2, 1, 3 | 1.0 | 0.0 | 0.0 | 0.0 | 2, 9, 1 | 0.0 | 0.0 | 1.0 | 0.0 | 3, 1, 2 | 0.0 | 0.0 | 0.0 | 1.0 | 4-state markov chains | 1 | 2 | 3 | 9 | 1, 1, 2, 1 | 0.0 | 0.0 | 1.0 | 0.0 | 1, 2, 1, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 1, 2, 1, 3 | 1.0 | 0.0 | 0.0 | 0.0 | 1, 2, 9, 1 | 0.0 | 0.0 | 1.0 | 0.0 | 1, 3, 1, 2 | 0.0 | 0.0 | 0.0 | 1.0 | 2, 1, 1, 2 | 1.0 | 0.0 | 0.0 | 0.0 | 2, 1, 3, 1 | 0.0 | 1.0 | 0.0 | 0.0 | 3, 1, 2, 9 | 1.0 | 0.0 | 0.0 | 0.0 |
In all the chains, the Y axis (left) shows the present state and the X axis
(top) shows the future states. Thus, for example, in the 4 state matrix, the
probability of going from present state ‘1121’ at time t to future state ‘3’ at
time t+1 is 1.0, or 100%. All the chains have been normalized. This means
that the sum of the probabilities of moving to the future states is 100%,
45
ensuring that there is absolute certainty that some future state will be
reached.
In the main program, a ‘ProbablePhrase’ class has been created. This is an
array of integers that was created at random based on weighted Markov
probabilities of moving from a particular state to another. A ProbablePhrase
(named for a ‘phrase’ of music notes that has some probability of being
generated as per the Markovian probabilities) consists of a set of Markov
Chains, a phrase length and an arbitrary starting note set by the user. A
random walk is then taken (starting at the default start note) to generate
the music phrase. At each step, the walker looks for the largest order
Markov chain available based on the immediate preceding notes in the
phrase. For example, if the phrase thus far is 1, 2, 3, 1, 2, 9, the largest
chain in the phrase for the immediate preceding notes is a 4-state chain (3 1
2 9) based on the transition matrices shown above. If the phrase thus far is
1, 2, 9, 3, 1, 2, the walker will try to find a chain for 9 3 1 2. As none exists,
it will look for 3-state chains and will find one for 3 1 2.
46
Once a chain is found, a random number is generated between 0 and
0.9999… and is matched against percentiles created out of the future states
of a chain. For example, consider the following chain:
The percentiles constructed would be as follows:
Future state Lower Boundary Upper Boundary Explanation
1 0 0.17
The default lower boundary for the first possible future state is set to 0 and the upper boundary is set to its transition probability.
2 0.17 0.67
The lower boundary is set as the upper boundary of the previous percentile and the current upper boundary is set as the new lower boundary + the transition probability (i.e. 0.17+0.5 = 0.67)
3 0.67 0.99 Same as above
9 - - No matching transition state exists. This step has been ignored.
The random number is tested against the percentiles. If it is greater than or
equal to the lower boundary and less the upper boundary of a particular
future state, that state is selected as the next step in the path. For example,
if the random number 0.3214 is generated, it will fit within the percentile for
future state ‘2’ and thus ‘2’ will be chosen. If the random number 0.67 is
1 2 3 9 1 0.17 0.5 0.33 0.0
47
chosen, it is equal to the lower boundary for future state ‘3’ and thus ‘3’ will
be chosen.
Thus,
chosenPercentileLowerBoundary <= randomNumber < chosenPercentileUpperBoundary
4.1.2 The Markov Chain Program – With Music
Figure 2 on the next page shows the classes (representing objects) that
were created in phase two of the project.
The program was extended to incorporate MIDI functionality and output
musical notes. The structure and methods of the TransitionState,
MarkovChain and TransitionProbabilitySet classes remained the same.
48
Fig 2. UML diagram of the Markovian Analyzer program in phase 2
49
The following changes were made:
JFugue
The JFugue API developed by Koelle (2007) was used to give the program
MIDI capability. This API provided for simple access to Java’s MIDI
functionality without the need to set several parameters and settings. For
example, playing a C major scale could be done by creating a ‘music string’
as such:
“C D E F G A B”
and passing this ‘music string’ to JFugue’s player to play the notes in MIDI
(Koelle 2007).
MidiAnalyzer
The midiAnalyzer class was created to receive a MIDI file specified by the
user and perform various methods to be used by the markovAuto class such
as getting the notes played by the musician in a format usable by the
MarkovAnalyzer.
MarkovAnalyzer
This was no longer the main class. Hence the main() method was removed.
Instead, the program would now be run using the MarkovAuto class as the
main class.
50
MarkovAuto
This class was added and made the main class. At runtime, the user
specifies a MIDI file containing the dataset (a solo improvised by a
musician), a markovOrder (to be used by the MarkovAnalyzer) and a phrase
length for the phrase to be generated.
ProbablePhrase
A phraseToJFugueString() method was added that converted a phrase to a
‘music string’ compatible with the JFugue MIDI player. This made it easier to
play the phrase generated without having to work directly with the,
somewhat tedious, Java MIDI functionality.
A package called NSD containing two classes, called NSDUtils and
NSDUtilsMIDI, that had been created previously for general use over several
projects was imported into this project and used by most of the classes in
the program. This package included general useful methods such as
rounding numbers to n decimal places, matching arrays of integers,
automatically printing the contents of vectors, etc. Descriptions of the
classes in this package are included in appendix 1.
For a more detailed explanation of each of the classes, their methods and
their variables, please refer to the Java documentation section in appendix
51
1. For a discussion of the results of phase two of the program, please refer
to chapter 5.
4.1.2.1 Limitations
The markov analyzer program has a number of limitations to its functioning:
First of all, the program is completely monophonic. It does not recognize two
notes being played simultaneously. In the event that the musician plays two
notes simultaneously, the program will recognize these as two separate
notes played successively. The output of the program is also monophonic.
The program is intended primarily for guitarists. However, the program does
not provide for variables in guitar playing such as pitch bending, vibrato,
string bending, sliding, etc. As is discussed later in this section, the program
can easily be extended to provide for these, but they would require a large
data set for various reasons.
The dataset is read independently of the backing track. The MIDI file being
read should only contain the guitar solo and no other backing tracks. That is,
only the solo played by the musician is analyzed. Any chords or rhythms that
may be played by backing musicians in real life situations are not recorded
or analyzed in any way.
52
Consequentially, the probable phrase is not influenced at any point by the
underlying chord structure or rhythm structure of the song. While a musician
in a real-life scenario might respond to anticipated chord changes or
rhythms in the song structure, the probable phrase does not. The analyzer
does not read the backing tracks in the MIDI file.
Thus, it is recommended that the musician play in a single key throughout
the passage so as to ensure that the improvised passage is not subjected to
random shifts into other keys as a result of overlap of notes between these
keys.
As an example of this, the C major scale is comprised of the following notes:
“C D E F G A B C”
and the D major scale is comprised of the following notes:
“D E F# G A B C# D”
There are several overlapping notes between the two scales, however, if the
D major scale is played over a musical sequence in the key of C major, the
dissonance caused by the F# and C# notes will usually sound displeasing.
53
Consider the following situation in which a musician improvises over a piece
of music that switches between these keys.
Notes: C E D F# A G G E C D F# D
Keys: C D C D
Of the several markov chains that can be generated from the sequence
above, consider the following:
These chains would suggest that a C note will be followed by a D note 5 out
of 10 times and that a D note will always be followed by an F# note.
Thus is it stochastically possible that the following phrase might be
generated:
Notes: C D F# A G (etc)
Keys: C D
The underlined F# in the sequence will sound wrong when played in the key
of C. However, according to the Markov analyzer program, this progression
is stochastically correct.
D E F# C 0.5 0.5 0.0
D 0.0 0.0 1.0
54
As a side note, it is possible that a musician’s style of phrasing may
incorporate the use of out-of-scale notes in such a way that they will not
sound displeasing. This will be dealt with later in the results section (chapter
5) to see if the Markov analyzer program can generate such phrases.
Another limitation of the Markov analyzer program is that the
‘phraseToJFugueString()’ method in the ‘probablePhrase’ class generates all
eighth notes (i.e. 8 notes in every bar consisting of 4 counts, or 2 notes per
count). At the time of analysis of the musician’s solo, the analyzer does not
record the length of each note, but rather just the note itself. Therefore,
there is no context regarding the length of notes. This was left out, as it
would require a large data set to get a full understanding of how a musician
plays different notes at different lengths followed by other notes of at other
lengths. For the sake of simplicity, the phrase generated is played as quarter
notes. This will be discussed further in chapter 6.
55
4.2 Autocatalytic Sets
The second segment of the program design phase dealt with the simulation
of Autocatalytic sets. It was created using the Java programming language.
Again, the creation of a program was not the primary focus of this project. It
was done to facilitate calculations and create an output showing the result of
the algorithm. The algorithm is the primary focus of the project. The
program acts as a scientific test of the success of the algorithm.
Autocatalytic sets are found in many different forms. The process of
autocatalysis occurs among catalytic peptides and catalytic RNA (Kauffman
1993). While it may differ in its form and conditions depending on the
polymer, the general idea behind autocatalysis is the same as the one
proposed in section 2.3.1.
To recapitulate, the idea worked in the following way:
A ‘pool’ of catalytic polymers exists. Initially, all the polymers are joined
together in a ‘sequence’. If each RNA cell is represented by an integer, the
string may look like this:
1 4 7 10 4 2 3 6 8 11 2 3 4 7 5 2 3 4 5 6 2
56
This RNA sequence exists in the fragment pool in the initial state. The
autocatalytic set also contains one or more 'digestive enzymes' (which are
RNA sequences too) that are reactive and can split a sequence into smaller
sub-sequences. They search RNA sequences for sections that match their
own content and split them around that point.
For example:
The following enzymes might exist in the set: (2 3) and (4 7). Assume the
enzyme (2 3) is active.
The enzyme searches the initial complete sequence for a subsequence that
matches its own sequence. Hence, the active enzyme (2 3) could split the
original sequence in any of the following places:
1 4 7 10 4 2 3 6 8 11 2 3 4 7 5 2 3 4 5 6 2
It randomly finds the second potential split point and splits the original
sequence into two sub-sequences. The fragment pool now looks like this:
1 4 7 10 4 2 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2
Next, the (4 7) enzyme might become active. It can potentially split one of
the sub-sequences at one of the following points:
1 4 7 10 4 2 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2
57
It randomly finds the match in the first sub-sequence and splits it. The
fragment pool now contains three sub-sequences:
1 4 - 7 10 4 2 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2
The (2 3) enzyme might become active again, randomly finding a match in
the second sub-sequence and splitting it into two smaller sub-sequences
around the match, resulting in the fragment pool looking as such:
1 4 - 7 10 4 2 - 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2
This process might continue indefinitely.
The digestive enzymes can also facilitate ligase (cleavage) reactions that
result in the joining of two sub-sequences. This process works in reverse to
the one described above.
For example, consider the fragment pool status after the autocatalyzation
process is complete.
1 4 - 7 10 4 2 - 3 6 8 11 2 - 3 4 7 5 2 3 4 5 6 2
58
If the (2, 3) enzyme became active again, it could potentially rejoin and
two of the sub-sequences at the following points:
2 3
7 10 4 2 and 3 6 8 11 2
or
2 3
7 10 4 2 and 3 4 7 5 2 3 4 5 6 2
or
2 3
3 6 8 11 2 and 3 4 7 5 2 3 4 5 6 2
or
2 3
3 4 7 5 2 3 4 5 6 2 and 3 6 8 11 2
As the example shows, any of the two RNA fragments that can be matched
around the digestive enzyme might be chosen; each at random.
59
Assuming it joins the second sub-sequence in the fragment pool with the
fourth one, the pool will look like this:
1 4 - 7 10 4 2 3 4 7 5 2 3 4 5 6 2 - 3 6 8 11 2
Using the (4 3) and (2 3) enzymes, a series of further ligase reactions
could potentially result in the following sequence:
1 4 7 10 4 2 3 4 7 5 2 3 4 5 6 2 3 6 8 11 2
Compare this to the initial sequence:
1 4 7 10 4 2 3 6 8 11 2 3 4 7 5 2 3 4 5 6 2
We can see that the resulting sequence is different to the original. It has
undergone recombination while still preserving the digestive enzymes.
60
4.2.1 Theory behind an Autocatalytic Set Program
RNA sequences in the autocatalytic set may be compared to a sequence of
music notes. As such, a ‘digestive enzyme’ may be chosen to break up this
sequence of notes and subsequently recombine them to arrive at a new
sequence.
It was suggested in section 3.2 that the ‘digestive enzymes’ chosen to
perform the reactions should be the guitarist’s signature phrases. This is
because digestive enzymes are preserved in their entirety in catalytic and
ligase reactions. This preservation of signatures might give the resulting
corpus more of a feel of the original style while still having an original
recombination of music notes.
This raises four questions:
1. How should the signatures be chosen?
2. How many signatures should be chosen?
3. How long should the signatures be?
4. How many reactions should take place?
To answer the first question, some of Cope’s (1992) work was referenced for
ideas. He suggested that signatures could be mathematical relationships
between notes rather than sequences of particular notes. Consider the
61
chromatic (or half-step) scale. There are 11 chromatic notes in standard
western music:
C C# D D# E F G G# A A# B
In MIDI data, a number represents every note. These numbers rise in
ascending order corresponding with these notes. Thus, the following
numbers correspond to the following notes:
Notes: C C# D D# E F F# G G# A A# B
No: 0 1 2 3 4 5 6 7 8 9 10 11
The numbers increase with the octaves (e.g. C1 is 12, C#1 is 13, etc).
According to cope, mathematical patterns between notes in a sequence
could be identified. For example, consider the following sequence:
C D G F G D B C#1
The mathematical relationship could be defined as the number of chromatic
notes note ‘N’ is away from note ‘N-1’. Therefore, for the subsequence C, D,
G, the note relationship would be 0, +2, +5. For D, G, F it would be 0, +5,
-2, and so on.
62
Upon gathering information about the note distance relationships for every
set of notes in the sequence, a signature dictionary can be compiled. The
signatures that occur most often would be considered the best ones.
Applying this conceptual idea of a signature (rather than a definite sequence
of notes) to an autocatalytic set, however, may not be advisable. Kauffman
suggests that “in order for reactions to occur effectively, the reactants must
be confined to a sufficiently small volume” (1993, pp. 298). This goes, in
part, towards answering the question of how many signatures should be
chosen. The number of digestive enzymes is best kept low. If the digestive
enzyme was a mathematical relationship, C D D#, A B C1 and E F# G all fit
the same signature (0 +2 +1). If all of them exist in the corpus, they will all
be used eventually over a large enough period of time breaking the original
sequence into tiny fragments.
This also has implications for the questions about how large a signature
should be and how many reactions should be allowed to take place. This
leads to a difficult situation when choosing enzymes and will be discussed
further in chapter 6.
For the sake of simplicity, it was decided that a particular sequence of notes
would be chosen to act as an enzyme. It is plausible to suggest that the
most commonly occurring sequence of notes could be considered a signature
phrase of the player. Hence it was decided that a few (two or three) of the
most commonly occurring phrases would be chosen as enzymes.
63
The Markov analyzer already provided a means for identifying this. Each
Markov chain has a number of total occurrences that could be used to
identify the most commonly occurring phrase.
It was decided that the phrase of the highest order chain and that of the
next highest order chain (as long as it is greater than 1) would be used as
enzymes.
4.2.2 Simulating Autocatalysis
It was decided that the user would specify the number of enzymes to be
chosen. Note that the user also specifies the order of the Markov chains. The
program then finds as many Markov chains of the specified order as are
available up to the specified enzyme number. Their present states are
extracted and used as enzymes.
Figure 3 on the next page shows the classes that were created to represent
an autocatalytic set.
64
Fig 3. UML diagram of the autocatalytic set program in phase 1
MarkovAuto
This class is the same as the one developed for phase 2 of the Markov
analyzer program.
DigestiveEnzyme
This is a 'reactive' array of integers in a particular sequence that can split
or rejoin greater RNA fragments containing a subsequence that matches
this enzyme’s structure within them.
The digestive enzyme has a head (its first half) and a tail (its second half).
In the case of enzymes with odd lengths, the head will be larger than the
tail. During the process of autocatalysis, when a sub-sequence is split into
two (a head and a tail) the digestive enzyme portion of it will be split in
half around its head and tail.
65
For example, consider a digestive enzyme of (2 1 3) in the following
sequence:
1 4 2 1 3 6 5 2
The sequence will be split around the matching enzyme sub-sequence as
follows:
1 4 2 1 - 3 6 5 2
AutocatalyticSet
This represents a fragment pool containing an RNA sequence (just a
sequence of integers) and has a list of digestive enzymes.
The process of autocatalyzation was programmed as per the description
above. The user specifies a number of reactions and enzymes are chosen at
random to find one of the many matching points within the RNA sequence
(and the sub-sequences that are created as a result of catalyzation). The
enzymes continue to break up the sequence and the number of reactions is
counted. Reactions continue to take place either until no more can occur
(the digestive enzymes have split up the sub-sequences in all possible
places) or until the desired reaction count is reached.
The process of ligation (or reverse catalyzation) had to detract from the
polymer concept of ligation. The intention of this program is to arrive at one
66
final, whole sequence with the same length as the original one. In real life,
the processes of autocatalyzation and ligation occur randomly until some
goal is reached. This goal may not necessarily be the creation of a single
sequence. However, it is entirely possible that, if only ligation took place
without any autocatalyzation, one or more series of cleavage reactions could
eventually result in the creation of one large sequence by chance.
Thus, the reverseCatalyze method actually seeks to find all possible series of
reactions that result in a whole final RNA sequence. Depending on which
enzymes react and which sub-sequences they join together, the results in
these cases are most likely to be different to each other.
In the end, one of the possible paths is chosen at random to act as the
series of cleavage reactions that take place to result in one large sequence.
This sequence will then be used as the musical sequence to be played.
Again, these classes, their variable and their methods are explained in
greater depth and detail in the Java documentation section in appendix 1.
The autocatalytic set program was tested with several sample data sets and
was found to provide algorithmically correct results.
67
4.3 Combining the Two
In the final stage, some minor changes were made to the programs and they
were then combined.
In the ‘markov’ package, the TransitionState and MarkovChain classes were
changed slightly. A new class called RecurringPhrase was created to
represent a ‘phrase’ of integers that may be found in more than one place
within a corpus. The number of occurrences of each recurring phrase is
stored.
A transition state is effectively one such recurring phrase that goes to a
particular future state. The number of times it does this in the corpus (or the
total occurrences of this transition state) is stored for each one.
Similarly, a markov chain is a recurring phrase that randomly goes to one of
many future states. It also stores the total number of occurrences of this
recurring phrase (regardless of which future state it is going to each time).
Hence, to facilitate further programming, the two were made subclasses of
the ‘RecurringPhrase’ class. The ‘present state’ of a transition state or a
Markov chain is the ‘phrase’ that recurs.
68
Fig 3. UML diagram of the complete program
69
For a clearer depiction of how the classes are arranged in their respective
packages, please refer to Figure 4 in Appendix 1.
The functioning of the main method in the MarkovAuto class was also
defined. It completes the combined analyses and uses the Markovian
processes along with autocatalysis and reverse catalysis to arrive at a newly
created corpus of music.
At runtime, the user specifies the name of the MIDI file containing the guitar
solo, the highest Markov order to be used, the number of enzymes to be
used and the length of the output phrase.
The main method works in the following way:
1. Using a MIDI analyzer, it extracts the notes of the solo from the MIDI file
and returns them in integer form (i.e. the integer values that MIDI
assigns to music notes).
2. A Markov analyzer is created to process this data. This process has
already been described in section 4.1. A probable phrase is generated,
which is a new guitar solo based on the probabilistic characteristics of the
original corpus.
3. The sequence of integers that makes up the probable phrase is then sent
to an autocatalytic set. To choose digestive enzymes, the program then
extracts the most frequently occurring Markov chains using the Markov
70
analyzer. Only chains of the order specified by the user at runtime are
chosen. If there are not enough chains of that order as per the user’s
specification, only as many as are available will be made into enzymes.
4. The set undergoes autocatalysis and reverse catalysis to result in a new
corpus in which the signature styles of the player (used as digestive
enzymes) are preserved but the rest of the corpus has been recombined
to some extent.
5. Finally, it saves this new corpus to a MIDI file.
71
5. Results
This chapter deals with the output of the program at various stages in its
development. In each section, a development phase of the program (as
discussed throughout chapter 4) is analyzed. Its limitations are specified and
the output is analyzed and discussed.
Note: The evaluations and opinions expressed in this section are subjective
to the perception of the author. To gauge the success of the program more
accurately, much larger tests would have to be conducted using a large
sample of musicians, ideally guitarists.
72
5.1 Markov Chains
This section deals with the output of the markov analyzer program at various
stages in its development
5.1.1 The Markov Chain Program – With MIDI
The basic Markov analyzer discussed in section 4.1.1 was extended to have
MIDI functionality. It was tested against two datasets containing improvised
solos played by the author.
In general, it was found that when the Markov chain order was set above 4,
there was a large amount of replication of the original dataset and little
creativity in the generated phrase. Large sections of the original data set
were simply repeated. Markov orders of 2 or 3 yielded better results and
maintained some degree of creativity. A Markov order of 1 resulted in a
phrase that bore little semblance to the original dataset.
73
The following is a transcription of a solo saved as dataset3.mid (all notes are
eighth notes:
D3, E3, F3, G3, G#3, A3, B3, C4, D4, E4, F4, G4, G#4, A4, C5, D5, F5, E5, D5, C5, E5, D5, C5, D5, C5, A4, G#4, A4, G#4, G4, F4, D4, F4, D4, C4, A3, C4, A3, G#3, G3, F3, E3, D3, C#3, D3, G#3, G3, F3, D3, A3, F3, G3, A3, B3, C4, D4, E4, D4, E4, F4, G4, A4, B4, G4, C5, G4, D5, F4, A4, C5, A4, D5, F5, D5, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, D4, F4, G#4, G4, F4, E4, D4, G4, A4, B4, C5, C5, D5, D5, F5, G5, F5, D5 The following phrase was generated at Markov order 3:
D3, C#3, D3, G#3, G3, F3, D3, A3, F3, G3, A3, B3, C4, D4, E4, F4, G4, G#4, A4, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, D4, F4, G#4, G4, F4, D4, F4, G#4, G4
It has some mix of the original characteristics but does not result in
repetitions of large sections of it. It sounds like a fair representation of the
player’s style without a lot of repetition.
The following phrase was generated at Markov order 4:
D3, G#3, G3, F3, D3, A3, F3, G3, A3, B3, C4, D4, E4, F4, G4, A4, B4, G4, C5, G4, D5, F4, A4, C5, A4, D5, F5, D5, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4
Here there is substantial repetition of the original piece with little original construction. A much longer solo incorporating the ‘Dorian flavour’ a lot more was played in dataset1.mid: D3, D3, A3, D4, C3, A2, D3, A3, D4, C3, D3, D3, A3, D4, F3, G3, D3, A3, D4, G#3, A3, D3, A3, D4, C4, G3, D3, A3, D4, G#3, A3, D3, A3, D4, D4, A3, D3, A3, D4, D3, F3, D3, A3, D4, G3, A3, D3, A3, D4, B3, C4, D3, A3, D4, D4, E4, D3, A3, D4, D4, E4, D3, A3, D4, F4, G4, D3, A3, D4, A4, B4, D3, A3, D4, G4, C5, D3, A3, D4, G4, D5, D3, A3, D4, F4, F#4, D3, A3, D4, G4, A4, D3, A3, D4, G#4, G4, D3, A3, D4, F#4, F4, D3, A3, D4, C5, F4, D3, A3, D4, B4, A#4, D3, A3, D4, A4, G#4, D3, A3, D4, G4, F#4, D3, A3, D4, F4, F5, D3, A3, D4, F4, E5, D3, A3, D4, D#5, D5, D3, A3, D4, C5, B4, D3, A3, D4, A#4, A4, D3, A3, D4, G4, F#4, D3, A3, D4, D4, C4, D3, A3, D4, A3, D3, D3, A3, D4, E3, F3, D3, A3, D4, G3, G#3, D3, A3, D4, A3, B3, D3, A3, D4, C4, D4, D3, A3, D4, E4, F4, D3, A3, D4, G4, G#4, D3, A3, D4, A4, C5, D3, A3, D4, D5, F5, D3, A3, D4, E5, D5, D3, A3, D4, C5, E5, D3, A3, D4, D5, C5, D3, A3, D4, D5, C5, D3, A3, D4, A4, G#4, D3, A3, D4, A4, G#4, D3, A3, D4, G4, F4, D3, A3, D4, D4, F4, D3, A3, D4, D4, C4,
74
D3, A3, D4, A3, C4, D3, A3, D4, A3, G#3, D3, A3, D4, G3, F3, D3, A3, D4, E3, D3, D3, A3, D4, C#3, D3, D3, A3, D4, G#3, G3, D3, A3, D4, F3, D3 Here the improvisations generated by the computer both at Markov order 3
and 2 were generally found to have roughly the same pleasing balance
between originality and trueness to the original style. Here is one
generated at Markov order 2:
D3, C#3, D3, G#3, G3, F3, D3, E3, F3, G3, A3, B3, C4, D4, E4, F4, G4, A4, G#4, G4, F#4, F4, F5, F4, E5, D#5, D5, C5, D5, F5, E5, D5, C5, E5, D5, C5, B4, A#4, A4, G4, F#4, F4, C5, F4, B4, A#4, A4, G#4, A4, C5, D5, F5, E5, D5, C5, E5, D5, C5, A4, G#4, G4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, F4, D4, C4, A3, C4, G3, G#3, A3, D4, A3, D3, F3, G3, G#3, A3, B3, C4, D4, E4, F4, G4, A4, B4
Through similar datasets, it was generally found that Markov orders 2 and 3
resulted in the best improvisations (3 generally being better than 2). With a
large enough data set in which the musician has had time to express herself,
the output at Markov order 3 will yield the best results.
It was hypothesized in section 3.1 that 4 would be the optimal Markov order
for capturing a guitarist’s style as the guitarist uses four fingers to play
notes and her style might be reflected in the limited movement of the
fingers across notes.
It was found that 4-order Markov chains captured the guitarist’s style almost
to the point of replication of the original piece. This does mean that it is a
good method for capturing style, but it does not capture only the essence of
75
the style. Markov orders 2 and 3 capture the essence without replicating
entire large phrases.
However, upon closer inspection, it was found that the style of the solos in
the datasets was based on guitar patterns that used 2 or 3 notes per string
and broke down sub phrases into groups of 3, i.e. the guitarist would put
together several tiny phrases of, on average, 3 notes each.
To further test this hypothesis, a section of a neo-classical guitar piece by
Paul Gilbert called ‘Scarified’ was analyzed. This piece involves a lot of
movement of all four fingers with large jumps across the guitar fret-board,
thereby giving a balanced dataset that requires the guitarist to increase the
limits of her movement across the fret-board. Here is a transcription of the
main melody:
C#6, A5, G#5, A5, C#6, A5, F#6, C#6, A5, B5, C#6, D6, C#6, F#6, C#6, A5, F#6, D6, C#6, D6, F#6, D6, B6, F#6, D6, E6, F#6, G6, F#6, B6, F#6, D6, B5, G#5, F#5, G#5, B5, G#5, E6, B5, G#6, E6, G#6, E6, B5, E6, B5, G#5, E6, C#6, B5, C#6, E6, C#6, A6, E6, C#7, A6, C#7, A6, E6, A6, E6, C#6, A5, F#5, E5, F#5, A5, F#5, D6, A5, F#6, D6, F#6, D6, A5, D6, A5, F#5, D6, B5, G#5, B5, D6, B5, F#6, D6, B6, F#6, B6, F#6, D6, F#6, D6, B5
It was found that 3 order and 4 order Markov chains produced the best
results in terms of capturing the style while still composing an original
sounding piece. 4 order Markov chains resulted in a substantial amount of
replication of the original corpus where as, 3 order Markov chains did not. 2
order Markov chains did not capture the essence of the corpus at all.
76
This would suggest that, on average, 3 order Markov chains generally
captured a wider style of guitar soloing. For best results, the Markov order
should be set according to the individual player.
It was decided that Markov order 3 would be best for capturing the ‘digestive
enzyme’ in the autocatalytic set based on the author’s datasets.
77
5.2 Autocatalytic Set Theory
This section deals with the output of the autocatalytic set program combined
with the Markov chain program.
Markov chains are generated and a ‘probable phrase’ of music notes is
created based on these chains. The highest number of total occurrences
amongst all the chains with the user specified order is found. It then
identifies any other chains with the same order and number of occurrences.
The user specifies a number of enzymes at runtime. If this number of
enzymes cannot be found (i.e. not enough chains of the user-specified order
have the maximum number of total occurrences), only as many as can be
found will be used as digestive enzymes.
78
5.2.1 The Autocatalytic Set Program
The autocatalytic set program was tested using the same data sets used to
test the Markov chain program.
Datasets 1 and 3 were tested first as these contained samples of the ‘Dorian
flavour’ phrases. Phrases of 100 notes were generated for both the Markov
chain program and the Autocatalytic Set program. The first problem was that
the reverse catalyzation method generated 100s of options (of ligation
reactions resulting in a single, whole sequence) with enzyme numbers
greater than 3 or 4. The program took significantly longer than usual to
process all this data and would almost always result in an OutOfMemoryError
by the Java program.
This was because of the way the reverse catalyzation method works. It
computes every possible series of ligase reactions that could result in the
generation of a single sequence (with no fragments remaining). It follows
logic that increasing the number of active enzymes and the length of the
phrases would increase the number of possible ligase reactions
exponentially.
To combat this problem, the user-specified output phrase length was
reduced to between 50 and 70 notes. Computation time was still quite high
even above 60 notes.
79
At lower phrase lengths, it was found that the difference between the
Markovian phrase and the autocatalyzed phrase was very small. Stylistically,
the differences were unnoticeable. No original phrases were easily perceived.
Dataset 3 was much larger than dataset 1 and had more musical variation in
the corpus. When tested, it yielded better results in terms of originality. On
rare occasion, at phrase length 50 the program would result in significantly
longer computation time. This was due to the random nature of the
autocatalyzation and subsequent ligation and the different combinations of
sub-sequences that might have emerged based on the series of random
reactions.
Overall, most of the time the results did, however, seem to suggest a level
of success. One example of this is a test done on dataset 3 with 3 state
Markov chains,
The Markovian analyzer generated the following phrase:
D3, G#3, G3, F3, D3, A3, F3, G3, A3, B3, C4, D4, E4, D4, E4, F4, G4, G#4, A4, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, D4, F4, D4, C4, A3, C4, A3, G#3, G3, F3, E3, D3, C#3, D3, G#3, G3, F3, D3, A3
80
These were then broken down and over 100 possible ligase options were
generated. The following one was chosen (note that ‘FP’ is the sub-
sequence’s reference number in the fragment pool).
(FP0)50, 56 - (FP1)55, 53, 50, 57, 53, 55, 57 - (FP4)62, 64, 62, 64 - (FP3)60 - (FP6)62, 65, 62, 60, 57, 60, 57, 56 - (FP5)65, 67, 68, 69, 72, 74, 77, 80, 79, 77, 74, 72, 71, 69, 67, 65 - (FP2)59 - (FP8)55, 53, 50, 57 - (FP7)55, 53, 52, 50, 49, 50, 56 This translated to the following corpus:
D3, G#3, G3, F3, D3, A3, F3, G3, A3, D4, E4, D4, E4, C4, D4, F4, D4, C4, A3, C4, A3, G#3, F4, G4, G#4, A4, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, B3, G3, F3, D3, A3, G3, F3, E3, D3, C#3, D3, G#3
The highlighted notes show where two sequence fragments had been
rejoined together. Upon listening to these sequences, a guitar player may
notice that these highlighted sections are large jumps across the fret board,
which would be unintuitive (though not technically impossible) based on the
phrase leading to the jump.
This is because the preceding phrases would be played using certain finger
positions, which would not necessarily lead intuitively into the jump across
notes. These jump are, however, playable with some amount of forward
thinking, and planning, on the guitarist’s part, about how and where to
position the fingers on the guitar fret board.
81
Most samples based on dataset 3 provided this kind of output and lead to
the composition of original phrases. The following phrase was particularly
interesting:
D3, A3, F3, G3, A3, G3, F3, D3, A3, F4, G4, A4, B4, G4, C5, G4, D5, F4, A4, C5, A4, D5, F5, D5, C5, D5, F5, G#5, G5, F5, D5, C5, B4, A4, G4, F4, D4, F4, D4, C4, A3, C4, A3, G#3, B3, C4, D4, E4, D4, E4 The underlined arrangement was quite different to any part of the ‘Dorian
flavour’ datasets but was an aurally pleasing phrase. Furthermore, the ‘B3’
note is the dorian sixth note of the D Dorian scale. In the author’s opinion it
was placed well in the phrase. However, one must bear in mind that it was
generated based on random probability and without any specific conscious
musical intention on the part of the program. This particular improvisation
has been saved as a MIDI file and provided on the CD. It is called ‘dataset3-
out-3order-length-50-sample.mid’.
Dataset 1 was a smaller sample of ‘Dorian flavour’ phrases. While the
Markov analyzer program yielded good results with this dataset, the
autocatalytic set program did not improve on these. The output was not
original sounding at all. Dataset 2 was a linear musical pattern that was not
meant for testing the autocatalytic set program, and so was not used.
82
6. Discussion and Recommendations
In this section, the results in chapter 5 will be analyzed and evaluated.
Discussions will follow regarding the reasons for these results and what they
might suggest. Recommendations for future studies will be made where
appropriate.
Note: The analyses and suggestions in this section are based on the opinion
and evaluation of the author. It is recommended that larger studies should
be carried out using musicians to evaluate the programs in order to arrive at
a better statistical analysis of the output.
6.1 Markov Chain Music
On the whole, the music generated by the Markov chain program was
satisfactory. Experimenting with different order chains yielded various
results. It was found, with the chosen datasets, that orders two and three
yielded a better mix of style capture and originality than other orders.
Depending on the size of the dataset, orders two and one tended to yield
more random sounding corpora. With orders of four and above (especially
six and above) the dataset was replicated almost note for note.
83
While the latter half of the analysis agrees with Brooks et al’s (1957)
findings, the former half does not. They suggested that lower order chains
such as two or three would not capture the style faithfully. However, this
test seems to suggest otherwise.
It is worth noting that the datasets used in this test were much smaller and
had less variation than the ones used by Brooks et al. It is possible that
lower order chains will not capture the style well for large datasets with
more variation.
As was noted in section 5.1.1, a limitation of the Markov analyzer program is
that the ‘phraseToJFugueString()’ method in the ‘probablePhrase’ class
generates all quarter notes (i.e. 4 notes for every beat or count). At the time
of analysis of the musician’s solo, the analyzer does not record the length of
each note, but rather just the note itself. Therefore, there is no context
regarding the length of notes.
One possible solution would be to treat the same note at different lengths as
different states. Thus, for example, D1 held for two counts would be a
different state to D1 held for four counts. The immediate drawback,
however, would be that the Markov chains would only recognize the
existence of D1 (two counts) and D1 (four counts) and would generate
either of these. At no point will it consider giving, say, C1 a length of four
84
counts as no Markov chain would exist for C1 (four counts) and it would be
an unrecognized state.
Further work should be done on developing heuristics and algorithms to
modify note lengths.
Overall, the Markov chain program was not meant to be the main focus of
the project. It was developed to serve the purpose of generating a phrase
with a reasonable degree of originality while staying true to the author’s
style. This goal was achieved and the main program, the autocatalytic set,
could be evaluated.
85
6.2 Autocatalytic Set Theory
As was shown in chapter 5, the autocatalytic set program had mixed results
depending largely on the dataset being used. The following general
observations were made:
Larger datasets with more variation in the content yield better results
Dataset 3 had more content with a greater degree of variation and this
resulted in the generation of original phrases that were not previously part
of the general style of the corpus. Tests carried out on dataset 1, with its
limited content, did not yield particularly interesting corpora.
Deviation from Markov chains result in originality
In all cases of original sounding phrases that were generated, the particular
parts of the phrases that sounded original were the ones that broke away
from the Markov chain predictability. This agrees with Franz’s (1998)
definition of originality, which is an “uncommonness or statistical
infrequency of a person’s ideas” (pp. 60). This would follow reason as the
Markov chains facilitate predictability and the frequent recurrence of high
probability passages.
86
However, this may also be a fault of the program algorithms. Recall that the
user specifies a Markov chain order at runtime and all chains from 1-state up
until, and including, the user-specified state are created. When generating a
probable phrase, the program looks for the highest available chain to use
when generating the next note.
For example, consider the following phrase:
1 4 6 2 9 X
If the user-specified chain order is 4, the program will consider the last four
states before generating state X. Thus it will consider the phrase “4 6 2 9”
and generate state X based on its Markovian probabilities. If no chain with
that present state is found, it will then consider “6 2 9”, followed by “2 9”
and finally “9”. Thus, at all times, the highest state chain will be chosen
where possible.
On one hand, this facilitates effective capture of the player’s style. However,
as Brooks et al (1957) suggest, using larger order chains increase
predictability to the point of replication of the original corpus.
Autocatalyzation and subsequent ligation result in the breaking up of these
chain structures by combining smaller-order chains together. Consider the
following original corpus:
1 4 6 2 9 6 2 8 4 6
87
Markovian probability suggest that the 4-order phrase “1 4 6 2” is always
followed by “9”.
Consider if “6 2” was the digestive enzyme and autocatalyzed the set. The
result fragments would be:
1 4 6 - 2 9 6 - 2 8 4 6
Subsequent ligation around the “6 2” enzyme could result in the following
recombination:
1 4 6 2 8 4 6 2 9 6
Here the phrase “1 4 6 2” is not followed by “9”. It has, thus, broken away
from predictable patterns and resulted in some degree of originality.
However, this new sequence still conforms to lower order Markov chain
probabilities as the phrase “4 6 2”, a three-state chain, is still followed by
“9” in the second half of the phrase.
It is possible, then that Markov-generated music can be made more creative
by mixing lower state chains with higher state chains when choosing the
next note to generate. However, lower state chains should not be chosen too
often as, keeping in line with the earlier hypothesis, they may not effectively
capture and maintain the player’s style.
The originality is further enhanced when fragments broken up by different
enzymes combine together. Consider the following sequence:
1 4 6 2 8 4 6 2 9 6 2
88
If two digestive enzymes (“2 8” and “2 9”) split the sequence, the resulting
fragments would look like this:
(2 8) (2 9)
1 4 6 2 - 8 4 6 2 - 9 6 2
It is entirely possible, at the time of ligation, that fragments 1 and 3, which
have come into existence because of two separate enzymes, can recombine
around the “2 9” enzyme and fragments 2 and 3 can do the same around
the “2 8” enzyme. The resulting string would look as such:
1 4 6 2 9 6 2 8 4 6 2
Again, this deviation away from larger state Markov patterns and towards
smaller state Markov patterns results in a sequence with a somewhat
original structure.
The ligation process should be reviewed and approached differently
One severe drawback of the algorithms used for ligation was that, as shown
in section 5.2, computation time was increased substantially and this
sometimes led to memory overload. As had been stated, this difficulty arose
because the true process of ligation (random ligation interspersed with
random autocatalyzation) could not be modelled, as the program’s goal was
to arrive at only one remaining sequence within a short amount of time. Real
89
life ligation might reach this goal by random chance but it is not intentional
in any way and it may happen after a long period of time.
Revisions to the autocatalytic set simulation are needed. This may even
entail completely changing the algorithms behind the simulation.
The number of digestive enzymes and autocatalyzation reactions should not
be too high
It was observed that the higher the number of reactions and enzymes, the
more random the resulting corpora sounded. This follows Kauffman’s (1993)
suggestion that the number of enzymes should be kept low. It also follows
the logic, described above, that choosing lower state Markov chains (as a
result of recombination of sequences) too often may decrease the player’s
predictability too much, to the point where the essence of her style is lost. At
this point the music becomes more random.
As such, the number of digestive enzymes and autocatalyzation reactions
has diminishing returns to scale. It is suspected that future evaluation may
show it to follow a skewed, bell-shaped graphical pattern.
90
Using a player’s signatures as digestive enzymes facilitates better style
capturing
It was hypothesized earlier that one of the reasons for Iverson’s (1990)
limited success when simulating autocatalytic sets to create music might
have been that the player’s signatures were not preserved. According to the
literature, the preservation (and frequent recurrence) of certain signatures
provides a good style capturing mechanism. The output of the program
seemed to maintain the style of the original corpus. However, this output
was not compared to models in which signatures were not preserved. An
avenue of future work would be to compare the output of the two models.
Markovian and autocatalytic music generation models can never be truly
creative
A glaring drawback of the Markovian models is that every note that is
generated has been ‘learnt’ from the original corpus. A particular note will
never be generated if it did not exist in the original corpus. The same is true
of autocatalytic sets. They do not add any notes to the phrase but simply
recombine existing notes.
The true process of creativity as found in human nature often entails
exploring new options and gauging the outcome. For guitar players, this
could be trying new notes that may not necessarily be in a particular scale
91
but may sound pleasing nonetheless. The ‘Dorian flavour’ is one such
example where ‘wrong’ notes (the flatted fifth and the Dorian sixth) are
played in such a way so as to sound pleasing and creative. While humans
may choose to explore these options, Markovian models do not.
However, as has been discussed by Iverson (1990), the digestive enzymes
may mutate in some autocatalytic sets. Randomly changing the structure of
these enzymes to notes that may not be in the scale might produce more
original results and shows characteristics of true creativity and musical
exploration.
It is worth noting that, even amongst humans, this process of musical
exploration is often one of trial and error. New notes can be tried at random.
and may be found either to sound good or not. With enough exploration and
trial of different note combinations, aurally pleasing phrases may eventually
be discovered.
Finally,
Markovian music generation models can be enhanced using autocatalytic set
theory
All the observations seem to suggest that Markovian music generation has
latent potential to produce original sounding pieces within a certain stylistic
92
context. The study seems to suggest, however that it needs another process
to ‘push’ it towards exploring these options. The process of random
recombination of the corpus using autocatalytic set theory achieved this. If
originality is borne out of a lack of predictability, the random recombination
facilitates this creative process by breaking predictable patterns as shown
above.
This would, however, also suggest that any form of randomization would go
some way towards achieving this. The autocatalytic set theory was chosen
for two main reasons. Firstly, it preserves artist signatures. Secondly, it is a
natural process in living organisms and numerous studies promote the use of
algorithms found in nature (Pelta and Krasnogor 2006). Future studies could
be carried out to compare the results of this program to that of others where
Markovian principles are combined with different randomization models. This
would provide a platform to discover further processes and heuristics that
might result in a great degree of originality within a stylistic context.
All these analyses and evaluations are those of the author. It is suggested
that extensive future work should be carried out to conduct a proper
evaluation of the music generated by the program. Any number of different
combinations of variables may be chosen. These include the Markov chain
orders, the number and lengths of the digestive enzymes, the number of
93
autocatalyzation reactions and the content of the original corpus. It is
recommended that musicians, specifically guitar players, should be chosen
to evaluate the output of the program at different combinations of these
variables in order to gain a better estimate of the best ones.
94
7. Conclusion
The purpose of this project was to develop a style modelling system capable
of capturing a guitar player’s style and then playing an improvised solo in
that style. The main objective was to develop the foundations of a system
that would maintain this style but add creative and original touches to the
music it generated. To achieve this end, Markov-based music generation and
autocatalytic set-based music generation were tested and evaluated
separately and combined, and the following conclusions were drawn:
1. Larger datasets with greater variation in the content have better potential
for the generation of creative, original sounding music.
2. Deviation from large state Markov chains result in originality. A greater
degree of originality may be achieved by combining lower state chains
with higher state chains when generating phrases.
3. The identification of a player’s signatures and the use of these as
digestive enzymes result in better style capturing.
4. The number of digestive enzymes and autocatalyzation reactions has
diminishing returns to scale. The perceived creativity of the program
peaks at a certain level but too many enzymes and reactions start to
95
weaken the perception of the original style and the music becomes more
random.
5. Markovian models cannot promote real creativity, as found in humans, as
they inherently will never generate previously unexplored note choices. A
process of randomization and mutation of the original corpus is needed to
give Markovian models this ‘creativity’.
6. Markovian music-generation models can be enhanced using autocatalytic
set theory in order to promote creativity and originality in the music
composed by these systems.
In conclusion, the value of this dissertation lies in developing a model for
creative music generation based on a specific style. It has achieved this goal
to a large extent. Future studies bigger in scope and size are required to
further develop and test this model. This project has provided a firm
theoretical and technical basis on which larger studies can be developed.
96
5. References
Allan, M. (2002) Harmonizing Chorales in the Style of Johann Sebastian
Bach. MSc Thesis, School of Informatics, University of Edinburgh
Anders, T. (2003) Composing Music by Composing Rules: Computer Aided
Composition employing Constraint Logic Programming. Sonic Arts Research
Centre, Queens University Belfast, Northern Ireland (unpublished)
Baggi, D. (1998) The Role of Computer Technology in Music and Musicology.
Laboatoria di Informatica Musicale, Universita delgi Studi di Milano, Italy.
Retrieved on 16th of July, 2007 from
http://lim.dico.unimi.it/eventi/ctama/baggi.htm
Biles, J.A. (1994) “GenJam: A genetic algorithm for generating jazz solos,”
in Proc. Int. Computer Music Conf., pp.131-137
Brooks, F.P., Hopkins, A. L., Neumann, P.G. & Wright, W.V. (1957) ‘An
Experiment in Musical Composition’. In S.M. Schwanauer & D.A. Levitt (eds)
Machine Models of Music. The MIT Press, London, pp. 23-42
Brown, P. (2005) ‘Is the future of music generative?’ Music Therapy Today,
6(2), pp. 215-274
97
Chomsky, N. (1957) Syntactic Structures. Mouton, The Hague
Conklin D (2003) ‘Music Generation from Statistical Models’, Proceedings of
the AISB 2003 Symposium on Artificial Intelligence and Creativity in the Arts
and Sciences, Aberystwyth, Wales, pp. 30–35
Cope, D. (1992) ‘Computer Modelling of Musical Intelligence in EMI’.
Computer Music Journal, 16(2), pp. 69-83
Cope, D. (1999) ‘One approach to musical intelligence’. IEEE Intelligent
Systems, 14(3), pp. 21–25
Csikszentmihalyi, M. (1988) ‘Society, culture, and person: A systems view of
creativity’. In R. J. Sternberg (eds) The Nature of Creativity: Contemporary
Psychological Perspectives. Cambridge University Press, New York, pp. 325-
329
Dubnov, S., Assayag, G., Bejerano, G., & Lartillot, O. (2003a) ‘A System for
Computer Music Generation by Learning and Improvisation in a Particular
Style’. IRCAM (unpublished). Retrieved on 17th July 2007 from
http://mediatheque.ircam.fr/articles/textes/Dubnov03a/
Dubnov, S., Assayag, G., Lartillot, O., & Bejerano, G. (2003b) ‘Using
machine-learning methods for musical style modelling’. IEEE Computer, 36
(10), pp. 73-80.
98
Eacott, J. (2000) Form and Transience – Generative Music Composition in
Practice. A paper for Generative Art 2000, Milan, Retrieved July 7th, 2007,
from http://www.informal.org/FormandTransience.pdf
Franz, D.M. (1998) Markov Chains as Tools for Jazz Improvisation Analysis.
MSc Thesis, Virginia Polytechnic Institute and State University
Huron, D. (2001) Information Theory and Music. Ohio State University
School of Music, Retrieved on 15th June 2007 from http://www.music-
cog.ohio-state.edu/Music829D/Notes/Infotheory.html
Iverson, E. and Hartley, R.T. (1990) ‘Metabolizing Music’. Proceedings of the
1990 International Computer Music Conference, ICMA, San Francisco, 1990.
Jones, K. (1981) ‘Compositional Applications of Stochastic Processes’.
Computer Music Journal, 5(2), pp. 45-61
Kauffman, S. A. (1993) The Origins of Order. Oxford University Press, USA
Koelle, D (2007) ‘JFugue’. Java API for Music Programming. Retrieved
August 1st, 2007 from http://jfugue.org/
Limpaecher, A. (2007) Musical Markov Retrieved July 2nd, 2007, from
http://www.princeton.edu/~alimpaec/alimpaecMarkov/
99
Mahmud, J. (2006) ‘Grammar Based Modelling and Generation of Tabla
Compositions’. BSc Dissertation, Computer Science Department, University
of Bath
McAlpine, K., Miranda, E. and Hoggar, S. (1999) ‘Making Music with
Algorithms: A Case Study System’, Computer Music Journal, 23(2), pp. 19-
30
Pachet, F. (2002a) ‘Multimedia at Work - Playing With Virtual Musicians: The
Continuator in Practice’, IEEE Computer, 9(3), pp. 77-82
Pachet, F. (2002b) ‘The Continuator: Musical Interaction With Style’.
Proceedings of the International Computer music Conference, ICMA, Sweden
Pelta, D.A. and Krasnogor, N. (2006) Workshop on Nature inspired
cooperative strategies for optimisation (eds). University of Granada.
Pfisterer, M and Bomers, F. (2003) DumpReceiver.java, Java Sound
Resources Code examples. Retrieved on 16th of August, 2007 from
http://www.jsresources.org/examples/DumpReceiver.java.html
Wilcox, A. (2007) Generative Music. Retrieved July 15th, 2007, from
http://www.alexwilcox.co.uk/projects.php?id=10
100
Winston, W. (1994) Operations Research: Applications and Algorithms.
Brooks/Cole Thomson Learning Inc., USA
Zimmer, C. (1993) ‘Metamusic’. Discover Magazine, Retrieved on 17th July
2007, from the Discover Magazine database
101
Appendix 1 – Java Documentation
This section presents the java documentation for the program developed as
part of this project. Each class shown in the diagram has a Java document
that provides an in-depth explanation of what it represents, what attributes
and operations it has and how it works.
All of the classes belong to a package. The documentation has been
arranged in alphabetical order according to class name.
Four packages were developed by the author: ‘autoCatalyticSet’, ‘markov’,
‘markovAuto’ and ‘NSD’. ‘JFugue’ was developed by Koelle (2007).
Figure 4 shows all the classes and packages developed by the author for the
program. It does not show relationships or dependencies between classes.
102
Fig 4. The classes and packages developed by the author for the program
Each Java document begins with an overall description of the class. This may
then be followed by a ‘nested class summary’, which refers to a class within
a class and will have its own separate Java document.
Next is the ‘field summary’, which has a brief description of the attributes of
the class. For example, a recurring phrase has ‘phrase’ and ‘totalOccurences’
103
attributes. These are only short descriptions. More details are given in the
‘field summary’ section further down in the document.
The ‘constructor summary’ that follows may be ignored. The ‘method
summary’ describes the operations of the class in short. Again, full
descriptions follow later on in the document.
These Java documents describe attributes and operations in every class that
may not have been shown in figure 4. These are more technical methods
that were used to facilitate programming.
Note that no Java Documentation has been provided for JFugue, as it is
copyright of Koelle (2007). A reference to the JFugue resources can be found
in the references sections.
To read the Java documentation, please refer to the CD attached with this
dissertation and read ‘README.pdf’ for instructions. A copy of this instruction
sheet has been attached in Appendix 2.
104
Appendix 2 – Program User Manual
This user manual guides you through using the program developed for this
project. It has been included on the CD attached with this dissertation.
Before starting, copy the ‘Program’ folder from the root directory of the CD
to any location on your hard drive.
2.1 Running the program
First use the command line to navigate to the location you have stored the
program file in. Then direct it to:
“Program/dist/”
105
To run the program, type the following into the command line:
“java –jar markovAuto.jar yourmidifile.mid X Y Z”
Where:
X = the highest order Markov chains to be created
Y = the number of digestive enzymes to be selected
Z = the length of the phrase (in notes) to be generated
106
2.2 Sample Data Sets and Output Samples
The four sample data sets that were used in the project have been provided
on this CD. They can be found under:
<root>/Program/dist/
The data sets are:
dataset1.mid = a short improvised solo in the key of Dm using phrases
with the “Dorian Flavour”.
dataset2.mid = a slightly modified finger exercise pattern for guitarists.
This is a linear pattern that will mostly generate the same phrase even at
very low Markov chain orders.
dataset3.mid = a longer improvised solo in the key of Dm using phrases
with the “Dorian Flavour”. This was the main file used to test the
program.
dataset-scarified.mid = a sample melody from the song “Scarified” by
Paul Gilbert.
One output sample generated during the testing phase has been provided
under:
<root>/Program/dist/output
This particular file is specifically mentioned in the dissertation.
107
2.3 Program Source Code and Java Documentation
The source code written for the program has been included under:
<root>/Program/src/
Java files for all of the classes developed have been included. They have
been arranged in the following package structure:
The source code for the JFugue package has not been provided as it is
copyright of Koelle (2007) and is available at http://jfugue.org.
108
Java documents for each of the classes have been also been provided. They
describe each class in complete detail including a brief description, its
variables and methods. To navigate through the Java documents, open:
<root>/Program/dist/javadoc/index.html
Again, all the classes have been arranged in their package structure.