Upload
caitlin-dorsey
View
214
Download
0
Embed Size (px)
Citation preview
IAML Annual Study Weekend12 April 2010
musicSpace: Music and the Semantic Webhttp://musicspace.mspace.fm
2
Contents
1. Motivation
2. Solutions
3. Evaluation
4. The Future
3
1. Motivation (the problem we’re addressing)
4
5
Centuries of material...
6
... is now increasingly digitised
Yet, with exceptions like Copac, ...
7
... data is typically siloed.
8
... data is typically siloed.
Geographical dispersal has been replaced by a virtual dispersal on the web. Data is now segregated into a plethora of online resources by: – Media type (text, image, audio,
video)– Date of creation/publication– Subject
9
... data is typically siloed.
Geographical dispersal has been replaced by a virtual dispersal on the web. Data is now segregated into a plethora of online resources by: – Language– Copyright holder– Ad hoc/insecure nature of project
funding
10
... data is typically siloed.
Crucially:
Data sources are not usually interoperable.
Data and metadata from one source cannot be used (automatically) as the basis for a query of another source
11
12
The Google/textbox search paradigm is limiting.
13
You’d better know what you want!
14
What tickles your fancy?
So, interacting with current data resources present barriers
at all stages of the research process:
15
16
Inchoate ideas Hum… something on Monteverdi’s madrigals?
Specific complex questions Which scribes have created manuscripts of
Monteverdi’s works, and which other composers’ works have they inscribed?
What recording of works by Cage exist, which performers have recorded a particular work by Cage, and what else have they recorded?
17
Which scribes have created manuscripts of Monteverdi’s works, and which other composers’ works have they inscribed?
We’d use RISM for this question.
1. Execute a ‘People’ search for ‘Monteverdi’.
2. Manually filter out results returned where the composer’s name appears in reference to a role other than that of composer (although for Monteverdi this isn’t likely to take long).
3. Examine the remaining records to identify the scribe in each case.
18
Which scribes have created manuscripts of Monteverdi’s works, and which other composers’ works have they inscribed?
4. Execute a ‘People’ search for each scribe of interest.
5. Manually filter out records where the scribe is named in reference to role other than that of scribe (e.g. ‘former owner’, ‘composer’ etc.).
6. Collate a list of other composers whose works the scribes have inscribed.
19
What recording of works by Cage exist, which performers have recorded a
particular work by Cage, and what else have they recorded?
We’d use BLSA, Copac and Naxos.
1. Search for recordings where the composer is ‘Cage’ in: i. BLSA
ii. Copac
iii. Naxos
(Each requiring a different search formulation)
2. Collate results and make a list of performers.
20
What recording of works by Cage exist, which performers have recorded a
particular work by Cage, and what else have they recorded?
3. Search Naxos for recordings where these performers are given as the ‘performer’ and ‘Cage’ is given as the ‘composer’.
4. Search BLSA and Copac for records that name these performers and Cage, and then manually filter out results where names do not occur in relation the appropriate role.
5. Manually collate repertoire lists for each Cage performer.
21
The barriers to tractability and their solutions
Need to consult several sources … and metadata from one source cannot guide searches of another source.
Insufficient granularity of data and/or search option.
Multi-part queries have to be broken down and results collated manually.Pen and paper!
Solutions:
Integration
Increase granularity
Optimally interactive UI (‘mSpace’)
2. Solutions
22
Integration
23
24
Rather than using many portals ...
25
... what if you could use just one?
26
Our partners use a variety of data formats
MARC-XML
MODS-XML
Custom MARC
Source-specific XML
Tables/CSV
We import these as RDF
Why RDF? 1.Standard format for the
Semantic Web.
2.It’s modular; we can add records and record fields without having to start from scratch.
3.RDF can be created using lots of different tools.
Granularity
27
28
Metadata hierarchy
We use a two-level hierarchy based on metadata type.
Person
Composer Scribe etc.PerformerAuthor
Crucially, our search UI exposes this hierarchy so that both broad and narrow searching is possible.
29
Adding/exposing granularity
Where possible we add to/expose the granularity of the metadata.
Person ‘Immyns, John [scr]’.
Scribe ‘Immyns, John’.
Book ‘Come Death, I shall not fear thee ...’ with author ‘Monteverdi, Claudio’.
Book of printed music ‘Come Death, I shall not fear thee ...’ with composer ‘Monteverdi, Claudio’.
(Because of marc-leader info.)
30
Generating metadata: Grove works lists
31
Our Tool for the Works Lists
User Interface
32
33
‘musicSpace’ is a faceted browser
34
Screencast 1: Which scribes have created manuscripts of Monteverdi’s works, and which other composers’ works have they inscribed?
35
Screencast 2: What recording of works by Cage exist, which performers have recorded a particular work by Cage, and what else have they recorded?
3. Evaluation
36
37
Feedback on speed and ease of use:
‘All the information showed up very quickly, and it was easy to find material. It was really good to have different kinds of material in the same place.’
‘[musicSpace offers] a speedier way to research crossed search pathways.’
‘Excellent interface – very simple to understand.’
38
Feedback on browsing around a subject or changing the search
paradigm:
‘I would recommend musicSpace for its ability to manipulate queries in order to get results that you wouldn’t otherwise be able to get [without starting over].’
‘I liked the ability to explore around a topic once you’ve identified something of interest.’
‘The ability to switch columns around and add new columns was most useful.’
39
Feedback on improved data granularity:
‘[Without using musicSpace] it would not be at all easy to do an opera character search. You would have to use printed reference books like Pipers Enzyklopädie des Musiktheaters, but even this does not have an index of characters, so you’d have to look at the entry for each opera and manually collate information. You would also have to know what you were looking for before starting out!’
‘I used musicSpace to explore how many operas have a character named Alceste. This information simply isn’t get-at-able using other search interfaces – you’d have to sort through the information by hand.’
An invitation to try our demo
Musicologists: – Monteverdi recordings– C19th opera buffa– Schubert’s songs – C20th electroacoustic music
Music librarians / library scientists
Music technologists / web scientists
40
4. The Future
42
44
Thank you!