IAML Annual Study Weekend 12 April 2010 musicSpace: Music and the Semantic Web

Preview:

Citation preview

IAML Annual Study Weekend12 April 2010

musicSpace: Music and the Semantic Webhttp://musicspace.mspace.fm

2

Contents

1. Motivation

2. Solutions

3. Evaluation

4. The Future

3

1. Motivation (the problem we’re addressing)

4

5

Centuries of material...

6

... is now increasingly digitised

Yet, with exceptions like Copac, ...

7

... data is typically siloed.

8

... data is typically siloed.

Geographical dispersal has been replaced by a virtual dispersal on the web. Data is now segregated into a plethora of online resources by: – Media type (text, image, audio,

video)– Date of creation/publication– Subject

9

... data is typically siloed.

Geographical dispersal has been replaced by a virtual dispersal on the web. Data is now segregated into a plethora of online resources by: – Language– Copyright holder– Ad hoc/insecure nature of project

funding

10

... data is typically siloed.

Crucially:

Data sources are not usually interoperable.

Data and metadata from one source cannot be used (automatically) as the basis for a query of another source

11

12

The Google/textbox search paradigm is limiting.

13

You’d better know what you want!

14

What tickles your fancy?

So, interacting with current data resources present barriers

at all stages of the research process:

15

16

Inchoate ideas Hum… something on Monteverdi’s madrigals?

Specific complex questions Which scribes have created manuscripts of

Monteverdi’s works, and which other composers’ works have they inscribed?

What recording of works by Cage exist, which performers have recorded a particular work by Cage, and what else have they recorded?

17

Which scribes have created manuscripts of Monteverdi’s works, and which other composers’ works have they inscribed?

We’d use RISM for this question.

1. Execute a ‘People’ search for ‘Monteverdi’.

2. Manually filter out results returned where the composer’s name appears in reference to a role other than that of composer (although for Monteverdi this isn’t likely to take long).

3. Examine the remaining records to identify the scribe in each case.

18

Which scribes have created manuscripts of Monteverdi’s works, and which other composers’ works have they inscribed?

4. Execute a ‘People’ search for each scribe of interest.

5. Manually filter out records where the scribe is named in reference to role other than that of scribe (e.g. ‘former owner’, ‘composer’ etc.).

6. Collate a list of other composers whose works the scribes have inscribed.

19

What recording of works by Cage exist, which performers have recorded a

particular work by Cage, and what else have they recorded?

We’d use BLSA, Copac and Naxos.

1. Search for recordings where the composer is ‘Cage’ in: i. BLSA

ii. Copac

iii. Naxos

(Each requiring a different search formulation)

2. Collate results and make a list of performers.

20

What recording of works by Cage exist, which performers have recorded a

particular work by Cage, and what else have they recorded?

3. Search Naxos for recordings where these performers are given as the ‘performer’ and ‘Cage’ is given as the ‘composer’.

4. Search BLSA and Copac for records that name these performers and Cage, and then manually filter out results where names do not occur in relation the appropriate role.

5. Manually collate repertoire lists for each Cage performer.

21

The barriers to tractability and their solutions

Need to consult several sources … and metadata from one source cannot guide searches of another source.

Insufficient granularity of data and/or search option.

Multi-part queries have to be broken down and results collated manually.Pen and paper!

Solutions:

Integration

Increase granularity

Optimally interactive UI (‘mSpace’)

2. Solutions

22

Integration

23

24

Rather than using many portals ...

25

... what if you could use just one?

26

Our partners use a variety of data formats

MARC-XML

MODS-XML

Custom MARC

Source-specific XML

Tables/CSV

We import these as RDF

Why RDF? 1.Standard format for the

Semantic Web.

2.It’s modular; we can add records and record fields without having to start from scratch.

3.RDF can be created using lots of different tools.

Granularity

27

28

Metadata hierarchy

We use a two-level hierarchy based on metadata type.

Person

Composer Scribe etc.PerformerAuthor

Crucially, our search UI exposes this hierarchy so that both broad and narrow searching is possible.

29

Adding/exposing granularity

Where possible we add to/expose the granularity of the metadata.

Person ‘Immyns, John [scr]’.

Scribe ‘Immyns, John’.

Book ‘Come Death, I shall not fear thee ...’ with author ‘Monteverdi, Claudio’.

Book of printed music ‘Come Death, I shall not fear thee ...’ with composer ‘Monteverdi, Claudio’.

(Because of marc-leader info.)

30

Generating metadata: Grove works lists

31

Our Tool for the Works Lists

User Interface

32

33

‘musicSpace’ is a faceted browser

34

Screencast 1: Which scribes have created manuscripts of Monteverdi’s works, and which other composers’ works have they inscribed?

35

Screencast 2: What recording of works by Cage exist, which performers have recorded a particular work by Cage, and what else have they recorded?

3. Evaluation

36

37

Feedback on speed and ease of use:

‘All the information showed up very quickly, and it was easy to find material. It was really good to have different kinds of material in the same place.’

‘[musicSpace offers] a speedier way to research crossed search pathways.’

‘Excellent interface – very simple to understand.’

38

Feedback on browsing around a subject or changing the search

paradigm:

‘I would recommend musicSpace for its ability to manipulate queries in order to get results that you wouldn’t otherwise be able to get [without starting over].’

‘I liked the ability to explore around a topic once you’ve identified something of interest.’

‘The ability to switch columns around and add new columns was most useful.’

39

Feedback on improved data granularity:

‘[Without using musicSpace] it would not be at all easy to do an opera character search. You would have to use printed reference books like Pipers Enzyklopädie des Musiktheaters, but even this does not have an index of characters, so you’d have to look at the entry for each opera and manually collate information. You would also have to know what you were looking for before starting out!’

‘I used musicSpace to explore how many operas have a character named Alceste. This information simply isn’t get-at-able using other search interfaces – you’d have to sort through the information by hand.’

An invitation to try our demo

Musicologists: – Monteverdi recordings– C19th opera buffa– Schubert’s songs – C20th electroacoustic music

Music librarians / library scientists

Music technologists / web scientists

40

An invitation to try our demo

musicspace.mspace.fm

(needs Firefox)

41

4. The Future

42

43

Works lists project with Grove Music Online

Composer URI project

44

Thank you!

Recommended