Upload
david-de-roure
View
897
Download
11
Tags:
Embed Size (px)
DESCRIPTION
"From SALAMI to Social Machines: Music Information Retrieval as an Exemplar of Digital Research, or Fourth Quadrant Research". Keynote by David De Roure at Semantic Media Launch, Barbican, 3 October 2012
Citation preview
David De Roure
From SALAMI to Social Machines:
Music Information Retrieval as an
Exemplar of Digital Research
From SALAMI to Social Machines:
Music Information Retrieval as an
Exemplar of Digital Research, or
Fourth Quadrant Semantic Media
• Digital Research
• SALAMI as an exemplar
• Reconstruct / Reuse / Repurpose
• Social Machines
Overview
...the imminent flood of scientific data expected from the next generation of experiments, simulations, sensors and satellites
Source: CERN, CERN-EX-0712023, http://cdsweb.cern.ch/record/1203203
Bio
Ess
ays,
, 26
(1):
99–
105,
Jan
uary
200
4
http://research.microsoft.com/en-us/collaboration/fourthparadigm/
More people
Mor
e m
achi
nes
A Big Picture
Big DataBig Compute
Conventional Computation
The Future!
SocialNetworking
e-infrastructure
onlineR&D
The Fourth Quadrant
• Data Analysis Pipelines• Workflows are the new
rock and roll• Machinery for
coordinating the execution of services and linking together resources
• Repetitive and mundane boring stuff made easier
E. Science laboris
Carole Goble
• Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis in cattle
• Paul meets Jo. Jo is investigating Whipworm in mouse.
• Jo reuses one of Paul’s workflow without change.• Jo identifies the biological pathways involved in
sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite.
• Previously a manual two year study by Jo had failed to do this.
Reuse, Recycling, Repurposing
Carole Goble
“A biologist would rathershare their toothbrush than
their gene name”
“Data mining: mydata’s mine and
your data’s mine”
Results
Logs
Results
Metadata PaperSlides
Feeds into
produces
Included in
produces Published in
produces
Included in
Included in Included in
Published in
Workflow 16
Workflow 13
Common pathways
QTLPaul’s PackPaul’s Research
Object
Research Objects Reproducibility, Integrated Publishing
• Workflow– Provenance– Conservation & Preservation– Executable Publication
• Human– Credit Tracking– Unit of Scholarship– Crowd management
• Semantics– Acquisition & Publishing– Encoding, Encapsulation & Annotation: OAI-ORE, AO…
Distributed Third Party Tenancy
Alien Store
Technical Objects Social Objects
Carole Goble
1. Use URIs as names for things
2. Use HTTP URIs so that people
can look up those names
3. When someone looks up
a URI, provide useful
information, using the
standards (RDF*, SPARQL)
4. Include links to other URIs
so that they can discover
more things
Linked Data support rdf.myexperiment.org
SELECT?pack ?contribWHERE { ?pack rdf:type mepack:Pack. ?pack ore:aggregates ?contrib.}
SELECT?wf ?uriWHERE { ?wf mebase:has-current-version ?v. ?v mecomp:executes-dataflow ?d. ?d mecomp:has-component ?c. ?c rdf:type mecomp:WSDLProcessor. ?c mecomp:processor-uri ?uri.}
Sean Bechhofer
A digital lab book
replacement that chemists were able to
use, and liked.
1 1 2 2 1 3 1 4
Sample of 4-flourinatedbiphenyl
Add CoolReflux
Butanone Sample ofK2CO3Powder
Weigh
grammes0.9031
Measure
40 ml
Add
Weigh
2.0719 g
text
3 5
Add
g
Sample ofBr11OCB
2 6
Reflux
2 7
Cool
Water
Measure
30 ml
9
Liquid-liquid
extraction
DCM
Measure
3 of 40 ml
10
Dry
MgSO4
11
Filter(Buchner)
12
RemoveSolvent
by RotaryEvaporation
13
Fuse
Silica
14
ColumnChromatography
Ether/PetrolRatio
Butanone dried via silica column andmeasured into 100ml RB flask.
Used 1ml extra solvent to wash outcontainer.
Started reflux at 13.30. (Had tochange heater stirrer) Only reflux
for 45min, next step 14:15.
Inorganics dissolve 2layers. Added brine
~20ml.
Organics are yellowsolution
Washed MgSO4 withDCM ~ 50ml
Measure
excess
Observation Types
weight - grammes
measure - ml, drops
annotate - text
temperature - K, °C
Key
Process
Input
Literal
Observation
Add CoolRefluxAddAdd Reflux Cool Dry Filter Remove
Solventby Rotary
Evaporation
Fuse ColumnChromatography
Dissolve 4-flourinatedbiphenyl inbutanone
Add K2CO3powder
Heat at refluxfor 1.5 hours
Cool and addBr11OCB
Heat atreflux untilcompletion
Cool and addwater (30ml)
Combine organics,dry over MgSO4 &filter
Removesolvent invacuo
Liquid-liquid
extraction
Extract withDCM(3x40ml)
Fuse compound to silica &column in ether/petrol
4 8
Add
Add
text
Annotate
Annotate
text
Weigh
Annotate
g
Annotate Annotate
text text
Future Questions
Whether to have many subclasses of processes or fewer with annotations
How to depict destructive processes
How to depict taking lots of samples
What is the observation/process boundary? e.g. MRI scan
1.5918
Combechem
30 January 2004gvh, hrm, gms
Ingredient List
Fluorinated biphenyl 0.9 gBr11OCB 1.59 gPotassium Carbonate 2.07 gButanone 40 ml
image
To
Do
Lis
tP
lan
Pro
ce
ss
Re
co
rd
Jeremy Frey
• Digital Research
• SALAMI as an exemplar
• Reconstruct / Reuse / Repurpose
• Social Machines
Overview
• Structural Analysis of Large amounts of Music Information
• Musical analysis has traditionally been conducted by individuals and on a small scale
• Computational approach, combined with the huge volume of data now available, will 1. Deliver substantive corpus of musical analyses in
common framework for music scholars and students2. Establish a methodology and tooling so that
community can sustain and enhance this resource
SALAMI
www.diggingintodata.org
Digital Music Collections
Student-sourced ground truth
Community Software
Linked Data Repositories
Supercomputer
23,000 hours ofrecorded music
Music InformationRetrieval Community
Structural Analysis of Large Amounts of Music Information
class structure
Ontology models properties from musicological domain• Independent of Music Information Retrieval research and
signal processing foundations• Maintains an accurate and complete description of
relationships that link them
Segment Ontology
Kevin Page and Ben Fields
Digital Music Collections
ground truth
Evaluation Infrastructure
(sociotechnical)
ExpertiseExpertiseExpertiseExpertise
Community SoftwareCommunity
SoftwareCommunity
Software
Digital Music CollectionsDigital Music CollectionsDigital Music Collections
ground truthground truth
Evaluation Infrastructure
(sociotechnical)
ResultsResultsResultsResults
EvaluationsEvaluationsEvaluations
paperspaperspapersPapers
• Digital Research
• SALAMI as an exemplar
• Reconstruct / Reuse / Repurpose
• Social Machines
Overview
Reusable. The key tenet of Research Objects is to support the sharing and reuse of data, methods and processes. Repurposeable. Reuse may also involve the reuse of constituent parts of the Research Object. Repeatable. There should be sufficient information in a Research Object to be able to repeat the study, perhaps years later. Reproducible. A third party can start with the same inputs and methods and see if a prior result can be confirmed.
Replayable. Studies might involve single investigations that happen in milliseconds or protracted processes that take years.Referenceable. If research objects are to augment or replace traditional publication methods, then they must be referenceable or citeable.Revealable. Third parties must be able to audit the steps performed in the research in order to be convinced of the validity of results.Respectful. Explicit representations of the provenance, lineage and flow of intellectual property.
The R dimensions
Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
Machine
Machine
repeat
Machine
repeat
REPRODUCE
MachineMachine
softwarepaper
paper
ResearchRecord
software
SoftwareREPRODUCE OR REPEAT?
software
workflowpaper
Software
wf
Machine
software
workflow
software
blogs.nature.com/eresearch/
Research Objects that are1. The research record for repeatable, reproducible, ... etc2. Describe process (method) for enactment/execution3. Usable by machines as well as humans
– Social Objects– Semantically described– Programmatically accessible– Designed for assistance and automation– Designed for scale and heterogeneity
4. Composable with a distributed computational model?
Computational Research Objects
Notifications and automatic re-runs
Machines are users too
Autonomic
Curation
Self-repair
New research?
• Digital Research
• SALAMI as an exemplar
• Reconstruct / Reuse / Repurpose
• Social Machines
Overview
Real life is and must be full of all kinds of social constraint – the very processes from which society arises. Computers can help if we use them to create abstract social machines on the Web: processes in which the people do the creative work and the machine does the administration… The stage is set for an evolutionary growth of new social engines. Berners-Lee, Weaving the Web, 1999
The Order of Social Machines
• Number of people• Number of machines• Scale of data• Varieties of data• Type of machine
problem solving• Type of human
problem solving
• Empowering of individuals, groups, crowds
• Time criticality• Extent of wide area
communication• Need for urgent
mobilization • Specification of goal state
Dimensions
SOCIAM – The Theory and Practice of Social Machines – commences October 2012, led by Nigel Shadbolt at University of Southampton.
Physical World(people and devices)
Building a Social Machine
Design andComposition
Participation andData supply
Model of social interaction
Virtual World(Network of social interactions) Dave Robertson
The users of a website, the website, and the interactions between them, together form our fundamental notion of a “machine”
More people
Mor
e m
achi
nes
That Big Picture
Big DataBig Compute
Conventional Computation
The Future!
SocialNetworking
e-infrastructure
onlineR&D
The Fourth Quadrant
An Agenda
1. Science has much to learn from an industry / R&D that is already digital end-to-end
• Insights into ICT challenges2. What can we learn from (e-)Science?
• Metadata capture at source, end-to-end semantics• Social objects, semantic objects, audio objects? • Reproducible/reconstructable/machine-assisted
analysis/production• Interactivity, intersection of digital and physical
3. Designing the Social Machines of music*• Human generated metadata / music?
* And also the music of Social Machines!
[email protected]/people/dder
www.scilogs.com/eresearch
@dder
Slide credits: Christine Borgman, Carole Goble, Faith Lawrence & Mike Jewell, Sean Bechhofer, Jeremy Frey, Ichiro Fujinaga, Stephen Downie, Ashley Burgoyne, Kevin Page, Ben Fields, Nigel Shadbolt, Dave Robertson
www.myexperiment.org/packs/337http://www.slideshare.net/davidderoure/fourth-quadrant-semantic-media
• Semantic Mediahttp://semanticmedia.org.uk/
• myExperiment project wikihttp://wiki.myexperiment.org/
• Workflow Forever project (Wf4Ever)http://www.wf4ever-project.org/
• Future of Research Communication (FORCE11)http://force11.org/
• Theory and Practice of Social Machines (SOCIAM)http://sociam.org/
Links
• D. De Roure, C. Goble and R. Stevens. The Design and Realisation of the myExperiment Virtual Research Environment for Social Sharing of Workflows Future Generation Computer Systems 25, pp. 561-567.
• S. Bechhofer, I. Buchan, D De Roure et al. Why linked data is not enough for scientists, Future Generation Computer Systems
• D. De Roure, David and C. Goble, Anchors in Shifting Sand: the Primacy of Method in the Web of Data. WebSci10, April 26-27th, 2010, Raleigh, NC, US.
• D. De Roure, S. Bechhofer, C. Goble and D. Newman, Scientific Social Objects, 1st International Workshop on Social Object Networks (SocialObjects 2011).
• D. De Roure, K. Belhajjame, P. Missier, P. et al Towards the preservation of scientific workflows. 8th International Conference on Preservation of Digital Objects (iPRES 2011).
• Carole A. Goble, David De Roure and Sean Bechhofer Accelerating scientists’ knowledge turns. Will be available at www.springerlink.com
• Khalid Belhajjame, Oscar Corcho, Daniel Garijo et al Workflow-Centric Research Objects: First Class Citizens in Scholarly Discourse, SePublica2012 at ESWC2012, Greece, May 2012
• Kevin R. Page, Ben Fields, David De Roure et al Reuse, Remix, Repeat: The Workflows of MIR, 13th International Society for Music Information Retrieval Conference (ISMIR 2012) Porto, Portugal, October 8th-12th, 2012
• Jun Zhao, Jose Manuel Gomez-Perezy, Khalid Belhajjame et al, Why Workflows Break - Understanding and Combating Decay in Taverna Workflows, eScience 2012, Chicago, October 2012