LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading

LinkSphere:P2P Cross Database Search --

Architecture and Issues

Hugo MillsUniversity of Reading

LinkSphere

• Linking Researchers and their Data

• Social networking for researchers

• Cross-database search

– Mostly Arts and Humanities datasets

– “Promoting serendipity”

– Access by and presentation of datasets to wider audiences

Datasets

Museums Archives Archaeology:

Silchester Excavation, IADB

Ure Museum of Classical Archaeology

CentAUR: ePrints Library

Beckett Collection Cole Museum of

Zoology Film Collection Herbarium Typography

Collections

Fully asynchronous peer-to-peer communications framework

Written in Java Fully distributed Robust

“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” (Leslie Lamport)

Has a simple distributed data store (“Virtual Registry”) for client metadata

(Relatively) lightweight 3MiB for a fully functional system

Fast• Flexible, Extensible

– Bootstrap handlers– Additional message types– VR extensions– Alternative communication protocols– Discovery of core mediators via Bonjour/ZeroConf

XDB System Architecture

VR VRVRVR

Tycho Core

RepoRepoRepo

JDBC Web API SPARQL ...

REST search API

Search App Search App

Meta MetaMetaMeta

User Interface

• Main UI is web-based

– Uses AJAX

– Currently embedded within the LinkSphere project site

– Will ultimately move to the SNS

• Any UI possible using the REST API

Issues

• Getting the data is hard

– Implementation problems

– Maintenance problems

– Admin problems

– Social problems

– Legal problems

“Muddling along”

• Archive of material for intra-departmental use only

– Some legal issues involved

• Group of technicians administering the data

– Poor quality data

• Excel spreadsheet(!)

• Reluctant to have index of material made public

“Not ready yet”

• Big university projects

• New systems, (potentially) large data sets

• MERL museums archive (AdLib)

– Data all loaded from previous systems

– Access modules not yet installed

• CentAUR publications archive (ePrints 3)

– Very little data available yet

“Works For Me”

• Custom web application

– PHP, sophisticated

• External developer

• No documentation

• MySQL underneath

“It works, but...” (part 1)

• Non-technical users

• Admins are Mac-only, desktop-only people

• FileMaker Pro

• DB structure and UI developed externally

– No documentation

– This has bad implications

“It works, but...” (part 2)

• Completely custom application

– External developer

– No documentation (again)

– Large lump of write-only perl

• Custom data store

– Not SQL. Not XML. Not RDF.

• No external access

Unreachable data

• Uncommunicative systems

• Custom applications

– Developers/administrators AWOL

• Custom data models

• Lost passwords

• Excel spreadsheets

– See also, “Uncommunicative”

Unreachable data

• Private data

– Legal issues

– Possessive owners

• Internal use only

• Poor quality

• No data!

Conclusions

• Building the software is easy

• There is still lots of hard-to-reach data out there

• Issues are largely not technical

• More outreach to A&H areas needed

Acknowledgements and thanks

• LinkSphere team: Mark Baker, Shirley Williams, Pat Parslow (Reading), Claire Warwick, Melissa Terras, Claire Ross (UCL)

• Repository owners at Reading: Amy Smith (Ure Museum), Guy Baxter (University Archivist), Mary Dyson, Hadj Messelles (Typography), Jonathan Bignell (Film Studies), Alison Sutton (CentAUR), Mike Fulford, Amanda Clarke (Silchester)

• JISC VRE 3 programme

Tycho Architecture

REST Interface

• /api/query

– POST to start new query asynchronously

• /api/query/query_id

– GET for query metadata

– DELETE to cancel query (or it will time-out naturally)

• /api/query/query_id/start/finish

– GET a range of results from the query

• Feedback API coming soon

REST Interface

• /api/repository

– GET list of repositories currently online

• /api/repository/repo_id

– GET for repository metadata• Link to repository itself

• Link to LinkSphere description of it

LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading

Documents

Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions

Determinants of Loan Performance in P2P Lendingessay.utwente.nl/72876/1/Möllenkamp_BA_BMS.pdfP2P lending, loan performance, P2P loan default, P2P risk, P2P credit scoring, online

P2P Applications

P2P Security - krnet.or.krB3... · 소리바다 Distributed Computing SETI@Home Groove (Virtual Office) KOREA@Home. KRnet 2005 5 P2P 네트워크구조(1) P2P Overlay Network P2P

Presentation P2P

Видеонаблюдение - Компания Qtech › public › marketing › catalogs › Catalog...Облачный сервис P2P P2P P2P Аудио 1 вход/1 выход

Unstructured vs. Structured P2P systems Peer-to-Peer Systemsheim.ifi.uio.no/michawe/teaching/p2p-ws08/p2p-2-6.pdf · Current P2P Content Distribution Systems • Most current P2P

P2P Credential_v3

Incentivos em Sistemas P2P - Instituto de Computação€¦ · armazenagem p2p multicast p2p redes adhoc. Esquema de reputação p2p Com técnicas de segurança, podemos: identificar

Live Streaming in P2P and Hybrid P2P-Cloud Environments

P2P playbook

TECHNOLOGIA P2P

ccgp-guizhou.gov.cnccgp-guizhou.gov.cn/attachment/201609/ZMC... · p2p p2p p2p p2p p2p, p2p p'2p p2p 30m 70m nternet 60m p2p 30m nternet 70m udp -34-

Detecting P2P Traffic from the P2P Flow Graph

SERIR P2P - deasecurity.com€¦ · SERIR P2P P2P 1.1.2 THE CONTROL UNIT Nerve centre of SERIR P2P system, the Control Unit is composed of a polyester cabinet (part number BOX-P2P)

Architect John Mills. PROFILE John MILLS of John Mills

P2P Incentives

Cs423-cotter1 P2P Discovering P2P (Miller) Internet

P2P Microfinance Program By Wirly Xiao. What is P2P? P2P means ‘Peer to Peer’

P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems e.g. Napster, Decentralized unstructured P2P systems e.g. Gnutella