Upload
inga-decker
View
32
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Event-Based Infrastructure for Reconciling Distributed Annotation Records. Ahmet Fatih Mustacoglu [email protected] Advisor: Prof. Geoffrey C. Fox. Outline. Introduction Motivations and research issues Architecture Event-Based Infrastructure Measurements and Analysis Conclusions - PowerPoint PPT Presentation
Citation preview
Event-Based Infrastructure for Reconciling Distributed
Annotation Records
Ahmet Fatih [email protected]
Advisor: Prof. Geoffrey C. Fox
Outline Introduction Motivations and research issues Architecture
Event-Based Infrastructure
Measurements and Analysis Conclusions
Contributions and Future Works
204/19/23 Ahmet Fatih Mustacoglu
Online Collaboration Rapid development of annotation tools and services Aimed at fostering online collaboration and sharing
between users and communities:Bookmarking Tools supports annotation using keywords
called tags and sharinge.g. del.icio.us
Tools for annotation and sharing of scholarly publicationsConnoteaCiteulikeBibsonomy
Social Networking Toolse.g. MySpace, and Facebook
Video Sharing and annotatione.g. YouTube
304/19/23 Ahmet Fatih Mustacoglu
Motivations Various annotation tools, different and limited
metadata storageMultiple instances of metadata about the same document
No time-stamp info for updated recordsCausing inconsistencies
Lack of interoperability between annotation sitesApplying service-based architecture to annotation systems
Unification and Federation of major annotation tools to use them with added capabilities for scientific researchManagement of metadata coming from different sourcesAdding missing services
Upload and extract metadata from/to a repository404/19/23
Ahmet Fatih Mustacoglu
Research Issues I Need an infrastructure to manage metadata
Dealing with metadata coming from several sources Issues with using annotation tools and their services with
added capabilities Extract and upload data to/from tools
More metadata support for documentsProviding communication between annotation tools Issues with document tracking and access to previous
versions of documents Consistency Enforcement
Issues with maintaining consistency between copies of a record stored at various annotation tools
504/19/23 Ahmet Fatih Mustacoglu
Research Issues II Unification
How to combine different annotation tools under the same umbrella?
FederationHow to federate major annotation tools?
Scalability System behavior for increased message rate per second
Flexibility and Extensibility Interoperable with other clientsEase of integrating an annotation tool
604/19/23 Ahmet Fatih Mustacoglu
Ahmet Fatih Mustacoglu 7
Event-based Infrastructure
and Consistency Enforcement Architecture
04/19/23
KEY CONCEPTS
Distributed Annotation Record (DAR): Collection of metadata stored at an annotation tool.
Digital Entity (DE): A digital collection of metadata for a citation stored in a system database forms a primary copy of a DAR.
Event: A time-stamped action on a digital entityMajor Events:
Insertion or deletion of a digital entityMinor Events:
Modifications to an existing digital entity04/19/23 Ahmet Fatih Mustacoglu 8
Communication Manager Responsible for providing communication between
annotation tools and update manager and digital entity manager via gatewayse.g. Connotea gateway
Utilizes a gateway for each annotation tool, and a parserRetrieve records in XML formatParse and pass records to update managerPost updates coming from digital entity manager
to annotation tools
904/19/23 Ahmet Fatih Mustacoglu
10
Communication Manager
04/19/23 Ahmet Fatih Mustacoglu
GatewayInterface between Event-based infrastructure and each annotation tool Provides extensibility A gateway needs to be deployed for each annotation tool that need to be integrated into the system
11
GatewaysEBI Modules
EBI
Annotation Tools
04/19/23 Ahmet Fatih Mustacoglu
Ahmet Fatih Mustacoglu
Annotation Tools Update Manager
Responsible for: Retrieving the records from annotation tools periodically (Time-based consistency approach by pulling records) Finding out the updatesPassing the updates to Digital Entity Manager so that they can be applied on the primary copy of each record
1204/19/23
Digital Entity Manager Responsible for:
Events and dataset creationEvent Processing
Manages updates made on the primary copy of a digital entity
Updates primary copy located on a system databasePass updates to the Communication Manager (Strict consistency by pushing updates immediately)
Handles periodic update management Deals with history and rollback management of a digital entity
1304/19/23
Key Design Features Representation of metadata of documents coming from various
sources as events Major and minor events More metadata support than major current annotation tools Ability to access and rollback to previous versions of documents
Unification and Federation of Connotea, Delicious, and Citeulike tools and support for web-based academic search tools for scientific research Using annotation tools’ existing services with added capabilities Support major online search tools to collect metadata Provides communication among annotation tools
Leveraging interoperability via service-enabled architecture Keeps records located at annotation tools and a system database
consistent with each other Adopting time-based and strict consistency approaches
04/19/23 Ahmet Fatih Mustacoglu 14
Use Cases Collaborative Tagging
Updating or assigning keywords to records Collecting and managing citation metadata
Obtaining metadata about a publication through online scholarly search tools or annotation tools
Unification and Federation of Connotea, Citeulike and Delicious annotation toolsProviding schema and communication among them
Tracking updates to documentsRolling back to previous states
Building versions of documents based onUsers, groups, or all events
04/19/23 Ahmet Fatih Mustacoglu 15
Benchmarks and Environments Message rate scalability investigation
MoreInfo operationWith DB AccessWith Memory Utilization
Update DE operation We have used:
Java 2 Standard Edition compiler with version 1.5.0_12. The maximum heap size of Java Virtual Machine (JVM) to1024MB
Apache Tomcat Server with version 5.0.28Apache Axis technology with version 1.2
1604/19/23 Ahmet Fatih Mustacoglu
1704/19/23 Ahmet Fatih Mustacoglu
Message rate scalability investigation result (DB Usage) - I
1804/19/23 Ahmet Fatih Mustacoglu
200 300 400 500 600 700 800 900 10002.5
3
3.5
4
4.5
5
5.5
6
6.5
7
message rate (message/per second)
aver
age
roun
d tr
ip t
ime
(mse
c)more info message rate
Message rate scalability investigation result (Memory Utilization) - II
1904/19/23 Ahmet Fatih Mustacoglu
200 400 600 800 1000 1200 1400 16001.5
2
2.5
3
3.5
4
message rate (message/per second)
aver
age
roun
d tr
ip t
ime
(mse
c)
more info message rate
Message rate scalability investigation result (Update DE) - III
2004/19/23Ahmet Fatih Mustacoglu
150 200 250 300 350 400 450 500 550 600 6502
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
message rate (message/per second)
aver
age
roun
d tr
ip t
ime
(mse
c)
update message rate
Overheads for updating Memory and DB
04/19/23 Ahmet Fatih Mustacoglu 21
Message Rate (message/sec)
Overhead Time (DB) (msec)
STDev for DB Overhead Time (Memory) (msec)
STDev for Memory
266 6.88 0.85 0.93 0.37
432 6.79 0.75 0.98 0.34
593 6.85 0.74 0.96 0.35
715 6.75 0.74 0.96 0.34
803 6.82 0.75 0.96 0.35
877 6.88 0.71 0.96 0.36
963 6.89 0.79 0.98 0.35
1030 6.75 0.74 0.97 0.34
1088 6.86 0.72 0.97 0.35
1115 6.74 0.72 0.96 0.35
Contributions System research
Event-based InfrastructureUnification, Federation and Interoperability of Connotea, Delicious and
Citeulike annotation toolsStrategies for increasing performance and scalability via in top-to
bottom approach and memory utilizationHandling various types of metadata coming from several sourcesFlexibility to access previous versions of a documentAdopting consistency enforcement approaches to maintain consistency Comprehensive benchmarks to evaluate the scalability of the prototype
system System software
An implementation of Event-based Infrastructure of Internet Documentation and Integration of Metadata (IDIOM) system
An implementation of consistency maintenance mechanism for Internet Documentation and Integration of Metadata (IDIOM) system 2204/19/23 Ahmet Fatih Mustacoglu
Future Works
Applying Event-based Infrastructure to broader range of application use casesSupporting video collaboration tools (e.g. YouTube)Social networking (e.g. Facebook)
Unification and Federation of other academic collaboration and publication tools into EBIe.g. BibSonomy
From a single storage of metadata to distributed storages
2304/19/23 Ahmet Fatih Mustacoglu
Publications Book Chapters
1. Web 2.0 for Grids and e-Science; Geoffrey C. Fox, Rajarshi Guha, Donald F. McMullen, Ahmet Fatih Mustacoglu, Marlon E. Pierce, Ahmet E. Topcu, David J. Wild. Published by Springer, 2007 - Grid Enabled Remote Instrumentation (Chapter: Web 2.0 for Grids and e-Science)
Publications1. Hybrid Consistency Framework for Distributed Annotation Records in a Collaborative Environment;
Ahmet Fatih Mustacoglu and Geoffrey Fox2. Web 2.0 for E-Science Environments Keynote Presentation; Geoffrey C. Fox, Marlon E. Pierce, Ahmet
Fatih Mustacoglu, Ahmet E. Topcu3. Integration of Collaborative Information Systems in Web 2.0; Ahmet E. Topcu, Ahmet Fatih
Mustacoglu, Geoffrey Fox, Aurel Cami4. SRG: A Digital Document-Enhanced Service Oriented Research Grid; Geoffrey Fox, Ahmet Fatih
Mustacoglu, Ahmet E. Topcu, Aurel Cami5. AJAX Integration Approach for Collaborative Calendar-Server Web Services; Ahmet Fatih
Mustacoglu, Geoffrey Fox6. A Novel Event-Based Consistency Model for Supporting Collaborative Cyberinfrastructure Based
Scientific Research; Ahmet Fatih Mustacoglu, Ahmet E. Topcu, Aurel Cami, Geoffrey Fox 7. iCalendar (RFC2445) Compatible Collaborative Calendar-Server Services; Ahmet Fatih Mustacoglu,
Wenjun Wu, Geoffrey Fox
2404/19/23 Ahmet Fatih Mustacoglu
Tools for Annotation and Sharing Publications They are used for:
Collecting data and metadataAnnotating data Sharing papers
Limitations of these tools:Different and limited metadata storageNeed to enter same entry to each toolNo timing information for updated records Lack of ability to transfer data between tools Lack of services to extract and import data into a repository Lack of services to upload data from a repository
2504/19/23 Ahmet Fatih Mustacoglu
04/19/23 Ahmet Fatih Mustacoglu 26