35
1 A Collaborative and Semantic A Collaborative and Semantic Data Management Framework for Data Management Framework for Ubiquitous Computing Ubiquitous Computing Environment Environment Weisong Chen, Weisong Chen, Cho-Li Wang Cho-Li Wang , Francis , Francis Lau Lau The University of Hong Kong The University of Hong Kong

A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

Embed Size (px)

DESCRIPTION

A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment. Weisong Chen, Cho-Li Wang , Francis Lau The University of Hong Kong. Presentation Outline. Ubiquitous Data Management Design Philosophy System Design Ontology-based Metadata - PowerPoint PPT Presentation

Citation preview

Page 1: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

11

A Collaborative and Semantic Data A Collaborative and Semantic Data Management Framework for Ubiquitous Management Framework for Ubiquitous

Computing EnvironmentComputing Environment

Weisong Chen, Weisong Chen, Cho-Li WangCho-Li Wang, Francis Lau, Francis LauThe University of Hong KongThe University of Hong Kong

Page 2: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

22

Presentation OutlinePresentation Outline

Ubiquitous Data Management Ubiquitous Data Management Design PhilosophyDesign Philosophy System DesignSystem Design

Ontology-based MetadataOntology-based Metadata Incentive-based RoutingIncentive-based Routing Cooperative CachingCooperative Caching

Performance EvaluationPerformance Evaluation ConclusionConclusion

Page 3: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

33

Data DilemmaData Dilemma We are We are drowndrown in the ocean of in the ocean of

datadata, but , but thirstythirsty for useful for useful informationinformation..

Can you recall the last time you Can you recall the last time you wanted to access a document wanted to access a document sent by one of your friends. You sent by one of your friends. You knewknew it must be it must be somewheresomewhere, but , but you just could not find it. you just could not find it.

Even the computer search Even the computer search cannot help, since you forgot the cannot help, since you forgot the file/directory namefile/directory name. .

Page 4: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

44

Ubiquitous DataUbiquitous Data The data dilemma would only be more The data dilemma would only be more

severe in ubiquitous environment. severe in ubiquitous environment. Users Users roamroam from smart space to smart from smart space to smart

space, space, leavingleaving all kinds of data there. all kinds of data there. At At anytimeanytime and in and in any locationany location, the users , the users

may want to access certain data. may want to access certain data.

A scenario : A person uses his A scenario : A person uses his mobile phonemobile phone to to take photake photostos as he as he moves aroundmoves around. Due to the . Due to the limited spacelimited space, he c, he cannot store all the photos taken. Therefore, he chooses tannot store all the photos taken. Therefore, he chooses to o offloadoffload photos to some nearby stable photos to some nearby stable stationsstations. Someti. Sometimes later, he would like to view the photos taken, mes later, he would like to view the photos taken, regardregardless less of hisof his location location. He may also want to . He may also want to shareshare his photo his photos with some friends. s with some friends.

Tsuruga Castle

Where is the photo of Alice?

Page 5: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

55

Data Management ChallengesData Management Challenges users are moving from place to placed (users are moving from place to placed (High MobilityHigh Mobility) - ) -

Data are stored in everywhere. (Data are stored in everywhere. (High DistributionHigh Distribution)) VariousVarious devices are with devices are with different capabilities different capabilities and they and they

useuse different means different means to store/access data. (to store/access data. (High High HeterogeneityHeterogeneity))

Users cannot Users cannot consistently controlconsistently control all the smart spaces he all the smart spaces he ever interacted with. (ever interacted with. (High AutonomyHigh Autonomy))

Users Users generategenerate certain data and may want to certain data and may want to accessaccess others. (others. (Sharing and CollaborationSharing and Collaboration ))

Others: resource-constrained devices, unreliable Others: resource-constrained devices, unreliable connectivity,….connectivity,….

Page 6: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

66

Design PhilosophyDesign Philosophy We incorporate the following philosophy in our We incorporate the following philosophy in our

system designsystem design Computers should Computers should mimic human societymimic human society to locate data to locate data

of interest.of interest. Computer not only Computer not only share datashare data, but also , but also indexing indexing

information (metadata)information (metadata), and contribute to the overall , and contribute to the overall infrastructure. infrastructure.

MetadataMetadata are widely propagated, while are widely propagated, while datadata are only are only moved to the locations where they are needed.moved to the locations where they are needed.

Incentives Incentives should be granted to those devices that should be granted to those devices that help others to find the required data, thus to foster help others to find the required data, thus to foster cooperation and encourage contribution cooperation and encourage contribution (If I help (If I help more, I shall find things faster !)more, I shall find things faster !)

Page 7: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

77

The Abstract ModelThe Abstract Model

A Pervasive Computing Environment

The Abstract Model

Shared/Public Devices

Private Devices

Smart Space 1(LAN-connected)

Smart Space 2 WLAN-connected

Smart Space 3(Ad-Hoc Network)

Internet

Page 8: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

88

Node ArchitectureNode ArchitectureRouting Knowledge = Metadata + node id

Cached Data

List of routing knowledge To organize the routing table and cache store All devices adopt this consistent architecture. Consisting of a routing table, a cache store, and the ontology knowledge

Page 9: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

99

OntologyOntology

An ontology is a An ontology is a formalformal, , explicitexplicit specificatiospecification of a n of a sharedshared conceptualizationconceptualization*. *. ““formalformal” implies that ontology should be ” implies that ontology should be machimachi

ne processablene processable. . ““explicitexplicit” means that ontology knowledges are ” means that ontology knowledges are

explicitly definedexplicitly defined. . ““sharedshared” indicates that ontology is ” indicates that ontology is agreed-upoagreed-upo

n, consensualn, consensual knowledge. knowledge.

*R. Studer, V.R. Benjamins, and D. Fensel, ``Knowledge Engineering: Principles and Methods,'' Data and Knowledge Engineering, vol.25, 1998, pp. 161--197.

Page 10: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1010

Sample OntologySample Ontology

Page 11: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1111*http://www.semanticweb.org/ontologies/swrc-onto-2001-12-11.daml

Representation of OntologyRepresentation of Ontology

Page 12: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1212

Usage of Ontology In Ubi. Env.Usage of Ontology In Ubi. Env. Shared ConceptualizationShared Conceptualization

Abstracting data diversity Abstracting data diversity Facilitating information exchangeFacilitating information exchange

Formal and ExplicitFormal and Explicit Providing context-awareness supportProviding context-awareness support Describing user profile and preferenceDescribing user profile and preference Maintaining accurate routing knowledgeMaintaining accurate routing knowledge Promoting reasoning and automatic processingPromoting reasoning and automatic processing ……

Page 13: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1313

Ontology-based MetadataOntology-based Metadata

Concept layer

Instance layer

Metadata (indexing inMetadata (indexing information) are aggreformation) are aggressively and widely prssively and widely propagated !opagated !

Page 14: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1414

Metadata Similarity FunctionMetadata Similarity Function

The calculation of The calculation of metadata similarity metadata similarity consists of calculating consists of calculating concept simiconcept similaritylarity, , instance similarityinstance similarity, and , and literal value similarityliteral value similarity..

Page 15: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1515

Incentive-based RoutingIncentive-based Routing

““Free ridingFree riding” is the prevalent problem in ” is the prevalent problem in existing P2P systems. existing P2P systems.

Selfish nodesSelfish nodes exploit the resources on exploit the resources on other nodes, without making any other nodes, without making any contribution.contribution.

GenerousGenerous nodes share resources with nodes share resources with others, but gaining no benefit. others, but gaining no benefit.

Page 16: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1616

Incentive-based Routing Incentive-based Routing (Cont’d)(Cont’d)

All devices interact in a All devices interact in a Peer-to-PeerPeer-to-Peer fashion. fashion. Devices forward received queries to the devices Devices forward received queries to the devices

that are that are most likelymost likely to have the required data. to have the required data. Once queried results are found, the Once queried results are found, the

corresponding metadatacorresponding metadata are sent back to the are sent back to the initiating devices, through the reverse query path.initiating devices, through the reverse query path.

The The intermediate nodesintermediate nodes that helped to find the that helped to find the results incorporate the returned metadata results incorporate the returned metadata based based on its current knowledge (ontology stored)on its current knowledge (ontology stored) into into their routing table, their routing table, enhancingenhancing their abilities to their abilities to serve their subsequent queries. serve their subsequent queries.

Page 17: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1717

Incentive-based Routing Incentive-based Routing (Cont’d)(Cont’d)

No limit on the search mechanism: DFS, BFS, etc

On a hit, metadata is sent back to the initiating device (N1), through the reverse On a hit, metadata is sent back to the initiating device (N1), through the reverse query path; all nodes (N5, N3, N2, N1) on the path update their routing entries.query path; all nodes (N5, N3, N2, N1) on the path update their routing entries.

Page 18: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1818

Incentive-based Routing Incentive-based Routing (Cont’d)(Cont’d)

Devices Devices gains routing knowledgegains routing knowledge through through helping others to find the required data.helping others to find the required data.

The moreThe more contributioncontribution a device makes to the a device makes to the success of others’ information access, success of others’ information access, the more the more (accurate) routing knowledge(accurate) routing knowledge it will gain. it will gain.

Therefore, devices are given Therefore, devices are given incentivesincentives to to contribute to others’ information access. contribute to others’ information access.

The net effect is that all devices become more The net effect is that all devices become more generous and beneficial. generous and beneficial.

Page 19: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

1919

Incentive-based Routing Incentive-based Routing (Cont’d)(Cont’d)

Ubiquitous devices are mostly small, resource-Ubiquitous devices are mostly small, resource-constrained devices. constrained devices. User profilesUser profiles described by described by ontologyontology can be used to select and retain the can be used to select and retain the most important knowledge. most important knowledge.

On the other hand, devices may expand the On the other hand, devices may expand the received queries using received queries using concept generalization concept generalization and and specializationspecialization. .

Devices handle received routing knowledge and Devices handle received routing knowledge and queries according to their respective queries according to their respective capabilitiescapabilities. .

Page 20: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2020

Incentive-based Routing (Cont’d)Incentive-based Routing (Cont’d)

Encourage devices with richerEncourage devices with richer Network BandwidthNetwork Bandwidth: forwarding queries to : forwarding queries to

more neighborsmore neighbors Processing PowerProcessing Power: expanding the queries with : expanding the queries with

more levels of generalization and more levels of generalization and specializationspecialization

Storage SpaceStorage Space: retaining more received : retaining more received routing knowledge routing knowledge

Page 21: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2121

Cooperative CachingCooperative Caching

Shared Cached Data Stored Routing Knowledge

Receiving Query

Original Data

Reusing the Cached Data

Page 22: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2222

Performance EvaluationPerformance Evaluation

We modify the simulation system used by We modify the simulation system used by NeuroGridNeuroGrid. .

Use TTL to control the termination of Use TTL to control the termination of experiment.experiment.

Run each experiment for 20000 iterations.Run each experiment for 20000 iterations. Measure the percentage of queries that Measure the percentage of queries that

got served and the number of messages got served and the number of messages sent for each query. sent for each query.

Page 23: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2323

Parameters SettingsParameters SettingsParameterParameter Base ValueBase ValueNumber of shared devicesNumber of shared devices 100100Number of private deviceNumber of private device 100100Total number of data objectsTotal number of data objects 30003000The size of cache memoryThe size of cache memory 5MB5MBStarting TTL of the queries Starting TTL of the queries 77Total number of queries issuedTotal number of queries issued 20,00020,000Disconnection probabilityDisconnection probability 20%20%Metadata similarity thresholdMetadata similarity threshold 0.90.9All tuning parametersAll tuning parameters 11

Page 24: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2424

Evaluation : ontology-based Evaluation : ontology-based metadatametadata

Testing the effect of ontology-based Testing the effect of ontology-based metadata: conducting three simulations metadata: conducting three simulations using the same parameter settings, but using the same parameter settings, but withwith Ontology-based metadataOntology-based metadata Keyword-based metadataKeyword-based metadata ID-based metadataID-based metadata

Comparing hit ratios, average search messages Comparing hit ratios, average search messages sent for each query, and the processing sent for each query, and the processing overheadsoverheads

Page 25: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2525

Ontology-based MetadataOntology-based Metadata

(a) Comparison of Hit Ratios (b) Comparison of Search Messages Sent

Comparison with Keyword-based and ID-based Metadata

More 40% hit ratio

4 less hops required

Page 26: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2626

Ontology-based Metadata : Ontology-based Metadata : Processing Time OverheadProcessing Time Overhead

Comparison of Processing Overhead

Using 1000 less operations

Page 27: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2727

Dividing user devices into three groups of Dividing user devices into three groups of different participation levels:different participation levels: Level 1Level 1: devices would not forward queries for : devices would not forward queries for

others others (selfish).(selfish). Level 2Level 2: devices would forward received : devices would forward received

queries to queries to randomlyrandomly selected peers. selected peers. Level 3Level 3: devices would forward received : devices would forward received

queries using their best routing knowledge. queries using their best routing knowledge.

Incentive-based RoutingIncentive-based Routing

Page 28: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2828

Incentive-based Routing ProtocolIncentive-based Routing Protocol

Comparison of Query Hit Ratios by Devices with Different Participation Levels

Level 3 has highest hit ratioIncentives for contribution!!

Page 29: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

2929

Cooperative CachingCooperative Caching

Query Hits Contributed by Peer Caches

Peer caches contribute to a significant portion of query serves

Page 30: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

3030

Compare the overall performance of our Compare the overall performance of our system against:system against: FreeNetFreeNet: ID-based metadata, best-effort : ID-based metadata, best-effort

routingrouting NeuroGridNeuroGrid: Keyword-based metadata, no : Keyword-based metadata, no

routing incentiverouting incentive Random WalkRandom Walk: Keyword-based metadata, : Keyword-based metadata,

forward queries to randomly selected peersforward queries to randomly selected peers

Overall PerformanceOverall Performance

Page 31: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

3131

Overall PerformanceOverall Performance

Comparison with FreeNet, NeuroGrid, and Random Walk

More 40% hit ratio 2 less hops required

Page 32: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

3232

ConclusionConclusion The use of ontology-based metadata can The use of ontology-based metadata can

increase the hit ratio and reduce the increase the hit ratio and reduce the number search messages sent. number search messages sent.

Queries issued by more generous devices Queries issued by more generous devices are more likely to be served. are more likely to be served.

Peer caches can be used to serve Peer caches can be used to serve significant portion of queries generated. significant portion of queries generated.

Our system significantly outperforms other Our system significantly outperforms other similar systems. similar systems.

Page 33: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

3333

Thank You!Thank You!

Q & AQ & A

Page 34: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

3636

Similarity Between Two MetadataSimilarity Between Two Metadata

To be simple, we let all tuning parameters equal 1. Msim(M1, M2) = (Isim(book0001, report0002) + Isim(Data, Software) + LVsim(2000, 2002)) / 3

= (Csim(Book, Report) + Csim(Data, Software) + LVsim(2000, 2002)) / 3 = ( 1/3 + 1/2 + 1/(2002-2000+1) ) / 3

= (1/3 + 1/2 + 1/3) / 3 = 7/12

Page 35: A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

3737

Semantic MatchingSemantic Matching

New Queries with Substituted Synonyms

New Queries with Concept Generalization and Specialization