Upload
kerry-boone
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Functions of a Web Warehouse
Kai Cheng, Yahiko Kambayashi, Seok Tae Lee Graduate School of Informatics, Kyoto University, Japan
and Mukesh MohaniaWestern Michigan University, USA
13-16 November 2000
ICDL 2000 2
Table of ContentsSurvival from “Information Explosion”Warehouse-Mediated Content DeliveryCommunity-Oriented Web WarehousesTechnical IssuesWarehouse Enhanced Web CachingRelated Work Concluding Remarks
13-16 November 2000
ICDL 2000 3
Survival from “Information Explosion”
Web Traffic Doubled Every 3-6 Months Exponential Growth of the Web
– 1 Billion Pages , January 2000– 2 Billion Pages , June 2000 – 100 Times Increase in the Next 2 Years
Information Overloadfor both Nets and Users
13-16 November 2000
ICDL 2000 4
Scale up the Web and Internet
More Bandwidth– Never Keep Pace with the Traffic Growth
More Server Capacity– How to Deal with “Hot-Spots” ?
Site Replication– Only Benefit Replicated Servers
?
13-16 November 2000
ICDL 2000 5
Our Approach
Tame the Chaotic Info. Streams
Saving Redundant Data
Transfers
Unite the Individual Users
Sharing Findings and Efforts of Each Other
13-16 November 2000
ICDL 2000 6
Warehouse-Mediated Content Delivery
Direct Delivery
QoS: Server, Network Overloaded Personalized Services Unrealistic Information Hunting Difficult
InternetInternet
13-16 November 2000
ICDL 2000 7
Indirect Content Delivery
StorageOutput
AnalysisAnalysis
NotificationNotification
TransformationTransformation
BufferingBuffering
WWWWWW
Input
Resource DiscoveryResource Discovery
Clu
steri
ng
Clu
steri
ng SearchingSearching
NavigationNavigationFilt
eri
ng
Filt
eri
ng
Web Warehouse
Web Warehouse
13-16 November 2000
ICDL 2000 8
Community-Oriented Web Warehousing
Sharing
Contribution
The Community of Users* People with Special
Information Needs/Interests
13-16 November 2000
ICDL 2000 9
Examples of User Community
Sports FanPatients
BusinessmanResearchers
13-16 November 2000
ICDL 2000 10
Real/Cyber Communities
(a) Real CommunitiesDependent on Location
(b) Cyber CommunitiesIndependent on Location
13-16 November 2000
ICDL 2000 11
Technical Issues
Functions of a Web Warehouse Web Caching vs. Web WarehousingData Warehousing vs. Web
Warehousing Dynamic Hierarchical Web
Warehouses
13-16 November 2000
ICDL 2000 12
Functions of a Web Warehouse
Buffering
Transformation1. Transcoding2. Summarizing
Content Analysis
Notification
Resource DiscoveryResource Discovery StorageStorage ReusingReusing
TransformTransformFormat AFormat A Format BFormat B
Content AContent A TransformTransform Content BContent B
Data/InformationData/Information AnalysisAnalysis KnowledgeKnowledge
13-16 November 2000
ICDL 2000 13
Web Caching
Research Program
Content Analysis
Transformation
Warehousing
13-16 November 2000
ICDL 2000 14
From Web Caching to Web Warehousing
Web Caching Web Warehousing
Object Data Information
Objective Reusing Sharing
Storage Bounded Bound-Free
Population Responses Web View
Model FS Dependent Hypermedia
13-16 November 2000
ICDL 2000 15
From Data Warehousing to Web Warehousing
Items Data WH Web WH1 Objective Decision Support Information Sharing
2 Model RDB/OORDB Hypermedia
3 Population View Materialization
Resource Discovery
Content Localization
4 Resource Operational Data Web Documents
5 Data Type Structured Semi-/Un-structured
6 Tie to Web DWH Web WWHWeb
13-16 November 2000
ICDL 2000 16
Warehouse as Shared Information Repository
Real Communities – Centralized Management of Warehouses– Unicast Data Transfer
Cyber Communities – Distributed Management of Warehouse– Multicast Data Transfer
13-16 November 2000
ICDL 2000 17
Hierarchy of Web Warehouses
HP DesignHP Design
SportsSports
SkiingSkiingTennisTennis
Mr. A, Ms. C Mrs. D …
Mr. A, Ms. C Mrs. D …
Mr. A. Mr. D…..
Mr. A. Mr. D…..
13-16 November 2000
ICDL 2000 18
Dynamic Formation of Web Warehouses (Split)
Tennis Skiing
A B
SportsSports
TennisTennis Skiing
AA BB
13-16 November 2000
ICDL 2000 19
Dynamic Formation of Web Warehouses (Union)
PaintingPaintingDrawing
AA BB
Painting & DrawingPainting & Drawing
AABB
13-16 November 2000
ICDL 2000 20
Current Status:Content-Sensitive Caching
Web Caching
Warehousing
Content SensitiveCaching
Content-Sensitive Caching
13-16 November 2000
ICDL 2000 21
Content-Sensitive Cache Replacement Policy
Cache Replacement : Keep? Replace?Traditional Caching
Long Time Observation Replacement Decision60% One-Access Objects How Differentiate ?
Content-Sensitive Caching LRU-SP+
13-16 November 2000
ICDL 2000 22
LRU-SP+: Content-SensitiveSize-Adjusted & Popularity-Aware LRU
Daily Indexing: Cache Content Indices
Indices Popular Topics How Similar?
New Document Popular Topics Benefit/Size Model
“Observed” Pop. + “Inherent” Pop. Implement this Model
13-16 November 2000
ICDL 2000 23
Related Work
LSAM’s Proxy Cache (Push)– Multicast-Based Virtual Cache– Affinity Groups and Push Channels
INTELSAT’s Wormhole Content Delivery – Warehouse-Koisk Model– Satellite-Based Delivery Platform
13-16 November 2000
ICDL 2000 24
Concluding Remarks
Proposed to Cope with the Scaling Problems by Web Warehouse-Mediated Content Delivery
Discussed the Basic Functions of a Web Warehouse: Buffering, Transformation, Notification and Content Analysis
Introduced our Current Work: Warehouse-Enhanced Web Caching