View
220
Download
0
Tags:
Embed Size (px)
Citation preview
Semantic Web Mining
Combination of Semantic Web and Web Mining
Improve Web Mining using Semantic Web
Improve Semantic Web using Web Mining
Overview
Web Mining Extracting Semantics from the Web Exploiting Semantics for Web Mining Mining the Semantic Web Closing the Loop Conclusion/Assessment
Web Mining• Discovers Local and Global Structure• Structured Data• Goals
• Improvement of site design• Generate dynamic recommendations• Improve marketing
• Main Areas• Web Content Mining• Web Structure Mining• Web Usage Mining
Content Mining
Type of Text Mining Uses Tags Detect co-occurrences Event detection Reconstruction of page content Relations in a domain
Web Structure Mining WebPages as a whole
Uses hyperlinks Identify relevance
Single Pages Five types of Web Pages
Head Pages Navigation Pages Content Pages Look up Pages Personal Pages
Disadvantages of Web Mining Content/Structure
False positives Unused Human understandable Large amount of data
Usage Usage tracked by urls General concepts Multiplicity of events and urls
Relational Metadata
DAMLPROJ
COOPERATES-WITH
URI-GST
URI-SWMining
COOPERATES-WITH
WORKS-IN
PROJECT
RESEARCHER
PERSON
TOP
COOPERATES--WITH
TITLE
NAME
RESEARCHER
PERSON
OntologyCOOPERATES--
WITH
Semantic Web Mining
WWW
-URI-AHO
Andreas Hotho
cooperateswith(X,Y)
cooperateswith(Y,X)
WORKS-IN
WORKS-IN
Outline
Web Mining Extracting Semantics from the Web Exploiting Semantics for Web Mining Mining the Semantic Web Closing the Loop Conclusion/Assessment
Extracting Semantics
Ontology Learning Learn structures of Ontologies
Instance Learning Populates the Ontologies
Extracting Semantics
Ontology Learning Semi-automatic approach Merging
FCA-Merge TITANIC
Instance Learning Information Extraction
Outline
Web Mining Extracting Semantics from the Web Exploiting Semantics for Web Mining Mining the Semantic Web Closing the Loop Conclusion/Assessment
Web Content/Structure Mining Content Mining
Preprocess the input data Apply heuristics Creates a cluster
Web Structure Mining Page Rank Keyword Analysis CLEVER
Conceptual Clustering of Emails (and Bookmarks)
using IE and Formal Concept Analysis for supporting navigation and retrieval.
Web Usage Mining
Goal Better understand user’s tendencies
Problem Dynamic pages How to take advantage of this?
Generate queries Create usage paths Classification scheme
Outline
Web Mining Extracting Semantics from the Web Exploiting Semantics for Web Mining Mining the Semantic Web Closing the Loop Conclusion/Assessment
Semantic Web/Structure Mining
Intertwined Relational Data Mining
Looks for patterns Classification, regression, clustering
and associations Challenges
Scalability Distributed
Semantic Web Usage Mining
Goal Requested page = ontology entity Log files
Advantages Understand search strategies Improve navigation design Personalize
Outline
Web Mining Extracting Semantics from the Web Exploiting Semantics for Web Mining Mining the Semantic Web Closing the Loop Conclusion/Assessment
Mining to Learn Ontologies
Establish a concept hierarchy OntEx
Determine Association rules Discover combinations of concepts
Conclusion/Assesment
• Semantic Structures in the Web can help Web mining
• Web Mining can build the Semantic Web
• Combine the two together• Different Idea• Combination of Products