View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Scientific Web IntelligenceThe Birth of a New Research Field
Mike Thelwall
Statistical Cybermetrics Research Group
University of Wolverhampton, UK
The Problem To map patterns of communication between
researchers in a country based upon university web sites
Patterns of communication are also mapped based upon journal citations or journal title words Provides useful information about the structure and
evolution of research fields Can identify previously unknown field connections
Web analysis could illustrate wider and more current patterns
Part 1: Hyperlink Analysis Citation counts are known to be reasonable
indicators of research quality but is the same true for inlink counts? Counts of links to universities within a country can
correlate significantly with measures of research productivity
The significance of this result is in giving ‘permission’ to investigate the use of inter-university links for researching scholarly communication
Links to UK universities against their research productivity
The reason for the strong correlation is the quantity of Web publication, not its quality
This is different to citation analysis
Most links are only loosely related to research 90% of links between UK university sites have some
connection with scholarly activity, including teaching and research But less than 1% are equivalent to citations
So link counts do not measure research dissemination but are more a natural by-product of scholarly activity Cannot use link counts to assess research Can use link counts to track an aspect of communication
Language is a factor in international interlinking
English the dominant language for Web sites in the Western EU
In a typical country, 50% of pages are in the national language(s) and 50% in English
Non-English speaking extensively interlink in English
{Research with Rong Tang & Liz Price}
Can map patterns of international communicationCounts of links between EU universities in Swedish are represented by arrow thickness.
Linking patterns vary enormously by discipline No evidence of a significant geographic trend Disciplinary differences in the extent of
interlinking: e.g., history Web use is very low, Chemistry is very high
Individual research projects can have an enormous impact upon individual departments E.g. Arts web sites are often for specific exhibitions
or for digital media projects Links not frequent enough to reliably reveal
patterns of interdiscipliniarity
Stretching links: colinks, couplings For the UK academic Web, about 42% of
domains connected by links alone host similar disciplines, and about 43% connected by links, colinks and couplings
But over 100 times more domains are colinked or coupled than are directly linked
Links in any form are less than 50% reliable as indicators of subject similarity
Text Mining Approaches Hyperlinks are not frequent enough or
systematic enough to yield reliable evidence of connections at a low level
Alternative is to look for words in common E.g., the frequency with which words
associated with psychology are found in computer science web sites
Clustering web pages/sites based upon word occurrences (c.f. journal title word clustering)
Text clustering – early resultsWord Frequency Domains Importance
business 59806 408 0.005902
marketing 16987 242 0.004476
finance 8300 217 0.002826
economics 15509 261 0.002726
banking 2010 123 0.002717
management 76754 465 0.002569
sitemap 2419 62 0.001874
accounting 8162 197 0.001613
auckland 55604 414 0.001546
Which discipline?Word Frequency Domains Importance
template 3356 147 0.001355
assignment 15610 240 0.001186
copyright 16780 278 0.001166
changed 7172 284 0.001152
sst 199 33 0.001071
semester 18364 319 0.001009
systems 44521 451 0.000949
lab 7709 261 0.000861
comments 16931 354 0.000842
Scientific Web Intelligence Standard hyperlink and text mining
approaches are inadequate for identifying low level inter-subject connections
Either extensive human intervention or artificial intelligence techniques needed to extract useful information
Hence the founding of Scientific Web Intelligence
Scientific Web Intelligence Objective: to combine techniques from
Information Science, Web Mining and Web Intelligence to extract patterns of interdiscipliniarity from university Web sites