24
The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Page 1: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

The Cooperative WebA Step towards Web

IntelligenceDaniel Gayo Avello

University of Oviedo

Page 2: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Web Intelligence?

• Multidisciplinary effort– Artificial Intelligence

– Information Retrieval

– Software Agents

– ...

• Early stages

• Goal The Wisdom Web– New web.

– More useful.

– Truly “intelligent”

Page 3: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

The Semantic Web (in a nutshell)• Standardized conventions (ontologies)

– objects

– attributes

– relations

• Semantic tags

– Document authors mark up

– Software agents (basic) reasoning

Page 4: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

So...

• Semantic Web ~ Web Intelligence Approach

• Cooperative Web ~ Web Intelligence Approach

Page 5: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Is the Cooperative Web just-another-proposal?• Not really...• Semantic Web

– beginning... – human made (ontologies - at this moment)– time to reach the whole Web (5-10 years?)

• “I know what I want and I want it now!”• The Web ~ Legacy System• Something...

– fully automatic– simple– built on top of the current web (legacy)– between the current web (legacy) and The Wisdom Web

(future)

• ...wouldn’t be nice?

Page 6: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Cooperative Web proposal (in a nutshell)• Simple, cheap, automatic

• Intermediate: Web ¿? Wisdom Web

• “Squeeze out” the current Web a little more...

• Main ideas:

– Concept extraction

– Automatic document taxonomies

– Computational biology

Page 7: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Concepts

• Let’s study these samples......Betelgeuse, a red supergiant star about 600 light years distant,

is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Page 8: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Concepts

• They’re results from the Google query star......Betelgeuse, a red supergiant star about 600 light years distant,

is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Page 9: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Concepts

• But they talk about different kinds of “stars”......Betelgeuse, a red supergiant star about 600 light years distant,

is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Page 10: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Concepts

• From those (and other) documents we could extract something like these “word bags”...0:{red supergiant, star, Sun, ...}

1:{actor, actors, celebrity, films, star, ...}

• Plenty of techniques to obtain these “word bags” or “concepts”, for instance:

– Latent Semantics (Foltz, 1990)

– Concept Indexing (Karypis and Han, 2000)

Page 11: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Conceptual related documents• Documents shown before...

...Betelgeuse, a red supergiant star about 600 light years distant, is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Page 12: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Conceptual related documents• Could be transformed in something

like this......Betelgeuse, a red supergiant star about 600 light years distant, is seen in

this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

• by dropping the “stop words”...

Page 13: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Conceptual related documents• And then into this...

?00???????00

????????1?1????

??1??11??

1???1?

• Last three documents are closely related while the first one has nothing to do...

Page 14: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Text strings...

• This way of representing free text...

?00???????00

????????1?1????

??1??11??

1???1?

• ...could be well-suited to determine the distance between documents.

• Let’s see a simpler technique to get the distance between text strings...

Page 15: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Text strings...

• Three simple strings:– BENJI

– DANI

– HENRY

• How closely are they related?

• Let’s define a distance between two strings as

the number of letters to delete +

the number of letters to change +

the number of letters to insert...

• ...to transform one string into the another.

Page 16: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Text strings...

• Distance between BENJI and DANI: 3BENJI DENJI (1), DENJI DANJI (2), DANJI DANI (3)

• Distance between DANI and HENRY: 4DANI HANI (1), HANI HENI (2), HENI HENRI (3), HENRI HENRY (4)

• Distance between BENJI and HENRY: 3BENJI HENJI (1), HENJI HENRI (2), HENRI HENRY (3)

• This is known as Levenshtein distance and will allow us to better understand next step...

Page 17: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Someone’s in the kitchen with DNA• DNA highly complex molecule made from only 4

different kinds of components:– Adenine - A

– Cytosine - C

– Guanine - G

– Thymine - T

• So, DNA molecules ~ simple (but huge) text strings– CCAAGGA...

– CCAAGGAAACTCACTA...

– GATTACA...

Page 18: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Someone’s in the kitchen with DNA

• If DNA ~ text string then distances between two or more strings can be easily computed...

(Ursing and Arnason, 1998)

Page 19: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

What if...

Could be possible to adapt computational biology

algorithms to distill semantics from the web in

an automatic fashion?

Page 20: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Cooperative Web architecture

Œ

Ž

User

Software agent

Browsinghistory

Documenttaxonomy

?

Œ

Page 21: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

So, the Cooperative Web would be...

A layer over the Webto provide semantics

in an automatic fashion“inspired” by

computational biology

Page 22: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Work in progress...

•Cooperative Web is just a proposal (at this moment)

•Some prototypes soon (I hope...)

Page 23: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

The Cooperative WebA Step towards Web

Intelligence

Thank you!Any question?

Page 24: The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

References• Foltz, P.W. (1990), "Using Latent Semantic Indexing for Information

Filtering", Proceedings of the ACM Conference on Office Information Systems, Boston, EE.UU., pp. 40-47.

• Karypis, G., and Han, E. (2000), "Concept indexing: A fast dimensionality reduction algorithm with applications to document retrieval and categorization", Technical Report TR-00-0016, University of Minnesota.

• Ursing, B.M., and Arnason, U. (1998), "Analyses of mitochondrial genomes strongly support a hippopotamus-whale clade", Proceedings of the Royal Society of London. Series B, Biological Sciences, 265:2251-2255.