30
Knowledge Organization in the Light of Intertextual Semantics A Natural-Language Analysis of Controlled Vocabularies Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

  • Upload
    buck

  • View
    48

  • Download
    3

Embed Size (px)

DESCRIPTION

Knowledge Organization in the Light of Intertextual Semantics A Natural-Language Analysis of Controlled Vocabularies. Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal. Overview. Intertextual semantics (IS) IS's view of controlled vocabulaires (CVs) Example - PowerPoint PPT Presentation

Citation preview

Page 1: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

Knowledge Organization in the Light of Intertextual Semantics

A Natural-Language Analysis of Controlled Vocabularies

Yves MARCOUXÉlias RIZKALLAH

GRDS – EBSIUniversité de Montréal

Page 2: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 2

Overview

• Intertextual semantics (IS)

• IS's view of controlled vocabulaires (CVs)

• Example

• Consequences of IS view

• Future work

Page 3: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 3

Intertextual semantics (IS)

• A way to envision how meaning is conveyed by information-bearing objects

• Based on natural language (NL)

• Not a semantics for natural language

• Rather a natural-language semantics for artificial information-bearing objects

• Goal: design "better" information-bearing objects (more effective and usable)

Page 4: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 4

Scope of IS reflection

• Information-bearing objects– Primarily structured documents (e.g., XML)– Any data structure designed to hold

information in an information system• Ex.: database table / record / field

• Communication of meaning to human persons interacting with the object through any kind of interface

Page 5: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 5

IS – Background (1/2)

• Introduced at Extreme Markup Languages (EML) 2006– valid XML documents only– modeler-author communication– further development (EML 2007)

• Applied to classical data structure for information exchange (SIGDOC 2007)

Page 6: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 6

IS – Background (2/2)

• One in a series of semiotics-based approaches to improve systems design– Knuth (1984), De Souza (2005)

• One in a series of semantic frameworks for structured documents (XML, etc.)– Sperberg-McQueen et al. (2000), Renear et

al. (2002), Wrightson (2005)

Page 7: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 7

Example

Facts about some US cities

City PopulationAnnual snowfall (inches)

Denver 850,000 23

Rochester 240,000 88

Palm Spring 48,000 0

Page 8: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 8

Modeler prepares “peritext” segments

Element text-before text-after

facts-about-US-cities"Here are facts about some US cities."

empty

city " The city " "."

name "named " empty

population" has a population of "

" inhabitants "

annual-snowfall-in-inches

" and an annual snowfall of "

" inches"

Page 9: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 9

Possible “semantic” (or IS) view for authors

Here are facts about some US cities. The city named Denver has a population of 850,000 inhabitants and an annual snowfall of 23 inches. The city named Rochester has a population of 240,000 inhabitants and an annual snowfall of 88 inches. The city named Palm Spring has a population of 48,000 inhabitants and an annual snowfall of 0 inches.

Page 10: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 10

Example

• Raw XML document:

<billing> <amount-burial>1205.47</amount-burial> <payable-burial>D</payable-burial> <amount-cremation>788.00</amount-cremation> <payable-cremation>F</payable-cremation></billing>

Page 11: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 11

IS view

Page 12: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 12

IS specification of the model(peritexts prepared by modeler)

Element text-before text-after

billing "This section gives the billing information for this order. "

" End of billing information section."

amount-burial "Amount charged for the burial service: "

" canadian dollars;  "

payable-burial "this amount is payable by: "" (D = Funeral director; F = Family)."

amount-cremation

"Amount charged for the cremation service: "

" canadian dollars;  "

payable-cremation

"this amount is payable by: "" (D = Funeral director; F = Family)."

Page 13: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 13

IS – Key ideas

• The semantic (IS) view is the reference interpretation and should convey, in NL, to humans, all the meaning intended / expected by the modeler

• The semantic (IS) view can (and should) contain hyperlinks to material not already known by target community of users, but necessary to make sense of the data structure

Page 14: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 14

IS – Hypothesis (ISH-1)

• The IS view of a document is one of the most workable incarnation of its meaning– Wittgensteinian position

• The (human) task of interpreting the IS view of a document is representative of the task of "understanding" the document

Page 15: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 15

IS – Consequences on design

• An intricate structure of the prose in the IS view, or a high number of hyperlink traversals indicate that the document (or data structure) is hard to understand– Gaps imply incomprehensible document!

• Design goals for modelers are thus:– Prose as simple as possible (but no more)– Low number of hyperlink traversals

Page 16: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 16

IS – Notes

• The network of resources anchored (via hyperlinks) in the semantic view suggests an actual interpretation (sense-making) path, but does not impose it

• Any specific reading of a document yields more information than the IS view, but the IS view is considered a minimum for all readings, and thus, serves as a reference

Page 17: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 17

Overview

• Intertextual semantics (IS)

• IS's view of controlled vocabulaires (CVs)

• Example

• Consequences of IS view

• Future work

Page 18: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 18

Controlled vocabularies (CVs)

• Same scope as SKOS concept schemes:– Thesauri, classification schemes, subject

heading systems, subject indexes, taxonomies

• CVs are data structures– Designed by information professionnals– Populated by corpus analysts ("authors")– Used by document analysts to index

documents, and users to find documents

Page 19: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 19

CVs in IS

• SKOS allows CVs to be expressed as XML documents– Eases the thought experiment of applying IS

• A CV can be expressed as a single XML document– Not as reductive as it sounds...– Example will concentrate on designer-author

communication

Page 20: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 20

Overview

• Intertextual semantics (IS)

• IS's view of controlled vocabulaires (CVs)

• Example

• Consequences of IS view

• Future work

Page 21: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 21

SKOS example<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:skos="http://www.w3.org/2004/02/skos/core#"> <skos:Concept rdf:about="http://www.my.com/#canals"> <skos:definition>Manmade waterway used by watercraft or for drainage, irrigation, or water power</skos:definition> <skos:scopeNote>A feature type category for places such as the Erie Canal</skos:scopeNote> <skos:prefLabel>canals</skos:prefLabel> <skos:altLabel>drainage canals</skos:altLabel> <skos:broader rdf:resource= "http://www.my.com/#hydrographic%20structures"/> </skos:Concept> <skos:Concept rdf:about= "http://www.my.com/#hydrographic%20structures"> <skos:prefLabel>hydrographic structures</skos:prefLabel> </skos:Concept></rdf:RDF>

Page 22: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 22

IS view of same example

[… Introductory section for the whole CV: background, purpose, scope, etc. (omitted) …]

Section for concept with formal identifier: http://www.my.com/#canals This concept can be defined as Manmade waterway used by watercraft or for drainage, irrigation, or water power. It can be used as A feature type category for places such as the Erie Canal. The official accepted word or expression for referring to this concept is canals. Another word or expression commonly used to refer to this concept is drainage canals. canals are special cases of hydrographic structures.End of section

Section for concept with formal identifier: http://www.my.com/#hydrographic%20structures The official accepted word or expression for referring to this concept is hydrographic structures.End of section

Page 23: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 23

IS specification

• Table of text-before and text-after for all SKOS elements and attributes

• Specified by designer (modeler) of CV before it is populated

Page 24: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 24

Overview

• Intertextual semantics (IS)

• IS's view of controlled vocabulaires (CVs)

• Example

• Consequences of IS view

• Future work

Page 25: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 25

IS specification

• Makes explicit the often hidden complexity of the CV model for users

• Is an opportunity for specifying extra semantics of the CV model, over and above SKOS semantics– Ex.: "is-a" instead of just "broader term"

• Cleary shows the cognitive price of using artificial codes, e.g., numbers instead of names to identify concepts

Page 26: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 26

Extensions

• If SKOS extensions are used (e.g., custom relationships), IS specification is even more useful, because there are no "standard" interpretation of extensions

Page 27: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 27

Overview

• Intertextual semantics (IS)

• IS's view of controlled vocabulaires (CVs)

• Example

• Consequences of IS view

• Future work

Page 28: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 28

Future work (1/2)

• Development of IS framework– From intertexts to geometrized text– Application to interface / interaction design

• Application to CVs– IS analysis of other uses of CVs, e.g., for

indexing and searching– Work out an IS specification for a real CV and

experiment

Page 29: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

ISKO 2008 - Montréal 29

Future work (2/2)

• Integration of IS in SKOS – IS-peritexts are not by refinement of SKOS

documentation properties– Rather domain-specific XML elements and/or

attributes

Page 30: Yves MARCOUX Élias RIZKALLAH GRDS – EBSI Université de Montréal

Thank you !

Questions ?

[email protected]