5
Building a Top Down Ontology From the Bottom Up Step by Step Approach for Identifying & Constructing Dimensions of an Ontology draft (v0.8): DeniseBedford / 2006.06.08

Building a Top Down Ontology From the Bottom Up Step by Step Approach for Identifying & Constructing Dimensions of an Ontology draft (v0.8): DeniseBedford

Embed Size (px)

Citation preview

Page 1: Building a Top Down Ontology From the Bottom Up Step by Step Approach for Identifying & Constructing Dimensions of an Ontology draft (v0.8): DeniseBedford

Building a Top Down Ontology From the Bottom Up

Step by Step Approach for Identifying & Constructing

Dimensions of an Ontology

draft (v0.8): DeniseBedford / 2006.06.08

Page 2: Building a Top Down Ontology From the Bottom Up Step by Step Approach for Identifying & Constructing Dimensions of an Ontology draft (v0.8): DeniseBedford

Step by Step Approach

• Step 1: Identify the boundaries of the ontology– What will be ‘ontologized’ (broad definition of content)?– Who will use the ontology?– How they will use the ontology?

• Following steps pertain to creating one dimension of an ‘ontology’ to apply to content --

• Step 2: Create a content inventory– Identify the sources of content – Use an inventory tool (COAST) to generate a full inventory– Working group weeds/selects from the inventory to create a core set of content to

work with

• Step 3. Extract list of concepts from the content– Use the inventory to capture content items as a training set– Identify the types of concepts to be extracted – noun-phrase descriptors, entity

identifiers, names, institutions, etc.– Configure and run the concept extraction

Page 3: Building a Top Down Ontology From the Bottom Up Step by Step Approach for Identifying & Constructing Dimensions of an Ontology draft (v0.8): DeniseBedford

Step by Step Approach

• Step 4. Review the list of concepts– Quickly scan the concept list to determine concentrations of concepts– Check whether these concentrations make sense in terms of ‘categories’ – If so, begin to build a categorization profile and organize the concepts within– Determine what’s missing from the list of concepts (domain experts help us

here..)– Determine what is in the list that is not pertinent to the topic (peripheral or out of

bounds for the topic) – (domain experts here us here, too)– Prune the list of concepts – in some cases find new content and repeat the

process

• Step 5x. Build the categorizer profile– Build a rule-based categorizer around the concept clusters (manual bunching at

a very coarse level)– Or,…check clustering of concepts using a clustering engine (here you can feed

the refined list of concepts back into a clustering engine and run them against the training set)

Page 4: Building a Top Down Ontology From the Bottom Up Step by Step Approach for Identifying & Constructing Dimensions of an Ontology draft (v0.8): DeniseBedford

Questions for Domain Expert Review

1. If you were talking about ontology with an expert, are all of the concepts you would use included in the list? If not, what is missing?

1. Are there a few concepts missing, or is there a larger domain or knowledge area that is missing?

1. What is in the list that doesn’t pertain to ontologies?

1. If you were looking for information about ontologies – from an expert point of view – would you use any of these concepts to search? Which ones are missing? What shouldn’t be in the list?

1. If you were looking for information about ontologies from a novice’s point of view – what is missing from the list of concepts? What shouldn’t be there?

Page 5: Building a Top Down Ontology From the Bottom Up Step by Step Approach for Identifying & Constructing Dimensions of an Ontology draft (v0.8): DeniseBedford

Step by Step Approach

• Step 6. Test the Categorization Profile against the content

– Define the xml output structure for the metadata

– Run the profile against the content

– Review the categorization results

– Accept/refine the profile

• Other steps to creating the full ontology– Determining what kind of functionality you need to support use of the ontologized

content– …search & discovery system– …browse of categories of content– …reporting– …recommender engines