Upload
madison-horn
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
INTRO TO DATABASE CLASSES:EXERCISING THE COMPUTING ONTOLOGYLois Delcambre and Felicia DeckerwithDave Maier, Len Shapiro, Rafael Fernandez
1
HIGH LEVEL GOALS Index the topics described in syllabi for DB class
Exercise the Computing Ontology (CO): to describe lecture topicsto describe (lots of small) things – including small-
grained pieces of lectures, assignments, tests, …
Build analysis tools to determine whether different classes covered the same topics, in the same depth, in the same order 2
3
Start of DB portionof the Computing Ontology
4
A bit more of the DB portion of the CO
BACKGROUND WORK FOR FELICIA Learn XML, RDF, OWL Become familiar with the CO Use various OWL tools (Protégé) Find sample data:
Search for DB course syllabi (using CITIDEL, Google)Check to see if slides, assignments, and tests were
available onlineSelect 6 syllabi
5
REPRESENT SYLLABI IN XML W/XML SCHEMA Define initial XML schema
Investigate the use of a tool that would allow forms-based data entry using an XML schema (ultimately didn’t find one to use)
Put the 6 syllabi into the XML schema (iterate)
Prepare an XML summary of each syllabus using XSL
6
7
CS 145 Syllabus – in XML, with our XML Schema
LABEL/INDEX LECTURE TOPICS W/CO TERMS Hand label each topic (as listed on the syllabus)
with one or more CO terms
8
9
CS 145 Syllabus – 1st class, withtopic shown
LABEL/INDEX LECTURE TOPICS W/CO TERMS (CONT.)
Identify list of questions:Terms with multiple pathsMissing termsTerms where Felicia wasn’t sure they were right
Prepare a “vanilla” spreadsheet of all questionsHave 4 DB profs answer the questions/choose termsCompile feedback
10
MULTIPLE PATH EXAMPLE Lecture topic: authorization Ontology term:
Ownership_&_Access_Control_-_Authorization_Techniques
Paths: Information_Topics/Database_Systems/
Components_of_Database_Systems/Database_Administration/ Ownership_&_Access_Control_-_Authorization_Techniques
Information_Topics/Managing_the_Database_Environment/Database_Administration/Ownership_&_Access_Control_-_Authorization_Techniques
11
MULTIPLE PATHS – EXAMPLE 2 Lecture topic: buffer replacement policy Ontology term:
Buffer_Management Paths:
Information_Topics/File_Processing/Buffer_Management
Information_Topics/Database_Systems/Physical_Database_Design/File_Processing/Buffer_Management
12
MULTIPLE PATHS – EXAMPLE 3 Lecture topic: query optimization Ontology term:
Query_Optimization
Paths: Information_Topics/Database_Systems/
Database_Languages/Query_Languages/Query_Optimization Information_Topics/
Storage_and_Retrieval_of_Semistructured_Information/Database_Languages/Query_Languages/Query_Optimization
Programming_Languages/Programming_Language_Classifications/Query_Languages/Query_Optimization 13
(POSSIBLE) MISSING TERMS Embedded SQL Views External Sorting SQL QueriesThis path:
Information_Topics/Database_Systems/Database_Languages/SQL has these children: SQL_Optimization_Techniques SQL_as_DDL SQL_as_DML Stored_Procedures Triggers
14
PRODUCE REPORTS Write a Java program that accepts multiple
syllabi as input (in XML) and produce:3-part summary of each syllabusTopical comparison report – showing common topics
and other topics – across the input syllabiRank comparison report – showing the 2nd through
last syllabi compared to topics listed in order of the first class
Full syllabus report – showing CO terms, with terms that have questions highlighted (in yellow)
15
16
Comparing two syllabi for coverage of topics: common topics highlighted in yellow
17
Unique topics, for each class, shown below
18
Comparing the order of topics covered:Topics listed in order of 1st class
19
Additional topics from 2nd classshown below
LABELING LECTURES – AT FINE GRAIN Hand-label fine-grained bits of lecture slides –
using terms from the CO Hand-label individual questions from a midterm
test using terms from the CO Investigated the use of the Apache POI tool to
extract text from PowerPoint, programmatically Work in progress
20
NOVEMBER 2009 Explain what this project is about Invite people to:
Contribute their DB class syllabus to EnsembleEnter their syllabus into XML Index their topics using the computing ontology
We could use the Syllabus Collection in Citidel We could use the CO in Drupal Write a description/story; build a UI to do the
XML and indexing 21