Upload
april-murphy
View
212
Download
0
Tags:
Embed Size (px)
Citation preview
Towards SemanticBusiness Intelligence
Semantic Technology 2010, San Francisco
Paul Haley
Automata, Inc.
(412) 716-6420
Business needs more intelligence• Natural logic:
– Only full page color ads may run on the last page of the Times.
• Some business rules to enforce constraints:– If an ad that is not full page is to be run on the last page of the Times
then refuse the run.– If an ad that is not color is to be run on the last page of the Times then
refuse the run.
• Business rules for user interfaces:– If asking for the size of an ad that is to be run on the last page of the
Times then the only choice should be full page.– If asking for the type of an ad that is to be run on the last page of the
Times then full page should not be a choice.
• More general business rules (without if):– Ads run on the last page of the Times must be full page.– Ads run on the last page of the Times must be color.
Copyright © 2010, Automata, Inc. 3SemTech 2010, San Franciso
Knowledge is power
• rules are too context sensitive
• too hard to conceive in context
• too hard to manage across contexts
Copyright © 2010, Automata, Inc. 4SemTech 2010, San Franciso
Two convergent themes
• Semantics, IT and enterprises
• Knowledge management & acquisition
Copyright © 2010, Automata, Inc. 5SemTech 2010, San Franciso
Knowledge “engineering”
• Metaphors – accessibility; sizzle
• Expressiveness – functionality, adequacy
• Utility – suitability, flexibility, reusability
Copyright © 2010, Automata, Inc. 6SemTech 2010, San Franciso
Business “intelligence”
• business rules vs. logic
• behavior vs. inference
• action vs. truth
• PRR vs. RIF
• SBVR?
Copyright © 2010, Automata, Inc. 7SemTech 2010, San Franciso
Business Processes and Rules
• highly conditional behavior
• many events & processes
• little semantics or inference
• little or no problem solving
Copyright © 2010, Automata, Inc. 8SemTech 2010, San Franciso
Tactical and strategic
• Narrow perspectives– business rule or decision management– business process management– event processing
• Broader perspectives– business intelligence– performance management
Copyright © 2010, Automata, Inc. 9SemTech 2010, San Franciso
Semantics chasm
• lack of semantics or knowledge– in BRMS / BPMS / CEP– in business intelligence / performance management
• what is driving the enterprise towards…?– W3C OWL, RIF– OMG SBVR, BMM
• Semantics of business: vocabulary & rules• Business motivation model
• if this is knowledge, how will it be managed?
Copyright © 2010, Automata, Inc. 10SemTech 2010, San Franciso
BRMS / BPMS / CEP don’t understand
• Underwriting must precede approval.
• Marital status is the state of people with respect to their participation in marriage.
• A plane flies from when it takes off to when it lands.
• A plane taxis between landing and taking off except when it is parked.
• Call a customer who has not responded to a notice within the applicable period.
• If a validated application has been submitted forward it to originations (or underwriting).
Copyright © 2010, Automata, Inc. 11SemTech 2010, San Franciso
Processes and events
• No [adequate] ontology exists
• Very commonly used in language
• Adjectives commonly reference state
• Tense and perfection reference process
• Imperative action and past or progressive occurrence is common
• Processes and events are not just “things”
Copyright © 2010, Automata, Inc.SemTech 2010, San Franciso 12
Events are primitive
• Events occur. – They happen.– They are temporal.– Processes are a kind of event.– Actions are processes.
• It’s all about the verbs.– Tense is context for BPM & CEP – De-verbal nouns are not just “objects”!
• A request is an action, process, and event.– see the blog for more details
Copyright © 2010, Automata, Inc. 13SemTech 2010, San Franciso
Crossing the chasm
• Rules as knowledge
• Semantics of action
• Processes as knowledge
• Semantics of events (and time)
PRR.RIF.SBVR.BRMS.BPMS.CEPconvergence?!
Copyright © 2010, Automata, Inc. 14SemTech 2010, San Franciso
Reach the enterprise majority
• Processes as means to objectives
• Pursuing goals and achieving objectives
• Semantics of business motivation
• Semantics of business performance
• Leverage inference– Active business intelligence– Active performance optimization
Copyright © 2010, Automata, Inc. 15SemTech 2010, San Franciso
Concrete steps
• Increase focus on knowledge– unify BRMS, BPMS, CEP
• Mature OWL, RIF, SBVR, et al– events (including processes) and time– quantities, including time and money, …
• Increasingly dictate and govern decisions and behavior using knowledge– increasing knowledge-driven process engines– increasing automated decision management
• Knowledge-driven BI & performance mgmt
Copyright © 2010, Automata, Inc. 16SemTech 2010, San Franciso
Semantic Business Intelligence
• Process semantics includes causality– good or bad events, activity, and outcomes
• Process semantics will include motivation– goals, purposes, & expectations
• Static BI vs. automated discovery– systems will learn to predict performance– w/o understanding learning is too hard
• Semantics enables learning & prediction
Copyright © 2010, Automata, Inc. 17SemTech 2010, San Franciso
Two convergent themes
• Semantics, IT and enterprises
• Knowledge management & acquisition
Copyright © 2010, Automata, Inc. 18SemTech 2010, San Franciso
Knowledge “engineering”
• Metaphors – accessibility; sizzle
• Expressiveness – functionality, adequacy
• Utility – suitability, flexibility, reusability
Copyright © 2010, Automata, Inc. 19SemTech 2010, San Franciso
Utility vs. Usability
Rule Metaphor Analysis
0
2
4
6
8
10
0 2 4 6 8 10
Intuitive / Ease of Use
Po
we
r / E
xpre
ss
ive
ne
ss
Natural Logical Rules
If / Then Rules
Tabular Rules
Decision Trees
Lookup Tables
Copyright © 2010, Automata, Inc.
Graphical metaphors less usable than tabular rules
Logics may be slightly more or less expressive than rules but are not more accessible than tabular rules
SemTech 2010, San Franciso 20
Lookup tables
• Determine one of the arguments to a predicate (typically binary or ternary)
• Min/max cardinality may be 0 or >1
Copyright © 2010, Automata, Inc.SemTech 2010, San Franciso 21
Other tabular metaphors
• Primitive
• Spreadsheet rules– one rule per row or column– multiple antecedents or consequents– variable scoping (givens) per sheet– binary predicate limitation
• Tabular organization of statements
Copyright © 2010, Automata, Inc.SemTech 2010, San Franciso 22
Primitive metaphor
• Tabular metaphor– rules correspond to 3 column tables
• leverage properties’ maximum arity of 2• predicate, domain, range (prefix or infix feasible)• support for literals, variables (amounts feasible)• negated, disjunctive, and inequality restrictions feasible• a 4th [tree] column supporting more than conjunction is feasible
– first or last row(s) are consequent(s)– other rows are antecedents
• Textual metaphor– wordings of properties as 5 column tables
Copyright © 2010, Automata, Inc.
text before optionaldomain or range
text between optionalrange or domain
text after
SemTech 2010, San Franciso 23
Textual metaphors
• Formal syntax– technical rule languages– FOL, SILK, etc.– SBVR’s “formalized” English
• Structured textual metaphors– Top-down structure editing– Left to right authoring or parsing
• Grammars ranging from pseudo-code to “business language” to “natural language”
Copyright © 2010, Automata, Inc.SemTech 2010, San Franciso 26
Textual versus linguistic
• SBVR defines wordings as structured text, not linguistically– vocabulary and wordings are (generally) “just”
text (e.g., without parts of speech)– SBVR does not manage its vocabulary in
dictionary with parts of speech, conjugations, or plurals nor with numbered definitions (i.e., word senses) corresponding to wordings
• SBVR does provide for ontology
Copyright © 2010, Automata, Inc.SemTech 2010, San Franciso 27
Top-Down Structure Editing
• IBM w/in Word• Clauses
correspond to relations with noun phrases for roles
• Structured editing and pick lists avoid ambiguity
• Simple and effective
Copyright © 2010, Automata, Inc.SemTech 2010, San Franciso 28
Existing structured “language”• No grammar or vocabulary required
– Very little natural language processing involved– Easily internationalized using text
• Generally limited to “if” formulations– Reference is by choice not inference– No ambiguity, the parse is implicit (or explicit)
• Expressiveness limited by small equivalent grammar– limited use of adjectives– relative clauses limited (e.g., “that”/”which”/”who”)– passive voice and possessive or prepositional forms limited– limited (if any) support for plurals (e.g., sets and aggregates)– such limitations result in logic limitations and wordiness
Copyright © 2010, Automata, Inc.SemTech 2010, San Franciso 29
Document metaphors
• Microsoft Word– Microsoft acquired SBVR from Unisys– Oracle supports Microsoft Word via Haley
• Wiki metaphor– supporting collaborative KA from Semantic MediaWiki
• Typical functionality and feedback loop:– ambiguous or clearly understood fragments readily
distinguished from draft content– rendering of draft content drives vocabulary acquisition– defining vocabulary or phrasings fleshes out ontology– content becomes increasingly understood as linguistic
ontology is acquired
Copyright © 2010, Automata, Inc.SemTech 2010, San Franciso 31
VocaWiki
Copyright © 2010, Automata, Inc.
Linguistic feedback improves authoring as in Simplified English
Encourages authors to contribute more formal knowledge
Facilitates knowledge acquisition from content thru gardening
Facilitates community / collaborative KA
Simplifies and improves query
SemTech 2010, San Franciso 32
Logical interpretation
• Only 6x21” ads can run on the back page of any section in the newspaper.
• For color reservations, pick-ups and multiple appearance ads are not allowed.
• When Vulcan runs a 2x7” on pgs 2-3 they should only be charged for a 2x5.25” at their contract rate.
• Book ads shall receive a lower per column inch rate for a 6x21” than for other sizes.
• Ad schedules using the same material with the same reservation number must always be the same size.
Copyright © 2010, Automata, Inc.SemTech 2010, San Franciso 33
Clarification or Disambiguation
• only full page color ads run on back pages– only ads run on back pages– ads run on back pages must be color– ads run on back pages must be full page
• show plausible implications of interpretations
SemTech 2010, San Franciso 34Copyright © 2010, Automata, Inc.
NLP and Linguistic Ontology
Copyright © 2010, Automata, Inc.
dictatea sentence
analyze lexemes used
in the sentence
hypothesize concepts
referenced by noun phrases
hypothesize relationships referenced by
phrases / clauses
determine plausible
interpretationsof sentence
identify ambiguities or grammatical
or lexical issues
restate or editthe sentence
updaterepository with
formal logicfor sentence
updaterepository of sentences
SemTech 2010, San Franciso 35
Towards SemanticBusiness Intelligence
Semantic Technology 2010, San Francisco
Paul Haley
Automata, Inc.
(412) 716-6420