24
Kansas State University Department of Computing and Information Sciences Real-Time Bayesian Network Inference Real-Time Bayesian Network Inference for Decision Support in Personnel for Decision Support in Personnel Management: Management: Report on Research Activities Report on Research Activities William H. Hsu, Computing and Information Sciences Haipeng Guo, Computing and Information Sciences Shing I Chang, Industrial and Manufacturing Systems Engineering Kansas State University http://groups.yahoo.com/group/onr-mpp This presentation is: http://www.kddresearch.org/KSU/CIS/ONR-2002-Jun-04.ppt

Kansas State University Department of Computing and Information Sciences Real-Time Bayesian Network Inference for Decision Support in Personnel Management:

Embed Size (px)

Citation preview

Kansas State University

Department of Computing and Information Sciences

Real-Time Bayesian Network Inference for Real-Time Bayesian Network Inference for Decision Support in Personnel Management:Decision Support in Personnel Management:

Report on Research ActivitiesReport on Research Activities

William H. Hsu, Computing and Information Sciences

Haipeng Guo, Computing and Information Sciences

Shing I Chang, Industrial and Manufacturing Systems Engineering

Kansas State Universityhttp://groups.yahoo.com/group/onr-mpp

This presentation is:

http://www.kddresearch.org/KSU/CIS/ONR-2002-Jun-04.ppt

Kansas State University

Department of Computing and Information Sciences

OverviewOverview

• Knowledge Discovery in Databases (KDD)– Towards scalable data mining

– Applications of KDD: learning and reasoning

• Building Causal Models for Decision Support

• Time Series and Model Integration– Prognostic (prediction and monitoring) applications

– Crisis monitoring and simulation

– Anomaly, intrusion, fraud detection

– Web log analysis

– Applying high-performance neural, genetic, Bayesian computation

• Information Retrieval: Document Categorization, Text Mining– Business intelligence applications (e.g., patents)

– “Web mining”: dynamic indexing and document analysis

• High-Performance KDD Program at K-State

Kansas State University

Department of Computing and Information Sciences

High-Performance Database Mining and KDD: High-Performance Database Mining and KDD:

Current Research Programs at K-StateCurrent Research Programs at K-State• Laboratory for Knowledge Discovery in Databases (KDD)

– Research emphases: machine learning, reasoning under uncertainty

– Applications• Decision support

• Digital libraries and information retrieval

• Remote sensing, robot vision and control

• Human-Computer Interaction (HCI) - e.g., simulation-based training

• Computational science and engineering (CSE)

• Curriculum and Research Development

– Real-time automated reasoning (inference)

– Machine learning

– Probabilistic models for multi-objective optimization

– Intelligent displays: visualization of diagrammatic models

– Knowledge-based expert systems, data modeling for KDD

Kansas State University

Department of Computing and Information Sciences

Stages of Data Mining andStages of Data Mining andKnowledge Discovery in DatabasesKnowledge Discovery in Databases

Kansas State University

Department of Computing and Information Sciences

Visual Programming:Visual Programming:Java-Based Software Development PlatformJava-Based Software Development Platform

D2K © 2002 National Center for Supercomputing Applications (NCSA)

Used with permission.

Kansas State University

Department of Computing and Information Sciences

X1

X2

X3

X4

Season:SpringSummerFallWinter

Sprinkler: On, Off

Rain: None, Drizzle, Steady, Downpour

Ground:Wet, Dry

X5

Ground:Slippery, Not-Slippery

P(Summer, Off, Drizzle, Wet, Not-Slippery) = P(S) · P(O | S) · P(D | S) · P(W | O, D) · P(N | W)

• Conditional Independence– X is conditionally independent (CI) from Y given Z (sometimes written X Y | Z) iff

P(X | Y, Z) = P(X | Z) for all values of X, Y, and Z

– Example: P(Thunder | Rain, Lightning) = P(Thunder | Lightning) T R | L

• Bayesian Network– Directed graph model of conditional dependence assertions (or CI assumptions)

– Vertices (nodes): denote events (each a random variable)

– Edges (arcs, links): denote conditional dependencies

• General Product (Chain) Rule for BBNs

• Example (“Sprinkler” BBN)

n

iiin21 Xparents |XPX , ,X,XP

1

Bayesian Belief Networks (BBNS):Bayesian Belief Networks (BBNS):DefinitionDefinition

Kansas State University

Department of Computing and Information Sciences

Bayesian Networks andBayesian Networks andRecommender SystemsRecommender Systems

• Current Research

– Efficient BBN inference (parallel, multi-threaded Lauritzen-Spiegelhalter in D2K)

– Hybrid quantitative and qualitative inference (“simulation”)

– Continuous variables and hybrid (discrete/continuous) BBNs

– Induction of hidden variables

– Local structure: localized constraints and assumptions, e.g., Noisy-OR BBNs

– Online learning

• Incrementality (aka lifelong, situated, in vivo learning)

• Ability to change network structure during inferential process

– Polytree structure learning (tree decomposition): alternatives to Chow-Liu

– Complexity of learning, inference in restricted classes of BBNs

• Future Work

– Decision networks aka influence diagrams (BBN + utility)

– Anytime / real-time BBN inference for time-constrained decision support

– Some temporal models: Dynamic Bayesian Networks (DBNs)

Kansas State University

Department of Computing and Information Sciences

Data Mining:Data Mining:Development CycleDevelopment Cycle

0

10

20

30

40

50

60

ObjectiveDetermination

Data Preparation MachineLearning

Analysis &Assimilation

Eff

ort

(%

)

• Model Identification– Queries: classification, assignment

– Specification of data model

– Grouping of attributes by type

• Prediction Objective Identification– Assignment specification

– Identification of metrics

• Reduction– Refinement of data model

– Selection of relevant data (quantitative, qualitative)

• Synthesis: New Attributes

• Integration: Multiple Data Sources (e.g., Enlisted Master File, Surveys)

Environment(Data Model)

LearningElement

KnowledgeBase

Decision SupportSystem

Kansas State University

Department of Computing and Information Sciences

Learning Bayesian Networks:Learning Bayesian Networks: Gradient Ascent Gradient Ascent

• Algorithm Train-BN (D)

– Let wijk denote one entry in the CPT for variable Yi in the network

• wijk = P(Yi = yij | parents(Yi) = <the list uik of values>)

• e.g., if Yi Campfire, then (for example) uik <Storm = T, BusTourGroup = F>

– WHILE termination condition not met DO // perform gradient ascent

• Update all CPT entries wijk using training data D

• Renormalize wijk to assure invariants:

• Applying Train-BN

– Learns CPT values

– Useful in case of known structure

– Key problems: learning structure from data, approximate inference

BusTourGroup

Storm

Lightning Campfire

ForestFireThunder

Dx ijk

ikijhijkijk w

x |u,yPrww

10

1

ijk

j ijk

w . j

w

Kansas State University

Department of Computing and Information Sciences

• General-Case BBN Structure Learning: Use Inference to Compute Scores

• Recall: Bayesian Inference aka Bayesian Reasoning

– Assumption: h H are mutually exclusive and exhaustive

– Optimal strategy: combine predictions of hypotheses in proportion to likelihood

• Compute conditional probability of hypothesis h given observed data D

• i.e., compute expectation over unknown h for unseen cases

• Let h structure, parameters CPTs

Scores for Learning Structure:Scores for Learning Structure:The Role of InferenceThe Role of Inference

Hh

m

n21m

D|hP h D,|xP

x,,x,x|x,,x,xPD|xP

1

m211

dΘ h |ΘPΘ h,|DPhP

hPh|DPD|hP

Posterior Score Marginal Likelihood

Prior over Structures Likelihood

Prior over Parameters

Kansas State University

Department of Computing and Information Sciences

Learning Structure:Learning Structure:K2K2 Algorithm and Algorithm and ALARMALARM

• Algorithm Learn-BBN-Structure-K2 (D, Max-Parents)

FOR i 1 to n DO // arbitrary ordering of variables {x1, x2, …, xn}

WHILE (Parents[xi].Size < Max-Parents) DO // find best candidate parent

Best argmaxj>i (P(D | xj Parents[xi]) // max Dirichlet score

IF (Parents[xi] + Best).Score > Parents[xi].Score) THEN Parents[xi] += Best

RETURN ({Parents[xi] | i {1, 2, …, n}})

• A Logical Alarm Reduction Mechanism [Beinlich et al, 1989]

– BBN model for patient monitoring in surgical anesthesia

– Vertices (37): findings (e.g., esophageal intubation), intermediates, observables

– K2: found BBN different in only 1 edge from gold standard (elicited from expert)

17

6 5 4

19

10 21

311127

20

22

15

34

32

1229

9

28

7 8

30

2518

26

1 2 3

33 14

35

23

13

36

24

16

37

Kansas State University

Department of Computing and Information Sciences

Major Software Releases, FY 2002Major Software Releases, FY 2002

• Bayesian Network Tools in Java (BNJ)– v1.0a released Wed 08 May 2002 to www.Sourceforge.net

– Key features

• Standardized data format (XML)

• Existing algorithms: inference, structure learning, data generation

– Experimental results

• Improved structure learning using K2, inference-based validation

• Adaptive importance sampling (AIS) inference competitive with best published algorithms

• Machine Learning in Java (MLJ)– v1.0a released Fri 10 May 2002 to www.Sourceforge.net

– Key features: (3) inductive learning algorithms from MLC++, (2) inductive learning wrappers (1 from MLC++, 1 from GA literature)

– Experimental results

• Genetic wrappers for feature subset selection: Jenesis, MLJ-CHC

• Overfitting control in supervised inductive learning for classification

Kansas State University

Department of Computing and Information Sciences

• About BNJ– v1.0a, 08 May 2002: 26000+ lines of Java code, GNU Public License (GPL)– http://www.kddresearch.org/Groups/Probabilistic-Reasoning/BNJ– Key features [Perry, Stilson, Guo, Hsu, 2002]

• XML BN Interchange Format (XBN) converter – to serve 7 client formats (MSBN, Hugin, SPI, IDEAL, Ergo, TETRAD, Bayesware)

• Full exact inference: Lauritzen-Spiegelhalter (Hugin) algorithm• Five (5) importance sampling algorithms: forward simulation (likelihood

weighting) [Shachter and Peot, 1990], probabilistic logic sampling [Henrion, 1986], backward sampling [Fung and del Favero, 1995] self-importance sampling [Shachter and Peot, 1990], adaptive importance sampling [Cheng and Druzdzel, 2000]

• Data generator

• Published Research with Applications to Personnel Science– Recent work

• GA for improved structure learning: results in [HGPS02a; HGPS02b]• Real-time inference framework – multifractal analysis [GH02b]

– Current work: prediction – migration trends (EMF); Sparse Candidate– Planned continuation: (dynamic) decision networks; continuous BNs

BBayesian ayesian NNetwork Tools in etwork Tools in JJava (BNJ)ava (BNJ)

Kansas State University

Department of Computing and Information Sciences

Change of Representationand Inductive Bias Control

[B] Representation Evaluatorfor Learning Problems

eI

D: Training Data

: Inference Specification

Dtrain (Inductive Learning)

Dval (Inference)

[A] Genetic Algorithm

OptimizedRepresentation

α̂

α

CandidateRepresentation

f(α)

RepresentationFitness

GA for BN Structure LearningGA for BN Structure Learning[Hsu, Guo, Perry, Stilson, GECCO-2002][Hsu, Guo, Perry, Stilson, GECCO-2002]

Kansas State University

Department of Computing and Information Sciences

[B] Representation Evaluatorfor Input Specifications

: Evidence SpecificationeI

Dtrain (Model Training)

Dval (Model Validation by Inference)

f(α)

Specification Fitness(Inferential Loss)

[ii] Validation(Measurementof Inferential

Loss)

hHypothesis

[i] Inductive Learning(Parameter Estimation

from Training Data)

α

CandidateInput Specification

Model-Based ValidationModel-Based Validation[Hsu, Guo, Perry, Stilson, GECCO-2002][Hsu, Guo, Perry, Stilson, GECCO-2002]

Kansas State University

Department of Computing and Information Sciences

BNJ: Integrated Tool forBNJ: Integrated Tool forBayesian Network Learning and InferenceBayesian Network Learning and Inference

XML Bayesian Network Learned from Data using K2 in BNJ

Kansas State University

Department of Computing and Information Sciences

• About MLJ– v1.0a, 10 May 2002: 24000+ lines of Java code, GNU Public License (GPL)– http://www.kddresearch.org/Groups/Machine-Learning/MLJ– Key features [Hsu, Schmidt, Louis, 2002]

• Conformant to MLC++ input-output specification• Three (3) inductive learning algorithms: ID3, C4.5, discrete Naïve Bayes• Two (2) wrapper inducers: feature subset selection [Kohavi and John,

1997], CHC [Eshelman, 1990; Guerra-Salcedo and Whitley, 1999]

• Published Research with Applications to Personnel Science– Recent work

• Multi-agent learning [GH01, GH02a]• Genetic feature selection wrappers [HSL02, HWRC02, HS02]

– Current work: WEKA compatibility, parallel online continuous arcing– Planned continuations

• New inducers: instance-based (k-nearest-neighbor), sequential rule covering, feedforward artificial neural network (multi-layer perceptron)

• New wrappers: theory-guided constructive induction, boosting (Arc-x4, AdaBoost.M1, POCA)

• Integration of reinforcement learning (RL) inducers

MMachine achine LLearning in earning in JJava (MLJ)ava (MLJ)

Kansas State University

Department of Computing and Information Sciences

Infrastructure for High-Performance Infrastructure for High-Performance Computation in Data MiningComputation in Data Mining

Rapid KDD Development Environment: Operational Overview

Kansas State University

Department of Computing and Information Sciences

National Center for Supercomputing National Center for Supercomputing Applications (NCSA)Applications (NCSA) D2K D2K

Kansas State University

Department of Computing and Information Sciences

Visual Programming Interface (Java):Visual Programming Interface (Java):Parallel Genetic AlgorithmsParallel Genetic Algorithms

Kansas State University

Department of Computing and Information Sciences

Time Series Modeling and Prediction:Time Series Modeling and Prediction:Integration with Information VisualizationIntegration with Information Visualization

New Time Series Visualization System (Java3D)

Kansas State University

Department of Computing and Information Sciences

Demographics-Based Clustering for Demographics-Based Clustering for Prediction (Continuing Research)Prediction (Continuing Research)

Cluster Formation and Segmentation Algorithm (Sketch)

Dimensionality-Reducing

Projection (x’)Clusters of

Similar Records

DelaunayTriangulation

Voronoi (Nearest Neighbor)

Diagram (y)

Kansas State University

Department of Computing and Information Sciences

Data Clustering inData Clustering inInteractive Real-Time Decision SupportInteractive Real-Time Decision Support

15 × 15 Self-Organizing Map(U-Matrix Output)

Cluster Map(Personnel Database)

Kansas State University

Department of Computing and Information Sciences

• Laboratory for Knowledge Discovery in Databases (KDD)– Applications: interdisciplinary research programs at K-State, FY 2002

• Decision support, optimization (Hsu, CIS; Chang, IMSE)

• (NSF EPSCoR) Bioinformatics – gene expression modeling (Hsu, CIS; Welch, Agronomy; Roe, Biology; Das, EECE)

• Digital libraries, info retrieval (Hsu, CIS; Zollman, Physics; Math, Art)

• Human-Computer Interaction (HCI) - e.g., simulation-based training

• Curriculum Development– Real-time intelligent systems (Chang, Hsu, Neilsen, Singh)

– Machine learning and artificial intelligence; info visualization (Hsu)

– Other: bioinformatics, digital libraries, robotics, DBMS

• Research Partnerships– NCSA: National Computational Science Alliance, National Center for

Supercomputing Applications

– Defense (ONR, ARL, DARPA), Industry (Raytheon)

• Publications, More Info: http://www.kddresearch.org

Summary:Summary:State of High-Performance KDD at KSU-CISState of High-Performance KDD at KSU-CIS