62
MIS 696 Final Presentation Fall 2008 Mary Burns Katherine Carl Jiesi Cheng Soomi Cheong Koren Elder Li Fan Chun-neng Huang Brent Langhals Matthew Pickard Nathan Twyman Shuo Zeng Xinlei Zhao

MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Embed Size (px)

Citation preview

Page 1: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

MIS 696 Final PresentationFall 2008

Mary Burns Katherine Carl Jiesi ChengSoomi Cheong Koren Elder Li FanChun-neng Huang Brent Langhals Matthew PickardNathan Twyman Shuo Zeng Xinlei Zhao

Page 2: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

What is MIS?

As a discipline? As a field of research?

Page 3: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

MIS: A Conventional Definition

Management Information

(Computer Science)

Systems

(Engineering)

Page 4: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

The Quest: From the Seven Pillars to the Tree of Decision

1998: Seven Pillars1999: A

Simple Model and

Key Researchers

2000: Additional

Pillars

2001: Another 2D

Model, A Timeline of Researchers 2002:

Researchers, More of the

Same

2003: A 3D Model,

Timeline, Endnote Library

2005: Another Model,

Publication Trends

2004: A 2D Model,

Research Institutions

2006: Methodological

Approach

2007: A Normative Approach, Decision

Tree

2008: An IS approach to

MIS?

Page 5: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

The Brainstorm“Discovery consists of seeing what everybody has seen and thinking

what nobody has thought.” –Albert Szent-Gyorgyi

Nathan

Page 6: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

The Ideagora

Validation of 2007 Decision Tree

Web of Science

Graphical Representation

Clustering

Journal Trends

Page 7: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

The RealizationWe are a large and intelligent group of people, but can we deliver all of

these analyses in a semester?

We need a way to manage a large quantity of data.

Page 8: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Contribution: Database

DataData

DataDataData

Page 9: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Contribution: Database

• Basic article info

• Category

Page 10: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Contribution: Database

Page 11: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Contribution: Database

Page 12: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Contribution: Database

• Web of Knowledge and Google Citations

Page 13: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Our Contribution: A Database

• Article Dimensions– Rigor vs. Relevance– Theoretical vs. Applied– Innovation vs. Review– Behavioral vs. Technical

Page 14: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Analysis 1:Statistical Analysis of the Corpus

Page 15: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

0.00%

1.00%

2.00%

3.00%

4.00%

5.00%

6.00%

7.00%19

37

1945

1959

1960

1962

1963

1966

1967

1968

1969

1970

1971

1972

1974

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2008

Percentage of Articles in Corpus by Year Published

Page 16: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

0

2

4

6

8

10

12

14

1930-1939 1950-1959 1960-1969 1970-1979 1980-1989 1990-1999 2000-2009

Artificial Intelligence

Collaboration

Data Management

Decision Sciences

eCommerce

Economics of Information

HCI

Information Assurance

Knowledge Management

Operations Management

OTHER

Social Informatics

Supply Chain Management

System Analysis & Design

Workflow/Business Process Management

0

2

4

6

8

10

12

14

1930-1939 1950-1959 1960-1969 1970-1979 1980-1989 1990-1999 2000-2009

Artificial Intelligence

Collaboration

Data Management

Decision Sciences

eCommerce

Economics of Information

HCI

Information Assurance

Knowledge Management

Operations Management

OTHER

Social Informatics

Supply Chain Management

System Analysis & Design

Workflow/Business Process Management

Page 17: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

OTHER, 16.12%

Data Management, 14.67%

System Analysis & Design, 12.14%

Knowledge Managment, 10.77%

Collaboration, 9.38%

Economics of Information, 8.96%

Decision Sciences, 8.84%

Workflow/Business Process Management, 8.60%

Artificial Intelligence, 6.30% Social Informatics, 4.22%

Top 10 Categories by % of WK Citations

Page 18: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Data Management, 22.67%

System Analysis & Design, 19.77%

Collaboration, 9.88%OTHER, 9.30%

Decision Sciences, 8.14%

Artificial Intelligence, 7.56%

Economics of Information, 6.40%

HCI, 5.81%

Social Informatics, 5.81%

Knowledge Managment, 4.65%

Top 10 Categories by % of Article Count

Page 19: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Title Author Year Category WK Citations

WK Citations per Year

Dynamic capabilities and strategic management Teece et al 1997 OTHER 1532 139A Relational Model of Data for Large Shared Data Banks Codd 1970 Data Management 1269 33Organizational Information Requirements, Media Richness and Structural Design

Daft and Lengel 1986 Workflow/Business Process Management 1100 50

A Dynamic Theory of Organizational Knowledge Creation Nonaka 1994 Knowledge Managment 1098 78On the Criteria To Be Used in Decomposing Systems into Modules Parnas 1972 System Analysis & Design 680 19

Machine learning in automated text categorization Sebastiani 2002 Artificial Intelligence 663 111The Lagrangian Relaxation Method for Solving Integer Programming Problems

Fisher 1981 Decision Sciences 658 24

Electronic Markets and Electronic Hierarchies Malone et al 1987 Economics of Information 575 27A Foundation for the Study of Group Decision Support Systems DeSanctis and Gallupe 1987 Collaboration 543 26Internet paradox: A social technology that reduces social involvement and psychological well-being?

Kraut et al 1998 Social Informatics 509 51

Top 10 Articles by Web of Knowledge Citations

Title Author Year Category Google Citations

Google Citations per Year

The Nature of the Firm Coase 1937 OTHER 11151 157Dynamic capabilities and strategic management Teece et al 1997 OTHER 5536 503A Dynamic Theory of Organizational Knowledge Creation Nonaka 1994 Knowledge Managment 5121 366The entity-relationship model toward a unified view of data Chen 1976 Data Management 4583 143A Relational Model of Data for Large Shared Data Banks Codd 1970 Data Management 4197 110As We May Think Bush 1945 Knowledge Managment 2972 47Organizational Information Requirements, Media Richness and Structural Design

Daft and Lengel 1986 Workflow/Business Process Management 2853 130

A Spiral Model of Software Development and Enhancement Boehm 1988 System Analysis & Design 2848 142On the Criteria To Be Used in Decomposing Systems into Modules Parnas 1972 System Analysis & Design 2729 76

A Note on Two Problems in Connection with Graphs Dijkstra 1959 OTHER 2190 45

Top 10 Articles by GoogleCitations

Page 20: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Title Author Year Category WK Citations

WK Citations per Year

Review: Knowledge Management and Knowledge Management Systems: Conceptual Foundations and Research Issues

Alavi and Leidner 2005 Knowledge Managment 418 139

Dynamic capabilities and strategic management Teece et al 1997 OTHER 1532 139Machine learning in automated text categorization Sebastiani 2002 Artificial Intelligence 663 111A Dynamic Theory of Organizational Knowledge Creation Nonaka 1994 Knowledge Managment 1098 78User acceptance of information technology: Toward a unified view Venkatesh et al 2003 HCI 379 76Internet paradox: A social technology that reduces social involvement and psychological well-being?

Kraut et al 1998 Social Informatics 509 51

Organizational Information Requirements, Media Richness and Structural Design

Daft and Lengel 1986 Workflow/Business Process Management 1100 50

A Relational Model of Data for Large Shared Data Banks Codd 1970 Data Management 1269 33Frictionless Commerce? A Comparison of Internet and Conventional Retailers Brynjolfsson and Smith 2000 eCommerce 263 33

Design Science in Information Systems Research Hevner et al 2004 OTHER 131 33

Top 10 Articles by Average Web of Knowledge Citations per Year

Title Author Year Category Google Citations

Google Citations per Year

Review: Knowledge Management and Knowledge Management Systems: Conceptual Foundations and Research Issues

Alavi and Leidner 2005 Knowledge Managment 1593 531

Dynamic capabilities and strategic management Teece et al 1997 OTHER 5536 503A Dynamic Theory of Organizational Knowledge Creation Nonaka 1994 Knowledge Managment 5121 366Machine learning in automated text categorization Sebastiani 2002 Artificial Intelligence 2056 343User acceptance of information technology: Toward a unified view Venkatesh et al 2003 HCI 1055 211The Nature of the Firm R. H. Coase 1937 OTHER 11151 157The entity-relationship model toward a unified view of data Peter Pin-Shan Chen 1976 Data Management 4583 143A Spiral Model of Software Development and Enhancement Boehm 1988 System Analysis & Design 2848 142Frictionless Commerce? A Comparison of Internet and Conventional Retailers Brynjolfsson and Smith 2000 eCommerce 1129 141

Organizational Information Requirements, Media Richness and Structural Design

Daft and Lengel 1986 Workflow/Business Process Management 2853 130

Top 10 Articles by Average Google Citations per Year

Page 21: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Attribute-Based Clustering & Analysis of MIS Papers

Page 22: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Purpose and Methodology

• Purpose– Classify the MIS papers from a different perspective – the

general attributes of the papers– Provide useful information to assist the trend analysis and

prediction about MIS research• Methodology

– Clustering: Use Fuzzy k-Means Clustering Algorithm– Validation: Use Partition Index (SC) to determine the best

number of clusters– Cluster Evaluation: Label the papers with cluster numbers– Analysis: Analyze the clustering results

Page 23: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Attributes of Papers

• 8 Attributes / 4 Attributes Pairs– Theoretical vs. Applied– Rigor vs. Relevance– Review vs. Innovation– Technical vs. Behavior

• Scoring and Data Processing– Every attribute of a paper is given a score 1~5– The score of one attribute is considered the reverse score

of the other attribute in the pair (i.e. scoreTheoretical = 3 equals to scoreApplied = -3)

Page 24: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

MIS-Paper Space Definition- A 4-dimensional space

Theoretical

Applied

Rigor Relevance

Review

InnovationTechnical

BehaviorX1

X2

X3

X4

Page 25: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Fuzzy k-Means Clustering

• Average value of scores in the same pair are used as the coordinates of the paper in MIS-Paper Space, it is 4-dimensional

• All coordinates of papers are used as the raw data in the clustering procedure

• Because the best number of clusters could not be decided at the beginning, the clustering procedure will run several times with the number of clusters predefined from 3~15

Page 26: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Validation

• Goal of clustering– Group the papers with as many similarities as

possible– Separate different groups as far away from each

other as possible

• Choice of validation index– Partition Index: The ratio of the sum of

compactness and separation of the clusters – The lower the ratio, the better

Page 27: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Validation (Cont’d)

2 4 6 8 10 12 14 16

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

3

X: 7Y: 1.47

Validation curve using Partition Index (SC)

Number of Clusters

Par

titio

n In

dex

Val

ue

Page 28: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Validation (Cont’d)

• Best number of clusters: 7• Reasons

– It is the “elbow” point, the increase of performance after 7 is not as prominent as that before 7

– Although 12 has the lowest index value, too many clusters (too few papers every cluster) will affect the generalization of the characteristics of every cluster

Page 29: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Cluster Evaluation

• Choose the largest membership value and label the paper with cluster number

• Center and number of papers of every cluster

1 1

2 2

3 3

4 4

5 5

6 6

0.6052 0.5752 0.8217 1.1602 28

1.2773 0.5368 0.2232 1.4327 24

0.2477 0.0712 0.3298 -1.3236 29

-0.9300 -0.3595 0.9106 -1.7341 26

-0.1861 0.6770 -0.8188 -0.4133 28

1.0888 0.6974 0.9997 -1.6909 2

v N

v N

v N

v N

v N

v N

7 7

9

0.1839 0.9297 -0.8653 1.1222 21v N

Page 30: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Characteristics TableTheoretical Applied Rigorous Relevant Review Innovation Technical Behavior

Cluster1 --- Moderate --- Moderate --- Moderate --- Normal

Cluster2 Normal --- --- Moderate --- Moderate --- Normal

Cluster3 --- Moderate Slight Slight --- Moderate Normal ---

Cluster4 Moderate --- Moderate --- --- Moderate Extreme ---

Cluster5 Moderate --- --- Moderate Moderate --- Moderate ---

Cluster6 --- Normal --- Moderate --- Moderate Extreme ---

Cluster7 --- Moderate --- Moderate Moderate --- --- Normal

Page 31: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Data Visualization

Page 32: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Data Visualization (Cont’d)

Page 33: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Data Visualization (Cont’d)

Page 34: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Data Visualization (Cont’d)

Page 35: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Domain-Cluster Paper DistributionCluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 Sum

Artificial Intelligence 0 1 5 1 2 4 0 13Collaboration 4 5 1 1 0 1 5 17

Data Management 1 1 12 10 7 8 0 39Decision Sciences 2 1 1 4 3 0 3 14

eCommerce 2 0 1 0 1 0 1 5Ecomomics of Information 4 2 0 0 4 0 1 11

HCI 3 2 1 0 2 1 1 10Information Assurance 1 0 0 0 0 0 0 1

Knowledge Management 1 5 1 0 0 0 1 8Operations Management 1 0 0 1 0 0 0 2

Social Informatics 3 1 0 0 2 0 4 10Supply Chain Management 0 0 0 2 0 0 0 2

System Analysis & Design 6 0 4 4 3 13 4 34Workflow/Business Process

Management 0 1 0 1 0 1 0 3OTHER 0 5 3 2 4 1 1 16

Sum 28 24 29 26 28 29 21 185

Page 36: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Possible Analysis Results

• By analyzing the paper distribution in domain and clusters, we can generate– Authors’ research map– Universities’ research map– Journals’ preference on paper types

• By analyzing the above result with a time series, we can generate– Trend and prediction of authors’, universities’

research– Journals’ preference

Page 37: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Benefits

• Catch the latest research hotspot in every domain

• Follow the changes of the preference of journals

• Acquire real-time information about the changes of universities’ and professors’ roles in the MIS community

• Discover the unexplored domain in MIS area

Page 38: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Discussion & Future Work

• Two difficulties– Need information from perspectives to reasonably explain

the results – Attribute score may contain bias, which will affect the

performance of the clustering• Future work

– Select new attributes to evaluate papers– Examine the effect of score bias and design better

approach– Replace manual analysis with automatic process, such as

Text Mining and Social Network Analysis

Page 39: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Text Mining

SQL 2005 Data Mining

Page 40: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

42

Data Mining Algorithms

√√ √√ √√ √√ √√ √√

√√ √√ √√ √√ √√

√√ √√ √√

√√ √√ √√ √√ √√ √√

√√ √√ √√

√√

√√

Decision Tre

esNaïve

Bayes

Clusterin

g

Seq. C

lusterin

gTim

e Serie

sAsso

ciation ru

lesNeural N

etwork

Classification

Regression

Segmentaion

Assoc. Analysis

Anomaly Detect.

Seq. Analysis

Time series

√ √ - second choice- second choice√ √ - first choice- first choice

Page 41: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

43

Naïve Bayesian• Based on Bayesian Theorem with “Naïve” assumption• The fastest algorithm, and gives reasonable accuracy• Best used for

– Advanced data exploration (correlation, attribute discrinimation, etc)– Manual feature selection

• Parallel correlation counting• Parameters:

– MAXIMUM_INPUT_ATTRIBUTES– MAXIMUM_OUTPUT_ATTRIBUTES– MINIMUM_NODE_SCORE– MAXIMUM_STATES

Page 42: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

44

Decision Trees• Best accuracy for classification, regression,

association prediction in many cases.• Multiple internal algorithms

– Bayesian with K2 prior, Uniform prior– Entropy-based– Bayesian Gaussian for regression trees– Complete/simple-binary splits

• Patent-pending technologies– Automatic feature-selection– High cardinality attribute handling– Continuous attribute handling

• Parallel correlation counting• Parameters:

– COMPLEXITY_PENALTY– MAXIMUM_INPUT_ATTRIBUTES– MAXIMUM_OUTPUT_ATTRIBUTES– MINIMUM_LEAF_CASES– FORCE_REGRESSORS– SCORE_METHOD– SPLIT_METHOD

Page 43: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

45

Clustering• Segmentation, profiling• Multiple internal algorithms

– K-means– EM

• Automatic feature selection on input attributes, automatic high cardinality attribute handling

• Parameters– CLUSTER_COUNT– MAXIMUM_INPUT_ATTRIBUTES– CLUSTER_METHOD– MAXIMUM_STATES– MINIMUM_CLUSTER_CASES– MODELLING_CARDINALITY– STOPPING_TOLERANCE

Page 44: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

46

Neural Network• Classification, segmentation, association prediction,

segmentation.• Conjugate gradient method• 0-1 hidden layer• Early stopping criteria• Automatic feature selection• Parameters

– MAXIMUM_INPUT_ATTRIBUTES– MAXIMUM_OUTPUT_ATTRIBUTES– MAXIMUM_STATES– HIDDEN_NODE_RATION– HOLDOUT_PERCENTAGE

Page 45: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan
Page 46: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan
Page 47: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan
Page 48: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan
Page 49: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

SQL 2005 Data Mining

Page 50: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

SQL 2005 Data Mining

Page 51: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

SQL 2005 Data Mining

Page 52: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

SQL 2005 Data Mining

Page 53: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

MIS Space Data Model

Page 54: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Improve the Corpus• Make it more Representative

– Consider citations– Bring up to date– Broaden Horizons

Number of Citations

Num

ber o

f Arti

cles

CACM & AI: 400+ articles with 100+ citations

Page 55: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Improve the Corpus

• The Corpus is the most critical– Consider citations– Bring up to date– Broaden horizons

Improve the

Corpus

Magnify the Human Judgment

Enhance the

Analyses

Expand the Database

Page 56: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Expand the Database

• Citations• Abstract• Journal Rankings• Impact Factors

Computer Science

EngineeringManagement

MIS

Page 57: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Rigor: What % of the paper is (A), (B), (C)

Magnify the Human Judgment

• Article Categorization– Improve or Replace the Decision Tree

• Inflexible – Articles can have more than one category, but there is no “well, sort of…” allowed in a decision tree

• Create a rubric for each category and rank each article against each rubric

• Article Dimensions– Expand the Scale – 7 pt.– Add structure to the Decision Process

Knowledge Mgmt

Collaboration

Applied: What % of the paper is (A), (B), (C)

Theoretical: What % of the paper is (A), (B), (C)

Page 58: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Enhance the Analyses

• Linguistic Analysis• Citation Analysis

– See what disciplines are most influential to MIS– See what disciplines are most influenced by MIS– Inbreeding

• Fusion of Categorization Methods: Human Judgment, Journal Analysis, Author Analysis Citation Analysis, Linguistic Analysis, Others

Page 59: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Our Experience, Concluded

Page 60: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Learn to Love the Question

Choose an appropriate definition of MIS

As a student

As a teacher

As a researcher

As a journal editor

Page 61: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

What is MIS?Computer

Science

EngineeringManagement

MIS

OTHER, 16.12%

Data Management, 14.67%

System Analysis & Design, 12.14%

Knowledge Managment, 10.77%

Collaboration, 9.38%

Economics of Information, 8.96%

Decision Sciences, 8.84%

Workflow/Business Process Management, 8.60%

Artificial Intelligence, 6.30% Social Informatics, 4.22%

Top 10 Categories by % of WK Citations

0.00%

1.00%

2.00%

3.00%

4.00%

5.00%

6.00%

7.00%

1937

1945

1959

1960

1962

1963

1966

1967

1968

1969

1970

1971

1972

1974

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2008

Percentage of Articles in Corpus by Year Published

Page 62: MIS 696 Final Presentation Fall 2008 Mary BurnsKatherine CarlJiesi Cheng Soomi CheongKoren Elder Li Fan Chun-neng HuangBrent LanghalsMatthew Pickard Nathan

Thank You