24
Enriching Literature Reviews with Text Mining Tools Case: Group Support Systems Ph.D. Johanna Bragge / HSE / Business Technology / ISS http://www.hse.fi/EN/HKI/B/Johanna_Bragge The presentation is based on Bragge J., Relander S., Sunikka A. and Mannonen P. (2007) “Enriching Literature Reviews with Computer-Assisted Research Mining. Case: Profiling Group Support Systems Research”, Proceedings of 40th HICSS Conference, Hawaii, USA.

Enriching Literature Reviews with Text Mining Tools Case: Group Support Systems Ph.D. Johanna Bragge / HSE / Business Technology / ISS

Embed Size (px)

Citation preview

Enriching Literature Reviews with Text Mining ToolsCase: Group Support Systems

Ph.D. Johanna Bragge / HSE / Business Technology / ISS

http://www.hse.fi/EN/HKI/B/Johanna_Bragge

The presentation is based on Bragge J., Relander S., Sunikka A. and Mannonen P. (2007) “Enriching Literature Reviews with Computer-Assisted Research Mining. Case: Profiling Group Support Systems Research”, Proceedings of 40th HICSS Conference, Hawaii, USA.

Structure of presentation

• Objectives of our HICSS´07 paper• What is ”research profiling”?• Profiling research on Group Support

Systems• Dimensions of research profiling

– From undergrad class papers and ”one-nighters” to dissertation or other major research projects

– Possibilities of ISI WoS Analysis tool• Conclusions

Objectives of our HICSS´07 paper

• Extend the notion of a traditional literature review

into the domain of research profiling• Present a practical case study

Group (Decision) Support Systems research

• Briefly overview the capabilities of VantagePoint

Text-mining software originally developed at the Georgia Institute of Technology, USA (currently SearchTech Inc.)

”Can we improve on the traditional ’literature review’?”

• Science and technology abstracts are literally at our fingertips in R&D databases

• Search engines enable rapid and effective collection of records relating to one’s research interests

• Analytical software helps elicit useful information from the searches (even if thousands of abstracts) to gain perspective on ones’ research context.

• ”This enhanced literature review – ’research profiling’– should become standard research practice.”Source: Porter, Kongthon and Lu (2002), ”From Traditional Literature Reviews to

’Research Profiling’”, Scientometrics, Vol. 53, No. 3, 351-370. AVAILABLE in SpringerLink database.

Augmenting, not replacing!

• Research profiling aims to augment, not replace, the traditional literature review

– helping to fulfill purposes of understanding the structure of the subject, important variables, pertinent methods, and key needs

• These aims can be better served by analyzing the whole, rather than just a few parts of the research milieu.

– students often limit their perspective to specialized slices of the literature

– research streams often lose connection to other research activities

Porter et al. (2002, p. 351).

Comparison of traditional literature reviews and research profiling

Old (Traditional Literature Review)

New (Research Profiling)

Micro focus (paper-by-paper)

Macro focus (patterns in the literature as a body)

Narrow range (~20 references)

Wide range (~20 – 20,000 references)

Tightly restricted to the topic

Encompassing the topic + related areas

Text discussion Text, numerical, and graphical depiction

Porter et al. (2002, p. 353).

The research profiling process based on Herbert Simon’s Decision Phases

Phase A: Intelligence

(1) Issue Identification(2) Selection of Information

Sources (3) Search Refinement

and Data Retrieval(4) Data Cleaning

Phase B: Analysis & Design

(5) Basic Analyses(6) Advanced

Analyses

Phase C: Choice

(7) Representation(8) Interpretation(9) Utilization

Phase A: Intelligence

(1) Issue Identification(2) Selection of Information

Sources (3) Search Refinement

and Data Retrieval(4) Data Cleaning

Phase B: Analysis & Design

(5) Basic Analyses(6) Advanced

Analyses

Phase C: Choice

(7) Representation(8) Interpretation(9) Utilization

Source: Adapted from Porter and Cunningham (2005).

Fundamental research

Commercial application

- Science Citation Index, ISI

- MEDLINE

- Chem abstracts

- INSPEC (by IEE)

- EI Compendex

- Derwent World Patent Index

- ABI Inform (ProQuest)

- Lexis Nexis

Source: Porter & Cunningham (2005, p. 83)

Selection of databases

Case: Profiling Group Support Systems Research

• Research question: Past, present and future of GSS

– Level of maturity / saturation?– Who? What? When?– What’s hot? Are there any emerging themes? – Etc.

• Search words used:– group support system(s), group decision support

system(s), electronic meeting system(s)• Database used: INSPEC by IEE

– Covers more than 3.850 journals and 2.200 conference proceedings

• Final sample: 2.000 publications from 1982-2005– The sample was collected in April 2006

G(D)SS Publications yearly in INSPEC

0

20

40

60

80

100

120

140

160

180

1982

1984

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

Top-12 Institutions of GSS research

Key affiliations # %

University of Arizona, AZ, USA 93 27 %

National Univ. of Singapore, Singapore 37 11 %

City University of Hong Kong, Hong Kong 32 9 %

New Jersey Inst. of Technology, NJ, USA 32 9 %

University of Mississippi, MS, USA 26 8 %

University of Georgia, GA, USA 25 7 %

Indiana University, IN, USA 19 6 %

University of Minnesota, MN, USA 18 5 %

University of Baltimore, MD, USA 17 5 %

Delft Univ. of Technology, Netherlands 16 5 %

University of Calgary, Canada 16 5 %

Naval Postgraduate School, CA, USA 16 5 %

Total 347 100%

Top-12 Outlets of GSS researchOutlet # % Proceedings of Hawaii International Conf. on System Sciences, IEEE (HICSS) 317 41.8

Decision Support Systems (DSS) 84 11.1 Journal of Management Information Systems (JMIS) 69 9.1

Information and Management (I&M) 57 7.5

European J. of Operational Res. (EJOR) 43 5.7 The International Conference on Systems, Man and Cybernetics, IEEE (SMC) 34 4.5 Proceedings of the Information Resources Management Association International Conference (IRMA) 31 4.1 Journal of Organizational Computing and Electronic Commerce (JOC&EC) 29 3.8

IFIP Working Groups Conference Proc’s 28 3.7

MIS Quarterly (MISQ) 25 3.3 Proceedings of Decision Sciences Institute Annual Meeting (DSI) 22 2.9 IFIP Transactions A: Computer Science and Technology 20 2.6

759 100%

Author [# of publ.] and Affiliation(s)

5 Main outlets 5 Main descriptors (other than

GDSS, DSS or groupware) Temporal publication

activity

Nunamaker, J.F., Jr [85] University of Arizona

Proc. of HICSS [36] JMIS [16] DSS [8]

I & M [3] MISQ [3]

Teleconferencing [15] Human factors [11] Systems analysis [9]

Military computing [6] Social aspects of automation [6]

Vogel, D. R. [51] University of Arizona

(current affiliation City University of Hong Kong)

Proc. of HICSS [18] JMIS [ 7] DSS [4]

MISQ [3] I & M [2]

Teleconferencing [12] Human factors [7]

Social aspects of automation [7] Computer-aided instruction [3]

DP-management [3]

Aiken, M. [42] University of Mississippi

I & M [10] DSS [4]

Proc. of Dec. Sci. Inst. [3] J. of Int. Inf. Mgmt [2]

SIGOIS Bulletin [2]

Human factors [11] Teleconferencing [7] User interfaces [7] Expert systems [6]

Language-translation [5]

Briggs, R. O. [36] University of Arizona

(current affiliation University of Alaska)

Proc. of HICSS [22] JMIS[8]

CACM [1] J. of Educ. Tech. Syst. [1] J. of End-User Comp. [1]

Human factors [9] Social aspects of automation [8]

Business data processing [5] Computer-aided instruction [4]

Military computing [4]

Dennis, A. R. [36] University of Georgia

(current affiliation Indiana University)

Proc. of HICSS [9] JMIS [7] MISQ [7] DSS [2] ISR [2]

Human factors [10] Teleconferencing [8] Systems analysis [5]

Social aspects of automation [4] Idea processors [3]

Wei, K. K. [29] National University of

Singapore (current affil. City U. of Hong Kong)

Proc. of HICSS [5] DSS [3]

EJOR [3] IEEE T. on SCM [3]

I & M[3]

Human factors [12] User interfaces [6]

Social aspects of automation [5] Task analysis [3]

Business data processing [2]

Vreede, G. J. [28] Delft University of

Technology (current affil. University of Omaha)

Proc. of HICSS [12] JMIS [5]

Proc. of ACM SIGCPR [2] Data Base for Adv. in IS [1]

IT for Development [1]

Police data processing [6] Business data processing [5]

Teleconferencing [4] Human factors [3]

Information Technology [3]

Hiltz, S. R. [26] New Jersey Institute of

Technology

Proc. of HICSS [15] JMIS [5] DSS [2]

J. of Org. Comp.& EC [2] MISQ [1]

Teleconferencing [7] Human factors [6]

Social aspects of automation [5] Office automation [2] Systems analysis [2]

Top-7 authors and selected details

Key figures of GSS research in 2000-2005 (from 688 out of 2000 publications)

2000-2005 Top-13 Authors

2000-2005 Top-13 Countries *)

2000-2005 Top-10 Outlets

2000-2005 Top-14 Descriptors

2000-2005 Top-10 Classifications

de Vreede, G. J. [15] (7.) Briggs, R. O. [14] (4.) Nunamaker, J.F. [11] (1.) Dennis, A. R. [8] (4.) Ma, J. [8] (18.) Vogel, D. R. [8] (2.) Huang, W. W. [7] (18.) Kwok, R. C. W. [7] (18.) Reinig, B. A. [7] (18.) Tuominen, M. [7] (27.) Aiken, M. [6] (3.) Hiltz, S. R. [6] (8.) Ito, T. [6] (29.)

(position in the whole sample presented in

parenthesis)

USA [234] China [102] UK [38] Japan [34] Australia [24] Taiwan [24] Netherlands[21] Canada [14] Finland [14] Brazil [12] France [12] Germany [11] Portugal [10] *) Country of publication is determined based on the first author

Proc. of HICSS [82] DSS [26] EJOR [22] Int. Conf. on CSCW in Design [18] JMIS [17] Proc. of IRMA [15] I & M [14] Int. Conf. on Systems, Man and Cybernetics [12] Journal of the OR Society [11] Proc. of World Multiconference on Systemics, Cybernetics and Informatics [10]

Group Decision Support Systems [544] Groupware [126] Decision making [92] Internet [79] Human factors [50] Business data processing [42] Fuzzy set theory [39] Negotiation Support Systems [38] Multi-agent syst.[35] Teleconferencing [33] Social aspects of automation [31] User-interfaces [31] Military comp. [26] Knowledge Management [25]

C7102-Decision-Support-systems [506] C6130G-Groupware [499] C6150N-Distributed-systems-software [100] C7210N-Information-networks [79] C7100-Business-and-administration [63] C6170-Expert-systems-and-other-AI-software-and-techniques [62] C6170K-Knowledge-engineering-techniques [60] C7810C-Computer-aided-instruction [42] C6180-User-interfaces [39] C0230-Economic,-social-and-political-aspects-of-computing [38]

Trends in authors’ keywords (non-technology)

1986-1990

1991-1995

1996-2000

2001-2005

distributed-* 15 65 62 63 group-decision-making 10 30 57 58 decision-making 9 28 35 38 face-to-face* 9 23 28 27 consensus-* 6 31 31 16 virtual-* 0 7 32 35 information-technology 5 25 18 15 idea-generation* 7 10 25 13 web-based-* 0 0 14 33 Internet 0 4 25 18 anonymity 5 10 21 8

1982-1985

1986-1990

1991-1995

1996-2000

2001-2005

Countries

Cited refer. avg.

Journal papers

Conf. papers

OutletsClassifications

PublicationsDescriptors

Authors

0

50

100

150

200

250

300

5-year trends of various terms

Illustration of VantagePoint’s ”DSS-capabilities”

Dimensions of research profiling

Less >>>>>>>>>>>>>>>>>>>>>>>>> More

Data availability

Counts only

Restricted download

Single rich dataset

Multiple datasets

Time & resources

”One-nighter”

Limited Rich

Tool availability

Search engine

Text-mining software

Text-mining expertise

None Limited Extensive

Subject expertise

Novice Knowledgeable

Multiple experts

Source: Porter et al. (2002, p. 366)

ISI Web of Science’s Analysis tool – suitable for ”one-nighters”

Problems occur as the authors may be inserted differently in the database!

With VantagePoint these can be cleaned for the purposes of advanced analyses – with simple frequencies ISI works well.

More options can be found from the ”Analyze Results: Analyze” button (in the right-hand column of the main search results page). Also possibility to save records to file for graphs.

Conclusions

• Increasing availability and amount of information– Modern search engines developed

• Research profiling uses sophisticated text mining tools for structured science information resources

– i.e. abstracts from ISI WoS, Ebsco, ProQuest, INSPEC etc.

• Emphasis on content - uncovering research gaps and new scientific domains

– Emphasis not on co-citation analysis (SNA tools better for that)

• Does not replace traditional literature reviews!– Database limitations, e.g. publication delays and non-

standardized contents of different databases

• Questions or comments?

More references on the foundations and applications of research profiling

• Porter, A. L., Kongthon, A. and J.-C. Lu, (2002) "Research Profiling: Improving the Literature Review", Scientometrics, vol. 53, no. 3, pp. 351-370.

• Porter, A. L. and S.W. Cunningham (2005), Tech Mining. Exploiting New Technologies for Competitive Advantage, Wiley Series in System Engineering and Management, New Jersey: John Wiley & Sons, Inc.

• Bragge, J., and Storgårds, J, (2007) “Profiling Academic Research on Digital Games Using Text Mining Tools”, Proceedings of the Digital Games Research Association’s DIGRA Conference, Tokyo, Japan.

• Bragge, J., and Storgårds, J, (2007) “Utilizing Text-Mining Tools to Enrich Traditional Literature Reviews. Case: Digital Games”, Proceedings of the 30th Information Systems Research Seminar in Scandinavia IRIS, Tampere, Finland.

• Sunikka, A. and Bragge, J. (2008) “What, Who and Where: Insight into Personalization”, Forthcoming in the Proceedings of the HICSS´-41, Hawaii, USA, January, 2008