22
Automatic Classification of Accounting Literature Nineteenth Annual Strategic and Emerging Technologies Workshop Vasundhara Chakraborty, Victoria Chiu, and Miklos Vasarhelyi San Francisco July 31, 2010

Automatic Classification of Accounting Literature Nineteenth Annual Strategic and Emerging Technologies Workshop Vasundhara Chakraborty, Victoria Chiu,

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Automatic Classification of Accounting Literature

Nineteenth Annual Strategic and Emerging Technologies Workshop

Vasundhara Chakraborty, Victoria Chiu, and Miklos Vasarhelyi

San Francisco July 31, 2010

OUTLINE

2

Automatic Classification of Accounting Literature

Introduction and Background Motivation and Research Questions

Literature Review Classification of Accounting Research- The Manual Method Development of Automatic Classification Method

Methodology- A Two-Phase Experiment Phase I: Keywords Phase II: Full Abstract

Results and Analysis Treatment Mode of Reasoning Accounting Area

Conclusion and Implication

Introduction (1/5)

• Purpose

This study explores the possibility of developing a methodology to classify accounting academic publication automatically.

• Research Questions

Can we automate the classification process (of accounting literature) by using the keywords from academic journal articles?

Can we automate the classification process (of accounting literature) by using the full abstracts from academic journal articles?

Do results vary depending on which elements we use to automate the literature classification process and to what extent do they differ?

3

Automatic Classification of Accounting Literature

Introduction (2/5)

• Contribution

Extending the usefulness of automatic text classification method to publications, and

Seeking the possibility to improve the methodology applied in research that investigates the attributes and development of knowledge in accounting discipline.

4

Automatic Classification of Accounting Literature

Introduction (3/5)

• Motivation

Literature taxonomization is a critical element for revealing the development and evolution of knowledge in disciplines (Brown et al. 1987, 1989, Vasarhelyi et al. 1988, Brinberg and Shields 1989, Meyer and Rigsby 2001, Heck and Jensen 2007).

Traditionally, the taxonomization process has been manually performed in this research area (Vasarhelyi et al. 1984, Brown et al. 1985, 1989, and Badua 2005).

5

Automatic Classification of Accounting Literature

Introduction (4/5)

• Motivation (cont.)

The rapid growth of collections in online academic databases indicates that there is increasing difficulty for professionals to access information in a timely and efficient way (Nobata 1999).

Gangolly and Wu (2000)- the development of methods for automatic indexing and classification of concepts has been necessitated due to

The increase of text databases size. High cost of domain expertise to develop classifications.

6

Automatic Classification of Accounting Literature

Introduction (5/5)

7

• Taxonomy for Classifying Accounting Publications Accounting Research Directory- The Database of Accounting Literature

(ARD)- (Brown, Gardner and Vasarhelyi, 1993).Treatment, Accounting Area, Mode of Reasoning, Research Method, Inference

Style, Mode of Analysis, School of Thought, Information, Geography, Objective, Applicability, and Foundation Discipline.

Treatment: identifies the major factor (independent variable) or other accounting phenomenon associated with/causes the Information taxon (dependent variable).

Accounting Area: identifies the major accounting field under which the article belongs.

Mode of Reasoning: identifies the technique used to formally arrive at the conclusion in the article.

Automatic Classification of Accounting Literature

Literature Review (1/4)

I. Accounting Literature Classification- The Manual Method

• Literature on examining the attributes and development of accounting research manually classifies publications that represent the core knowledge of accounting discipline.

• Kinard and Putney 1968, Gonedes and Dopuch 1974, Hofstedt 1975, 1976, Brown et al. 1987, Vasarhelyi et al. 1988, Rigsby 2001.

• Vasarhelyi et al. (1988) examine the trend of accounting research within 1963 and 1984.

• Taxonomy: Research Method, Foundation Discipline, School of Thought, and Mode of Reasoning.

8

Automatic Classification of Accounting Literature

Literature Review (2/4)

I. Accounting Literature Classification- The Manual Method (cont.)

• Brown et al. (1989) researched on accounting publications in academic journals (AOS, TAR, JAE, and JAR) from 1976 to 1984. Accounting Area, Research Method, School of Thought, and

Geographic.

• Fleming et al. (2000) studied the evolution of research in The Accounting Review (TAR) within 1966 and 1985.

Focused attributes: Research Methods, Financial Accounting subtopics, article length, citations, and author background.

9

Automatic Classification of Accounting Literature

Literature Review (3/4)

II. Development of Automatic Classification Method

• Crouch and Yang (1992) found that automatic classification method produces useful thesaurus classes when supplementing query terms.

• Chen et al. (1995) automatically generated a thesaurus to evaluate Worm Community System (WCS) by adopting the algorithmic approach developed by Chen and Lynch (1992) which was applied to generate a concept network.

• Nobata (1999) used statistical and decision tree methods to identify and classify biology terms automatically.

Refining the applied algorithms for automating classification process is needed.

10

Automatic Classification of Accounting Literature

Literature Review (4/4)

II. Development of Automatic Classification Method (cont.)

• Classifying financial accounting concepts automatically (Gangolly 2000).

Term frequency in financial accounting standards was analyzed and clusters of concepts are derived by agglomerative nesting algorithm.

• Automatic grouping related accounting concepts (Garnsey 2006)

Semantic parsing techniques and statistical methods were used.

11

Automatic Classification of Accounting Literature

• Sample Collection Three hundred and fifty eight articles published in accounting journals

were downloaded.

Methodology (1/3)

Automatic Classification of Accounting Literature

Word countWord count Attribute Selection

Attribute Selection

Create document –term matrix

Create document –term matrix

Apply classification algorithms

Apply classification algorithms

Word countWord countAttribute Selection

Attribute Selection

Create document –term matrix

Create document –term matrix

Apply classification algorithms

Apply classification algorithms

Phase II: Using Full Abstract

Term frequency

Term frequency

Create phrases

Create phrases

Term frequency

Term frequency

Journal articles Parse out keywords

Parse out keywords

Phase I: Using Keywords

Methodology (2/3)

Database

Validation of results

Validation of results

Validation of results

Validation of results

Parse out full abstract

Parse out full abstractJournal articles

Database

Automatic Classification of Accounting Literature

Examples of Keywords used for Treatment Taxon Classification

Methodology (3/3)

Automatic Classification of Accounting Literature

Phase I: Keywords Analysis with all Subclasses

Results and Analysis (1/5)

Analysis on Treatment Taxon

Automatic Classification of Accounting Literature

Phase I: Keywords Analysis with Class Modification

Results and Analysis (2/5)

Analysis on Treatment Taxon

Automatic Classification of Accounting Literature

Phase II: Full Abstracts Analysis with Class Modification

Results and Analysis (3/5)

Analysis on Treatment Taxon

Automatic Classification of Accounting Literature

Results and Analysis (4/5) Analysis on Accounting Area Taxon

Automatic Classification of Accounting Literature

Automatic Classification of Accounting Literature

Results and Analysis (5/5) Analysis on Mode of Reasoning Taxon

Automatic Classification of Accounting Literature

Automatic Classification of Accounting Literature

Comparison of Results

Conclusion (1/2)

Automatic Classification of Accounting Literature

Conclusion (2/2)• Findings Summary

This study shows that using semantic parsing and data mining techniques, we can classify academic publications.

Treatment and Accounting Area taxons can be classified relatively better.

• Limitations Limited number of articles were used for the experiments. A more comprehensive database with sufficient representation of

articles belonging to different subclasses is needed.

• Future Research Use of a larger data corpus. Use of full text. Use semantic parsing and data mining methods to discover new

emerging paradigms in the accounting literature.

21

Automatic Classification of Accounting Literature

Thank You!

22

Automatic Classification of Accounting Literature