43
Combining LSI with other Classifiers to Improve Accuracy of Single-label Text Categorization Ana Cardoso-Cachopo Arlindo Oliveira Instituto Superior T´ ecnico — Technical University of Lisbon / INESC-ID EWLSATEL, March 2007 (IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 1 / 11

Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Combining LSI with other Classifiers to ImproveAccuracy of Single-label Text Categorization

Ana Cardoso-Cachopo Arlindo Oliveira

Instituto Superior Tecnico — Technical University of Lisbon / INESC-ID

EWLSATEL, March 2007

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 1 / 11

Page 2: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Outline

1 Introduction

2 Classification Methods

3 Combinations Between Methods

4 Experimental Setup

5 Experimental Results

6 Conclusions and Future Work

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 2 / 11

Page 3: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Outline

1 Introduction

2 Classification Methods

3 Combinations Between Methods

4 Experimental Setup

5 Experimental Results

6 Conclusions and Future Work

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 2 / 11

Page 4: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Outline

1 Introduction

2 Classification Methods

3 Combinations Between Methods

4 Experimental Setup

5 Experimental Results

6 Conclusions and Future Work

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 2 / 11

Page 5: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Outline

1 Introduction

2 Classification Methods

3 Combinations Between Methods

4 Experimental Setup

5 Experimental Results

6 Conclusions and Future Work

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 2 / 11

Page 6: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Outline

1 Introduction

2 Classification Methods

3 Combinations Between Methods

4 Experimental Setup

5 Experimental Results

6 Conclusions and Future Work

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 2 / 11

Page 7: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Outline

1 Introduction

2 Classification Methods

3 Combinations Between Methods

4 Experimental Setup

5 Experimental Results

6 Conclusions and Future Work

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 2 / 11

Page 8: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Introduction

Text Classification

Single-label

Classification MethodsI VectorI k-NNI SVMI LSI

Goal: improve Accuracy

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 3 / 11

Page 9: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Introduction

Text Classification

Single-label

Classification MethodsI VectorI k-NNI SVMI LSI

Goal: improve Accuracy

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 3 / 11

Page 10: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Introduction

Text Classification

Single-label

Classification MethodsI VectorI k-NNI SVMI LSI

Goal: improve Accuracy

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 3 / 11

Page 11: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Introduction

Text Classification

Single-label

Classification MethodsI VectorI k-NNI SVMI LSI

Goal: improve Accuracy

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 3 / 11

Page 12: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Introduction

Text Classification

Single-label

Classification MethodsI VectorI k-NNI SVMI LSI

Goal: improve Accuracy

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 3 / 11

Page 13: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Introduction

Text Classification

Single-label

Classification MethodsI VectorI k-NNI SVMI LSI

Goal: improve Accuracy

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 3 / 11

Page 14: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Introduction

Text Classification

Single-label

Classification MethodsI VectorI k-NNI SVMI LSI

Goal: improve Accuracy

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 3 / 11

Page 15: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Introduction

Text Classification

Single-label

Classification MethodsI VectorI k-NNI SVMI LSI

Goal: improve Accuracy

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 3 / 11

Page 16: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Classification Methods

p dimensionalterm space

s << pdimensional

concept space

SVM

k-NN

Vector

LSI

Cosine similarity

k-NN + Cosine similarity

Kernel Voting strategy

SVD Cosine similarity

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 4 / 11

Page 17: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Combinations Between Methods

p dimensionalterm space

s << pdimensional

concept space

k-NN-LSI

SVM-LSI

k-NN + Cosine similarity

Kernel + Voting strategy

SVD

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 5 / 11

Page 18: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Setup

Methods (6 already mentioned + Dumb)

DatasetsI Bank´s Data - Bank37I Reuters 21578 - R8, R52I 20 Newsgroups - 20NgI Web Knowledge Base - Web4I Cade - Cade12

Evaluation Measure

Accuracy =#Correctly classified documents

#Total documents

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 6 / 11

Page 19: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Setup

Methods (6 already mentioned + Dumb)

DatasetsI Bank´s Data - Bank37I Reuters 21578 - R8, R52I 20 Newsgroups - 20NgI Web Knowledge Base - Web4I Cade - Cade12

Evaluation Measure

Accuracy =#Correctly classified documents

#Total documents

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 6 / 11

Page 20: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Setup

Methods (6 already mentioned + Dumb)

DatasetsI Bank´s Data - Bank37I Reuters 21578 - R8, R52I 20 Newsgroups - 20NgI Web Knowledge Base - Web4I Cade - Cade12

Evaluation Measure

Accuracy =#Correctly classified documents

#Total documents

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 6 / 11

Page 21: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Setup

Methods (6 already mentioned + Dumb)

DatasetsI Bank´s Data - Bank37I Reuters 21578 - R8, R52I 20 Newsgroups - 20NgI Web Knowledge Base - Web4I Cade - Cade12

Evaluation Measure

Accuracy =#Correctly classified documents

#Total documents

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 6 / 11

Page 22: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Setup

Methods (6 already mentioned + Dumb)

DatasetsI Bank´s Data - Bank37I Reuters 21578 - R8, R52I 20 Newsgroups - 20NgI Web Knowledge Base - Web4I Cade - Cade12

Evaluation Measure

Accuracy =#Correctly classified documents

#Total documents

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 6 / 11

Page 23: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Setup

Methods (6 already mentioned + Dumb)

DatasetsI Bank´s Data - Bank37I Reuters 21578 - R8, R52I 20 Newsgroups - 20NgI Web Knowledge Base - Web4I Cade - Cade12

Evaluation Measure

Accuracy =#Correctly classified documents

#Total documents

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 6 / 11

Page 24: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Setup

Methods (6 already mentioned + Dumb)

DatasetsI Bank´s Data - Bank37I Reuters 21578 - R8, R52I 20 Newsgroups - 20NgI Web Knowledge Base - Web4I Cade - Cade12

Evaluation Measure

Accuracy =#Correctly classified documents

#Total documents

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 6 / 11

Page 25: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Setup

Methods (6 already mentioned + Dumb)

DatasetsI Bank´s Data - Bank37I Reuters 21578 - R8, R52I 20 Newsgroups - 20NgI Web Knowledge Base - Web4I Cade - Cade12

Evaluation Measure

Accuracy =#Correctly classified documents

#Total documents

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 6 / 11

Page 26: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Characteristics of the Datasets

Train Test Total Smallest LargestDocs Docs Docs Class Class

Bank37 928 463 1391 5 346

20Ng 11293 7528 18821 628 999

R8 5485 2189 7674 51 3923

R52 6532 2568 9100 3 3923

Web4 2803 1396 4199 504 1641

Cade12 27322 13661 40983 625 8473

Numbers of documents for the datasets: number of training documents,number of test documents, total number of documents, number ofdocuments in the smallest class, and number of documents in the largestclass.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 7 / 11

Page 27: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Characteristics of the Datasets

Train Test Total Smallest LargestDocs Docs Docs Class Class

Bank37 928 463 1391 5 346

20Ng 11293 7528 18821 628 999

R8 5485 2189 7674 51 3923

R52 6532 2568 9100 3 3923

Web4 2803 1396 4199 504 1641

Cade12 27322 13661 40983 625 8473

Numbers of documents for the datasets: number of training documents,number of test documents, total number of documents, number ofdocuments in the smallest class, and number of documents in the largestclass.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 7 / 11

Page 28: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Characteristics of the Datasets

Train Test Total Smallest LargestDocs Docs Docs Class Class

Bank37 928 463 1391 5 346

20Ng 11293 7528 18821 628 999

R8 5485 2189 7674 51 3923

R52 6532 2568 9100 3 3923

Web4 2803 1396 4199 504 1641

Cade12 27322 13661 40983 625 8473

Numbers of documents for the datasets: number of training documents,number of test documents, total number of documents, number ofdocuments in the smallest class, and number of documents in the largestclass.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 7 / 11

Page 29: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Characteristics of the Datasets

Train Test Total Smallest LargestDocs Docs Docs Class Class

Bank37 928 463 1391 5 346

20Ng 11293 7528 18821 628 999

R8 5485 2189 7674 51 3923

R52 6532 2568 9100 3 3923

Web4 2803 1396 4199 504 1641

Cade12 27322 13661 40983 625 8473

Numbers of documents for the datasets: number of training documents,number of test documents, total number of documents, number ofdocuments in the smallest class, and number of documents in the largestclass.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 7 / 11

Page 30: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

0.0

0.2

0.4

0.6

0.8

1.0Dumb

Vector

k-NN

SVM

LSI

k-NN-LSI

SVM-LSI

Bank37 20Ng R8 R52 Web4 Cade12

Accuracy values for the six datasets using each method.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 8 / 11

Page 31: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

0.0

0.2

0.4

0.6

0.8

1.0Dumb

Vector

k-NN

SVM

LSI

k-NN-LSI

SVM-LSI

Bank37 20Ng R8 R52 Web4 Cade12

Accuracy values for the six datasets using each method.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 8 / 11

Page 32: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

0.0

0.2

0.4

0.6

0.8

1.0Dumb

Vector

k-NN

SVM

LSI

k-NN-LSI

SVM-LSI

Bank37 20Ng R8 R52 Web4 Cade12

Accuracy values for the six datasets using each method.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 8 / 11

Page 33: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

0.0

0.2

0.4

0.6

0.8

1.0Dumb

Vector

k-NN

SVM

LSI

k-NN-LSI

SVM-LSI

Bank37 20Ng R8 R52 Web4 Cade12

Accuracy values for the six datasets using each method.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 8 / 11

Page 34: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

0.0

0.2

0.4

0.6

0.8

1.0Dumb

Vector

k-NN

SVM

LSI

k-NN-LSI

SVM-LSI

Bank37 20Ng R8 R52 Web4 Cade12

Accuracy values for the six datasets using each method.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 8 / 11

Page 35: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

0.0

0.2

0.4

0.6

0.8

1.0Dumb

Vector

k-NN

SVM

LSI

k-NN-LSI

SVM-LSI

Bank37 20Ng R8 R52 Web4 Cade12

Accuracy values for the six datasets using each method.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 8 / 11

Page 36: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

Dataset Dumb Vector k-NN SVM LSIk-NNLSI

SVMLSI

Bank37 0.2505 0.8359 0.8423 0.9071 0.8531 0.8488 0.917920Ng 0.0530 0.7240 0.7593 0.8284 0.7491 0.7557 0.7775R8 0.4947 0.7889 0.8524 0.9698 0.9411 0.9488 0.9680R52 0.4217 0.7687 0.8322 0.9377 0.9093 0.9100 0.9311Web4 0.3897 0.6447 0.7256 0.8582 0.7357 0.7908 0.8897Cade12 0.2083 0.4142 0.5120 0.5284 0.4329 0.4880 0.5465

Average 0.3030 0.6961 0.7540 0.8383 0.7702 0.7904 0.8385

Accuracy values for the six datasets using each method, and averageAccuracy for each method over all the datasets.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 9 / 11

Page 37: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

Dataset Dumb Vector k-NN SVM LSIk-NNLSI

SVMLSI

Bank37 0.2505 0.8359 0.8423 0.9071 0.8531 0.8488 0.917920Ng 0.0530 0.7240 0.7593 0.8284 0.7491 0.7557 0.7775R8 0.4947 0.7889 0.8524 0.9698 0.9411 0.9488 0.9680R52 0.4217 0.7687 0.8322 0.9377 0.9093 0.9100 0.9311Web4 0.3897 0.6447 0.7256 0.8582 0.7357 0.7908 0.8897Cade12 0.2083 0.4142 0.5120 0.5284 0.4329 0.4880 0.5465

Average 0.3030 0.6961 0.7540 0.8383 0.7702 0.7904 0.8385

Accuracy values for the six datasets using each method, and averageAccuracy for each method over all the datasets.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 9 / 11

Page 38: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

Dataset Dumb Vector k-NN SVM LSIk-NNLSI

SVMLSI

Bank37 0.2505 0.8359 0.8423 0.9071 0.8531 0.8488 0.917920Ng 0.0530 0.7240 0.7593 0.8284 0.7491 0.7557 0.7775R8 0.4947 0.7889 0.8524 0.9698 0.9411 0.9488 0.9680R52 0.4217 0.7687 0.8322 0.9377 0.9093 0.9100 0.9311Web4 0.3897 0.6447 0.7256 0.8582 0.7357 0.7908 0.8897Cade12 0.2083 0.4142 0.5120 0.5284 0.4329 0.4880 0.5465

Average 0.3030 0.6961 0.7540 0.8383 0.7702 0.7904 0.8385

Accuracy values for the six datasets using each method, and averageAccuracy for each method over all the datasets.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 9 / 11

Page 39: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

Dataset Dumb Vector k-NN SVM LSIk-NNLSI

SVMLSI

Bank37 0.2505 0.8359 0.8423 0.9071 0.8531 0.8488 0.917920Ng 0.0530 0.7240 0.7593 0.8284 0.7491 0.7557 0.7775R8 0.4947 0.7889 0.8524 0.9698 0.9411 0.9488 0.9680R52 0.4217 0.7687 0.8322 0.9377 0.9093 0.9100 0.9311Web4 0.3897 0.6447 0.7256 0.8582 0.7357 0.7908 0.8897Cade12 0.2083 0.4142 0.5120 0.5284 0.4329 0.4880 0.5465

Average 0.3030 0.6961 0.7540 0.8383 0.7702 0.7904 0.8385

Accuracy values for the six datasets using each method, and averageAccuracy for each method over all the datasets.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 9 / 11

Page 40: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Experimental Results

Dataset Dumb Vector k-NN SVM LSIk-NNLSI

SVMLSI

Bank37 0.2505 0.8359 0.8423 0.9071 0.8531 0.8488 0.917920Ng 0.0530 0.7240 0.7593 0.8284 0.7491 0.7557 0.7775R8 0.4947 0.7889 0.8524 0.9698 0.9411 0.9488 0.9680R52 0.4217 0.7687 0.8322 0.9377 0.9093 0.9100 0.9311Web4 0.3897 0.6447 0.7256 0.8582 0.7357 0.7908 0.8897Cade12 0.2083 0.4142 0.5120 0.5284 0.4329 0.4880 0.5465

Average 0.3030 0.6961 0.7540 0.8383 0.7702 0.7904 0.8385

Accuracy values for the six datasets using each method, and averageAccuracy for each method over all the datasets.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 9 / 11

Page 41: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Conclusions and Future Work

Very good Accuracy for some datasets.

It is worth pursuing this line of research by testing more combinationsbetween the method´s parameters.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 10 / 11

Page 42: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Conclusions and Future Work

Very good Accuracy for some datasets.

It is worth pursuing this line of research by testing more combinationsbetween the method´s parameters.

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 10 / 11

Page 43: Combining LSI with other Classifiers to Improve Accuracy ...web.ist.utl.pt/~acardoso/docs/...presentation.pdf · Combining LSI with other Classifiers to Improve Accuracy of Single-label

Thank You.

Any Questions?

(IST-TULisbon/INESC-ID) Ana Cardoso-Cachopo EWLSATEL, March 2007 11 / 11