12
Classification of Classification of microarray microarray gene gene expression expression data using data using support vector machines support vector machines ( ( SVM SVM ) ) A presentation on the topic A presentation on the topic For CIS 595 Bioinformatics course For CIS 595 Bioinformatics course by Despina Kontos by Despina Kontos Spring 2003 – Temple University Spring 2003 – Temple University

Classification of microarray gene expression data using support vector machines ( SVM )

  • Upload
    niyati

  • View
    30

  • Download
    1

Embed Size (px)

DESCRIPTION

Classification of microarray gene expression data using support vector machines ( SVM ). A presentation on the topic For CIS 595 Bioinformatics course by Despina Kontos Spring 2003 – Temple University. Overview…. What are microarray gene expression data ? - PowerPoint PPT Presentation

Citation preview

Page 1: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Classification of Classification of microarraymicroarray gene expressiongene expression data using data using

support vector machinessupport vector machines ((SVMSVM))

Classification of Classification of microarraymicroarray gene expressiongene expression data using data using

support vector machinessupport vector machines ((SVMSVM))

A presentation on the topic A presentation on the topic For CIS 595 Bioinformatics courseFor CIS 595 Bioinformatics course

by Despina Kontosby Despina KontosSpring 2003 – Temple University Spring 2003 – Temple University

Page 2: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Overview…• What are microarray gene

expression data?

• What are Support Vectors Machines?

• How can we use them to utilize

these gene expression data?

CLASSIFICATION EXPERIMENTS !!!CLASSIFICATION EXPERIMENTS !!!

Page 3: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Microarrays…• What are they anyway??

Gene expression levels on tissue or cell for varying environment conditions

Page 4: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Microarrays…• From a machine learning point of view…

Genes

Experiment g-1 g-2 …… g-n

ex-1

ex-2

…….

…….

ex-m

Tissue classification

Function classification

Page 5: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Support Vector Machines (SVM)• Linear classifiers• Attempt to avoid overfitting by finding the optimal

hyperplane that separates the data

HOW???

By maximizing the Margin..

Support Vectors

Introduced by V.Vapnic and co-workers in 1995

Page 6: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Support Vector Machines (SVM)• And what about datasets that are not linearly separable??Map the data into higher dimensional space and make linear classification there (theorem!!)

Page 7: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Support Vector Machines (SVM)

We need ONLY the support vectors for

computations!!

We can use KERNEL functions to avoid computations in

higher dimensional space

Some mathematical formulations…

Page 8: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Some experiments…M.P.S.Brown, W.N.Grundy, D.Lin, N.Cristianini, C.W.Sugnet, T.S.Furey, M.Ares Jr. and D.Haussler,“Knowledge-based analysis of microarray gene expression data by using support vector machines", Proc.Natl.Acad.Sci.USA,97, 1, pp.262-267, 2000.

Classification of gene function from microarray data using SVM

2,476 genes

79 DNA hybridization experiments

6 gene function families

SVM providedoptimal

classification!!!

Genes

Experiment

g-1

g-2

……

g-n

ex-1

ex-2

…….

…….

ex-m

F1 F2 F3 ...

Function Classification

Page 9: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

More experiments…T.furey, N.Cristianini, N. Duffy, D. Bednarski, M. Schummer and D Haussler, “Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expressioin Data”, Bioinformatics, 2000.

Gene expression data on tissue

97,802 DNA clones

31 tissue samples

Genes

Experiment

g-1

g-2

……

g-n

ex-1

ex-2

…….

…….

ex-m

Cancer ovarianNormal ovarianNormal non-ovarian

Cancer

Not Cancer

...

...

Cancer

Tissue

Classification

Page 10: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Conclusions• Microarray gene expression data are a very useful

format of biological information (..expensive to obtain!!)

• SVM new and very promising classification apprach

• A lot of research still to be done on Biological

information processing using techniques developed in

fields such as Machine Learning, Data Mining, etc..

Page 11: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

Additional resources..Osuna, R. Freund, and F. Girosi. Support vector machines: Training and applications. In A.I. Memo. MIT A.I. Lab, 1996

N. Cristianini. ICML'01 tutorial, 2001

http://www.kernel-machines.org/

http://research.microsoft.com/users/jplatt/svm.html

http://www.isis.ecs.soton.ac.uk/resources/svminfo/

Page 12: Classification of  microarray gene expression  data using  support vector machines  ( SVM )

THANK YOU!!!!!