24
Exhaustive Signature Algorithm Guy Harari

Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

  • View
    225

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Exhaustive Signature Algorithm

Guy Harari

Page 2: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Outline

• ISA biclustering algorithm• Bimax biclustering algorithm• Exhaustive Signature Algorithm• Results and future work

Page 3: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

ISA algorithm

• Was developed by Sven Bergmann in 2003.• Goal: find genes/conditions having correlated

expression.• Frequently used, compared and improved.• Good results in real data.

Page 4: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

ISA - details

• Input – expression matrix , initial gene set.• Compute by normalizing each column.• For each condition– z-test avg. normalized expression in gene subset

against avg. expression in condition.– If above a threshold, select the condition.

• Do the same for resulting condition set.• Repeat until convergence of gene set.

GE

E

Page 5: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

ISA - drawbacks

• Initial gene set should be given.• Few biclusters for specific parameter value.• Parameter values are hard to optimize.• Expression values aren’t normally distributed.• Genes might not be independent.

Page 6: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Exhaustive approach

• Use Bimax algorithm to find seeds.• For each seed apply ISA with random

parameters.• Drop similar seeds while running.• Drop similar biclusters from ISA.• Observation: applying the algorithm

separately for positive and negative values improves results.

Page 7: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Bimax algorithm

• Input – expression matrix• Binarize matrix (1 value for b% highest and

lowest values).• Goal – find all submatrices which: – Contain only 1’s.– Are inclusion-maximal.

• Method:– Drop areas in matrix with 0’s only.– Recursively apply Bimax on other areas.

Page 8: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Bimax - illustration

1 0 1 1 0 1 0 0

0 0 0 0 1 0 1 1

1 0 0 1 0 1 0 0

0 0 1 1 1 0 1 1

0 1 0 0 0 0 0 1

0 0 0 1 0 1 0 1

1 1 1 0 1 1 0 1

Page 9: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Bimax - illustration

1 0 1 1 0 1 0 0

0 0 0 0 1 0 1 1

1 0 0 1 0 1 0 0

0 0 1 1 1 0 1 1

0 1 0 0 0 0 0 1

0 0 0 1 0 1 0 1

1 1 1 0 1 1 0 1

Page 10: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Bimax - illustration

1 1 1 1 0 0 0 0

0 0 0 0 1 0 1 1

1 0 1 1 0 0 0 0

0 1 1 0 1 0 1 1

0 0 0 0 0 1 0 1

0 0 1 1 0 0 0 1

1 1 0 1 1 1 0 1

Page 11: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Bimax - illustration

1 1 1 1 0 0 0 0

1 0 1 1 0 0 0 0

0 0 0 0 1 0 1 1

0 1 1 0 1 0 1 1

0 0 0 0 0 1 0 1

0 0 1 1 0 0 0 1

1 1 0 1 1 1 0 1

Page 12: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Bimax - illustration

1 1 1 1 0 0 0 0

1 0 1 1 0 0 0 0

0 1 1 0 1 0 1 1

0 0 1 1 0 0 0 1

1 1 0 1 1 1 0 1

0 0 0 0 1 0 1 1

0 0 0 0 0 1 0 1

Page 13: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Bimax - drawbacks

• Information loss due to binarization.• Binarization parameter is hard to control.• Runtime depends linearly on no. of biclusters.• Usually returns millions of biclusters.• Poor results on real data.

Page 14: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Exhaustive Signature Algorithm

• Apply Bimax on the input expression matrix.• Keep biclusters that:– Do not overlap with other biclusters.– Have low p-value w.r.t a bicluster score.

• Sort resulting biclusters by size.• Begin with the largest, apply ISA for each one.• Keep new biclusters that do not overlap with

previous ones.• Stop if more than N biclusters found.

Page 15: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

ESA – details

• Overlaps – use Jaccard index, take the larger.• Score – average abs. Pearson correlation

between gene pairs.• P-value:– Randomize input matrix using edge shuffling.– Apply ESA on randomized matrix.– Keep score distribution of all biclusters found.– P-value = right tail of score distribution of resulting

biclusters.

Page 16: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

ESA – details

• Observation: anti-correlated genes usually do not pass enrichment tests simultaneously.

• So apply ESA separately on positive and negative expression values.

• Also change ISA: – For positive run, test: score>threshold– For negative run, test: –score>threshold

Page 17: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

ESA - experiments

• Apply the algorithms: SAMBA, Bimax, ISA,ESA and ESANP (negative and positive values separately).

• Datasets:– Gasch 2001 (yeast heat shock)– Whitfield 2002 (human cell cycle)

• Evaluation: GO, TF and KEGG enrichment tests

Page 18: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Results – Yeast, GO

20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

45

50

ESANPSAMBAISABimaxESA

-log(pval)

#Terms

Page 19: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Results – Yeast, TF

0 5 10 15 20 25 30 35 40 45 500

2

4

6

8

10

12

14

16

18

20

ESANPSAMBAISABimaxESA

-log(pval)

#TFs

Page 20: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Results – Yeast, KEGG

0 5 10 15 20 25 30 35 400

5

10

15

20

25

ESANPSAMBAISABimaxESA

-log(pval)

#PWs

Page 21: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Results – Human, GO

0 5 10 15 20 250

5

10

15

20

25

30

35

40

45

50

ESANPSAMBAISABimaxESA

-log(pval)

#Terms

Page 22: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Results – Human, KEGG

0 5 10 15 20 25 300

1

2

3

4

5

6

7

8

9

ESANPSAMBAISABimaxESA

-log(pval)

#PWs

Page 23: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Conclusions

• ESA exploits both Bimax’s power and ISA’s accuracy.

• ESA avoids ISA’s parameter selection.• ESA avoids ISA’s seed generation.• ESA reduces #biclusters from Bimax.• ESA shows good results on real data.

Page 24: Exhaustive Signature Algorithm Guy Harari. Outline ISA biclustering algorithm Bimax biclustering algorithm Exhaustive Signature Algorithm Results and

Future work

• Test the algorithm on other datasets.• Initiate binarization parameter automatically.• Evaluate results with other criteria.• Avoid bias towards large biclusters.