26
I SUPPORT OPEN ACCESS PubMed Central www.pubmedcentral.nih.gov/ Public Library of Science www.publiclibraryofscience.org Nucleic Acids Research nar.oupjournals.org/

I SUPPORT OPEN ACCESS PubMed Central Public Library of Science Nucleic Acids Research nar.oupjournals.org

Embed Size (px)

Citation preview

Page 1: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

I SUPPORTOPEN ACCESS

PubMed Centralwww.pubmedcentral.nih.gov/

 

 Public Library of Sciencewww.publiclibraryofscience.org

  

Nucleic Acids Researchnar.oupjournals.org/

Page 2: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

Thanks to the authors and reviewers who support NAR

Please introduce yourselves!

Page 3: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

rebase.neb.com/rebase/rebase.html

Page 4: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org
Page 5: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org
Page 6: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org
Page 7: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

RM Systems

FunctionalType

I 94

3701II

III

IV ( )

10

3

R S M

R M

res mod

R R

Page 8: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

Type II Subtypes

EcoRI

BamHI

HphI

BsrDI

AhdI

BcgI

HpaII

GAATTC

GGATCC

GGTGA

GCAATG

GAC(N)5GTC

CGA(N)6TGC

CCGG

R M

C

M1 M2

S

RM

V

M R

R

M R C

S

M R

R1 R2 M1 M2

Page 9: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

REBASE EntriesI

IVIII

II

3701

893

650

2652

94

830

65

714

82

753

10

199

12

232

3

354

= R = M = S = Predicted

Page 10: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

Type II Restriction Enzymesand

Methylases

Total number of R specificities: 262

Number of sequenced examples: 188

Total number of M specificities: 253

Number of sequenced examples: 193

Page 11: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

The Bioinformatics Problems of RM Systems

 1. M genes

  Easy to find using motifs

  2. S and V genes

  Easy to find using motifs

  3. C genes

  Some are easy (C.BamHI, etc.)

Some are difficult

  4. R genes

  Very difficult unless homologs exist

 

 

Page 12: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

Sequenced Restriction Enzymes Genes

Recognition

Sequence Family 1 Family 4Family 3Family 2

AATT

ACGT

AGCT

ATAT

CATG

CCGG

CGCG

CTAG

GATC

GCGC

GGCC

GTAC

TATA

TCGA

TGCA

TTAA

1

1 (2)

1

None known

2 (68)

1 (1)

2 (2)

3 (1)

21 (13)

2

5 (4)

1 (1)

None known

6 (1)

1 (4)

1

1 (1)

1 (1)

1

3(7)

1

2(2)

1

1

1

2(5)

1

1

1

Family 5

1

Page 13: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org
Page 14: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org
Page 15: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

Analysis of new M gene hits

 

1. Is the overall sequence of the M gene similar to a known M gene?

 

2. Is the variable region (DNA recognition domain) highly similar to a known variable region?

 

3. Are there genes nearby that are similar to known S, V, C, R or other M genes?

 

4. Are the flanking genes similar to known non-R genes?

  

Page 16: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

Problems 

Methylases

 

  1. What cutoff value will distinguish true positives from spurious hits?

 

a) How do we avoid just populating the database with more examples of the same?

 

b) How do we avoid the degeneration of the database by including marginal examples?

 

2. The HemK group of “apparent” methylases

 

Page 17: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

Problems 

 

Restriction enzymes

  

1. Even “true” matches are often very poor.

 

2. Good matches are “usually”, but not always, real isoschizomers. How do we distinguish?

 

3. Can we identify the “real” candidates, in the absenceof sequence similarity?

 

 

Page 18: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

SHOTGUN SEQUENCING

R M

Page 19: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

HindII

Page 20: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

HindII

Page 21: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

HindVP

» HindVP

Page 22: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

HindVP

Page 23: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org
Page 24: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

1. digestion of λ DNA using McaTI expression lysate only2. digestion using BssHII (NEB) only3. double digestion using BssHII and McaTI expression lysate

BssHII and McaTI haveno significant sequence similarity!

Page 25: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

 

 

Acknowledgements 

Janos Posfai Computer Scientist - Sequence Analysis

 

Tamas Vincze Programmer - Sequence Analysis

 

Yu Zheng Postdoctoral Fellow – in vitro experiments

 

Rick Morgan Staff Scientist – Experimental RE discovery

 

Dana Macelis Programmer - REBASE

 

  

 

 

Page 26: I SUPPORT OPEN ACCESS PubMed Central  Public Library of Science  Nucleic Acids Research nar.oupjournals.org

I SUPPORTOPEN ACCESS

PubMed Centralwww.pubmedcentral.nih.gov/

 

 Public Library of Sciencewww.publiclibraryofscience.org

  

Nucleic Acids Researchnar.oupjournals.org/