83
Corrections

Corrections

  • Upload
    vinnie

  • View
    26

  • Download
    2

Embed Size (px)

DESCRIPTION

Corrections. N-linked glycosylation (GlcNac): Look at the Swiss-Prot annotation (in a random ‘glycosylated’ entry). Query:. annotation:(type:carbohyd "N-linked (GlcNAc...)" confidence:experimental) reviewed:yes. Taxonomic distribution. TPNLI N DTME. Multiple alignment (ClustalW). - PowerPoint PPT Presentation

Citation preview

Page 1: Corrections

 

Corrections

Page 2: Corrections
Page 3: Corrections

N-linked glycosylation (GlcNac):

Look at the Swiss-Prot annotation (in a random ‘glycosylated’ entry)

Page 4: Corrections

Query:

annotation:(type:carbohyd "N-linked (GlcNAc...)" confidence:experimental) reviewed:yes

Page 5: Corrections

Taxonomic distribution

Page 6: Corrections

TPNLINDTME

Page 7: Corrections

Multiple alignment (ClustalW)

-[LAPIQ]-N-[HAYRCS]-[ST]-[KLESGM]

Page 8: Corrections
Page 9: Corrections
Page 10: Corrections
Page 11: Corrections

N-glycosylation does not occur in Bacteria: …false positive !

Page 12: Corrections

301 protein (within the set of 1000 proteins) are N-glycosylated according to the UniProtKB annotation…!

Page 13: Corrections
Page 14: Corrections

Scan Prosite with the official pattern

The official pattern also match with bacteria sequences (false positives)

Page 15: Corrections
Page 16: Corrections
Page 17: Corrections

PRATT pattern with 20 sequencesD-K-T-G-T-[IL]-T-x(3)-[ILMV]-x-[FILV]

Page 18: Corrections
Page 19: Corrections

AT31_HUMAN:

SIMILARITY: Belongs to the cation transport ATPase (P-type) family. Type V subfamily. The pattern is a discriminator for ATP ase family (Cation-transporting )

Page 20: Corrections
Page 21: Corrections
Page 22: Corrections
Page 23: Corrections
Page 24: Corrections
Page 25: Corrections
Page 26: Corrections
Page 27: Corrections

C-x(2,4)-C-x(3)-[LIVMFYWC]-x(8)-H-x(3,5)-H

Page 28: Corrections

Pattern scan

Page 29: Corrections
Page 30: Corrections
Page 31: Corrections

The pattern missed some Zn finger in the same proteini.e. Q24174

Pattern

Profile

Not found with the pattern

Page 32: Corrections

The pattern:

C - X(2,4) - C - X(3) - [LIVMFYWC] - X(8) - H - X(3,5) – H

Should includes:

YRCVLCGTVAKSRNSLHSHMSrQHRGIST

C-X(2,4)-C-X(3)-[LIVMFYWCA]-X(8)-H-X(3,5)-H

Page 33: Corrections
Page 34: Corrections

Yes !

But:

The pattern becomes less restrictive.You get more sequences which should not be here.(As the results are limited to 1000, the number of hits is not the same…)

Page 35: Corrections

Discriminators (Signatures, descriptors) for the Zinc finger C2H2 type domain can be found in Prosite (Pattern and Profile) and Pfam (HMM)

Page 36: Corrections
Page 37: Corrections

Step 1: scan UniProtKB/Swiss-Prot with the patternUse the ‘scanprosite’ tool at http://www.expasy.org/tools/scanprosite/

Page 38: Corrections
Page 39: Corrections

Step 2: Retrieve the matched human entries @ UniProt(go at the end of the Scan Prosite result page: click on ‘Matched UniProtKB entries’)

Page 40: Corrections

Step 3: Retrieve the sequences annotated as being ‘phosphorylated on a Thr’

Page 41: Corrections

-> 19 candidates to be manually checked ….

Step 3: Retrieve the sequences annotated as being ‘phosphorylated on a Thr’

Page 42: Corrections
Page 43: Corrections

InterPro scan results

Page 44: Corrections

InterPro : other shema (Graphical view from UniProtKB)

Page 45: Corrections

InterPro shema

PFAM Graphical view

Page 46: Corrections

Prosite Graphical view

Page 47: Corrections

Blast @ NCBI against Swiss-Prot

NCBI: Color key for alignment scores

Page 48: Corrections

NCBI Swiss-Prot does not contain the alternative sequences (i.e. P28175-2) –!! NCBI gives the ‘version number’ of the Swiss-Prot sequence (i.e. Q8BU25.2)….

Page 49: Corrections

UniProt: Color code for identity scores (not alignment !)

Page 50: Corrections
Page 51: Corrections

UniProt: Color code for identity scores (not alignment !)

Page 52: Corrections

ProDom databaseList of proteins sharing at least a common domain…

Page 53: Corrections
Page 54: Corrections

1) BLAST at www.uniprot.org

Page 55: Corrections
Page 56: Corrections
Page 57: Corrections
Page 58: Corrections

2) PROSITE tools

Page 59: Corrections
Page 60: Corrections

You are lucky: domains are rarely not annotated in the different domain/family databases !

Page 61: Corrections

3) Construct a profile with My hits at SIBUse PSI Blast

Page 62: Corrections

Do a PSI BLAST against UniProtKB

Page 63: Corrections
Page 64: Corrections

Select sequence with a E value > 0.001 and do a second cycle

Page 65: Corrections

Look at the MSA

Page 66: Corrections
Page 67: Corrections

Construct a profile with the MSA

Page 68: Corrections
Page 69: Corrections
Page 70: Corrections
Page 71: Corrections

The profile

Page 72: Corrections

The profile hits

Page 73: Corrections

Construct a HMM with the MSA

Page 74: Corrections

The HMM

Page 75: Corrections

The HMM hits

Page 76: Corrections

- Look at the Goloco data in InterPro. How many proteins (and/or hits) are found by the different methods ?

Page 77: Corrections

http://www.ebi.ac.uk/interpro/

Page 78: Corrections

According to InterPro: Goloco domain is described by at least one of the different methods (PFAM, Prosite, Smart)

PFAM: 167 proteins Prosite: 192 proteinsSMART:  1 proteins These different numbers are the consequence of the interval between the different releases of the different databases (including the sequence databases (UniProtKB). It may also be due to the different methods used (HMM, profile…)

Page 79: Corrections

Look for the HMM for the Goloco domain in PFAM

Page 80: Corrections

Look for the HMM for the Goloco domain in PFAM

Page 81: Corrections

Download the HMM matrix

Page 82: Corrections

the HMM matrix

Page 83: Corrections