Upload
pierce-jackson
View
223
Download
1
Embed Size (px)
Citation preview
The craft of annotationThe craft of annotation
Carole GobleBased on observations of the
PRINTS protein fingerprint database
Primary & Secondary Primary & Secondary databasesdatabases
Primary source generated by experimentalists. Role: standards, quality thresholds, dissemination
•Sequence databases: EMBL, GenBank•Increasingly other data types: micro-array
Secondary source derived from repositories, other secondary databases, analysis and expertise.Role: Distilled and accumulated specialist knowledge. Value added commentary.
•Swiss-Prot, PRINTS, CATH, PAX6, Enzyme, dbSNP…
Role: Warehouses to support analysis over replicated data
• GIMS, aMAZE, InterPro…
The “Annotation Pipeline”The “Annotation Pipeline”
EMBLSwiss-Prot
PRINTS
Analysis
Analysis
GPCRDB
Analysis
TrEMBL
Analysis
Interpro
BLOCKS
Annotation DistillationAnnotation Distillation
Expressed Sequence Tags millions
nrdb 503,479
TrEMBL 234,059
Swiss-Prot 85,661
InterPro 2990
PRINTS1310
PRINTSPRINTS
PRINTS - a database of protein family “fingerprints”Fingerprints - groups of motifs excised from alignments–used to provide diagnostic signatures for protein families
PRINTS forms basis of derived resources–e.g., blocks, emotif, InterPro
Used in gene family analysis, genome annotation, etc.
ID PRIO_HUMAN STANDARD; PRT; 253 AA.AC P04156;DE MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) (ASCR).OS Homo sapiens (Human).OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo.OX NCBI_TaxID=9606;RN [1]RP SEQUENCE FROM N.A.RX MEDLINE=86300093 [NCBI, ExPASy, Israel, Japan]; PubMed=3755672;RA Kretzschmar H.A., Stowring L.E., Westaway D., Stubblebine W.H., Prusiner S.B., Dearmond S.J.RT "Molecular cloning of a human prion protein cDNA.";RL DNA 5:315-324(1986).RN [6]RP STRUCTURE BY NMR OF 23-231.RX MEDLINE=97424376 [NCBI, ExPASy, Israel, Japan]; PubMed=9280298;RA Riek R., Hornemann S., Wider G., Glockshuber R., Wuethrich K.;RT "NMR characterization of the full-length recombinant murine prion protein, mPrP(23-231).";RL FEBS Lett. 413:282-288(1997).CC -!- FUNCTION: THE FUNCTION OF PRP IS NOT KNOWN. PRP IS ENCODED IN THE HOST GENOME AND IS CC EXPRESSED BOTH IN NORMAL AND INFECTED CELLS.CC -!- SUBUNIT: PRP HAS A TENDENCY TO AGGREGATE YIELDING POLYMERS CALLED "RODS".CC -!- SUBCELLULAR LOCATION: ATTACHED TO THE MEMBRANE BY A GPI-ANCHOR.CC -!- DISEASE: PRP IS FOUND IN HIGH QUANTITY IN THE BRAIN OF HUMANS AND ANIMALS INFECTED WITH CC NEURODEGENERATIVE DISEASES KNOWN AS TRANSMISSIBLE SPONGIFORM ENCEPHALOPATHIES OR PRION CC DISEASES, LIKE: CREUTZFELDT-JAKOB DISEASE (CJD), GERSTMANN-STRAUSSLER SYNDROME (GSS), CC FATAL FAMILIAL INSOMNIA (FFI) AND KURU IN HUMANS; SCRAPIE IN SHEEP AND GOAT; BOVINE CC SPONGIFORM ENCEPHALOPATHY (BSE) IN CATTLE; TRANSMISSIBLE MINK ENCEPHALOPATHY (TME); CC CHRONIC WASTING DISEASE (CWD) OF MULE DEER AND ELK; FELINE SPONGIFORM ENCEPHALOPATHY CC (FSE) IN CATS AND EXOTIC UNGULATE ENCEPHALOPATHY(EUE) IN NYALA AND GREATER KUDU. THE CC PRION DISEASES ILLUSTRATE THREE MANIFESTATIONS OF CNS DEGENERATION: (1) INFECTIOUS (2)CC SPORADIC AND (3) DOMINANTLY INHERITED FORMS. TME, CWD, BSE, FSE, EUE ARE ALL THOUGHT TO CC OCCUR AFTER CONSUMPTION OF PRION-INFECTED FOODSTUFFS.CC -!- SIMILARITY: BELONGS TO THE PRION FAMILY.DR HSSP; P04925; 1AG2. [HSSP ENTRY / SWISS-3DIMAGE / PDB]DR MIM; 176640; -. [NCBI / EBI]DR InterPro; IPR000817; -.DR Pfam; PF00377; prion; 1.DR PRINTS; PR00341; PRION.KW Prion; Brain; Glycoprotein; GPI-anchor; Repeat; Signal; Polymorphism; Disease mutation.
Swiss-Prot Swiss-Prot annotatioannotatio
nn
gc; gx; gn; ga; gt; gp; manual annotationbb;gr; bb;gd; bb;si; SUMMARY INFORMATIONsi; -------------------sd; 37 codes involving 8 elementssd; 0 codes involving 7 elementssd; 0 codes involving 6 elementssd; 0 codes involving 5 elementssd; 0 codes involving 4 elementssd; 1 codes involving 3 elementssd; 0 codes involving 2 elementsbb;ci; COMPOSITE FINGERPRINT INDEXci; ---------------------------cr;cd; 8| 37 37 37 37 37 37 37 37 cd; 7| 0 0 0 0 0 0 0 0 cd; 6| 0 0 0 0 0 0 0 0 cd; 5| 0 0 0 0 0 0 0 0 cd; 4| 0 0 0 0 0 0 0 0 cd; 3| 1 0 0 0 1 1 0 0 cd; 2| 0 0 0 0 0 0 0 0 cd; --+-----------------------------------------cd; | 1 2 3 4 5 6 7 8 bb;tp; PRIO_COLGU PRIO_MACFA PRIO_CEREL PRIO_ODOHE KA; P40251 M1 P40254 M1 P79142 M1 P47852 M1
tp; PRIO_GORGO PRIO_PANTR PRIO_HUMAN O46648 SWISS-PROT IDsKA; P40252 M1 P40253 M1 P04156 M1 O46648 M1 tp; PRIO_SHEEP PRIO_CALJA PRIO_BOVIN PRP2_BOVIN KA; P23907 M1 P40247 M1 P10279 M1 Q01880 M1 bb;tt; PRIO_COLGU MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) - COLOBUS GUEREZA.tt; PRIO_MACFA MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) - MACACA FASCICULARIS (CRAB EATING MACAQUE) tt; PRIO_CEREL MAJOR PRION PROTEIN PRECURSOR (PRP) - CERVUS ELAPHUS (RED DEER).tt; PRIO_ODOHE MAJOR PRION PROTEIN PRECURSOR (PRP) - ODOCOILEUS HEMIONUS (MULE DEER) (BLACK-TAILED DEER).tt; PRIO_GORGO MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) - GORILLA GORILLA GORILLA (LOWLAND GORILLA)tt; PRIO_PANTR MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) - PAN TROGLODYTES (CHIMPANZEE)tt; PRIO_HUMAN MAJOR PRION PROTEIN PRECURSOR (PRP) (PRP27-30) (PRP33-35C) (ASCR) - HOMO SAPIENS (HUMAN).
Nude Nude PRINTS PRINTS entryentry
Low Level AnnotationLow Level Annotation
Prion protein signature
PROSITE; PS00291 PRION_ 1; PS00706 PRION_ 2
BLOCKS; BL00291
PFAM; PF00377 prion
INTERPRO; IPR000817
1. STAHL, N. AND PRUSINER, S. B.
Prions and prion proteins.
FASEB J. 5 2799- 2807 (1991).
Annotation: “High-level”Annotation: “High-level”
Semi-structured text-based annotation, representing the accumulated knowledge of the biological community about the data entryIntellectually formed – the accumulated knowledge of an expert distilling the aggregated information drawn from multiple data sources and analyses, and the annotators knowledge.Culled from other sources such as other database entries annotations and the literature. Intended to be human readable rather than machine processable.
gc; PRIONgx; PR00341gt; Prion protein signaturegp; INTERPRO; IPR000817gp; PROSITE; PS00291 PRION_1; PS00706 PRION_2gp; BLOCKS; BL00291gp; PFAM; PF00377 prionbb;gr; 1. STAHL, N. AND PRUSINER, S.B.gr; Prions and prion proteins.gr; FASEB J. 5 2799-2807 (1991).gr;gr; 2. BRUNORI, M., CHIARA SILVESTRINI, M. AND POCCHIARI, M.gr; The scrapie agent and the prion hypothesis.gr; TRENDS BIOCHEM.SCI. 13 309-313 (1988).gr; gr; 3. PRUSINER, S.B.gr; Scrapie prions.gr; ANNU.REV.MICROBIOL. 43 345-374 (1989).bb;gd; Prion protein (PrP) is a small glycoprotein found in high quantity in the brain of animals infected with gd; certain degenerative neurological diseases, such as sheep scrapie and bovine spongiform encephalopathy (BSE), gd; and the human dementias Creutzfeldt-Jacob disease (CJD) and Gerstmann-Straussler syndrome (GSS). PrP is gd; encoded in the host genome and is expressed both in normal and infected cells. During infection, however, the gd; PrP molecules become altered and polymerise, yielding fibrils of modified PrP protein.gd;gd; PrP molecules have been found on the outer surface of plasma membranes of nerve cells, to which they are gd; anchored through a covalent-linked glycolipid, suggesting a role as a membrane receptor. PrP is also gd; expressed in other tissues, indicating that it may have different functions depending on its location. gd;gd; The primary sequences of PrP's from different sources are highly similar: all bear an N-terminal domain gd; containing multiple tandem repeats of a Pro/Gly rich octapeptide; sites of Asn-linked glycosylation; an gd; essential disulphide bond; and 3 hydrophobic segments. These sequences show some similarity to a chicken gd; glycoprotein, thought to be an acetylcholine receptor-inducing activity (ARIA) molecule. It has been gd; suggested that changes in the octapeptide repeat region may indicate a predisposition to disease, but it is gd; not known for certain whether the repeat can meaningfully be used as a fingerprint to indicate susceptibility.gd;gd; PRION is an 8-element fingerprint that provides a signature for the prion proteins. The fingerprint was gd; derived from an initial alignment of 5 sequences: the motifs were drawn from conserved regions spanning gd; virtually the full alignment length, including the 3 hydrophobic domains and the octapeptide repeats gd; (WGQPHGGG). Two iterations on OWL18.0 were required to reach convergence, at which point a true set comprising gd; 9 sequences was identified. Several partial matches were also found: these include a fragment (PRIO_RAT) gd; lacking part of the sequence bearing the first motif,and the PrP homologue found in chicken - this matches gd; well with only 2 of the 3 hydrophobic motifs (1 and 5) and one of the other conserved regions (6), but has an gd; N-terminal signature based on a sextapeptide repeat (YPHNPG) rather than the characteristic PrP octapeptide.
PRINTS PRINTS AnnotationAnnotation(manual)(manual)
High level annotationHigh level annotation
Prion protein (PrP) is a small glycoprotein found in high quantity in the brain of
animals infected with certain degenerative neurological diseases, such as sheep
scrapie and bovine spongiform encephalopathy (BSE), and the human dementias
Creutzfeldt- Jacob disease (CJD) and Gerstmann- Straussler syndrome (GSS).
PRINTS Annotation ProcessPRINTS Annotation Process
FingerPrint
Process
Blank Annotation
Annotation
gathering
Editorial culling
SWISS-PROT
MEDLINE
heuristics
mapping rules
Filled Annotatio
n
TagDeco
r-ation
OMIM GRAPPRINTS
Knowledge
For all matches to a fingerprint, full SWISS-PROT entry is retrieved:
tp; PRIO_COLGU PRIO_MACFA PRIO_CEREL PRIO_ODOHE tp; PRIO_GORGO PRIO_PANTR PRIO_HUMAN O46648 tp; PRIO_SHEEP PRIO_CALJA PRIO_BOVIN PRP2_BOVIN
ID analysis determines if the entry is a super-family, family or domain This is essential as influences how the annotation is processed:
tp; URIC_RAT URIC_MOUSE URIC_RABIT URIC_PAPHAtp; URIC_PIG URIC_DROPS URIC_DROME URIC_DROVI tp; URIC_SOYBN URIC_EMENI URIC_ASPFL URID_CANLI
tp; MUP5_MOUSE LACB_BOVIN LACB_BUBAR LACB_CAPHI tp; MUP_RAT RET1_ONCMY RET2_ONCMY PURP_CHICKtp; RETB_HUMAN ICYA_MANSE ICYB_MANSE CRA2_HOMGA
tp; UROT_HUMAN PLMN_PIG PLMN_HUMAN PLMN_BOVIN tp; APOA_HUMAN UROK_HUMAN APOA_MACMU UROK_PIGtp; THRB_BOVIN HGFL_MOUSE THRB_HUMAN HGFL_HUMAN
PRINTS Annotation ProcessPRINTS Annotation Process
ID analysis usually reveals families unambiguouslythe comment field helps to resolve super-families from domains
CC -!- SIMILARITY: BELONGS TO THE PRION FAMILYCC -!- SIMILARITY: BELONGS TO THE URICASE FAMILYCC -!- SIMILARITY: BELONGS TO THE LIPOCALIN FAMILYCC -!- SIMILARITY: CONTAINS 38 KRINGLE REGIONS
Once entry type established, appropriate precis is constructed Shared annotation is engineered to provide a report detailing
the function & structure of the proteinthe disease(s) with which it is associatedthe family to which it belongsa set of literature referencesa list of keywordsAny other remarks
The precis is then fed into a naked pre-PRINTS file.Output is English.
PRINTS Annotation ProcessPRINTS Annotation Process
Description Copy gt (title)
RAuthor, RTitle, Rlocation
Common + Filters: •Top four - Date priority•Mixed paper subject portfolio
gr (reference)
Database cross Reference fields
Common + Filters:-Preferred links -Preferred order
gp (other databases)
KeyWords Up to a threshold of common keywords
gd (general annotation)
Function Majority vote function
Subcellular location
Majority vote subcellular location
Disease Golden vote -Sequence provenance disease
Similarity tag Cluster on SWISS-PROT codesMajority vote for familiesEven distribution for superfamilies and domains
family
Subunit An indication of structure subunit (structure)
RP Structure Paper type classification - 1 crystallographic- 1 NMR
structure
Swiss-Prot tag Heuristics PRINTS tag
--- Com
ment fie
ld
------>
Swiss-Prot RedundancySwiss-Prot Redundancy
OPSD SHEEP DR PRINTS; PR00237; GPCRRHODOPSN.
OPSD HUMAN DR PRINTS; PR00237; GPCRRHODOPSN.
OPSD MOUSE DR PRINTS; PR00237; GPCRRHODOPSN.
OPSD SHEEP VISUAL PIGMENTS ARE THE LIGHT- ABSORBING
MOLECULES THAT MEDIATE VISION
OPSD HUMAN VISUAL PIGMENTS ARE THE LIGHT- ABSORBING
MOLECULES THAT MEDIATE VISION
OPSD MOUSE VISUAL PIGMENTS ARE THE LIGHT- ABSORBING
MOLECULES THAT MEDIATE VISION
Impact on provenance.
Redundancy eliminationRedundancy elimination
ACM1 HUMAN Primary transducing effect is
pi turnover.
ACM4 HUMAN Primary transducing effect is
inhibition of adenylate cyclase.
ACM2 HUMAN Primary transducing effect is
adenylate cyclase inhibition.
Databases: majority voteDatabases: majority vote
Major prion protein precursor (PRP)
PRINTS; PR00341 PRION
PROSITE; PS00291 PRION_ 1; PS00706 PRION_ 2
PFAM; PF00377 prion
INTERPRO; IPR000817
PDB; 1B10; 1AG2
References: date ranking ++References: date ranking ++
1. CERVENAKOVA, L., [...]Infectious amyloid precursor gene sequences in primates used for experimental transmission of human spongiform encephalopathy.PROC. NATL. ACAD. SCI. USA 91 12159- 12162 (1994).2. LOWENSTEIN, D. H., [...]Three hamster species with different scrapie incubation times and neuropathological featuresencode distinct prion proteins.MOL. CELL. BIOL. 10 1153- 1163 (1990).3. KALUZ, S., [...]Sequencing analysis of prion genes from red deer and camel.GENE 199 283- 286 (1997).
Disease – Golden Voting.Disease – Golden Voting.
(PRIO_ HUMAN; P04156): Prp is found in high quantity in the brain of humans and animals infected with neurodegenerative diseases known as transmissible spongiform encephalopathies or prion diseases [...]
(PRIO_ HUMAN; P04156): Kuru is transmitted during ritualistic cannibalism, among natives of the new guinea highlands. [...]
(PRIO_ SHEEP; P23907): Polymorphism at position 171 may be related to the alleles of scrapie [...]
PRINTS Annotation ProcessPRINTS Annotation Process
FingerPrint
Process
Blank Annotation
Annotation
gathering
Editorial culling
SWISS-PROT
MEDLINE
heuristics
mapping rules
Filled Annotatio
n
TagDeco
r-ation
OMIM GRAPPRINTS
Knowledge
gc; PRIOgx; gt; Major prion protein precursor (PRP) signaturegp; PROSITE; PS00291 PRION_1; PS00706 PRION_2gp; INTERPRO; IPR000817gp; PFAM; PF00377 priongp; PDB; 1B10; 1AG2gp; SCOP; 1B10; 1AG2gp; CATH; 1B10; 1AG2gp; MIM; 176640; 123400; 137440; 245300; 600072bb;gr; 1. LOWENSTEIN, D.H., BUTLER, D.A., WESTAWAY, D., MCKINLEY, M.P., DEARMOND, S.J. AND PRUSINER, S.B. gr; Three hamster species with different scrapie incubation times and neuropathological features encode distinct gr; prion proteins. gr; MOL.CELL.BIOL. 10 1153-1163 (1990).gr;gr; 5. RIEK, R., HORNEMANN, S., WIDER, G., GLOCKSHUBER, R. AND WUETHRICH, K. gr; NMR characterization of the full-length recombinant murine prion protein, mPrP(23-231). gr; FEBS LETT. 413 282-288 (1997).bb;gd; The function of prp is not known. Prp is encoded in the host genome and is expressed both in normal and gd; infected cells. gd;gd; (PRIO_HUMAN; P04156): gd; Prp is found in high quantity in the brain of humans and animals infected with neurodegenerative diseases gd; known as transmissible spongiform encephalopathies or prion diseases, like: creutzfeldt-jakob disease (cjd), gd; gerstmann-straussler syndrome (gss), fatal familial insomnia (ffi) and kuru in humans; scrapie in sheep and gd; goat; bovine spongiform encephalopathy (bse) in cattle; transmissible mink encephalopathy (tme); chronic gd; wasting disease (cwd) of mule deer and elk; feline spongiform encephalopathy (fse) in cats and exotic ungulate gd; encephalopathy (eue) in nyala and greater kudu. The prion diseases illustrate three manifestations of cns gd; degeneration: (1) infectious (2) sporadic and (3) dominantly inherited forms. Tme, cwd, bse, fse, eue are all gd; thought to occur after consumption of prion-infected foodstuffs. gd;gd; Prp has a tendency to aggregate yielding polymers called "rods". gd;gd; The structure has been determined, e.g. "NMR characterization of the full-length recombinant murine prion gd; protein, mPrP(23-231)" [5]. gd; gd; Belongs to the prion family. gd;gd; Keywords: GPI-anchor; Repeat; Signal; Prion; Brain; Glycoprotein; Polymorphism; Disease mutation; 3D-structure.gd;gd; PRIO is an 8-element fingerprint that provides a signature for the Major prion protein precursor (PRP). The gd; fingerprint was derived from an initial alignment of 6 sequences: the motifs were drawn from conserved regions gd; spanning virtually the full alignment length. Two iterations on SPTR37_9f were required to reach convergence, gd; at which point a true set comprising 37 sequences was identified. A single partial match was also found: gd; (PRIO_CHICK; P27177).
PRECIS PRECIS annotationannotation
gc; PRIONgx; PR00341gt; Prion protein signaturegp; INTERPRO; IPR000817gp; PROSITE; PS00291 PRION_1; PS00706 PRION_2gp; BLOCKS; BL00291gp; PFAM; PF00377 prionbb;gr; 1. STAHL, N. AND PRUSINER, S.B.gr; Prions and prion proteins.gr; FASEB J. 5 2799-2807 (1991).gr;gr; 2. BRUNORI, M., CHIARA SILVESTRINI, M. AND POCCHIARI, M.gr; The scrapie agent and the prion hypothesis.gr; TRENDS BIOCHEM.SCI. 13 309-313 (1988).gr; gr; 3. PRUSINER, S.B.gr; Scrapie prions.gr; ANNU.REV.MICROBIOL. 43 345-374 (1989).bb;gd; Prion protein (PrP) is a small glycoprotein found in high quantity in the brain of animals infected with gd; certain degenerative neurological diseases, such as sheep scrapie and bovine spongiform encephalopathy (BSE), gd; and the human dementias Creutzfeldt-Jacob disease (CJD) and Gerstmann-Straussler syndrome (GSS). PrP is gd; encoded in the host genome and is expressed both in normal and infected cells. During infection, however, the gd; PrP molecules become altered and polymerise, yielding fibrils of modified PrP protein.gd;gd; PrP molecules have been found on the outer surface of plasma membranes of nerve cells, to which they are gd; anchored through a covalent-linked glycolipid, suggesting a role as a membrane receptor. PrP is also gd; expressed in other tissues, indicating that it may have different functions depending on its location. gd;gd; The primary sequences of PrP's from different sources are highly similar: all bear an N-terminal domain gd; containing multiple tandem repeats of a Pro/Gly rich octapeptide; sites of Asn-linked glycosylation; an gd; essential disulphide bond; and 3 hydrophobic segments. These sequences show some similarity to a chicken gd; glycoprotein, thought to be an acetylcholine receptor-inducing activity (ARIA) molecule. It has been gd; suggested that changes in the octapeptide repeat region may indicate a predisposition to disease, but it is gd; not known for certain whether the repeat can meaningfully be used as a fingerprint to indicate susceptibility.gd;gd; PRION is an 8-element fingerprint that provides a signature for the prion proteins. The fingerprint was gd; derived from an initial alignment of 5 sequences: the motifs were drawn from conserved regions spanning gd; virtually the full alignment length, including the 3 hydrophobic domains and the octapeptide repeats gd; (WGQPHGGG). Two iterations on OWL18.0 were required to reach convergence, at which point a true set comprising gd; 9 sequences was identified. Several partial matches were also found: these include a fragment (PRIO_RAT) gd; lacking part of the sequence bearing the first motif,and the PrP homologue found in chicken - this matches gd; well with only 2 of the 3 hydrophobic motifs (1 and 5) and one of the other conserved regions (6), but has an gd; N-terminal signature based on a sextapeptide repeat (YPHNPG) rather than the characteristic PrP octapeptide.
Human Human annotationannotation
Implications for provenanceImplications for provenance
Tools used by the service providers can be sophisticated.Provenance information may be recorded in those tools.But are not passed on into the annotation (e.g. SWISS-PROT and PRINTS)
•Why?
Implications for provenanceImplications for provenance
Mining, Aggregating, Distilling, Summarising and Generating phrases and texts from comment fields. Distillation to create compact and comprehensive summary.Urge to be non-redundant.
•How to represent the provenance? •How does the provenance get aggregated?•How does it get propagated?•Degrees of evidence -> Degrees of provenance
Implications for provenanceImplications for provenance
gr; 5. RIEK, R., HORNEMANN, S., WIDER, G., GLOCKSHUBER, R. AND WUETHRICH, K. gr; NMR characterization of the full-length recombinant murine prion protein, mPrP(23-231). gr; FEBS LETT. 413 282-288 (1997).bb;gd; The function of prp is not known. Prp is encoded in the host genome and is expressed both in normal and gd; infected cells. gd;gd; (PRIO_HUMAN; P04156): gd; Prp is found in high quantity in the brain of humans and animals infected with neurodegenerative diseases gd; known as transmissible spongiform encephalopathies or prion diseases, like: creutzfeldt-jakob disease (cjd), gd; gerstmann-straussler syndrome (gss), fatal familial insomnia (ffi) and kuru in humans; scrapie in sheep and gd; goat; bovine spongiform encephalopathy (bse) in cattle; transmissible mink encephalopathy (tme); chronic gd; wasting disease (cwd) of mule deer and elk; feline spongiform encephalopathy (fse) in cats and exotic ungulate gd; encephalopathy (eue) in nyala and greater kudu. The prion diseases illustrate three manifestations of cns gd; degeneration: (1) infectious (2) sporadic and (3) dominantly inherited forms. Tme, cwd, bse, fse, eue are all gd; thought to occur after consumption of prion-infected foodstuffs. gd;gd; Prp has a tendency to aggregate yielding polymers called "rods". gd;gd; The structure has been determined, e.g. "NMR characterization of the full-length recombinant murine prion gd; protein, mPrP(23-231)" [5].
•Inter and Intra provenance
Swiss-Prot
Inheritance of errors E.g. SWISS-PROT errorsgd; Polymorphism at position 171 may be related to the
gd; alleles of scarpie incubation-control (sic) gene in this species.
Poor quality begates poor quality. E.g. SWISS-PROT annotation poor or inconsistentgd; Visual pigments are the light-absorbing molecules that mediate vision. They consist gd; of an apoprotein, opsin, covalently linked to cis-retinal. This receptor is coupled gd; to the activation of phospholipase c. gd;gd; Visual pigments are the light-absorbing molecules that mediate vision. They consist gd; of an apoprotein, opsin, covalently linked to cis-retinal. This receptor is coupled gd; to the activation of phospholipase c (by similarity).
•How do we record that it’s a copy but its been corrected and why?
Implications for provenanceImplications for provenance
Implications for provenanceImplications for provenance
Hugely subjective.e.g. if only one annotation claims that the family is implicated in a disease, and that annotation was by a group Terri Attwood respects then it gets in.
• How to capture that subjectivity and use it when using the annotation?•The workflow is complex – how to capture this?• Its more like argumentation than reproducible derivation.
Questions, questions …Questions, questions …
Where does provenance come from? –Incidental vs supplied by the scientist, somehow.
What is provenance used for? –Reliability & quality: –Justification & audit: –Reusability, reproducibility & repeatability–Change & evolution: –Ownership, security, credit & copyright. –Identity - LSID–Immutability–Migration & storage–Aggregation–Versioning
SparesSpares