34
Biological Integrated Knowledge Environment This demonstration is best viewed as a slide show. To do this, click Slide Show on the top tool bar, then View show. What follows is an illustration of how BioBIKE can solve interesting scientific problems that would be very difficult for the nonprogramming biologist to approach in any other way. The example uses BioBIKE's graphical programming language, which is under development. However, every operation is already possible using the text-base language, which you can access from http::/ramsites.net/~biobike Click to move on to the next slide.

Biological Integrated Knowledge Environment

  • Upload
    camden

  • View
    14

  • Download
    1

Embed Size (px)

DESCRIPTION

Biological Integrated Knowledge Environment. - PowerPoint PPT Presentation

Citation preview

Page 1: Biological Integrated Knowledge Environment

Biological Integrated Knowledge Environment

This demonstration is best viewed as a slide show.To do this, click Slide Show on the top tool bar, then View show.

What follows is an illustration of how BioBIKE can solve interesting scientific problems that would be very difficult for the nonprogramming biologist to approach in any other way. The example uses BioBIKE's graphical programming language, which is under development. However, every operation is already possible using the text-base language, which you can access from http::/ramsites.net/~biobike

Click to move on to the next slide.

Page 2: Biological Integrated Knowledge Environment

What Defines Cyanobacteria?

Cyanobacteria comprise an ancient and coherent group of organisms that happen to look alike (at least to our eyes). In fact, the genetic diversity of cyanobacteria greatly exceeds that of vertebrates, a group we consider as consisting of very different creatures. Amidst the difference, what makes cyanobacteria cyanobacteria?

Page 3: Biological Integrated Knowledge Environment

What Defines Cyanobacteria?

Another way to put this question is:What genes are in common amongstall cyanobacteria?

By analyzing the functions of these genes, we may deduce the characteristics that unite cyanobacteria.

Page 4: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

We'll use BioBIKE to accomplish the following tasks: - Define a set of protein common to all cyanobacteria - Examine this set to discern core cyanobacterial functionsThe first set is to define a set, so click the Definition button in the Pallete.

Page 5: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

APPLY-FUNCTION – Applies a function to a list of values

ASSIGN – Sets a variable to a given value

DEFINE – Sets for the first time a variable to a given value

DEFINE-FUNCTION – Defines operation of a new function MY-FUNCTIONS – Lists functions you have defind MY-VARIABLES – Lists variables you have defined SWAP – Gives x's value to y and y's value to x

Clicking Definition brings up a menu related to defining sets, variables, and functions.Click the green triangle next to Define (Clicking Define itself would bring up a description of the function).

Page 6: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE ASvariable name value

A function box appears, with internal boxes telling you what the function needs. Click on the first entry box to provide the name of the set you're defining.

Page 7: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins value

The box becomes selected (outlined in red). Type in the box the name of the set (or for now, just click).Then click on the value box, where you'll provide the definition of the set.

Page 8: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins value

You want to define this set as all protein common to all cyanobacterial genomes. Click the Genomes button in the Pallette to bring up functions that work on genomes.

Page 9: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE

Auto

AScore-proteins

CHROMOSOME-OF - Returns chromosome' s frame

CODING-GENES-OF - Returns those genes that encode proteins

COMMON-ORTHOLOGS-OF - Returns orthologs common to all organisms in the given list

DOWNSTREAM-SEQUENCES-OF - Undocumented

GENES-OF - Returns the gene of given organism, replicon, gene, protein

INTERGENIC-SEQUENCES-OF - Returns sequences between genes of an organism

NICKNAMES-OF - Returns nicknames of given organism

NONCODING-GENES-OF - Returns those genes that do not encode proteins

ORGANISM-OF - Returns organism's frame

PROTEINS-OF - Returns proteins' frames REPLICON-OF - Returns chromosome, replicon (plasmid) or contig's frame

SEQUENCE-OF - Returns the sequence of the given gene, protein, contig, replicon, or genome UPSTREAM-SEQUENCES-OF - Returns sequences upstream of a set of genes

COMMON-ORTHOLOGS-OF sounds promising ("orthologs" are, in brief, those proteins related by common evolutionary descent). You could find out more by clicking on the function, but for now, click on the triangle next to the function.

Page 10: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF set of organisms

all-cyanobacteriafilamentous-cyanobacteria

heterocystous-cyanobacteriamarine-cyanobacteria

unicellular-cyanobacteria

The function fills the definition hole and, as before, tells you what it needs to know in order to function. Click on the green down arrow to see appropriate choices. Common orthologs amongst all-cyanobacteria is what you want, so click on that.

Page 11: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

The definition seems to be complete, so click the rightward green GO triangle next to define to accomplish the task.

Page 12: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 1: A7120.p-All0004 A7120.p-All0005 A7120.p-All0006 A7120.p-All0007 A7120.p-All0008 A7120.p-Asl0009 A7120.p-All0010 A7120.p-All0011 A7120.p-Alr0033 A7120.p-All0036 A7120.p-Alr0044 A7120.p-Alr0045 ...(744)

The result appears in the Results window below. Evidently there are 744 proteins in common amongst cyanobacteria (they're listed in terms of the protein of Anabaena PCC 7120). But what are they? To find the descrip-tions of these proteins click the Genes/Proteins button in the Pallette.

Page 13: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 1: A7120.p-All0004 A7120.p-All0005 A7120.p-All0006 A7120.p-All0007 A7120.p-All0008 A7120.p-Asl0009 A7120.p-All0010 A7120.p-All0011 A7120.p-Alr0033 A7120.p-All0036 A7120.p-Alr0044 A7120.p-Alr0045 ...(744)

Description-analysis – Properties of genes and proteins

Gene-Neighborhood – Genes/sequences around a given gene or coordinate

Translation – Codon amino acid conversions and properties of amino acids

Gene-protein-types – Type checks related to genes and proteins

You're given four categories of choices. Description-analysis looks like the most likely choice, so click the green GO triangle next to that.

Page 14: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 1: A7120.p-All0004 A7120.p-All0005 A7120.p-All0006 A7120.p-All0007 A7120.p-All0008 A7120.p-Asl0009 A7120.p-All0010 A7120.p-All0011 A7120.p-Alr0033 A7120.p-All0036 A7120.p-Alr0044 A7120.p-Alr0045 ...(744)

Description-analysis – Properties of genes and proteins

Gene-Neighborhood – Genes/sequences around a given gene or coordinate

Translation – Codon amino acid conversions and properties of amino acids

Gene-protein-types – Type checks related to genes and proteins

COG-ID-OF - Returns the COG ID of given gene or protein

DESCRIPTION-OF - Returns best description of given gene[s] or protein[s]

GENE-DESCRIBED-BY - Returns gene annotated as given

GENE-NAMED - Converts string name into gene frame

GENE-OF - Returns the gene of given organism, replicon, gene, protein

HYDROPHOBICITY-OF - Returns hydrophobicity score for amino acids

LENGTH-OF - Returns the length of a given entity

LENGTHS-OF - Returns the length of each entity in a list of entities

MW-OF - Calculates molecular weight of amino acid sequence

ORTHOLOG-OF – Returns phylogenetically related gene or protein

PROTEIN-OF - Returns protwib of given organism, replicon, gene, protein

SEQUENCE-OF - Returns the sequence of the given entity

Good guess! That got me to DESCRIPTION-OF, which seems very likely to do what you want. Click the green GO triangle next to that.

Page 15: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 1: A7120.p-All0004 A7120.p-All0005 A7120.p-All0006 A7120.p-All0007 A7120.p-All0008 A7120.p-Asl0009 A7120.p-All0010 A7120.p-All0011 A7120.p-Alr0033 A7120.p-All0036 A7120.p-Alr0044 A7120.p-Alr0045 ...(744)

DESCRIPTIONS-OF entity

The usual function box appears. You could type in the name of your set, core-proteins, but you might misspell it. Click on the downward green arrow to get appropriate choices to put in the entity box.

Page 16: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 1: A7120.p-All0004 A7120.p-All0005 A7120.p-All0006 A7120.p-All0007 A7120.p-All0008 A7120.p-Asl0009 A7120.p-All0010 A7120.p-All0011 A7120.p-Alr0033 A7120.p-All0036 A7120.p-Alr0044 A7120.p-Alr0045 ...(744)

DESCRIPTIONS-OF entity

(specific gene)(previous result)

core-proteins

Sets and variables you've previously defined appear as appropriate choices. Click on core-proteins.

Page 17: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 1: A7120.p-All0004 A7120.p-All0005 A7120.p-All0006 A7120.p-All0007 A7120.p-All0008 A7120.p-Asl0009 A7120.p-All0010 A7120.p-All0011 A7120.p-Alr0033 A7120.p-All0036 A7120.p-Alr0044 A7120.p-Alr0045 ...(744)

DESCRIPTIONS-OF core-proteins

You could execute the function now (clicking the GO triangle), but this will get you just the descriptions. You'd like them labeled with the protein name. Click on the white/green MORE arrow in the upper right corner to see what optional capabilities the function supports.

Page 18: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 1: A7120.p-All0004 A7120.p-All0005 A7120.p-All0006 A7120.p-All0007 A7120.p-All0008 A7120.p-Asl0009 A7120.p-All0010 A7120.p-All0011 A7120.p-Alr0033 A7120.p-All0036 A7120.p-Alr0044 A7120.p-Alr0045 ...(744)

DESCRIPTIONS-OF core-proteinsLABELED

LENGTH

Labeling is evidently one of the options. Click on LABELED.

Page 19: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 1: A7120.p-All0004 A7120.p-All0005 A7120.p-All0006 A7120.p-All0007 A7120.p-All0008 A7120.p-Asl0009 A7120.p-All0010 A7120.p-All0011 A7120.p-Alr0033 A7120.p-All0036 A7120.p-Alr0044 A7120.p-Alr0045 ...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

That's all you want to specify for the descriptions, so click on the right GO triangle to get the result.

Page 20: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/Output

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

The result appears in the Result window,… at least part of it. To see the whole thing, click on RESULT 2.

Page 21: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/Output

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

Result 2A7120.p-All0004 "ATP synthase subunit gamma" A7120.p-All0005 "ATP synthase subunit alpha" A7120.p-All0006 "ATP synthase subunit delta" A7120.p-All0007 "ATP synthase subunit b" A7120.p-All0008 "ATP synthase subunit b" A7120.p-Asl0009 "ATP synthase subunit c" A7120.p-All0010 "ATP synthase subunit a" A7120.p-All0011 "ATP synthase subunit 1" A7120.p-Alr0033 "DHNA phythltransferase" A7120.p-All0036 "UDP-N-acetylmuramoylalanyl-D-gA7120.p-Alr0044 "hypothetical protein" A7120.p-Alr0045 "hypothetical protein" A7120.p-All0049 "DNA mismatch repair protein" A7120.p-Alr0051 "IMP dehydrogenase" A7120.p-Alr0052 "thioredoxin" A7120.p-Alr0063 "ribosome binding factor A" A7120.p-All0082 "riboflavin biosynthesis proteiA7120.p-Alr0088 "single-stranded DNA-binding pr A7120.p-Alr0094 "glutamate racemase" A7120.p-Alr0096 "solanesyl diphosphate synthase A7120.p-All0109 "photosystem I subunit III precA7120.p-Alr0110 "sialoglycoprotease" A7120.p-Alr0115 "N5-glutamine methyltransferase A7120.p-Alr0116 "hypothetical protein"

You can use the scroll bar of the result window to scroll through all of the descriptions, but even with what you see right now, certain things stand out. First, many of the descriptions are of expected proteins (e.g. ATP synthase). But you see two that are surprising: "hypothetical protein" (click to continue).

Page 22: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

Result 2A7120.p-All0004 "ATP synthase subunit gamma" A7120.p-All0005 "ATP synthase subunit alpha" A7120.p-All0006 "ATP synthase subunit delta" A7120.p-All0007 "ATP synthase subunit b" A7120.p-All0008 "ATP synthase subunit b" A7120.p-Asl0009 "ATP synthase subunit c" A7120.p-All0010 "ATP synthase subunit a" A7120.p-All0011 "ATP synthase subunit 1" A7120.p-Alr0033 "DHNA phythltransferase" A7120.p-All0036 "UDP-N-acetylmuramoylalanyl-D-gA7120.p-Alr0044 "hypothetical protein" A7120.p-Alr0045 "hypothetical protein" A7120.p-All0049 "DNA mismatch repair protein" A7120.p-Alr0051 "IMP dehydrogenase" A7120.p-Alr0052 "thioredoxin" A7120.p-Alr0063 "ribosome binding factor A" A7120.p-All0082 "riboflavin biosynthesis proteiA7120.p-Alr0088 "single-stranded DNA-binding pr A7120.p-Alr0094 "glutamate racemase" A7120.p-Alr0096 "solanesyl diphosphate synthase A7120.p-All0109 "photosystem I subunit III precA7120.p-Alr0110 "sialoglycoprotease" A7120.p-Alr0115 "N5-glutamine methyltransferase A7120.p-Alr0116 "hypothetical protein"

Hypothetical protein? Proteins possessed by cyanobacteria separated perhaps by over two million years of evolution must have critical functions. Seeing this result, you've decided to change course and identify all core proteins annotated "hypothetical". Perhaps we can learn the most from them. Click the white/red X to leave the Result window.

Page 23: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

To look for those highly conserved hypothetical proteins, you need to search for the words "hypothetical protein" in the descriptions. The descriptions are strings (collections of letters and other characters). So click on the String/Sequences button of the pallette.

Page 24: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

String-analysis - Functions that analyze the properties of strings

String-extraction - Functions that deliver parts of strings

String-production - Functions that manipulate and produce strings

String-type-checks - Functions that determine the type of a string

String-analysis looks good. Click on its GO triangle to find a function that might work to identify "hypothetical protein" in the descriptions.

Page 25: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

String-analysis - Functions that analyze the properties of strings

String-extraction - Functions that deliver parts of strings

String-production - Functions that manipulate and produce strings

String-type-checks - Functions that determine the type of a string

ALIGNMENT-OF – Aligns sequences via Clustal

ALPHABET-OF – Returns the minimal string of characters contained in string

COUNT-OF - Returns the number of times the query appears in the target

COUNTS-OF – Returns the number of times the query(s) appear in the target(s)

LENGTH-OF – Returns the length of a given entity

LENGTHS-OF – Returns the length of each entity in a list of entities

MATCH-OF – Searches target for first instance of query

MATCHES-OF – Searches target for all instances of query

MATCHES-OF seems perfect. Click on the GO triangle next to it.

Page 26: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

MATCHES-OF value or category

MATCHES-OF calls for something to match. Click on the down triangle to see perhaps appropriate choices.

Page 27: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

MATCHES-OF

NOTEACH-OFPATTERN

PATTERNSENZYME

ENZYMESPROPERTY

PROPERTIES(specific entity)(previous result)

core-proteins

value or category

The function can find various kinds of matches – patterns, restriction sites, etc. But "hypothetical protein" is not on the list. You'll have to type it in yourself. X out of the box.

Page 28: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

MATCHES-OF

Type "hypothetical protein" (click for this to be done automatically.

Page 29: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

MATCHES-OF pothetical protein”

You need to specify where to look for these words. Click on the right green MORE arrow to extend the choices of this function.

Page 30: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

MATCHES-OF “hypothetical protein”

IN-EACH

IN

Choose IN-EACH, because you are looking not for "hypothetical protein" in the list of results but rather for those words within each of the individual subresults.

Page 31: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

MATCHES-OF IN-EACH value

(specific entity)(previous result)

core-proteins

“hypothetical protein”

Click the green MENU triangle in the value box to get a list of appropriate choices. Then, choose previous result.

Page 32: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 2: ((A7120.p-All0004 "ATP synthase subunit gamma") (A7120.p-All0005 "ATP synthase subunit alpha") (A7120.p-All0006 "ATP synthase subunit delta")...(744)

DESCRIPTIONS-OF LABELEDcore-proteins

MATCHES-OF

IN-EACH previous result“hypothetical protein”

That seems to describe what you want to do, so click the green GO arrow to the left of MATCHES-OF

Page 33: Biological Integrated Knowledge Environment

ExpandResults

Workspace

Exit

Arithmetic

Other Logic Favorites

Input/OutputHelp

Lists/Sets String/Sequences

Analysis

Tables

Definition GenomesGenes/Proteins

Click a button above to choose a function. Press Help for general and specific advice

History

Pallette

DEFINE AScore-proteins COMMON-ORTHOLOGS-OF all-cyanobacteria

RESULT 3: ((A7120.p-Alr0044 hypothetical protein) (A7120.p-Alr0045 hypothetical protein) (A7120.p-Alr0116 hypothetical protein) ...(196)

DESCRIPTIONS-OF LABELEDcore-proteins

The result window gives you a list of those proteins that are conserved amongst all cyanobacteria and are currently annotated as "hypothetical". You've gotten your wish and can consider each carefully to try to find clues as to their functions… later. For now click the Exit button in the pallette).

MATCHES-OF

IN-EACH previous result“hypothetical protein”

Page 34: Biological Integrated Knowledge Environment

Biological Integrated Knowledge Environment

This example illustrated:

- The power of an integrated knowledge environment You didn't need to read in any data or worry about formats. It was all there for you.

- The power of a language that knows the concepts of biology You didn't need to teach the language the concept of "ortholog". This and many other biological concepts are built in.

- The power of a graphical interface You were programming a computer. It wasn't that challenging.

- The power of creative programming in solving biological problems You didn't use a premade program but examined each intermediate result as you went. This enabled you to see an unexpected and fascinating result that would have passed you by if you got only the final result.

BioBIKE may be accessed from ramsites.net/~biobike