15
1 von 15 MapMan Help 1 Short Introduction 1.1 View data included in the package 1.2 Display options 1.3 Print 1.4 Save 1.5 Search Functions 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create an experiment folder 2.3 Add data files to your new folder 2.4 Visualise your data 3 Use MapMan to visualise any gene expression, metabolite or other data 3.1 Creation of a mapping file 3.2 Create your own pathway (Annotation of an image file) 3.3 Histogram frequency chart 4 Structure of MapMan 5 Available Pathways Overview 6 Contact information 1 Short introduction index MapMan is a user-driven tool that displays large datasets, e.g. from gene expression experiments onto diagrams of metabolic pathways or other processes. You can not only use the provided diagrams, but also generate your own ones and let the software display yours and others data onto your diagrams. There are three different types of files needed to use MapMan. They are located in three different folders inside the directory of example data. (i) data files: contain experimental results (.xls or .txtfile format) (ii) pathways: diagrams ( .bmp file format) which serve as context (iii) mapping files: connects gene/metabolite/protein identifiers to BINs (“functional categories”) which are displayed (. xls or .txt file format)

MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

1 von 15

MapMan Help

1 Short Introduction

1.1 View data included in the package

1.2 Display options

1.3 Print

1.4 Save

1.5 Search Functions

1.6 Statistics (Wilcoxon Rank Sum Test)

2 View your own Arabidopsis 22KAffymetrix data

2.1 Data format

2.2 Create an experiment folder

2.3 Add data files to your new folder

2.4 Visualise your data

3 Use MapMan to visualise any gene expression, metabolite or other data

3.1 Creation of a mapping file

3.2 Create your own pathway (Annotation of an image file)

3.3 Histogram frequency chart

4 Structure of MapMan

5 Available Pathways Overview

6 Contact information

1 Short introduction index

MapMan is a user-driven tool that displays large datasets, e.g. from gene expression experiments onto

diagrams of metabolic pathways or other processes.

You can not only use the provided diagrams, but also generate your own ones and let the software display

yours and others data onto your diagrams.

There are three different types of files needed to use MapMan. They are located in three different folders

inside the directory of example data.

(i) data files: contain experimental results (.xls or .txtfile format)

(ii) pathways: diagrams (.bmp file format) which serve as context

(iii) mapping files: connects gene/metabolite/protein identifiers to BINs (“functional categories”) which are

displayed (.xls or .txt file format)

Page 2: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

2 von 15

1.1 View data included in the package index

The MapMan download provides example.

After starting the MapMan software you will find data files in the Experiments folder, pathway image files

in the Pathways(overview of available predefined pathways)folder and mapping files in the Mappings

folder of the selection directory on the left.

You can view the included datasets in context of different metabolic pathways

(i)Double click an image file from the folder "Pathways",

e.g. "Metabolism_overview" or "Glycolysis".

Choose a mapping file from the pop-up box

and click "OK".

Choose "AFFY2005" in combination with "Metabolism_overview"

with all maps showing biological processes (This is most often

right).

Hint: Images need different mapping files (table 1).

Response_images need an Response_mapping file. (In fact, if you

don’t care about the statistics you can combine both files).

(ii) A simple click activates experiment files from the folder

"Experiment:DiurnalCycle (or other) one by one.

All data files from one experiment can now be

viewed in sequence. Each file is called up after

loaded the first time by mouse click in a fraction of

a second.

Each gene is symbolised by a box,

the gene expression level is colour-encoded

(red = down-, blue = upregulation).

A simple mouse-over action on an individual box will call up the gene annotation beneath your mouse,

while a click will copy the information to the text window below pathway.

Right clicking on an individual box brings up further options, such as opening a webbrowser (link out) with

additional information about the particular spot from the GABI website (http://gabi.rzpd.de) which will also

get you to further information ressources.

You can get unigene information, and information how good a spot reflects a given transcripts.

Page 3: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

3 von 15

Number Filename Description/Keywords Visualization Data Standard/

Response

1 photosynthesis Light reaction, Calvin cycle, photorespiration P T S

2 cell functions overview cellular functions H T S

3 Cell Wall precursors NDP sugar pathways (used for the cell wall) P T S

4Cellular reponse

overviewstresses, redox, development, cell cycle and division P T S

5 Glycolysis glycolysis P T S

6 Large enzyme families Large enzyme families like oxidases, GDSL lipases, etc P T S

7 Lignin Monolignol pathway starting from Phenylalanine P T S

8 metabolism overview overview of metabolic reactions P T S

9mitochondrial

e-transportmitochondriol overview including transporters P T S

10 N-metabolism Nitrogen metabolism P T S

11 RNA-Protein SynthesisProtein Synthesis, targeting and degradation as well as

RNA processingP/H T S

12 Sucrose Starch Sucrose and Starch Degradation and Synthesis P T S

13Transcription

RegulationPotential Transcription factors and regulators of transcription P T S

14 Transport Overview different transporters P T S

15 Proteasome Ubiquitin dependent protein degradation pathway P T S

16 Regulation overviewTFs, Protein modification and degradation,

hormone regulation, receptor kinases, G proteins, MAPKs etc.P T S

17 TCA TCA Cycle including mitochondrial genes P T S

18 Glycolysis-TCA Combination of Glycolysis, TCA, and mitochondrial genes P T S

19 Secondary metabolism Secondary metabolism like flavenols, chalcones, lignins, etc P T S

20 Pentose phosphate Pentose phosphate pathway, Warburg way, Shunt P T S

21 Metabolites

most metabolites that can be measured, for a conversion

of your metabolite name to the canonical MM one see:

http://csbdb.mpimp-golm.mpg.de/csbdb/gmd/tools/gmd_conv.html

P M S

22 C_TCA only the TCA cycle including some metabolites P T/M S

23 Sulfate assimilation Sulfate assimilation P T S

24 C_LigninMonolignol pathways (slightly different layout as ligin with transcript

only)P T/M S

25 Response_nutrientsResponse to starvation and readdition of nutrients:

phosphate, nitrate, sulphate, carbonP T R

26 Response_stress Response to abiotic stresses from Atgenexpress (roots) P T R

Visualization types: P(oints) H(istogram)

Data types: T(ranscripts) M(etabolites) P(roteinas)

Table 1:Available image, mapping and experiment files.

Regard: Some of the mapping files included in this package might be encrypted. To receive the original files, please contact:

Mark Stitt ([email protected])

Page 4: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

4 von 15

1.2 Display options index

In default mode, each individual gene (protein, metabolite) is symbolised by a small square box in which the

expression (concentration) level is colour-encoded.

In case of gene expression experiments, up-regulated genes are shown in blue, while down-regulated genes

are stained in red. [Colour coding will be extended with planned additions]

The display options can be changed:

(i) Select "Options" from the "Pathway" menu.

(ii)

Scaling: Change the colour intensity of your data points. Scale

down the value for more, scale up the value for less colour

intensity. The default value is 3.0.

Datasize: Minimize or enlarge the box size. Middle size (M) is

default value.

Background Colour: Change the colour of the image

background.

Visualization Type:

Change mode, default means use visualizations as specified for each data area, other modes override the

modes specified for each data area

Marked Visualization Type:

Change visualization for marked elements. You will only notice changes in visualization for marked

elements which have been marked via the search function!

1.3 Print index

Print your image-data-file:

Select "Print" from the "Pathway" menu.

1.4 Save index

You can save the result of an experiment depicted on a pathway as an image-data-file:

Select "Save as Image" from the "Pathway" menu. It is possible to save the image as .jpg or .png file.

After selecting a file-type write the filename and add the correct ending of the file name (.jpg or .png).

Page 5: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

5 von 15

1.5 Search Functions index

The search function allows you searches in the description for all genes/metabolites/proteins which are

present on a selected pathway.

Type in the your search description in the search field.

Press the binocular button right beside the search field to start.

In the "Log" tab ( bottom pane )the number of found matches is displayed.

Additionally, all spots which were matched by the search are displayed according to the selected display

options from the menu: "Pathway->Options->MarkedVisualiationType".

(The experienced user can make use of regular expression a.k.a. REGEX syntax)

Press the recycle button to reset the marking on the pathway diagram

Results of all matches are visually highlighted in each individual box.

The displayed visualization can be altered under the options menu, selecting other options under

"MarkedVisualizationType". The default display is "3D rectangles" In the example shown, the option

"greying" was chosen, the greys out all items that do not match with the descriptor search of interest. Other

available search options are "Inner Rectangles", "Triangles" and "3D Rectangles". The Highlighting option

can be modified under Pathways->Options->MarkedVisualizationType. See paragraph 1.2

Page 6: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

6 von 15

1.6 Statistics (Wilcoxon Rank SumTest) index

The statistics performed in MapMan is based on the Wilcoxon Rank Sum Test to predict BINs that exhibit a

different behaviour in terms of expression profile compared to all the other remaining BINs.

The test is done automatically every time a new experiment is loaded.

For each BIN the results are displayed (BIN, Elements, Probability) in the Statistics panel beside the Log

panel at the bottom pane below the pathway diagram.

The test is done automatically every time a new experiment is loaded.

For each Bin displayed the results (Bin, Elements, Probability) of the test is shown in the Statistics panel

beside the Log panel at the bottom part of the pathway.

Page 7: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

7 von 15

2 View your own Arabidopsis 22K Affymetrix data index

Users of MapMan can visualize their own Arabidopsis 22K Affymetrix microarray experiments using the

mapping and pathway image files included in the MapMan package.

New files with experimental data have to be in a specific format (2.1) and are loaded into MapMan in two

steps.

The first step is to create a new experiment subdirectory folder (2.2) and the second step is to upload data

files into this folder (2.3).

You are able to create as many experiment folders as you like and reference data files belonging to your

experiment.

2.1 Data format index

The data format MapMan expects is either an EXCEL or tab-delimited text file. Files might have the

following structure: Values should typically be in the range of -10 to 10, but values may be higher or lower).

Single assay file format: Multiple assay file format:

IDENTIFIER VALUE

EST_1A01 -1,50

EST_1C05 1,90

EST_1G03 X¹

EST_1F08 -1,80

agi exp3a exp3b exp4a exp4b

at1g6635 -1,50 -1,20 -1,70 -1,10

at2g6678 1,90 1,50 X¹ 1,40

at2g2431 1,90 1,50 1,40 1,40

at4g1786 1,20 -1,50 0,40 0,30

¹Values of 'X' mean absent, which will be displayed as empty squares.

The "IDENTIFIER" - column refers to the unique EST or oligo sequence identifier. For example

Affymetrix Identifiers represented on your filter/array.

The "VALUE" - column indicates for each identifier the measured expression value.

A negative/positive value represents downward/upward regulation.

We recommend using logarithms of measured expression ratios between two treatment conditions in an

experiment (e.g. log2 expression values representing fold change).

MapMan expects numeric information in the "VALUE" column to be within the range -10 to 10, but values

may be higher or lower.

Missing or absent values are marked by a capital "X".

Alternatively MapMan can load files containing multiple assays at once.

MapMan expects the Probe identifier in the first columns, all following columns contain the individual

experiment values.

The tables can contain a header with the name of the experiment, but this is not necessary.

Page 8: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

8 von 15

2.2 Create an experiment (folder) index

Experimental data can be organized in folders and data files can than be added to those folders.

Alternatively folders from the file system can be selected and the containig data files will be added

automatically.

(i)

Right click on any experiment folder in

the folder structure on the left pane,

and select "new sub folder" to

generate a new experiment folder in

MapMan

(ii)

An option box appears:

Select"by name" to add files

individually.

(Advanced users can optionally

reference directories by selecting

existing folders to load all data files

within a given directory)

(iii)

The new experiment folder will show

up in the folder structure:

Page 9: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

9 von 15

2.3 Add data files to your new folder index

(i)

Right click the mouse button on the newly

created experiment folder and select "add data

file" from the menu.

(ii)

Choose your data files one by one.

(Tab delimited text files are much faster to load

than excel files!)

You can export your data from excel as tab

delimited text files.

(iii)

A dialog opens giving the option to configure your datafile.

You have the option to deselect data columns you are not going to use which will speed up loading of the

data and prevent errors from unreadable data.

Usually MapMan does recognize the format of

your data automatically. (Check if the number

format matches (decimal point or comma))

Moreover, you can force MapMan to take a

header row or not to take a header, by checking

or unchecking the check box "first row contains

header" respectively.

Page 10: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

10 von 15

(iv)

All configured data files are now listed within

your experiment folder.

2.4 Visualise your data index

Once you have imported your data file into MapMan it is possible to display these data sets onto an image

(diagram). Please follow the instructions as outlined in paragraph 1.

Page 11: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

11 von 15

3 Use MapMan to visualise any gene expression, metabolite or other data index

In general MapMan can be used to display any data onto user-defined images. Data can be gene expression

data, metabolite or protein concentrations, enzyme activities etc.

Image files (.png, .bmp ?) can be metabolic pathways, cellular processes, regulatory networks etc.

Prior MapMan usage two steps are necessary:

(i) creation of a mapping file (3.1),

(ii) annotation of new image files (3.2).

3.1 Creation of a custom mapping file index

The mapping file structures your genes, metabolites, enzymes etc. in discrete classes in a hierarchical way.

The mapping file has to be in "MS-EXCEL", tab-delimited ".txt" or ".xml" format.

Define the five firest columns as "BINCODE", "NAME", "IDENTIFIER" , "DESCRIPTION",

"TYPE".

The "BINCODE" column contains the identifier for all your main classes (1, 2, 3, 4 ....), subclasses (1.1,

1.2, ...., 2.1, 2.2....), subsubclasses (1.1.1, 1.1.2....1.2.1, 1.2.2.....) and so on.

Important is the dot between classes and their subclasses.

The BINCODE is used to annotate the image files (3.2).

The "NAME" column includes the names for each class (e.g. Photosynthesis) and subclass (e.g.

Photosynthesis.lightreaction). Again a dot separates classes and subclasses.

The "IDENTIFIER" column lists the identifier of a gene, metabolite, enzyme etc.. These identifiers have to

match the identifier in your data file.

The "DESCRIPTION" column contains a user-defined description of the gene, metabolite, enzyme etc.

There is no space limitation.

The "TYPE" column specifies if the item is a transcript (T), metabolite (M), enzyme (E), protein (P)

Hint: Currently you can leave out the Type column for backward compatibility reasons

BINCODE NAME IDENTIFIER DESCRIPTION TYPE

1 Photosynthesis

1.1 Photosynthesis.lightreaction Chlorophyll aApproximate Chlorophyll content

with essay XYM

1.1 Photosynthesis.lightreaction EST_1B05 photosystem II G protein T

1.2 Photosynthesis.photorepiration EST_2A08glycine hydroxymethyltransferase

(EC 2.1.2.1)T

1.3 Photosynthesis.calvin cyle EST_11H01 subunit of rubisco T

1.3 Photosynthesis.calvin cyle EST_7K12 fructose-bisphosphate aldolase T

2 major CHO metabolism

2.1 major CHO metabolism.syntheis

2.1.1major CHO

metabolism.synthesis.sucroseEST_5F03 sucrose-phosphate synthase T

2.1.2major CHO

metabolism.synthesis.starchEST_3A09

putative ADP-glucose

pyrophosphorylase large subunit

Table 2: Example for a .xls mapping file.

Regard : Some of the mapping files, included in the package might be encrypted.

Please contact Mark Stitt ([email protected])to get one of the original mapping files.

Page 12: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

12 von 15

3.1.1 Add mappings to the "Mappings" folder

Once a new mapping file is created it can be loaded into MapMan:

(i) Right mouse click on "Mappings" and then "Add

mapping file".

Select your newly created

.txt, .xls or .xml file from the folder source.

(ii) A box appears:

Select"from file" to add files individually

or"download" to add mapping files from the

MapManStore server which has updated mappings.

(iii) Select the appropriate mapping file

(iv) The mapping name is shown in the "Mappings" tree

structure without the file extensions .txt or .xls.

Page 13: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

13 von 15

(v) By double click the mapping file is displayed and it is possible to browse through this file.

A mouse click on the (sub)classes or identifier shows up the information as outlined in the

"DESCRIPTION" field of your mapping file as well as the "TYPE" of your spot:

3.2 Create your own pathway (Annotation of an image file) index

(i) Select "Add pathway" from the "File" menu and select an image file from your directory. The new file

will appear in the "Pathways" folder in the left pane.

(ii) To annotate the new image click the right mouse button on the image

where you want to place your annotation and select "add" from the

"Annotation" menu.

Page 14: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

14 von 15

(iii) A dialog box is opened in which the user is asked to type in the numerical identifier of the

BINS/subBINS for which data should be deposited, concordant to your mapping file.The "Block Format"

can be set: type in xa or ya (a=1...n). This format assigns the arrangement of the boxes (if default or points is

chosen as "Visualization Type")

E.g. Block Format x20

Block Format y16

Visualization Type histogram (3.3) Choose "histogram" as

"Visualization type" to view data in a histogram frequency chart.

(iv) You can specify what kind of data you want to show. Currently

ImageAnnotator supports four different kinds of data points that can be

nested. However, you have to have a mapping file that supports these

different data types.

Annotated areas are marked by a dot and can be moved to exactly adjust the position via mouse dragging

(holding down the left mouse key on a annotation dot and moving the mouse). It can also be achieved by

clicking on the annotation dot and afterwards holding down the alt key and using the arrow keys for

positioning. Add as many annotations points to the image as you want.

All annotation points can be changed or deleted:

(i) To change the annotation text or to switch to the histogram frequency chart, click the right mouse button

on the point you want to change and select "Edit" from the menu. Type in your changes.

(ii) To delete an annotation just click on the point with the right mouse button and select "Delete".

Page 15: MapMan Help - GabiPDgabipd.org/projects/MapMan/help/MapMan_Help.pdf · 1.6 Statistics (Wilcoxon Rank Sum Test) 2 View your own Arabidopsis 22KAffymetrix data 2.1 Data format 2.2 Create

15 von 15

3.3 Histogram frequency chart index

The genes (enzymes, proteins) in a selected group can be treated as a population , and their collective

response displayed as a frequency histogram. Genes (enzymes, proteins) that change by less than a filter

value (e.g., <0.33 and >-0.33 on a log scale 2) are grouped into the central white bar, genes that increase are

displayed as a series of blue bars at right hand side (corresponding on this scale to changes between

0.33-0.99, 0.99-1.66, 1.66-2.33, 2.33-3.00 and >3.0 respectively), and genes that decrease are shown by a

similar set of red bars on the left hand side.

4 Structure of Mapman index

MapMan relies on SCAVENGER modules to build mapping files. These SCAVENGER modules group

metabolites, transcripts etc. data into the BINS. The SCAVENGER modules are completely independent of

the ImageAnnotator module and vice versa. The ImageAnnotator module uses mapping files from the

SCAVENGER modules or user-built mapping files as its data source. It then paints out the experimental

data onto maps (images with accompanying XML files) according to the hierarchical structure of the

mapping files.

6 Contact information index

Axel Nagel, [email protected] (application design, software development)

GABI Primary Database

RZPD Deutsches Ressourcenzentrum für Genomforschung GmbH

Heubnerweg 6

D-14059 Berlin

Bjoern Usadel, [email protected] (annotation computing, pathways)

Max Planck Institute of Molecular Plant Physiology

Am Muehlenberg 1

D-14476 Golm