14
1 __________________________________________________________________________________________________ Fall, 2017 GCBA/MGCB/BMI 815 Tools and Algorithms in Bioinformatics GCBA815/MCGB815/BMI815, Fall 2017 Week 11: IPA: Ingenuity Pathway Analysis Babu Guda, PhD Professor Department of Genetics, Cell Biology and Anatomy University of Nebraska Medical Center (Some slides were borrowed from IPA Tutorial with permission) __________________________________________________________________________________________________ Fall, 2017 GCBA/MGCB/BMI 815 The leading commercial software for biological pathway and network analysis Ingenuity knowledge Base (expert curated information from the literature) Integrated with public databases (Entrez Gene, RefSeq, OMIM, UniProt, GO, HMDB, GNF, KEGG, IntAct, BIND, Clinical trials, DrugBank, etc.) Major products Ingenuity Pathway Analysis (UNMC has a site license) IPA Variant Analysis: To identify variants from sequencing data Ingenuity Software http://www.ingenuity.com

Tools and Algorithms in Bioinformatics• Unmapped Ids: Unidentified by the IPA database • Check for typos and try to map as many as possible • Analysis –ready molecules: Identified

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

1

__________________________________________________________________________________________________Fall, 2017 GCBA/MGCB/BMI 815

Tools and Algorithms in BioinformaticsGCBA815/MCGB815/BMI815, Fall 2017

Week 11: IPA: Ingenuity Pathway Analysis

Babu Guda, PhDProfessor

Department of Genetics, Cell Biology and Anatomy

University of Nebraska Medical Center

(Some slides were borrowed from IPA Tutorial with permission)

__________________________________________________________________________________________________Fall, 2017 GCBA/MGCB/BMI 815

• The leading commercial software for biological pathway and network analysis

• Ingenuity knowledge Base (expert curated information from the literature)

• Integrated with public databases (Entrez Gene, RefSeq, OMIM, UniProt, GO,

HMDB, GNF, KEGG, IntAct, BIND, Clinical trials, DrugBank, etc.)

• Major products

• Ingenuity Pathway Analysis (UNMC has a site license)

• IPA Variant Analysis: To identify variants from sequencing data

Ingenuity Softwarehttp://www.ingenuity.com

2

TheIngenuityKnowledgeBase

TheIngenuityOntology

IngenuityFindingsIngenuity® Expert Findings – Manually curated Findings that are reviewed, from the full-text, rich with contextual details, and are derived from top journals.

Ingenuity® ExpertAssist Findings –Automated text Findings that are reviewed, from abstracts, timely, and cover a broad range of publications.

IngenuityModeledKnowledgeIngenuity® Expert Knowledge – Content we model such as pathways, toxicity lists, etc.

Ingenuity® Supported Third Party Information – Content areas include Protein-Protein, miRNA, biomarker, clinical trial information, and others

IngenuityContent• IngenuityKnowledgeBase

3

HowcanIPAhelpyou?

4

• IPA

– Deeppathwayunderstandingofasinglegene/protein• Drug/therapeutictargetdiscovery

– Biologicalunderstandingoflargedatasets,including• Transcriptomics– differentialgeneexpression(arrayandRNAseq)

– IsoProfiler:filterfortranscriptexpressionandannotationofinterest• Proteomics– differentialproteinexpression• Phosphoproteomics – differentialproteinphosphorylation• Geneswithloss/gain-of-functionvariants• Metabolomics• miRNAexpression• Methylation• GeneLists

– ChIP-Seq– siRNAscreening

3

__________________________________________________________________________________________________Fall, 2017 GCBA/MGCB/BMI 815

Different types of analyses

• Browsing the database• genes, molecules, diseases, pathways and functions using the IPA

knowledgebase• Core Analysis

• Gene expression changes, cellular processes• Upstream regulator analysis• Pathway analysis (signaling, metabolic, disease)• Interaction network analysis

• Explorative Analysis• Grow/build networks from seed molecules• Mechanistic network analysis

• Human Isoform view• Molecule activity predictor (MAP)

• Predicted cascading effects of altering one molecule in a pathway• Hypothesis generation

Identifykeycellularpathwaysmostlikelytobeaffected

• CanonicalPathwaysAnalysis

• Whichmetabolicandcellsignalingpathwaysshowasignificanceofenrichmentforagroupofgenes?

• Whatarethepredictedupstreamand/ordownstreameffectsofactivationorinhibitionofmoleculesinapathwaygivenmoleculeswith“known”activity?(MoleculeActivityPredictor)

4

Upstream Analysis

• Use published experimental molecular interactions to identify upstream regulators

• Identify upstream regulators by determining gene enrichment in downstream genes

• Predict the activity state of regulators by correlating literature reported effects with observed gene expression

Identifylikelyupstreamregulatorsandtheiractivitystate

Createdenovopathwaysofregulatorsandgenes• UpstreamAnalysis– MechanisticNetworks

• Identify potential upstream regulator signal transduction

• Using shared downstream gene effects and gene-gene interactions, pathways (mechanistic networks) are created.

5

Visualizeandpredictthebiologicalimpactofgeneexpressionchanges

• Diseases&Functions

• Identifykeybiologicalprocessesinfluencedbydifferentiallyexpressedgenes

• Understandwhethercellularprocessesarebeingdrivenupordownbycorrelatingobservedexpressionwithreportedexperimentalgeneeffects

• Each box represents a biological process or disease• The size of the box represents gene enrichment• The color of the box indicates the predicted increase or

decrease

Connectupstream,downstreamanalyses

• Hypothesisforhowaphenotype,functionordiseaseisregulatedinthedatasetbyactivatedorinhibitedupstreamregulators

• Explainimpactofupstreammoleculesondownstreambiology• Explainpotentialmechanismforaphenotypeordrug• Definedrugtargets• Discovernovel(orconfirmknown)regulatorà disease/phenotype/function

relationships

• RegulatorEffects

6

Discoveradditionalinterconnectivitywithinyourdata

11

Networks

• To show as many interactions between user-specified molecules in a given data set and how they might work together at the molecular level

• Highly-interconnected networks are likely to represent significant biological function

__________________________________________________________________________________________________Fall, 2017 GCBA/MGCB/BMI 815

• Search feature against the entire IPA Knowledgebase

• Molecule pages sorted by species (human/mouse)

• Directly access PubMed records for each annotation

• Share your analysis results with other IPA registered users

• Can also share only part of the result

• Import external pathways into IPA (in SBML format)

• Use Path Designer to edit pathway/network diagrams

• MicroRNA target filter

• Biomarker analysis

• Export data and images for reports/publications

Some special features of IPA

7

__________________________________________________________________________________________________Fall, 2017 GCBA/MGCB/BMI 815

After Login – Landing page

__________________________________________________________________________________________________Fall, 2017 GCBA/MGCB/BMI 815

• Data in a structured format (excel, tab-delimited, etc.)

• Limit the header to only the first row

• Keep the ‘flexible format’ option (default)

• Identifier type: Automatic, but change if inappropriate

• Select only the columns that you want to upload and ignore the rest

• Select appropriate units like fold change, z-score, p-value, etc.

• Prompts you to create a new project; all your analysis results will be

stored in this project folder, which can be accessed from the left panel

• General default settings are good unless you want to customize

• Very detailed step-by-step directions are available at

• http://ingenuity.force.com/ipa/IPATutorials?id=kA150000000TPy1#

Upload the data

8

WhatgetsuploadedtoIPA?

15

• IPACoreAnalysisinput– RNA-seq,microarray,miRNA,proteomic,genomic,SNP,ormetabolicdata– Measurementcalculations(e.g.differentialexpressionandsignificance)are

madeoutsideofIPApriortoupload

– Observation– foragivenexperimentalcondition…• Alistofmoleculeidentifiers(gene,protein,etc.)• Correspondingmeasurementvalues(foldchange,p-value,etc.)

– Single-observationdatasets– oneexperimentalcomparison• Casevs.control• Mutantvs.wild-type• Treatedvs.untreated

– Multiple-observationdatasets– morethanoneexperimentalcondition• Atimecourseexperimentwithmultipletimepoints• Doseresponseexperimentwithmultipledoses• Measurementofmultiplecelltypesordiseasesubtypes

Datauploadformatexamples

16

ID(required)

IdentifierExamplesArrayIDsdbSNPEnsemblEntrez GeneGenBankIPIKEGGPubChemRefSeqUniProt…

9

Datauploadformatexamples

17

ID(required)

IdentifierExamplesArrayIDsdbSNPEnsemblEntrez GeneGenBankIPIKEGGPubChemRefSeqUniProt…

DirectionalComparisonsExprRatioExprFoldChangeExprLogRatioVariantLoss/GainPhospho RatioPhospho FoldChangePhospho LogRatio

OtherMeasurementsExprp-value

ExprFDR(q-value)ExprIntensity/RPKM/FPKMVariantACMGClassification

Phospho p-valuePhospho FDR(q-value)

Phospho IntensityPhospho Site

Measurements(recommended)

Datauploadformatexamples

18

ID(required)

IdentifierExamplesArrayIDsdbSNPEnsemblEntrez GeneGenBankIPIKEGGPubChemRefSeqUniProt…

DirectionalComparisonsExprRatioExprFoldChangeExprLogRatioVariantLoss/GainPhospho RatioPhospho FoldChangePhospho LogRatio

OtherMeasurementsExprp-value

ExprFDR(q-value)ExprIntensity/RPKM/FPKMVariantACMGClassification

Phospho p-valuePhospho FDR(q-value)

Phospho IntensityPhospho Site

Measurements(recommended) AdditionalObservations(optional)

ExperimentalComparisonExamplesMutantvs.wild-typeTreatedvs.untreatedOthercasevs.control

AdditionaltimepointsMultipledoseresponses

Variouscelllines

10

DataUpload

19

• Bestpractices– CalculatemetricsoutsideofIPA(e.g.fold-change,p-value)– CreateanExcelspreadsheetortabdelimitedfile

• Only1headerrowallowed• Onecolumnmusthaveidentifiers,preferablytheleft-mostcolumn• IPAwillonlylookatthetopworksheetinanExcelworkbook

– Grouprelatedobservationsintoasinglespreadsheetifpossible• Timecourse,drugconcentration,celllines,etc.• Canhaveupto20observations

– Specifyarrayplatform(chip)ifpossible• ItisOKtouse“Notspecified/applicable”

– Pre-filterdataatthelowestthresholdthatyouhaveconfidencein• Forexample,probemeasurementp-valueof.05orothercriteria• FurtherfilterinCoreAnalysis

__________________________________________________________________________________________________Fall, 2017 GCBA/MGCB/BMI 815

• Mapped Ids: Identified by the IPA database

• Unmapped Ids: Unidentified by the IPA database

• Check for typos and try to map as many as possible

• Analysis –ready molecules: Identified and also contain

information in the database

• Run the analysis for individual observations

• If more than one observation is selected, analysis is run

independently for each and saved under separate names

such as control, wildtype, treated, untreated, etc.

Running Core Analysis

11

CoreAnalysisSteps

21

1. LaunchCoreAnalysis:File>New>CoreAnalysis

2. UploadData(geneexpression,proteinexpression,metabolomics,etc.)

3. SetCoreAnalysisSettingsandRunAnalysis

4. InterpretResults

ExpressionValueCalculation

22

• Verifythedifferentialexpressioncalculations

– Ratiodifferentialexpression

– Log2(ratio)differentialexpression(recommended)

– FoldChange• Ifincreaseddifferentialexpression

• Ifdecreaseddifferentialexpression

Foldchangewillneverhavevaluesbetween-1and1

12

Gene/ProteinExpressionAnalysis,Overview

23

• IPACoreAnalysis

– PathwayAnalysis• Identifiesenrichedcanonicalpathwaysandscoresdirectionalchangesbasedongeneexpression

– UpstreamRegulatorAnalysis• Predictswhatregulatorscausedchangesingeneexpressionandthedirectionalstateofregulator

– DiseasesandFunctionsAnalysis• Predictseffectedbiology(cellularprocesses,biologicalfunctions)basedongeneexpressionandpredictsdirectionalchangeonthateffect

– RegulatorEffects• Modelspathwayinteractionsfrompredictedupstreamregulators,throughdifferentiallyexpressedgenes,tobiologicalprocesses

– Networks• Predictsnon-directionalgeneinteractionmap

CanonicalPathwayZ-scores

– AllowsyoutoquicklydetermineifCanonicalPathways,includingfunctionalend-points,areincreasedordecreasedbasedondifferentiallyexpressedgenesorproteinsinyourdataset

– Certainpathwayswithintheknowledgebasearedirectional(proceedfrom“A”to“Z”)

– Aspartofpathwaycuration,asubsetofgenesareselectedtobeactive• Allowsthedirectionalityofothergenestobepredicted• Resultdefinesan“activated”stateforagivenpathway

– Z-scoresarecalculatedbasedonthedataset’scorrelationwiththeactivatedstate

• Pathwayactivityanalysis

13

PathwayActivityAnalysis

25

Expected Activation State(Knowledge Base)

Positive Z-score(Example Data)

Negative Z-score(Example Data)

IPAUpstreamRegulatorAnalysis• DirectionalEffects:MoleculeActivityPredictor

ExamineExpressionRelationshipConsistency

14

__________________________________________________________________________________________________Fall, 2017 GCBA/MGCB/BMI 815

• Grow: Connected molecules for a selected molecule can be grown

based on the known connections in the literature

• Trim: Connections can be removed from a larger network based on

certain criteria

• Connect: Two sets of unconnected genes can be connected via the

shortest path

• Pathway Explorer: Above tools can be used to explore pathways

• Pathway Designer: Offers graphical representation of molecules for

generating publication quality figures

• Canonical pathways can be selected and genes from the dataset can

be overlaid

Network/Pathway Analysis