3
Wright State University CORE Scholar Kno.e.sis Publications e Ohio Center of Excellence in Knowledge- Enabled Computing (Kno.e.sis) 6-15-2010 Semantically Annotated RESTful Services for Large-scale Metabolomics Data Analysis Ashwin Manjunatha Wright State University - Main Campus Paul E. Anderson Wright State University - Main Campus Satya S. Sahoo Wright State University - Main Campus Ajith H. Ranabahu Wright State University - Main Campus Michael L. Raymer Wright State University - Main Campus, [email protected] See next page for additional authors Follow this and additional works at: hp://corescholar.libraries.wright.edu/knoesis Part of the Bioinformatics Commons , Communication Technology and New Media Commons , Databases and Information Systems Commons , OS and Networks Commons , and the Science and Technology Studies Commons is Presentation is brought to you for free and open access by the e Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis) at CORE Scholar. It has been accepted for inclusion in Kno.e.sis Publications by an authorized administrator of CORE Scholar. For more information, please contact [email protected]. Repository Citation Manjunatha, A., Anderson, P. E., Sahoo, S. S., Ranabahu, A. H., Raymer, M. L., & Sheth, A. P. (2010). Semantically Annotated RESTful Services for Large-scale Metabolomics Data Analysis. . hp://corescholar.libraries.wright.edu/knoesis/96

Semantically Annotated RESTful Services for Large-scale … · 2016-05-29 · Indexing/Search framework . 1. Built using the technology made for faceted classification of Web APIs

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Semantically Annotated RESTful Services for Large-scale … · 2016-05-29 · Indexing/Search framework . 1. Built using the technology made for faceted classification of Web APIs

Wright State UniversityCORE Scholar

Kno.e.sis Publications The Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis)

6-15-2010

Semantically Annotated RESTful Services forLarge-scale Metabolomics Data AnalysisAshwin ManjunathaWright State University - Main Campus

Paul E. AndersonWright State University - Main Campus

Satya S. SahooWright State University - Main Campus

Ajith H. RanabahuWright State University - Main Campus

Michael L. RaymerWright State University - Main Campus, [email protected]

See next page for additional authors

Follow this and additional works at: http://corescholar.libraries.wright.edu/knoesis

Part of the Bioinformatics Commons, Communication Technology and New Media Commons,Databases and Information Systems Commons, OS and Networks Commons, and the Science andTechnology Studies Commons

This Presentation is brought to you for free and open access by the The Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis) atCORE Scholar. It has been accepted for inclusion in Kno.e.sis Publications by an authorized administrator of CORE Scholar. For more information,please contact [email protected].

Repository CitationManjunatha, A., Anderson, P. E., Sahoo, S. S., Ranabahu, A. H., Raymer, M. L., & Sheth, A. P. (2010). Semantically AnnotatedRESTful Services for Large-scale Metabolomics Data Analysis. .http://corescholar.libraries.wright.edu/knoesis/96

Page 2: Semantically Annotated RESTful Services for Large-scale … · 2016-05-29 · Indexing/Search framework . 1. Built using the technology made for faceted classification of Web APIs

AuthorsAshwin Manjunatha, Paul E. Anderson, Satya S. Sahoo, Ajith H. Ranabahu, Michael L. Raymer, and Amit P.Sheth

This presentation is available at CORE Scholar: http://corescholar.libraries.wright.edu/knoesis/96

Page 3: Semantically Annotated RESTful Services for Large-scale … · 2016-05-29 · Indexing/Search framework . 1. Built using the technology made for faceted classification of Web APIs

"GTPS" is acronym of Gene Trek in Procaryote Space. Various complete genomes of eubacteria and archaea have been registered in the International <span class="sem-class" title="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#Nucleotides">Nucleotide </span> Sequence Databases (INSD) of DDBJ/EMBL/GenBank. The annotation and sequence data are available from GIB (Genome Information Broker; http://gib.genes.nig.ac.jp/).

Semantically Annotated RESTful Services for Large-scale Metabolomics Data Analysis Ashwin Manjunatha, Paul Anderson, Satya Sahoo, Ajith Ranabahu, Michael Raymer, Amit Sheth

Kno.e.sis Center, Wright State University {ashwin,paul,satya,ajith,raymer,amit}@knoesis.org

1. Introduction

7. Tools

6. What is the bottom line for the Biologist ?

3. How about Scalability ?

4. Annotation and SA-REST

Adding metadata to point to richer models

5. Advantages of Annotation

8. References

2. What is the problem ?

• Large Data sets • Standard post-instrumental processing • Quantification of spectral features • Normalization • Scaling • Multivariate statistical modeling

• All Computationally intensive processes • Variety of algorithms for each step

Need a robust and flexible analysis platform

Move to a Service based Architecture ! • Provide Web Services for each algorithm • Assemble workflows as required ! Taverna – an open source family of tools

for designing and executing workflow

A common solution for flexibility

The term metabolomics is defined as a comprehensive analysis in which metabolites of a biological system are identified and quantified. Any technique that can quantify metabolites can be used for metabolomics, but there are two primary techniques seen in the literature: nuclear magnetic resonance (NMR) and mass spectrometry with a prior on-line separation step such as high performance liquid chromatography (HPLC) or gas chromatography (GC). While neither technique is strictly superior, each technique has its own advantages and disadvantages. Existing applications include the identification of biomarkers associated with responses to toxin and pathophysiologic changes, sample classification based on the type of toxic exposure, large scale human studies, clinical diagnosis, and the study of genetic disorders.

Metabolomics An open-source software framework for reliable, scalable, distributed computing. [http://hadoop.apache.org] • Uses the map-reduce computational paradigm • Runs off Computing Clouds

Hadoop Shared hardware resources, software and information are provided to computers and other devices on-demand. • Many vendors

Computing Cloud

Use Apache Hadoop on Computing Clouds to run processes in parallel. Applicable to many common mathematical

operations such as summing and averaging.

Faceted Search Technique for accessing a collection of information represented using a faceted classification, allowing users to explore by filtering available information. When annotated with richer models, the indexing software can easily create faceted indexes to support a fine grained search. Even the regular keyword search can be improved.

1. Query by concept – not by keyword Search for “NCI:FASTA” instead just FASTA. Yields

documents that indicate the term FASTA as defined by the NCI Thesaurus.

2. Filter by multiple facets Issue queries indicating many facets, say “type:

soap binding:java include:NCI:FASTA” to look for service descriptions that are SOAP services with java bindings including mentions about NCI:FASTA.

Semi-Automated Composition When service interface documents are annotated service compositions can be done more intelligently. 1. A composition tool can warn the creator of

incompatible connections : Output of Service A cannot be input to Service B !

2. Supplement transformations by suggesting matching elements : Create transformations or suggest the difficulty of transformation to the human (see Mediatability[1])

Firefox Plug-in Annotate web pages inside the browser and submit them to the index Indexing/Search framework 1. Built using the technology made for faceted classification of Web APIs [2]. 2. Multiple Apache Lucene indexes in the back-end

1. Mediatability: Estimating the degree of human involvement in xml schema mediation, K Gomadam, A Ranabahu, L Ramaswamy, AP Sheth

2. A Faceted Classification Based Approach to Search and Rank Web APIs, Gomadam, K. and Ranabahu, A. and Nagarajan, M. and Sheth, A.P. and Verma, K.

3. SA-REST: Semantic Annotation of Web Resources, W3C member submission by Wright State University http://www.w3.org/Submission/SA-REST/

Better Search for Biological Web Services Services can be searched with more precise terms and concepts. Search by ontology concept and add facets to make precise filtering. Convenience in Creating Workflows Find and mash services together with ease. The tools can suggest the degree of match and also create data mappings. The workflows can be made graphically and then executed by just a point and click. There is no need to download, install and configure a number of applications. Faster processing and result generation The backend services can be Cloud based providing results much faster than any single computer. No need for heavy in-house computing facilities Use services that are hosted on clouds and avoid the equipment costs and all the hassle of hardware maintenance. Pay per use pricing model is convenient for sporadic usage.

SA-REST W3C member submission on Semantic Annotation of RESTful services [3]. Three basic properties domain-rel : mark the top level domain of a document :e.g.Nucleotides sem-rel : mark the domain of a linked document sem-class : mark the meaning of a selected word

Toxicology is the branch of pharmacology that deals with poisons and their effects on plant, animal and human life.

Toxicology

http://www.taverna.org.uk/

Nuclear magnetic resonance (NMR) spectroscopy is an experimental technique that exploits the properties of an atom’s nucleus. It can be used to obtain information about the concentration and structure of molecules. NMR studies magnetic nuclei by applying a static magnetic field followed by applying a second oscillating magnetic field. Specifically, only nuclei with an odd number of protons or neutrons can be measured using NMR; however, the two most common atoms studied are 1H And 13C.

NMR Spectrometer

NMR

Ontology

Annotation links the Ontology concept with a term

Web page