23
By: Richa Sahni(09609072) Snigdha Nehru(096091 17) Himanshu Johari(06104663)

Impliance: A Next Generation Information Management

Embed Size (px)

Citation preview

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 1/23

By:Richa Sahni(09609072)

Snigdha Nehru(09609117)

Himanshu Johari(06104663)

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 2/23

SQL

Takes too long and too much expertise to :

set up,

configure,

tune, test,

deploy,

maintain,

and enhance.

Rarely exceed a few hundred nodes. Redundant layers

Competing components

7/8/21010Impliance : A Next Generation Information Management Appliance 2

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 3/23

 A ³next-generation´ information management system

Currently being designed and prototyped at the IBM Almaden

Research Center 

³Outside In´ Design Methodology

Integrated hardware and software components

Easy-to-administer appliance

To store, retrieve, and analyze all types of structured, semi-structured, and unstructured information

Low Total Cost of Ownership(TCO)

7/8/21010Impliance : A Next Generation Information Management Appliance 3

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 4/23

All Data Types : Need to store, manage, and uniformly query and

transform ALL data, not just structured records

PDF, XML, Text, Audio, and Video

search, classify, aggregate, and analyze the content of semi-structured and unstructured data

Accelerating Data Volume Growth : Need to scale out as the

volume of this data grows

Documents

Digital Media

Sensors (RFID etc.)

Long term Business trend analysis

7/8/21010

Impliance : A Next Generation Information Management Appliance4

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 5/23

Total cost of  ownership :

Cost of software and hardware reducing

Dominated by labour costs, which includes :

set up,

configure,

tune,

test,

deploy,

maintain,

and enhance

Need for modularity and simplicity right from Inception leading tolow costs.

Information Integration :: one version of ³truth´

You can access various data silos, BUT

Need to query across various data silos.

Requires one schema/ format of data all across the organization

7/8/21010

Impliance : A Next Generation Information Management Appliance5

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 6/23

Hardware ± Software Mismatch

Hardware has become more scalable

low-power 

multi-core blade servers,

large memories layers of large on chip caches

ultra-dense storage systems

commodity low-latency networks

Software is based on hardware

designed decades ago

7/8/21010

Impliance : A Next Generation Information Management Appliance6

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 7/23

Exploiting Customer Relationship Management

 At least one center for handling customer phone calls, e-mails,

and/or web-page comments, questions, and complaints.

Opportunity for selling more products to existing and prospectivecustomers

Requires trained motivated and hence expensive operators.

Can record text and based correlating the information extracted

from the text of the conversation transcripts with the profile of 

similar customers

Customized offer to a customer through a combination of 

services and products.

7/8/21010

Impliance : A Next Generation Information Management Appliance7

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 8/23

Integrating Content and Data The Usual Content Management Products have very limited

awareness of the semantics of the content or capabilities to

search it, usually restricting search to the content¶s metadata.

Impliance gives the ability to search the

actual content and relate it to structuredinformation from other sources.

E.g Insurance Companies.

Legal Compliance

If an enterprise is involved in legal actions with another 

enterprise

Need to preserve broad classes of information

that may be pertinent to the litigation

Most of this is information is unstructured.

7/8/21010

Impliance : A Next Generation Information Management Appliance8

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 9/23

Semantics :

What the data actually means is provided in databases by

humans via its logical schema,

Can be done automatically through text

analytics and annotation, imagerecognition algorithms, etc.

Search/ Query :

Data is too voluminous

use begins by obtaining a subset of the data that meets certainconditions on its content, context or metadata.

7/8/21010

Impliance : A Next Generation Information Management Appliance9

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 10/23

Composition.

Relating objects to each other and Composing them into new objects

creates new information that is at the heart of the value

proposition of information management.

Needs well defined schematics

Aggregation.  To be consumable by humans, large bodies of data must be

reduced through aggregation along various dimensions, to

discover 

higher-level models,

trends, and exceptions that

facilitate business decisions.

Most of the data being unstructured

becomes difficult to analyze.

Impliance will make it easy.7/8/21010

Impliance : A Next Generation Information Management Appliance10

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 11/23

The data infused into Impliance is mapped from its initial format to auniform data model.

The query processing engine can now store it and execute queries

over it

The discovery process executes queries over the data and uses theresults to derive annotations that are added to the data.

The end user uses an interactive retrieval interface to find the

desired information, optionally making use of the annotations added

by the discovery process.

The query processing engine does not ³understand´ the

annotations; instead, it supports a mixed data/meta-data model that

relies on smart query construction

7/8/21010

Impliance : A Next Generation Information Management Appliance11

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 12/23

7/8/21010

Impliance : A Next Generation Information Management Appliance12

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 13/23

Supports Economies of scale

Reduces the ³time to value´ (TTV) :

Pre-installation :The necessary software is pre-installed,

automatically detecting which hardware components are

available and ,

Reconfiguring : itself if there are changes. Better integration of different software components.

Tight integration among layers of software to improve efficiency

DataR

eduction/P

ushing Down : higher-level functionality such asaggregation and predicate application can be more easily ³pushed

down´ closer to the storage for early data reduction.

Use of Open Source and Commercial Software to accelerate

system development.

7/8/21010

Impliance : A Next Generation Information Management Appliance13

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 14/23

` Databases typically manage highly-structured data with a common

format and relatively small attributes, which conforms nicely to the

tables of relational database systems.` Impliance unifies the management of all data under one umbrella.

7/8/2101014Impliance : A Next Generation Information Management Appliance

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 15/23

` It provides interfaces to search structured and unstructured content

and metadata alike.

`  All types of data can be incorporated into Impliance.

` Impliance treats each such new version of a data item as

immutable. Thus reduces the problem of determining whether any

replica has the most recent version.

7/8/2101015Impliance : A Next Generation Information Management Appliance

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 16/23

 Additional metadata will be extracted for each document by running

different kinds of annotators.

Using schema mapping technologies structures from different

sources can be consolidated.

 Additional relationships across documents can be identified by

running various analyses on all pairs of documents.

7/8/2101016Impliance : A Next Generation Information Management Appliance

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 17/23

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 18/23

`  A typical Impliance installation will consist of several instances of 

Impliance deployed in geographically separated locations for 

disaster recovery as well as load balancing.

` Impliance needs an efficient way of organizing the storage,

computations, and the topology of the uunderlying hardware.

7/8/2101018Impliance : A Next Generation Information Management Appliance

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 19/23

7/8/2101019

Impliance : A Next Generation Information Management Appliance

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 20/23

` Data nodes have direct ownership of a subset of the persistent

storage and are the most efficient when performing operations on

that storage.

` Grid nodes perform analytic computations. They may be pulled into

a ³work crew´ to perform long or short-term operations, and have no

long-term state.

` Cluster nodes are responsible for making consistent locking and

caching decisions on data within data consistency groups.

7/8/2101020Impliance : A Next Generation Information Management Appliance

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 21/23

` Google Base? Primary data store which allow various type of data to be

published in simple way. Impliance focuses more on proactive information discovery,richer 

businss analytics and data management.

` Database ³Appliances´ (Netezza, DataAlegro, etc.)?

Both offer  appliances for  business intelligence applications 

on relational data.

Not just structured (relational) data

Discovery of semantics

More pro-active  Also both Oracle Secure Enterprise Search (OSES)  and IBM 

Websphere Information Integrator enable many data types to becrawled but the interface used was not as advanced asIMPLIANCE.

7/8/2101021Impliance : A Next Generation Information Management Appliance

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 22/23

Impliance

 A box with software pre-installedDelivered to enterprise: appliance or 

serviceFunctions?` Store and manage all information

accept all types of enterprisesdata` Deliver all intelligence

Integrate cross silo information  Advanced analytics with richer 

semanticsP

roperties?` Low TCO

easy to deploy (³plug & play´) simple and stable

` Scalability From SMB to Very Large

(P

etaBytes)

 

Data+Content+Digital Media

Relational

data

SQL

content

JCR

XMLXS

LT

Web page

Native

retrieval

interface

Native

update/

load

interface

HTTP

Video

 Archive

ILM

«

7/8/2101022

Impliance : A Next Generation Information Management Appliance

8/8/2019 Impliance:� A Next Generation Information Management

http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 23/23

7/8/21010Impliance : A Next Generation

Information Management Appliance 23