Upload
sandyjbs
View
221
Download
0
Embed Size (px)
Citation preview
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 1/23
By:Richa Sahni(09609072)
Snigdha Nehru(09609117)
Himanshu Johari(06104663)
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 2/23
SQL
Takes too long and too much expertise to :
set up,
configure,
tune, test,
deploy,
maintain,
and enhance.
Rarely exceed a few hundred nodes. Redundant layers
Competing components
7/8/21010Impliance : A Next Generation Information Management Appliance 2
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 3/23
A ³next-generation´ information management system
Currently being designed and prototyped at the IBM Almaden
Research Center
³Outside In´ Design Methodology
Integrated hardware and software components
Easy-to-administer appliance
To store, retrieve, and analyze all types of structured, semi-structured, and unstructured information
Low Total Cost of Ownership(TCO)
7/8/21010Impliance : A Next Generation Information Management Appliance 3
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 4/23
All Data Types : Need to store, manage, and uniformly query and
transform ALL data, not just structured records
PDF, XML, Text, Audio, and Video
search, classify, aggregate, and analyze the content of semi-structured and unstructured data
Accelerating Data Volume Growth : Need to scale out as the
volume of this data grows
Documents
Digital Media
Sensors (RFID etc.)
Long term Business trend analysis
7/8/21010
Impliance : A Next Generation Information Management Appliance4
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 5/23
Total cost of ownership :
Cost of software and hardware reducing
Dominated by labour costs, which includes :
set up,
configure,
tune,
test,
deploy,
maintain,
and enhance
Need for modularity and simplicity right from Inception leading tolow costs.
Information Integration :: one version of ³truth´
You can access various data silos, BUT
Need to query across various data silos.
Requires one schema/ format of data all across the organization
7/8/21010
Impliance : A Next Generation Information Management Appliance5
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 6/23
Hardware ± Software Mismatch
Hardware has become more scalable
low-power
multi-core blade servers,
large memories layers of large on chip caches
ultra-dense storage systems
commodity low-latency networks
Software is based on hardware
designed decades ago
7/8/21010
Impliance : A Next Generation Information Management Appliance6
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 7/23
Exploiting Customer Relationship Management
At least one center for handling customer phone calls, e-mails,
and/or web-page comments, questions, and complaints.
Opportunity for selling more products to existing and prospectivecustomers
Requires trained motivated and hence expensive operators.
Can record text and based correlating the information extracted
from the text of the conversation transcripts with the profile of
similar customers
Customized offer to a customer through a combination of
services and products.
7/8/21010
Impliance : A Next Generation Information Management Appliance7
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 8/23
Integrating Content and Data The Usual Content Management Products have very limited
awareness of the semantics of the content or capabilities to
search it, usually restricting search to the content¶s metadata.
Impliance gives the ability to search the
actual content and relate it to structuredinformation from other sources.
E.g Insurance Companies.
Legal Compliance
If an enterprise is involved in legal actions with another
enterprise
Need to preserve broad classes of information
that may be pertinent to the litigation
Most of this is information is unstructured.
7/8/21010
Impliance : A Next Generation Information Management Appliance8
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 9/23
Semantics :
What the data actually means is provided in databases by
humans via its logical schema,
Can be done automatically through text
analytics and annotation, imagerecognition algorithms, etc.
Search/ Query :
Data is too voluminous
use begins by obtaining a subset of the data that meets certainconditions on its content, context or metadata.
7/8/21010
Impliance : A Next Generation Information Management Appliance9
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 10/23
Composition.
Relating objects to each other and Composing them into new objects
creates new information that is at the heart of the value
proposition of information management.
Needs well defined schematics
Aggregation. To be consumable by humans, large bodies of data must be
reduced through aggregation along various dimensions, to
discover
higher-level models,
trends, and exceptions that
facilitate business decisions.
Most of the data being unstructured
becomes difficult to analyze.
Impliance will make it easy.7/8/21010
Impliance : A Next Generation Information Management Appliance10
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 11/23
The data infused into Impliance is mapped from its initial format to auniform data model.
The query processing engine can now store it and execute queries
over it
The discovery process executes queries over the data and uses theresults to derive annotations that are added to the data.
The end user uses an interactive retrieval interface to find the
desired information, optionally making use of the annotations added
by the discovery process.
The query processing engine does not ³understand´ the
annotations; instead, it supports a mixed data/meta-data model that
relies on smart query construction
7/8/21010
Impliance : A Next Generation Information Management Appliance11
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 12/23
7/8/21010
Impliance : A Next Generation Information Management Appliance12
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 13/23
Supports Economies of scale
Reduces the ³time to value´ (TTV) :
Pre-installation :The necessary software is pre-installed,
automatically detecting which hardware components are
available and ,
Reconfiguring : itself if there are changes. Better integration of different software components.
Tight integration among layers of software to improve efficiency
DataR
eduction/P
ushing Down : higher-level functionality such asaggregation and predicate application can be more easily ³pushed
down´ closer to the storage for early data reduction.
Use of Open Source and Commercial Software to accelerate
system development.
7/8/21010
Impliance : A Next Generation Information Management Appliance13
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 14/23
` Databases typically manage highly-structured data with a common
format and relatively small attributes, which conforms nicely to the
tables of relational database systems.` Impliance unifies the management of all data under one umbrella.
7/8/2101014Impliance : A Next Generation Information Management Appliance
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 15/23
` It provides interfaces to search structured and unstructured content
and metadata alike.
` All types of data can be incorporated into Impliance.
` Impliance treats each such new version of a data item as
immutable. Thus reduces the problem of determining whether any
replica has the most recent version.
7/8/2101015Impliance : A Next Generation Information Management Appliance
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 16/23
Additional metadata will be extracted for each document by running
different kinds of annotators.
Using schema mapping technologies structures from different
sources can be consolidated.
Additional relationships across documents can be identified by
running various analyses on all pairs of documents.
7/8/2101016Impliance : A Next Generation Information Management Appliance
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 17/23
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 18/23
` A typical Impliance installation will consist of several instances of
Impliance deployed in geographically separated locations for
disaster recovery as well as load balancing.
` Impliance needs an efficient way of organizing the storage,
computations, and the topology of the uunderlying hardware.
7/8/2101018Impliance : A Next Generation Information Management Appliance
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 19/23
7/8/2101019
Impliance : A Next Generation Information Management Appliance
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 20/23
` Data nodes have direct ownership of a subset of the persistent
storage and are the most efficient when performing operations on
that storage.
` Grid nodes perform analytic computations. They may be pulled into
a ³work crew´ to perform long or short-term operations, and have no
long-term state.
` Cluster nodes are responsible for making consistent locking and
caching decisions on data within data consistency groups.
7/8/2101020Impliance : A Next Generation Information Management Appliance
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 21/23
` Google Base? Primary data store which allow various type of data to be
published in simple way. Impliance focuses more on proactive information discovery,richer
businss analytics and data management.
` Database ³Appliances´ (Netezza, DataAlegro, etc.)?
Both offer appliances for business intelligence applications
on relational data.
Not just structured (relational) data
Discovery of semantics
More pro-active Also both Oracle Secure Enterprise Search (OSES) and IBM
Websphere Information Integrator enable many data types to becrawled but the interface used was not as advanced asIMPLIANCE.
7/8/2101021Impliance : A Next Generation Information Management Appliance
8/8/2019 Impliance:� A Next Generation Information Management
http://slidepdf.com/reader/full/impliance-a-next-generation-information-management 22/23
Impliance
A box with software pre-installedDelivered to enterprise: appliance or
serviceFunctions?` Store and manage all information
accept all types of enterprisesdata` Deliver all intelligence
Integrate cross silo information Advanced analytics with richer
semanticsP
roperties?` Low TCO
easy to deploy (³plug & play´) simple and stable
` Scalability From SMB to Very Large
(P
etaBytes)
Data+Content+Digital Media
Relational
data
SQL
content
JCR
XMLXS
LT
Web page
Native
retrieval
interface
Native
update/
load
interface
HTTP
Video
Archive
ILM
«
7/8/2101022
Impliance : A Next Generation Information Management Appliance