Upload
girish-srivastava
View
1.963
Download
9
Tags:
Embed Size (px)
DESCRIPTION
Netezza is a data warehousing appliance which can process your terabyte of data in glimpse.
Citation preview
04/12/2023 1
IBM PureData System for Analytics
04/12/2023 2
Objective
At the end of this session, participants will understand all the basic concepts about IBM Puredata System for Analytics (Netezza).
IBM Puredata System models and its components.
IBM Puredata System Architecture. And
How it works exactly.
04/12/2023 3
“If you'd like to take us on, make our day.”
- Larry Ellison, Oct 2009
Our Obsession
“Our goal is to become number one in the high-end server business for both Online Transaction Processing and Data Warehousing, both of those segments.”
- Larry Ellison, Dec 2010
04/12/2023 4
The Results
86%One of “The five most important M&A Deals of 2010”
- Wall Street Journal
04/12/2023 5
04/12/2023 6
Built-In Expertise Makes This as Simple as an Appliance
Dedicated device
Optimized for purpose
Complete solution
Fast installation
Very easy operation
Standard interfaces
Low cost
04/12/2023 7
Analytics without constraint
PureData for Analytics – Where Big Data Meets Deep Analytics
04/12/2023 8
04/12/2023 9
Seamless integration with Informatica, Business Objects, SAS and SQL Server (SSIS packages)Very little DDL & SQL conversion• Used same table structures • Converted Primary index to Distribution column10 to 200X performance improvements in BO reportingFast to DeployPrice to Performance very appealingEase of use.• Administrative• DBA Tasks• Supports all DB structures (3NF, Star, De-Normalized table)
9
Why Netezza?
04/12/2023 10
IBM PureData System for Analytics N1001
The IBM PureData System for Analytics N1001 models include single-rack and multi-rack configurations.
The N1001 model family is an update to the IBM Netezza 1000 model family, with the same architectural and interface specifications.
Each N1001 storage array contains either two or four disk enclosures, depending upon the model.
Each disk enclosure has 12 disks. For example, an N1001-005 system has one storage array
with 48 disks.
04/12/2023 11
Cont…
04/12/2023 12
IBM PureData System for Analytics N2001
The IBM® PureData™ System for Analytics N2001 family is the latest generation of data warehouse appliances.
It increases the capacity and performance of the N1001 models.
Within each rack are numerous components that work together to provide the asymmetric massively parallel processing of the Netezza® architecture.
The key hardware components include:Snippet blades (S-Blades)HostsStorage arrays
04/12/2023 13
The Following figure summarizes the IBM PureData System for Analytics N2001 half-rack, full-rack, and two-rack models.
04/12/2023 14
Snippet Blades (S-Blades)The snippet processing functions are the responsibility of
the S-Blade. The S-Blade is a specialized processing board which
combines the CPU processing power of a blade server with the query analysis intelligence of the Netezza Database Accelerator card.
The dualboard component resides in two slots of the S-Blade chassis.
Each chassis can contain up to 7 S-Blades.
04/12/2023 15
Cont… The Netezza Database Accelerator card contains the FPGA query engines,
memory, and I/O for processing the data from the disks where user data is stored.
04/12/2023 16
Netezza Hosts The host server is a Linux server that runs the Netezza
software and utilities.
The host controls and coordinates the activity of the appliance.
It performs query optimization; controls table and database operations; consolidates and returns query results; and monitors the Netezza system components to detect and report problems.
The host is a highly redundant, highly available, server.
The Netezza 1000 systems have two hosts in a highly available (HA) configuration.
04/12/2023 17
Storage ArraysThe storage arrays contain the disks that store the user data
and related processing files to support the query activity on the system.
In the N2001 model family, each disk enclosure has 24 disks.
There are 12 disk enclosures in each full rack, or 6 enclosures in a half-rack model.
In the N2001 family, each rack is one storage array.
04/12/2023 18
Technology Netezza's appliances use a proprietary Asymmetric Massively
Parallel Processing (AMPP) architecture that combines open, blade-based servers and disk storage with a proprietary data filtering process using field-programmable gate arrays (FPGAs).
Netezza’s proprietary AMPP architecture is a two-tiered system designed to quickly handle very large queries from multiple users.
The first tier is a high-performance Linux SMP host that compiles data query tasks received from business intelligence applications, and generates query execution plans.
It then divides a query into a sequence of sub-tasks, or snippets that can be executed in parallel, and distributes the snippets to the second tier for execution.
The second tier consists of multiple no. of snippet processing blades, or S-Blades, where all the primary processing work of the appliance is executed.
04/12/2023
19
04/12/2023 20
Built-in Expertise No indexes or tuning Data model agnostic Fully parallel, optimized In Database Analytics
Integration by Design Server, Storage, Database in one easy to use package Automatic parallelization and resource optimization to scale
economically Enterprise-class security and platform management
Simplified Experience Up and running in hours Minimal ongoing administration Standard interfaces to best of breed Analytics, BI, and data integration
tools Built-in analytics capabilities allow users to derive insight from data
quickly Easy connectivity to other Big Data Platform components
IBM PureData System for AnalyticsThe Simple Appliance for Serious Analytics
04/12/2023 21
System for Analytics
Delivering data services for analytics
IBM PureData System for AnalyticsOptimized exclusively for analytic data workloads
Speed 10-100x faster than traditional custom systems* Patented MPP hardware acceleration
(Massively Parallel Processing)
Simplicity Data load ready in hours No database indexes No tuning No storage administration
Scalability Peta-scale data capacity
Smart Designed to runs complex analytics in minutes,
not hours Richest set of in-database analytics
* Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary.
04/12/2023 22
SimplifyMove analytics into the Data Warehouse
Integrate the server, storage and database into one optimized package
Move complex analytics into the database
Integrated, high performance analytics within the data warehouse Server
Storage
Database
Analytics
04/12/2023 23
Traditional Data Warehouse Complexity
04/12/2023 24
Data Warehousing – Simplified
04/12/2023 25
Spend Less Time Managing and More Time Innovating
Simplicity andEase of
Administration
No dbspace/tablespace sizing and configuration No redo/physical/Logical log sizing and
configuration No page/block sizing and configuration for tables No extent sizing and configuration for tables No Temp space allocation and monitoring No logical volume creations of files No integration of OS kernel recommendations No maintenance of OS recommended patch levels
Data Experts, not Database
Experts
Easy Administration Portal No software installation No indexes and tuning No storage administration
04/12/2023 26
Traditional Complexity … Netezza Simplicity
0. CREATE DATABASE TEST LOGFILE 'E:\OraData\TEST\LOG1TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG2TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG3TEST.ORA' SIZE 2M, 'E:\
OraData\TEST\LOG4TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG5TEST.ORA' SIZE 2M EXTENT MANAGEMENT LOCAL MAXDATAFILES 100 DATAFILE 'E:\OraData\TEST\
SYS1TEST.ORA' SIZE 50 M DEFAULT TEMPORARY TABLESPACE temp TEMPFILE 'E:\OraData\TEST\TEMP.ORA' SIZE 50 M
UNDO TABLESPACE undo DATAFILE 'E:\OraData\TEST\UNDO.ORA' SIZE 50 M NOARCHIVELOG CHARACTER SET WE8ISO8859P1;
1. Oracle* table and indexes 2. Oracle tablespace 3. Oracle datafile 4. Veritas file 5. Veritas file system 6. Veritas striped logical volume 7. Veritas mirror/plex 8. Veritas sub-disk 9. SunOS raw device 10. Brocade SAN switch 11. EMC Symmetrix volume 12. EMC Symmetrix striped meta-volume 13. EMC Symmetrix hyper-volume 14. EMC Symmetrix remote volume (replication) 15. Days/weeks of planning meetings
Netezza: Low (ZERO) Touch:
CREATE DATABASE my_db;
04/12/2023 27
ORACLE
CREATE TABLE "MRDWDDM"."RDWF_DDM_ROOMS_SOLD" ("ID_PROPERTY" NUMBER(5,
0) NOT NULL ENABLE, "ID_DATE_STAY" NUMBER(5, 0) NOT NULL ENABLE,
"CD_ROOM_POOL" CHAR(4) NOT NULL ENABLE, "CD_RATE_PGM" CHAR(4) NOT
NULL ENABLE, "CD_RATE_TYPE" CHAR(1) NOT NULL ENABLE,
"CD_MARKET_SEGMENT" CHAR(2) NOT NULL ENABLE, "ID_CONFO_NUM_ORIG"
NUMBER(9, 0) NOT NULL ENABLE, "ID_CONFO_NUM_CUR" NUMBER(9, 0) NOT
NULL ENABLE, "ID_DATE_CREATE" NUMBER(5, 0) NOT NULL ENABLE,
"ID_DATE_ARRIVAL" NUMBER(5, 0) NOT NULL ENABLE, "ID_DATE_DEPART"
NUMBER(5, 0) NOT NULL ENABLE, "QY_ROOMS" NUMBER(5, 0) NOT NULL
ENABLE, "CU_REV_PROJ_NET_LOCAL" NUMBER(21, 3) NOT NULL ENABLE,
"CU_REV_PROJ_NET_USD" NUMBER(21, 3) NOT NULL ENABLE,
"QY_DAYS_STAY_CUR" NUMBER(3, 0) NOT NULL ENABLE, "CD_BOOK_SOURCE"
CHAR(1) NOT NULL ENABLE) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255
STORAGE( FREELISTS 6) TABLESPACE "DDM_ROOMS_SOLD_DATA" NOLOGGING
PARTITION BY RANGE ("ID_PROPERTY" ) (PARTITION "PART1" VALUES LESS
THAN (600) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255
STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE
"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART2" VALUES
LESS THAN (1200) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255
STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE
"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART3" VALUES
LESS THAN (1800) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255
STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE
"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART4" VALUES
LESS THAN (2400) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255
STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE
"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART5" VALUES
LESS THAN (3000) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255
STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE
"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART6" VALUES
LESS THAN (MAXVALUE) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255
STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE
"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS ) ;
ORACLE Indexes
CREATE INDEX "MRDWDDM"."RDWF_DDM_ROOMS_SOLD_IDX1" ON "RDWF_DDM_ROOMS_SOLD"
("ID_PROPERTY" , "ID_DATE_STAY" , "CD_ROOM_POOL" , "CD_RATE_PGM" ,
"CD_RATE_TYPE" , "CD_MARKET_SEGMENT" ) PCTFREE 10 INITRANS 6 MAXTRANS 255
STORAGE( FREELISTS 10) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING
PARALLEL ( DEGREE 4 INSTANCES 1) LOCAL(PARTITION "PART1" PCTFREE 10
INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1
MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL
DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART2"
PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840
MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS
1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING,
PARTITION "PART3" PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL
4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0
FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE
"DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART4" PCTFREE 10 INITRANS 6
MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS
100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART5" PCTFREE 10
INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1
MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL
DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART6"
PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840
MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS
1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING ) ;
ORACLE Bitmap index
CREATE BITMAP INDEX "CRDBO"."SNAPSHOT_MONTH_IDX13" ON
"SNAPSHOT_OPPTY_MONTH_HIST" ("SNAPSHOT_YEAR" ) PCTFREE 10 INITRANS 2
MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4194304 MINEXTENTS 2 MAXEXTENTS
2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL
DEFAULT) TABLESPACE "SFA_DATAMART_INDEX" NOLOGGING ;
ORACLE Table Clusters
CREATE CLUSTER "MRDW"."CT_INTRMDRY_CAL" ("ID_YEAR_CAL" NUMBER(4, 0),
"ID_MONTH_CAL" NUMBER(2, 0), "ID_PROPERTY" NUMBER(5, 0)) SIZE 16384
PCTFREE 10 PCTUSED 90 INITRANS 3 MAXTRANS 255 STORAGE(INITIAL
83886080 NEXT 41943040 MINEXTENTS 1 MAXEXTENTS 1017 PCTINCREASE 0
FREELISTS 4 FREELIST GROUPS 1 BUFFER_POOL RECYCLE) TABLESPACE
"TSS_FACT" ;
Netezza
CREATE TABLE MRDWDDM.RDWF_DDM_ROOMS_SOLD (
ID_PROPERTY numeric(5, 0) NOT NULL ,
ID_DATE_STAY integer NOT NULL ,
CD_ROOM_POOL CHAR(4) NOT NULL ,
CD_RATE_PGM CHAR(4) NOT NULL ,
CD_RATE_TYPE CHAR(1) NOT NULL ,
CD_MARKET_SEGMENT CHAR(2) NOT NULL ,
ID_CONFO_NUM_ORIG integer NOT NULL ,
ID_CONFO_NUM_CUR integer NOT NULL ,
ID_DATE_CREATE integer NOT NULL ,
ID_DATE_ARRIVAL integer NOT NULL ,
ID_DATE_DEPART integer NOT NULL ,
QY_ROOMS integer NOT NULL ,
CU_REV_PROJ_NET_LOCAL numeric(21, 3) NOT NULL ,
CU_REV_PROJ_NET_USD numeric(21, 3) NOT NULL ,
QY_DAYS_STAY_CUR smallint NOT NULL ,
CD_BOOK_SOURCE CHAR(1) NOT NULL)
distribute on random;
•No indexes
•No Physical Tuning/Admin
•Stripe data randomly, or by Columns
Traditional Complexity ... Netezza Simplicity
04/12/2023 28
Data In
Loading the PureData System for Analytics
Data Integration
Ab Initio Cloudera Composite Software IBM Big Insights IBM Information Server IBM InfoSphere Streams Informatica Oracle Data Integrator Oracle GoldenGate SAP Business Objects
SQ
L
OD
BC
JD
BC
O
LE
-DB
04/12/2023 29
Querying the PureData System for Analytics
Reporting and Analysis
IBM Cognos IBM SPSS IBM Unica Information Builders Kalido KXEN Microsoft Excel MicroStrategy Oracle OBIEE SAP Business Objects SAS Actuate
Data Out
SQ
L
OD
BC
JD
BC
O
LE
-DB
04/12/2023
30
04/12/2023 31
The Netezza AMPP™ Architecture
Advanced
Analytics
Loader
ETL
BI
Applications
FPGA
Memory
CPU
FPGA
Memory
CPU
FPGA
Memory
CPU
HostsHost
Disk
EnclosuresS-Blades™
Network
Fabric
Netezza Appliance
04/12/2023 32
Our Secret Sauce
FPGA Core CPU Core
Uncompress Project Restrict,Visibility
Complex ∑Joins, Aggs, etc.
select DISTRICT,
PRODUCTGRP,
sum(NRX)
from MTHLY_RX_TERR_DATA
where MONTH = '20091201'
and MARKET = 509123
and SPECIALTY = 'GASTRO'
Slice of table
MTHLY_RX_TERR_DATA
(compressed)
Slice of table
MTHLY_RX_TERR_DATA
(compressed)
where MONTH = '20091201'
and MARKET = 509123
and SPECIALTY = 'GASTRO'
where MONTH = '20091201'
and MARKET = 509123
and SPECIALTY = 'GASTRO'
sum(NRX)sum(NRX)
select DISTRICT,
PRODUCTGRP,
sum(NRX)
select DISTRICT,
PRODUCTGRP,
sum(NRX)
What is Netezza?
Essentially: A big, fast SQL database04/12/2023 33
Custom Backend Blades
Commodity CPU, NIC, diskFPGA
Can do basic filtering in hardware, i.e., stream processing before data hits main memory
04/12/2023 34
04/12/2023 35
Major Components
The four key components that make up TwinFin are: SMP hosts; snippet blades (called S-Blades); disk enclosures and a network fabric.
The disk enclosures contain high-density, high-performance disks.
Each disk contains a slice of the data in the database table, along with a mirror of the data on another disk.
The storage arrays are connected to the S-Blades via high-speed interconnects that allow all the disks to simultaneously stream data to the S-Blades at the fastest rate possible.
04/12/2023 36
Cont… The SMP hosts are high-performance Linux servers that are set
up in an active-passive configuration for high-availability.
The active host presents a standardized interface to external tools and applications, such as BI and ETL tools and load utilities.
It compiles SQL queries into executable code segments called snippets, creates optimized query plans and distributes the snippets to the S-Blades for execution.
04/12/2023 37
Cont… S-Blades are intelligent processing nodes that make up the
turbocharged MPP engine of the appliance.
Each S-Blade is an independent server that contains powerful multi-core CPUs, Netezza's unique multi-engine FPGAs and gigabytes of RAM--all balanced and working concurrently to deliver peak performance.
FPGAs are commodity chips that are designed to do process data streams at extremely fast rates.
04/12/2023 38
The Netezza S-Blade™
04/12/2023 39
S-Blade™ Components
Intel Quad-Core
Dual-Core FPGADRAM
IBM BladeCenter Server Netezza DB Accelerator
SAS Expander
Module
SAS Expander
Module
04/12/2023 40
FPGANetezza uses FPGA to do front line processing by filtering data from disk and applying additional logic before passing that to memory on SPU. Main advantages from data processing: Parallelism and processing power now shifted away from CPU,
FPGA has similar dimensions as a CPU, consumes 5 times less power and clock speed is about 5 times less.
Filtering out unnecessary data. Low latency, high throughput. More caching capability.
04/12/2023 41
Cont…
Netezza is the first company to leverage the power of FPGA to process streaming data in a data warehouse appliance.
In traditional systems, all the data for a query is moved and then the “where” clause is processed.
With Netezza, instead of moving a huge set of data, the FPGA processes the “where” clause as data streams off of the disk, so only the data needed for processing is moved to the next step.
04/12/2023 42
Netezza Storage
As discussed earlier, each disk in the appliance is partitioned into primary, mirror and temp or swap partitions.
The primary partition in each disk is used to store user data like database tables, the mirror stores a copy of the primary partition of another disk so that it can be used in the event of disk failures and the temp/swap partition is used to store the data temporarily like when the appliance does data redistribution while processing queries.
04/12/2023 43
Cont… The logical representation of the data saved in the primary partition of each disk is called the data slice.
When users create database tables and load data into it, they get distributed across the available data slices.
Logical representation of data slices is called the data partition.
For TwinFin systems each S-Blade or SPU is connected to 8 data partitions and some only to 6 disk partitions (since some disks are reserved for failovers).
There are situations like SPU failures when a SPU can have more than 8 partitions attached to it since it got assigned some of the data partitions from the failed SPU.
04/12/2023 44
Cont… The SPU 1001 is connected to 8 data partitions numbered 0 to 7.
Each data partition is connected to one data slice stored on different disks.
For e.g., the data partition 0 points to the data slice 17 stored on the disk with id 1063.
The disk 1063 also stores the mirror of the data partition 18 stored on disk 1064.
The following diagram illustrates what happens when the disk 1070 fails.
04/12/2023 45
Cont… Immediately after the disk 1070 stops responding, the disk 1069 will be used by the system to satify queries for which data is required from data slice 23 and 24.
Disk 1069 will serve the requests using the data in both its primary and mirror partition.
In the meantime, the contents in disk 1070 are regenerated on one of the spare disks in the disk array which in this case is disk 1100 using the data in disk 1069.
Once the regen is complete the SPU data partition 7 is updated to point to the data slice 24 on disk 1100.
04/12/2023 46
Cont… In the situation where a SPU fails, the appliance assigns all the data partitions to other SPUs in the system.
Pair of disks which contains the mirror copy of each others data slice will be assigned to other SPUs which will result in additional two data partitioned to be managed by the target SPU.
If for e.g. if an SPU currently manages data partitions 0 to 7 and if the appliance reassings two data partitions from a failed SPU, the SPU will have 10 data partitions to manage and it will be numbered from 0 to 9.
04/12/2023 47
IBM Netezza Has High Success Rates vs. Oracle & Teradata
Speed
• Hardware-based data streaming
Scalability
• True MPP offers enterprise scale-out
Simple
• Black-box appliance with no tuning or storage administration
Smart
• Built-in advanced analytics pushed deep into database
NO NO NO NO
NO YES NO LIMITED
04/12/2023 48
IBM Netezza is better value than Teradata
Teradata Results In IBM Netezza Client Advantage
Costs
High initial cost
Lots of professional services
Lots of administration
High cost of ownership
Low initial cost
Little administrationLow total cost of ownership
SmartLimited analytics pushdown
Analytics causes resource contention
Poor analytic performance
Minimal contention due to analytics
More customers benefit from faster analytics
SimplicityConstant tuning for performance
Needs much administration
Difficult and slow to provide business
value
True applianceNo tuning
Faster time to value
Speed
Old inefficient legacy code
Complex workload partitions
Data warehouse performance doesn’t
scale consistentlyDesigned for balance
Highest / most consistent data warehouse and advanced
analytics performance
Architecture
Proprietary interconnect
Virtualized MPP nodes (vAMPs)
Separating compute and storage
Unpredictable performance
True MPP
FPGA acceleration
Best architecture for data warehouse and advanced
analytics
48
04/12/2023 49
IBM Netezza is Better Value than Oracle ExadataOracle Exadata Results In IBM Netezza Client Advantage
CostsHigh initial cost
Lots of administration
High total cost of ownership
Low initial cost
Little administrationLow total cost of ownership
Smart
Limited analytics pushdown
Inefficiency of Oracle Real Application Clusters (RAC)
Poor analytic performance
Extensive analytics Pushdown capabilities
Fast time to insightMore users benefit
from faster analytics
Simplicity
Complexity of Oracle RAC
Constant tuning for performance
Complex patch process
Complex administration
True applianceNo tuning
Faster time to value
ScalabilityNo proof points on scaling
RAC scalability bottleneck
Business growth risk
Proven scalabilityBusiness growth with
confidence
Speed
Designed for OLTP
RAC is inefficient for data warehouse workloads
Poor data warehouse
performance
Designed for data warehousing
Highest data warehouse performance
ArchitectureClustered SMP database layer
+Shared disk MPP storage layer
Compromised performance
True MPP
FPGA acceleration
Best architecture for data warehousing and advanced
analytics49
04/12/2023
50
End of The Session
Thanks for your attention