Upload
buinhi
View
215
Download
0
Embed Size (px)
Citation preview
Information Management Software
2
Overview
- Netezza has different behavior than other databases
- Overall CDC Philosophy (low impact vs. low latency)
- Topology
- Requirements
- Netezza Optimizations (filtering, bulk apply, updat es)
- Simplified Tuning
- Limitations
Information Management Software
Log-Based Change Data Capture
Database Logs
Source Engine Target Engine
TCP/IP
Monitoring and Configuration
Database
Message Queue
Web Services
DB2, Oracle,SQL Server, etc
Flat files
Key Benefits:
• Low impact
• Flexible implementation
• Heterogeneous platform support
• Easy to use
Information Server
InfoSphere Information
Server
Netezza
Information Management Software
Netezza is Different
• Optimized for bulk operations and queries
• Not optimized for small OLTP transactions
• Traditional row by row apply is sub-optimal
• Striving for very low apply latency can generate te rribly inefficient workloads for the database
Information Management Software
IIDR/CDC for Netezza
• Optimized to make best use of your Netezza applianc e
• Leverages Netezza’s strengths in bulk loading to de liver high throughput apply
• Designed to minimize the impact on the appliance – A pply Latency is minutes not seconds
Information Management Software
Topology – Installation on RedHat Linux (64-bit x86)
CDC Netezza
Netezza Appliance
• Need connectivity to NPS machine
CDC Source
Information Management Software
Topology – Instances and Database
• Each CDC Instance connects to a NZ Database
• Consider number of DBs when sizing CDC
SALES
MKT
ARC
INVT
CUST
INVT
CUST
CDC Netezza
MKT
SALES
Information Management Software
CDC Netezza Requirements
• OS: RedHat Linux 5.3 or later
• Architecture: x86 (64-bit) or 64-bit zLinux
• Per Subscription Memory: 4Gb minimum, default 8Gb
• Netezza JDBC 6.0.3 driver and later (not supplied)
• Connectivity to Netezza Appliance
Information Management Software
Engine Details
• Data is loaded into Netezza External Tables
• CDC uses Linux Named Pipe to populate External Tabl es
• Data Never hits the disk until it’s in Netezza
Netezza DBEXT_TBLPipe
INSERT/UPDATE T1 SELECT … FROM EXTERNAL TABLE EXT_TBL …;
T11,’a’
2,’b’
3,’c’
Information Management Software
Netezza Optimizations – Updates
• Updates are replicated as DELETE and INSERT
• Netezza tables should have primary keys
• Update and Delete workloads from the source require table mappings to select the unique key column set (primary keys are automatic)
• Updates marked as N/A in performance monitor
Information Management Software
CDC Netezza Optimizations – Net changes
• CDC NZ can filter some operations that are made red undant by deletes. For example, inserting 10,000 rows then deleting them.
Information Management Software
Simplified Tuning
• Less System Parameters
• One tuning parameter : acceptable_latency_in_minute s
Information Management Software
The Bulk Apply with High Volumes
• Data is buffered in memory to create large UOW
• If enough work is found, CDC will apply
TX3
TX2
TX1
TX4
TX5
Netezza DBAPPLY
12:0012:0112:02
acceptable_latency_in_minutes =5
Information Management Software
The Bulk Apply with Low Volumes
• Data is buffered in memory to create large UOW
• If not enough work is found in a given time it’s fl ushed
TX2
TX1
Netezza DBAPPLY
12:0012:0312:0412:05
acceptable_latency_in_minutes =5
Information Management Software
Limitations
• General Rule: with exception of apply, no access to target DB
• User Exits, Stored Procedures cannot use CDC’s conn ection
• No differential refresh
• No conflict Detection and Resolution
• INTERVAL data type not supported
• Does not support LOBs
• Must be deselected or mapped to varchar/char/nvarchar/nchar and limited in length