Upload
charlotte-keon
View
66
Download
1
Embed Size (px)
DESCRIPTION
Teradata Platform Introduction. Hardware and Software Components in Enterprise Data Warehouse Derek Jones March 2005. Teradata in the Enterprise. Teradata is relational database management system Acts as central enterprise-wide database - PowerPoint PPT Presentation
Citation preview
Teradata Platform Introduction
Hardware and Software Components in Enterprise Data Warehouse
Derek JonesMarch 2005
Teradata Confidential2 > 04/19/23
Teradata in the Enterprise
Teradata is relational database management system
• Acts as central enterprise-wide database> Contains information extracted from operational systems> Central placement minimizes data duplication and provides
single view of business
Teradata Confidential3 > 04/19/23
Key Teradata Differentiators
• Parallelism throughout platform• Shared Nothing Architecture• Proprietary intelligent system inter-connect
Teradata Confidential4 > 04/19/23
Teradata Scales Linearly
• Scaling achieved via ‘shared nothing’ architecture and unconditional parallelism
• Power is in linear scalability, where slope = 1• Scales with data• Scales with users• Scales with work
More nodes
More work
More users
More data
Node
Work
Users
Data
Teradata Confidential5 > 04/19/23
Data Volume(Raw, User Data)
SchemaSophistication
QueryFreedom
QueryComplexity
QueryConcurrency
MixedWorkload
Query Data Volume
DataFreshness
The Teradata Difference“Multi-dimensional Scalability”
Teradata Confidential6 > 04/19/23
Data Volume(Raw, User Data)
Business Needs ChangeCompetition can be
Tuned to Meet a Static Environment
SchemaSophistication
QueryFreedom
QueryComplexity
DataFreshness
Query Data Volume
The Teradata Difference“Multi-dimensional Scalability”
QueryConcurrency
MixedWorkload
Teradata Confidential7 > 04/19/23
Data Volume(Raw, User Data)
Competition can be Tuned to Meet a
Static Environment
But At the Expense of Another Dimension
Competition Scales One Dimension at the
Expense of Others
Business Needs Change Desire to Increase User/
Query Concurrency
SchemaSophistication
QueryFreedom
QueryComplexity
DataFreshness
Query Data Volume
The Teradata Difference“Multi-dimensional Scalability”
QueryConcurrency
MixedWorkload
Teradata Confidential8 > 04/19/23
Data Volume(Raw, User Data)
Competition Scales One Dimension at the
Expense of Others
Limited by Technology!
SchemaSophistication
QueryFreedom
QueryComplexity
DataFreshness
Query Data Volume
The Teradata Difference“Multi-dimensional Scalability”
QueryConcurrency
MixedWorkload
Teradata can Scale Simultaneously Across Multiple Dimensions
Driven by Business!
Teradata Confidential9 > 04/19/23
Key Teradata Differentiators
• Parallelism throughout platform• Shared Nothing Architecture• Proprietary intelligent system inter-connect
Teradata Confidential10 > 04/19/23
Node Architecture (‘Shared Nothing’)
Each Teradata Node is made up of hardware and software
• Each node has CPUs, system disk, memory and adapters• Each node runs copy of OS and database SW
Teradata Confidential11 > 04/19/23
Node Architecture (‘Shared Nothing’)
Each Teradata Node is made up of hardware and software
• Each node runs copy of OS, database SW, & virtual processes (above line)• Each node has CPUs, system disk, memory & adapters (below line)
PE vproc
AMPvproc
Vdisk
AMPvproc
Vdisk
AMPvproc
Vdisk
AMPvproc
Vdisk
AMPvproc
Vdisk
AMPvproc
Vdisk
AMPvproc
Vdisk
AMPvproc
Vdisk
PE vproc
UNIX
PDE
V2 Virtual Processors (Vprocs)
Teradata Confidential12 > 04/19/23
NCR 5400 Server Value Prop
1. Better Price/Performance> 20% Performance Improvement> 12% Price/Performance
Improvement
2. Advanced Cabinet Design> Up to 10 Nodes Per Cabinet> Up to a 40% Reduction in Floor
Space
3. Investment Protection> Multi Generation (5) Coexistence> 32-bit/64-bit Transition Platform
Teradata Confidential13 > 04/19/23
NCR 5400 Server Key Messages#2 – Advanced Cabinet Design
• Revolutionary cabinet increases reliability and provides greater configuration flexibility.
> up to 10 nodes per cabinet enable a 20% - 40% smaller footprint than the 5380
> 30% increase in system storage reliability with new advanced cooling mechanisms
> Extend supported distance for large systems (65+ nodes) between cabinets to 300 – 600 meters with new BYNET V3.
> Doubles the number of configurable nodes to 1,024
1
3
1
3
1
3
1
3
1
3
1
3
1
3
1
3
1
3
1
3
1
3
BYNET V3 Switches
Five UPS Modules
Ethernet Switches
Up to 10 nodes within each cabinet
Server Management
Module (3GSM)
FC Switches
Teradata Confidential14 > 04/19/23
Key Teradata Differentiators
• Parallelism throughout platform• Shared Nothing Architecture• Proprietary intelligent system inter-connect
Teradata Confidential15 > 04/19/23
Parallelism via BYNET Interconnect
BYNET high-speed interconnect facilitates system communication
• All nodes connected via BYNET– Hardware network– Software runs on each node
Vproc to Vproc
Vproc to Vproc
Multicast (1 to Many)
Broadcast (1 to All)
Different communication paths facilitate system parallelism
– 1 to 1– 1 to Many– 1 to All
Teradata Confidential16 > 04/19/23
MPP System Configuration
Nodes grouped to increase data availability and system uptime > Not shared storage but
access within group> Improves data availability> Improves system up time> Allows for VPROC
migration
Teradata Confidential17 > 04/19/23
Teradata Clique
NodeNode Node Node
Disk Array Disk Array Disk Array Disk Array
Clique is group of nodes that access same arrays
• VPROC smallest unit of parallelism
• VPROC has assigned storage within clique
• VPROCs can migrate within clique
• Improves system up time, data availability, and ease of recovery
= VPROC
Teradata Confidential18 > 04/19/23
Teradata Clique and VPROC
NodeNode Node Node
Disk Array Disk Array Disk Array Disk Array
VPROC smallest unit of parallelism
• VPROC smallest unit of parallelism or work
• Data distributed by hash to all VPROCs
• VPROC has assigned storage within clique
• VPROCs can migrate within clique
• Improves system up time, data availability, and ease of recovery
• Data fully available at degraded performance until node returns.
= VPROC
X
Teradata Confidential19 > 04/19/23
Node Node Node Node Node NodeNode Hot Standby
X
Disk Array Disk Array Disk Array Disk Array Disk Array Disk Array
Fibre Channel Switches
Teradata Clique with Hot Standby
Teradata Confidential20 > 04/19/23
Teradata Optimizer
• The Teradata Optimizer is the most robust in the industry• Optimizer is parallel-aware, understands available system
components• Handles mixed work loads
> Multiple complex queries > Joins per query > Unlimited ad-hoc processing
• Output is least expensive plan (resources) to answer request
Teradata Confidential21 > 04/19/23
Teradata Request Cycle
DBaseAccessRightsTVMTVFieldsIndexes
DATA parcel
AMPSTEPS
STATISTICS
GENERATOR
OPTIMIZER
RESOLVER
SYNTAXER
DD
REQUEST Parcel
CACHED?
No
GNCAPPLY
SECURITY
Yes
Request flow diagram• Each request parcel contains at least
one SQL statement• Six main component steps
> Syntaxer> Resolver> Security> Optimizer> Generator> gncApply
• AMP steps are instructions sent to AMP VPROCs to complete the request
• Following completion each request generates a success/fail parcel with any necessary records.
Teradata Confidential22 > 04/19/23
Data Protection (Object Locks)
Locks protect data from simultaneous access• Vary by type
– Exclusive, Write, Read, & Access
• Vary by object locked – Database, Table, & Row Hash
• Locks enforced by hierarchy
Teradata Confidential23 > 04/19/23
Data Protection (RAID-1)
RAID data protection• RAID-1 (disk mirroring)• Disk pair increases read
performance and data availability• In failure scenario, mirrored drive
re-built by array controller
Teradata Confidential24 > 04/19/23
Data Protection (Fallback)
• Fallback table data• Copy of table rows maintained by database on
second AMP VPROC
• Fallback copies grouped logically in CLUSTERS so data fully available when physical CLIQUE is
off-line.• Fallback + RAID increase data availability
Teradata Confidential25 > 04/19/23
Data Storage and Access
Bucket #
Row Hash
Parsing Engine
Primary Index value = 25
Message Passing Layer
AMP AMP AMP AMP
Hash Map
Hashing Algorithm
25
Data stored by hash• Primary Index is chosen for
data distribution, not same as primary key
• Primary Index value hashed• Hash value creates bucket
assignment• Hash Map assigns buckets to
AMP VPROCs• AMP VPROCs reside on
specific node• AMP VPROC writes row to
disk• Data and algorithm
exceptions require Uniqueness value for guaranteed unique Row ID
Teradata Confidential26 > 04/19/23
Data Access by Primary Index
Data accessed by row hash valueNeed 3 pieces of information
to find a row• Table ID• Row Hash of PI value
> Output of hash algorithm on PI Value
• PI Value• Operation involves only
one AMP VPROC
Table-idRow-hash
Cyl 1Index
Cyl 2Index
Cyl 3Index
Cyl 4Index
Cyl 5Index
Cyl 6Index
Cyl 7Index
Master
Index
Data RowData Row
DATA BLOCK
AMP #3
Cylinder #PI Value
Teradata Confidential27 > 04/19/23
Base Table Base Table Base Table Base Table
RowID Cust Name Phone
USI NUPI471, 1 45 Adams 444-6666555, 6 98 Brown 333-9999717, 2 72 Adams 666-7777884, 1 74 Smith 555-6666
RowID Cust Name Phone
USI NUPI147, 1 49 Smith 111-6666147, 2 12 Young 777-4444388, 1 27 Jones 222-8888822, 1 62 Black 444-5555
RowID Cust Name Phone
USI NUPI107, 1 37 White 555-4444536, 5 84 Rice 666-5555638, 1 31 Adams 111-2222640, 1 40 Smith 222-3333
AMP 1 AMP 2 AMP 3 AMP 4
Message Passing Layer
Data Access by Unique Secondary Index (USI)
USI Data access• Index is created on table• SQL uses USI by value• PE VPROC managing
session uses same information as primary index access (Table ID, Row Hash, Index Value)
• This process involves two AMP VPROC operations
• After USI subtable lookup, process similar to primary index access
RowID Cust Name Phone
USI NUPI639, 1 77 Jones 777-6666778, 3 95 Peters 555-7777778, 7 56 Smith 555-7777915, 9 51 Marsh 888-2222
Table ID
100
Row Hash
778
Unique Val
7
USI Subtable USI Subtable USI Subtable USI Subtable
RowID Cust RowID244, 1 74 884, 1505, 1 77 639, 1744, 4 51 915, 9757, 1 27 388, 1
RowID Cust RowID135, 1 98 555, 6296, 1 84 536, 5602, 1 56 778, 7969, 1 49 147, 1
RowID Cust RowID288, 1 31 638, 1339, 1 40 640, 1372, 2 45 471, 1588, 1 95 778, 3
RowID Cust RowID175, 1 37 107, 1489, 1 72 717, 2838, 4 12 147, 2919, 1 62 822, 1
Message Passing Layer
AMP 1 AMP 2 AMP 3 AMP 4
CREATE UNIQUE INDEX (cust) on customer;
SELECT *FROM customerWHERE cust = 56;
Create USI
Access via USI
Table ID
100
Row Hash
602
Index Value
56
Customer Table ID = 100
USI Value = 56
Hashing Algorithm
PE
Teradata Confidential28 > 04/19/23
Data Access via Non-uniqueSecondary Index (NUSI)
CREATE INDEX (name) on customer;
SELECT *FROM customerWHERE name = ‘Adams’;
Create NUSI
Access via NUSI
Table ID
100
Row Hash
567
Index Value
‘Adams’
Hashing Algorithm
PE
NUSI Subtable NUSI Subtable NUSI Subtable NUSI Subtable
RowID Name RowID432, 8 Smith 640, 1448, 1 White 107, 1567, 3 Adams 638, 1656, 1 Rice 536, 5
Message Passing Layer
AMP 1 AMP 2 AMP 3 AMP 4
RowID Name RowID432, 1 Smith 147, 1448, 4 Black 822, 1567, 6 Jones 338, 1770, 1 Young 147, 2
RowID Name RowID155, 1 Marsh 915, 9396, 1 Peters 778, 3432, 5 Smith 778, 7567, 1 Jones 639, 1
RowID Name RowID432, 3 Smith 884,1567, 2 Adams 471,1 717,2852, 1 Brown 555,6
Base Table Base Table Base Table Base Table
RowID Cust Name Phone
NUSI NUPI147, 1 49 Smith 111-6666147, 2 12 Young 777-4444388, 1 27 Jones 222-8888822, 1 62 Black 444-5555
RowID Cust Name Phone
NUSI NUPI107, 1 37 White 555-4444536, 5 84 Rice 666-5555638, 1 31 Adams 111-2222640, 1 40 Smith 222-3333
AMP 1 AMP 2 AMP 3 AMP 4
RowID Cust Name Phone
NUSI NUPI639, 1 77 Jones 777-6666778, 3 95 Peters 555-7777778, 7 56 Smith 555-7777915, 9 51 Marsh 888-2222
RowID Cust Name Phone
NUSI NUPI471, 1 45 Adams 444-6666555, 6 98 Brown 333-9999717, 2 72 Adams 666-7777884, 1 74 Smith 555-6666
NUSI Value = ‘Adams’Customer Table ID = 100
•Index is created on table
•SQL uses NUSI by value
•PE VPROC managing session uses same information as primary index access (Table ID, Row Hash, Index Value)
•This process involves all-AMP VPROC operations
Teradata Confidential29 > 04/19/23
Teradata Structures
Database structures
• Users• Databases
> Tables> Views> Macros> Triggers> Stored
Procedures> User Defined
Functions
Teradata Confidential30 > 04/19/23
Teradata is an Open System
Virtually any application or middleware framework can be integrated with Teradata.
Teradata
CORBA
ODBC
.NET
OLE-DB
TeradataUtilities
Adapter(s)
Mess
age B
us
Publis
h &
Subsc
ribe
JAVA
JDBC
EJB
JDBC
Adapter(s)TeradataUtilities
Web
Messages
IIOP ASPJSPJSM
Queues
Teradata Confidential31 > 04/19/23
64-bit Teradata SolutionTeradata on SuSE Linux 2H 2005
64-BIT ApplicationServer-Tier
Client-Tier Client-Tier
• 64-Bit
32-BIT ApplicationServer-Tier
• IntelAlso• IBM/Power PC• SUN/SPARC• HP/PA-RISC
• 32-Bit
• DELL• HP• IBM
Teradata Tools & Utilities
Teradata System Mgmt
Intel Platform
Teradata Application
3rd Party Partners
• Linux
Intel 64-Bit Database Server-Tier
Teradata Database on Intel 32-bit and 64-bit will support both 32-bit and 64-bit applications & clients concurrently
Teradata Database
Operating System
2H 2005
Teradata Confidential32 > 04/19/23
Enterprise Message Bus
TX1
APPL
NW
DA-MW
TX2
APPL
MSG-MW
DA-MW
TX3
APPL
MSG-MW
DA-MW
TX4
APPL
MSG-MW
DA-MW
BIAPPL
MSG-MW
DA-MW
TacticalAPPL
MSG-MW
DA-MW
StrategicAPPL
MSG-MW
DA-MW
Business Process Automation
Analytic & Decision Making Repositories
Analytic & Decision Making Services
Transactional Repositories
Batch
Streaming
Data Acquisition & Integration
Transactional Services
Enterprise Users — (Browsers and/or Portal) Legacy EnvironmentLegacy Environment
Service Brokers
Business Rules
MSG-MW
Event Notification
MSG-MW MSG-MW
Event Detection
WAN / VAN
RS
EDW — BEDW — A
Internet / Intranet WAN / VAN
OLTP1 OLTP2 OLTP3 OLTP4
DA-MWDA-MWDA-MW
RDBMS BasedEvent
Processing
BIAPPL
NW
DA-MW
QDQD
C/S EDI Consumers Suppliers Internal Partners EDI C/S
ASP / JSP
Teradata’s Real-Time Enterprise Reference Architecture
Teradata Confidential33 > 04/19/23
Transactional Services
TX1
APPL
NW
DA-MW
TX2
APPL
MSG-MW
DA-MW
TX3
APPL
MSG-MW
DA-MW
TX4
APPL
MSG-MW
DA-MW
Transactional Repositories
Transactional Services
OLTP1 OLTP2 OLTP3 OLTP4
Data Access Middleware Occurs via standards, such as; ODBC, OLE-DB, JDBC, as well as proprietary techniques
Transactional Application ServicesApplications that perform book-keeping or transactional services for the enterprise
OLTP Data Repositories• Data that reflects the current state
of various business process• Limited history • Tuned for transaction workload
Application Scope Applications have narrow scope. Tuned for specific book-keeping or transactional services.
Teradata Confidential34 > 04/19/23
Transactional User Base
Enterprise Message Bus
TX1
APPL
NW
DA-MW
TX2
APPL
MSG-MW
DA-MW
TX3
APPL
MSG-MW
DA-MW
TX4
APPL
MSG-MW
DA-MW
Transactional Repositories
Transactional Services
Enterprise Users — (Browsers and/or Portal)Legacy Environment
Service Brokers
WAN / VAN Internet / Intranet
OLTP1 OLTP2 OLTP3 OLTP4
C/S EDI Consumers Suppliers Internal Partners
ASP / JSP
Client/Server Styles 2-Tier and 3-Tier RPC style interfaces
User-level IntegrationOccurs via standard EAI services, such as JAVA, Web Sphere, .NET, Tibco, and SeeBeyond
Transactional User Base Consumers, Suppliers, Internal, and Trading Partners
Service Brokers J2EE, CORBA, DCOM, Web Services
Teradata Confidential35 > 04/19/23
Data Warehouse Services
Data Access Middleware Occurs via standards, such as ODBC, OLE-DB, JDBC, as well as proprietary techniques
Application Services Applications that provide predictive analysis and assisted decision making
Enterprise Data Warehouse• Consolidated enterprise data• Crosses multiple business
domains• Integrated data model
BIAPPL
MSG-MW
DA-MW
TacticalAPPL
MSG-MW
DA-MW
StrategicAPPL
MSG-MW
DA-MW
Analytic & Decision Making Repositories
Analytic & Decision Making Services
RS
EDW — BEDW — A
RDBMS BasedEvent
Processing
BIAPPL
NW
DA-MW
QDQD
Application ScopeStrategic and Tactical decision making applications.Though BI tools or custom applications.
Teradata Confidential36 > 04/19/23
Decision Support User Base
BIAPPL
MSG-MW
DA-MW
TacticalAPPL
MSG-MW
DA-MW
StrategicAPPL
MSG-MW
DA-MW
Analytic & Decision Making Repositories
Analytic & Decision Making Services
RS
EDW — BEDW — A
RDBMS BasedEvent
Processing
BIAPPL
NW
DA-MW
QDQD
Enterprise Message Bus
Enterprise Users — (Browsers and/or Portal)
Service Brokers
Internet / Intranet
Consumers Suppliers Internal Partners
ASP / JSP
Service Broker Styles J2EE, CORBA, DCOM, Web Services
Client/Server Styles 2-Tier and 3-Tier RPC style interfaces
User-level Integrationoccurs via standard EAI services, such as Web Services, JAVA, .NET, Tibco, and SeeBeyond
Legacy Environment
WAN / VAN
EDI C/S
DW User Base Consumers, Suppliers, Internal, and Trading Partners
Teradata Confidential37 > 04/19/23
Data Acquisition Services
Analytic & Decision Making RepositoriesTransactional Repositories
Batch
Streaming
Data Acquisition & Integration
RS
EDW — BEDW — A
OLTP1 OLTP2 OLTP3 OLTP4
RDBMS BasedEvent
Processing
QDQD
Data Transformation Services• Data cleansing• Data transformation (normalization)• Streaming data for frequent updates• Batch data moves for bulk operations• Partner ETL tools are typically used to
perform these services
Data Extraction• Data is extracted
from OLTP systems• Partner ETL tools are
frequently used here
Data Load• Data is loaded into
EDW system using Teradata Load tools
• FastLoad• MultiLoad• TPump
Data Acquisition Options• Traditional load utilities
(bulk or continuous loads)• Loads through – “in-flight”
Message Passing• Replication – Table level
replication from source to target
Teradata Confidential38 > 04/19/23
RDBMS Based Event Processing• Real-time events are detected
through a combination of Triggers, Stored Procedures, and UDFs
• Event engine performs query• Messages are passed via P2P, Web
Services or Enterprise Message Bus
Business Process Automation• Event Detection• Applied Business Rules• Event Notification• Messages are passed via P2P,
Web Services or Enterprise Message Bus
Event-Driven Business Processes
Analytic & Decision Making RepositoriesTransactional Repositories
Batch
Streaming
Data Acquisition & Integration
RS
EDW — BEDW — A
OLTP1 OLTP2 OLTP3 OLTP4
RDBMS BasedEvent
Processing
QDQD
Business Process Automation
Business Rules
MSG-MW
Event Notification
MSG-MW MSG-MW
Event Detection
DA-MWDA-MWDA-MW
Enterprise Message Bus
Teradata Confidential39 > 04/19/23
Application Integration
Enterprise Message Bus
TX1
APPL
NW
DA-MW
TX2
APPL
MSG-MW
DA-MW
TX3
APPL
MSG-MW
DA-MW
TX4
APPL
MSG-MW
DA-MW
BIAPPL
MSG-MW
DA-MW
TacticalAPPL
MSG-MW
DA-MW
StrategicAPPL
MSG-MW
DA-MW
Business Process Automation
Analytic & Decision Making Repositories
Analytic & Decision Making Services
Transactional Repositories
Batch
Streaming
Data Acquisition & Integration
Transactional Services
Service Brokers
Business Rules
MSG-MW
Event Notification
MSG-MW MSG-MW
Event Detection
RS
EDW — BEDW — A
OLTP1 OLTP2 OLTP3 OLTP4
DA-MWDA-MWDA-MW
RDBMS BasedEvent
Processing
BIAPPL
NW
DA-MW
QDQD
ASP / JSP
Decision Making Applications interact with bookkeeping applications via standard Enterprise services, such as Web Services, JAVA, .NET, -or- through the use of traditional client/server technology.
Teradata Confidential40 > 04/19/23
Dual Active Solution
BIAPPL
MSG-MW
DA-MW
TacticalAPPL
MSG-MW
DA-MW
StrategicAPPL
MSG-MW
DA-MW
Analytic & Decision Making Repositories
Analytic & Decision Making Services
Batch
Streaming
Data Acquisition & Integration
RS
EDW — BEDW — A
RDBMS BasedEvent
Processing
BIAPPL
NW
DA-MW
QDQD
Secondary “Active” system does not need to be as large as primary system
Replication Services• Changed data capture in V2R6• Update propagation via “GoldenGate”
Dual Data Load• Input data stream is split into
two independent load streams• Input data is filtered so that only
critical data is loaded on the Secondary “Active” system
Teradata Query Director• Query routing control based on
business rules• Business Continuity, workload
sharing