77
Teradata Subj : Re: Week word as calendar_date=current_date From : McCall, Glenn David You might try Select * From sys_calendar.calendar Where calendar_date = current_date. The week_of_year column should give you what you want. Subj : Re: Compression Algorithm From : David Clough Thanks for the comments and input. For your trouble in replying here's where I am ... Taking those comments into account, I've now finished the job. I've now got one Stored Procedure which, for a specified table, 140

Subj

Embed Size (px)

Citation preview

Page 1: Subj

Teradata

Subj:   Re: Week word as calendar_date=current_date

 

From

:   McCall, Glenn David

You might try

Select *

From sys_calendar.calendar

Where calendar_date = current_date.

The week_of_year column should give you what you want.

Subj:   Re: Compression Algorithm

 

From

:   David Clough

Thanks for the comments and input. For your trouble in replying here's where I am ...

Taking those comments into account, I've now finished the job. I've now got one Stored

Procedure which, for a specified table, provides all the values worth compressing on, column by

column, writing the data to a table.

To complement this, I've written (finished today) a front end (written in Python) that gets the

values from Teradata and constructs a full, intact, Create Table statement.

The gathering of the data could take hours - we'll run that part overnight - but the construction of

the output takes around 1 second !

140

Page 2: Subj

Teradata

Here's an example output :

CREATE MULTISET TABLE example_tab_NEW (

COM_ID CHAR ( 02 ) CHARACTER SET LATIN NOT NULL

COMPRESS ('WW'),

CON_ID CHAR ( 15 ) CHARACTER SET LATIN NOT NULL,

BUL_ID_ORIG CHAR ( 05 ) CHARACTER SET LATIN NOT NULL

COMPRESS

('EIN','SCL','ARH','ZRB','BRU','IST','KOR','RTM','ZWS','MAD'

,'SIN','HKG','BCN','BT8',

'BH8','DX7','KUL','SP8','BIE','FRT','BSL','HAM','ZMU','MHN',

'ANR','LIS','BKK','NBE',

'CN2','TPY','SEL','MOW','QDU','TYO','GNO','HGH','DTM','NGB',

'BER','CP9','BUH','WAW',

'BLL','DUS','HA3','CN1','COL','LGE','VIE','SAO','BRE','STO',

'OK1','FBG','LYS','HMK',

'AGB','GVA','HNJ','DUB','ULM','SZX','SCN','VGN','JKT','JNB',

'BK2','BUD','CP1','WB4',

'CPH','SYD','FR1','LP1','CCU','WZB','CN3','BL1','SGE','ATH',

'SGN','MNL','MMA','BUE',

'QFC','MRS','YYZ','ZRH','EF3','LUG','NTE','MIL','HYD','GT2',

'LPZ','GLM','GOT','DRS',

'SBG','LIL','LUX','BRQ','LCY','BOG','CA3','BRT','KLE','ORB',

'ORK','CDG','LY9','TAO',

'TO1','CVT','HEL','TKU','KRK','KBZ','MEL'),

CON_CREATE_DT DATE NOT NULL

COMPRESS (date '2007-06-26',date '2007-06-27',date '2007-07-

10',date '2007-06-28',date '2007-07-11',date '2007-07-18',

date '2007-07-12',date '2007-07-17',date '2007-07-25',

141

Page 3: Subj

Teradata

date '2007-06-29',date '2007-07-24',date'2007-07-19',

date '2007-07-31',date '2007-07-13',date '2007-07-20',

date '2007-07-27',date '2007-05-03',

date '2007-08-01',date '2007-08-02',date '2007-08-07',

date '2007-08-08',date'2007-07-09',date '2007-07-23',

date '2007-08-03',date '2007-08-14',date'2007-08-09',

date '2007-07-16',date '2007-07-30',date '2007-08-16'),

CON_CREATE_TM SMALLINT NOT NULL,

CON_SRKY_ID INTEGER NOT NULL,

CON_WHOUSE_CD CHAR ( 05 ) CHARACTER SET LATIN NOT NULL

COMPRESS (''),

CON_DG_TRANMODE_CD CHAR ( 01 ) CHARACTER SET LATIN NOT NULL

COMPRESS (''),

CON_DUTIABLE_IN CHAR ( 01 ) CHARACTER SET LATIN NOT NULL

COMPRESS ('D'),

CON_COLL_DT DATE

COMPRESS (date '2007-06-27',date '2007-06-26',date '2007-06-

28',date

'2007-07-10',date '2007-07-11',date '2007-07-18',date '2007-

07-12',date

'2007-07-17',date '2007-07-25',date '2007-07-31',date '2007-

06-29',date

'2007-07-13',date '2007-07-24',date '2007-07-19',

date '2007-07-26',date '2007-07-20',date '2007-07-23',date

'2007-08-02',date '2007-08-01',date '2007-05-03',date '2007-

08-07',date

'2007-07-09',date '2007-08-08',date '2007-07-16',date '2007-

08-03',date

'2007-08-09',date '2007-08-14',date '2007-07-30',

date '2007-08-16'),

CON_COLL_TM TIME(0),

142

Page 4: Subj

Teradata

CON_COMM_DELIV_DT DATE

COMPRESS (date '2007-07-30',date '2007-07-23',date '2007-07-

16',date

'2007-07-02',date '2007-08-06',date '2007-08-13',date '2007-

08-20',date

'2007-07-19',date '2007-07-26',date '2007-07-13',date '2007-

08-02',date

'2007-07-20',date '2007-07-17',date '2007-07-12',

date '2007-06-29',date '2007-07-27',date '2007-07-25',date

'2007-07-24',date '2007-07-31',date '2007-08-03',date '2007-

08-01',date

'2007-08-10',date '2007-08-09',date '2007-08-16',date '2007-

06-28',date

'2007-08-07',date '2007-08-08',date '2007-08-17',

date '2007-07-11'),

CON_MAX_DEL_DT DATE

COMPRESS (date '2007-07-23',date '2007-07-16',date '2007-07-

30',date

'2007-07-02',date '2007-08-06',date '2007-08-13',date '2007-

08-20',date

'2007-07-19',date '2007-07-26',date '2007-08-02',date '2007-

07-13',date

'2007-07-20',date '2007-07-12',date '2007-07-18',

date '2007-07-17',date '2007-07-27',date '2007-07-25',date

'2007-07-24',date'2007-08-03',date '2007-07-31',

date '2007-08-16',date '2007-08-10',date '2007-08-01',date '2007-

08-09',date '2007-06-28',date

'2007-08-07',date '2007-08-08',date '2007-08-17',

date '2007-07-11',date '2007-05-04',date '2007-05-03',date

'2007-05-07',date '2007-06-27',date '2007-07-10',date '2007-

08-15',date

143

Page 5: Subj

Teradata

'2007-07-03',date '2007-05-02',date '2007-08-21',date '2007-

04-30',date

'2007-05-08',date '2007-07-04',date '2007-05-01',

date '2007-05-09',date '2007-05-10',date '2007-04-27',date

'2007-07-05',date '2007-08-23',date '2007-07-28',date '2007-

07-21',date

'2007-08-04',date '2007-07-14',date '2007-06-30',date '2007-

08-11',date

'2007-08-18',date '2007-08-05',date '2007-07-06',

date '2007-08-12',date '2007-07-29',date '2007-05-05'),

CON_MAX_DEL_TM TIME(0),

CND_NOTE_SRC_CD CHAR ( 02 ) CHARACTER SET LATIN NOT NULL

COMPRESS

('QS','IS','QL','EB','PS','EX','ED','BL','Z','MT','GH','DI',

'CR','SC','AC'),

CON_OPSA_TGRS_WT DECIMAL(9,3) NOT NULL

COMPRESS

(.100,.500,1.000,.200,.300,.050,2.000,.120,.250,.150,.400,.0

80,3.000,.140,5.000),

CON_OPSC_TGRS_WT DECIMAL(9,3) NOT NULL

COMPRESS (.100,.500,1.000),

CON_OA_TOT_VL DECIMAL(7,3) NOT NULL

COMPRESS (.000),

CON_OC_TOT_VL DECIMAL(7,3) NOT NULL

COMPRESS

(.000,.001,.002,.003,.005,.004,.006,.010,.008,.012,.007,.009

,.011,.016,

.018,.024,.014,.013,.030,.015,.027,.017,.036,.019,.021,.040,

.022,.033,

144

Page 6: Subj

Teradata

.025,.028,.048,.032,.026,.034,.060,.038,.072,.029,.031,.035,

.045,.096,

.042,.037,.120,.054,.080,.043,.044,.064,.039,.041,.051,.075,

.047,.056,

.055,.240,.144,.058,.046,.100,.192,.049,.059,.090,.288,.070,

.053,.067,

.084,.384,.108,.063,.074,.125,.480,.068,.061,.066,.057,.085,

.081,.069,

.091,.216,.073,.065,.576,.076,.077,.960,.071,.086,.768,.088,

.082,.078,

.168,.336,.150,.079,.180,.087,.175,.360,.083,.160,.112,.128,

.110,.200,

.101,.089,.162,.130,.140,.105,.095),

CON_OA_TITEM_QT DECIMAL(5) NOT NULL

COMPRESS (1.),

CON_OC_TITEM_QT DECIMAL(5) NOT NULL

COMPRESS (1.),

CON_VAL_OF_GDS_AM DECIMAL(13,2) NOT NULL

COMPRESS (.00),

CON_CHK_WT_IN CHAR ( 01 ) CHARACTER SET LATIN NOT NULL

COMPRESS ('Y'),

CON_CHK_VL_IN CHAR ( 01 ) CHARACTER SET LATIN NOT NULL

COMPRESS ('N'),

CON_DESP_DT DATE

COMPRESS (date '2007-06-27',date '2007-07-10',date '2007-06-

26',date

'2007-06-28',date '2007-07-11',date '2007-07-18',date '2007-

07-12',date

'2007-07-31',date '2007-07-25',date '2007-07-17',date '2007-

06-29',date

'2007-07-24',date '2007-07-19',date '2007-07-13',

date '2007-07-26',date '2007-07-20',date '2007-08-01',date

145

Page 7: Subj

Teradata

'2007-05-03',date '2007-08-02',date '2007-08-07',date '2007-

08-08',date

'2007-07-23',date '2007-08-14',date '2007-08-03',date '2007-

07-09',date

'2007-07-16',date '2007-08-09',date '2007-08-10',date '2007-

08-16'),

...... plus dozens of other columns

......

PRIMARY INDEX example_tab_UPI

( CON_SRKY_ID )

;

Subj:   Problem with handling null values for Partition

 

From

:   Stieger, Etienne E

Good Day,

Hope someone can help.

We want to implement a solution to prevent inserts of duplicate records into a SET table that has

a NUPI based on a single column. The table has a PK consisting of multiple columns (some of

which are nullable for very specific business reasons).

The solution we had in mind is to ensure that the "where" clause of the update portion of the

mload can handle nulls for nullable columns, as follows:

146

Page 8: Subj

Teradata

where Account_Num = :Account_Num(Dec(18, 0) , Format

'999999999999999999.')

and coalesce(Account_Modifier_Num, -32768) =

coalesce(:Account_Modifier_Num, -32768)

and coalesce(SB_Account_Open_Dt, date '9999-12-31') =

coalesce(:SB_Account_Open_Dt, '9999-12-31')(Date ,Format

'YYYY-MM-DD')

and coalesce(Balance_Category_Type_Cd, -32768) =

coalesce(:Balance_Category_Type_Cd, -32768)

and coalesce(Time_Period_Cd, -32768) =

coalesce(:Time_Period_Cd, -32768)

and coalesce(Account_Summary_Dt, date '9999-12-31') =

coalesce(:Account_Summary_Dt, '9999-12-31')(Date ,Format

'YYYY-MM-DD')

(Account_Num is not nullable, and does not need a coalesce)

Using coalesce works fine for all the columns in the above example, except the partition column

(in this case Account_Summary_Dt). I suspect any nullable columns that are part of the PI might

also be affected but this does not apply to this specific example, because Account_Num is the

only PI column.

If the coalesce is used for the conditional test on the partition column, we get the following

message:

3538: A MultiLoad UPDATE Statement is Invalid.

When we change:

and coalesce(Account_Summary_Dt, date '9999-12-31') =

coalesce(:Account_Summary_Dt, '9999-12-31')(Date ,Format

'YYYY-MM-DD')

147

Page 9: Subj

Teradata

back to:

Account_Summary_Dt = :Account_Summary_Dt (Date ,Format

'YYYY-MM-DD')

then the error dissappears, but we are not handling possible null values (on either side of the

"=").

PS: Partition column as defined on table

PARTITION BY RANGE_N(Account_Summary_Dt BETWEEN DATE '2003-

01-01' AND

DATE '2003-12-31' EACH INTERVAL '1' MONTH ,

DATE '2004-01-01' AND DATE '2004-12-31' EACH INTERVAL '1'

MONTH ,

DATE '2005-01-01' AND DATE '2005-12-31' EACH INTERVAL '1'

MONTH ,

DATE '2006-01-01' AND DATE '2006-12-31' EACH INTERVAL '1'

MONTH ,

DATE '2007-01-01' AND DATE '2007-12-31' EACH INTERVAL '1'

MONTH ,

NO RANGE OR UNKNOWN);

Subj:   Multiload error 3857 from BTEQ export file

 

From

:   Millar, Timothy

148

Page 10: Subj

Teradata

I'm trying to multiload a file that was created from a BTEQ export as follows:

EXPORT DATA FILE1.TXT;

SELECT

CAST (TheDate AS DATE FORMAT 'YYYY/MM/DD') AS RUNDATE

,CAST (TheTime AS VARCHAR(15)) AS RUNTIME

,'TMNGR' (VARCHAR(30)) AS SOURCE

....

MULTILOAD:

INSERT INTO EDW_CAP_MGMT.HEARTBEAT_HISTORY

(TheDate

,TheTime

,Source

,other columns....

)

VALUES

(

CAST(:RUNDATE AS DATE FORMAT 'YYYY/MM/DD')

,CAST(:RUNTIME AS CHAR(15))

,CAST(:Source AS CHAR(30))

...

);

My problem comes in with the fact that the first column is still exported in TEXT format

"YYYY/MM/DD" but multiload is looking for a DATE format. When I try to CAST the

incoming data to DATE format, I get the 3857 error: "Cannot use value (or macro

parameter :RUNDATE)

149

Page 11: Subj

Teradata

Any suggestions?

Subj:   Re: Multiload error 3857 from BTEQ export file

 

From

:   Geoffrey Rommel

You didn't include the .field commands from your Mload script. Since the date was exported as

DATA, I would expect it to be in an integer form (4 bytes); in this case you would define it to

Mload as ".field rundate date". In this case, inserting it directly (without a CAST) should work.

But maybe it is a 10-byte character string. In this case, you would define it as ".field rundate

char(10)" and try inserting ":rundate (date, format 'YYYY/MM/DD')".

Subj:   Re: Random Date Updates

 

From

:   Joseph D silva

Something I could quickly think of, is to use the teradata RANDOM(n,m) function to retrieve a

date from Sys_Calendar.calendar view.

SELECT CALENDAR_DATE

FROM Sys_Calendar.Calendar

150

  What is the proper way to update a date column with random?  

Page 12: Subj

Teradata

WHERE day_of_calendar = RANDOM()

Subj:   Re: Random Date Updates

 

From

:   frank.c.martinez

I would think you'd rather pick a random offset from today. The system calendar has a maximum

and minimum date:

SELECT MIN(calendar_date),

MAX(calendar_date)

FROM Sys_Calendar.CALENDAR;

So something like this would not require "going out" to the calendar, and you can adjust the

range of dates by adjusting n:

SELECT DATE + RANDOM(1, n) - n/2;

Calculate the date range and use startdate + random, e.g. for a date between 2000-01-01 and

2010-12-31:

date '2000-01-01' + random(0,4017)

Subj:   Check for existing attributes in a table

 

Hope any1 can help me with my query. My requirement is such that from source tables I am

getting fields acct_num, var1, var2, var3. I have to create a volatile table with unique seq_num

and finally insert into a target table, a sequence number greater than the existing sequence

number, after checking that the combination of var1, var2, var3 is not present in the target table.

151

Page 13: Subj

Teradata

The target table does not have acct_id field, while the volatile table should have the acct_id field

after picking from the source table. Below is the concept we are using for it and attached is the

actual code. In the below code dim1 is my target table and stg_tab is my volatile table. The blow

logic works fine for small no. of rows but when the no. of rows increases we start getting spool

space error. Any better approaches?

select y.*, case when b_dim1_id is not null then b_dim1_id

else

max_dim1_id + sum(case when prev_attr =

curr_attr

then 0 else 1

end) over

(order by curr_attr,prev_attr rows unbounded

preceding)

end new_key,

case when b_dim1_id is not null then b_dim1_id else

max_dim1_id + sum(case when prev_attr = curr_attr then 0

else 1

end

) over

(partition by

b_dim1_id order by curr_attr, prev_attr rows unbounded

preceding)

end new_key1

from

(

select x.*, min(var1||':;'||var2||':;'||var3) over (order

by var1, var2,

var3 rows between 1 preceding and 1 preceding) prev_attr,

var1||':;'||var2||':;'||var3 curr_attr

152

Page 14: Subj

Teradata

from

(

select a.*, b.dim_id b_dim1_id, b.var1 b_var1, b.var2

b_var2, b.var3 b_var3

from

(select a.*, c.*

from prod_work.stg_tab a, (select max (dim_id) max_dim1_id

from

prod_work.dim1) c

) a

left outer join

prod_work.dim1 b

on a.var1 = b.var1

and a.var2 = b.var2

and a.var3 = b.var3

)x

) y

Period data type support in Teradata MultiLoad

Article by lydiaxie on 28 Dec 2011 0 comments

Tags:

MultiLoad

period data type

The Period data type is supported by Teradata Multiload as of TTU 13.0. A period represents an

interval of time. It indicates when some particular event starts and when it ends. It has a

beginning bound and an (optional) ending bound (the period is open if there is no ending bound).

The beginning bound is defined by the value of a beginning element. The ending bound is

defined by the value of an ending element. Those two elements of a period must be of the same

type that is one of the three DateTime data types: DATE, TIME, or TIMESTAMP.

153

Page 15: Subj

Teradata

A period is a value of a Period data type. Period data types are implemented internally as UDTs.

However, the syntax and functions of Period data types closely follow the ANSI proposal and

Period data types appear to the user as system-defined data types.

The five new Period data types that were introduced to TTU 13.0 MultiLoad are:

PERIOD(DATE)

PERIOD(TIME[(n)])

PERIOD(TIME[(n)] WITH TIME ZONE)

PERIOD(TIMESTAMP[(n)])

PERIOD(TIMESTAMP[(n)] WITH TIME ZONE) 

How support for the PERIOD data type is implemented in MultiLoad

The FIELD command specifies a field of the input record to be sent to the Teradata Database.

The FILLER command describes a named or unnamed field as filler, which is not sent to the

Teradata Database. Prior to TTU13.0, if the user defined a Period data type on the .FIELD

command or .FILLER command, MultiLoad rejected the data type with an error message

such: UTY0005 Bad data in the FIELD command at position 12, the

name beginning with "PERIOD(TIME(2))" is not a valid data

descriptor.

Starting from TTU13.0, the new PERIOD data types with nested parentheses are supported on

the .FIELD and .FILLER commands.  

The TABLE command identifies a table whose column names and data descriptions are used as

the names and data descriptions of fields of the input records.

When the TABLE command is used and the table contains Period data types, the Teradata

Database returns the type names as below:

PERIOD(DATE) PD

PERIOD(TIME) PT

154

Page 16: Subj

Teradata

PERIOD(TIME WITH TIME ZONE) PZ

PERIOD(TIMESTAMP) PS

PERIOD(TIMESTAMP WITH TIME ZONE) PM

MultiLoad must recognize those new PERIOD data types when obtaining the table schema from

the Teradata Database in response to a .TABLE command. Then it can generate the correct

FIELD commands to define the input records.

MultiLoad recognizes new data descriptor type codes and generates the correct data type phrases

when building the USING modifier.

Internal representation

The max size of the PERIOD data is fixed:

PERIOD Data Type Field Size in Bytes

PERIOD(DATE) 8

PERIOD(TIME(n)) 12

PERIOD(TIME(n) WITH TIME ZONE) 16

PERIOD(TIMESTAMP(n)) 20

PERIOD(TIMESTAMP(n) WITH TIME ZONE) 24

The DBS storage size in bytes are different between 32-bit DBS and 64-bit DBS server, but the

actual field size returned to the client will be same on both 32-bit and 64-bit

platforms. PERIOD(DATE), PERIOD(TIME(n) and PERIOD(TIME(n) WITH TIME

ZONE) are fixed length data types. However, PERIOD(TIMESTAMP[(n)])

and PERIOD(TIMESTAMP[(n)] WITH TIME ZONE) are variable length data types. That

means a 2-byte length indicator must precede the data. To support variable data types

PERIOD(TIMESTAMP[(n)]) and PERIOD(TIMESTAMP[(n)] WITH TIME ZONE),

MultiLoad processes FastLoad, Binary and Unformat data formats by looking at the 2-byte

length indicator preceding the data.

155

Page 17: Subj

Teradata

The internal representation of PERIOD data does not consist of two strings. The detailed

representation for each PERIOD data type is described as the following:

PERIOD(DATE) contains two DATE elements:

DATE type: 4 bytes signed integer

Total: 4*2 = 8 bytes

PERIOD(TIME(n)) contains two TIME elements:

TIME type:

Second: 4 bytes signed integer

Hour: 1 unsigned byte

Minute: 1 unsigned byte

Total: (4+1+1) * 2 = 12 bytes

PERIOD(TIME(n) WITH TIME ZONE) contains two TIME ZONE elements:

TIME ZONE type:

Second: 4 bytes signed integer

Hour: 1 unsigned byte

Minute: 1 unsigned byte

Time Zone Hour: 1 unsigned byte

Time Zone Minute: 1 unsigned byte

Total: (4+1+1+1+1) * 2 = 16 bytes

PERIOD(TIMESTAMP(n)) contains two TIMESTAMP elements:

TIMESTAMP type:

Second: 4 bytes signed integer

Year: 2 bytes signed short integer

Month: 1 unsigned byte

Day: 1 unsigned byte

Hour: 1 unsigned byte

Minute: 1 unsigned byte

Total: 2 + (4+2+1+1+1+1)*2 = 22 bytes

PERIOD(TIMESTAMP(n) WITH TIME ZONE) contains two TIMESTAMP WITH TIME

156

Page 18: Subj

Teradata

ZONE elements:

TIMESTAMP WITH TIME ZONE type:

Second: 4 bytes signed integer

Year: 2 bytes signed short integer

Month: 1 unsigned byte

Day: 1 unsigned byte

Hour: 1 unsigned byte

Minute: 1 unsigned byte

Time Zone Hour: 1 unsigned byte

Time Zone Minute: 1 unsigned byte

Total: 2 + (4+2+1+1+1+1+1+1)*2 = 26 bytes

Restrictions

PERIOD data is always exported as two consecutive integer values in all response modes, other

than a field mode response. So MultiLoad processes PERIOD data as binary structure if  the user

provides data as PERIOD data types by defining them as such in the LAYOUT. They can not be

processed as TEXT format. It is recommanded to specify TEXT format for character data only. It

doesn't make sense to process any binary data as TEXT format.

While the user can supply and define data record as CHAR type, the Teradata Database will cast

from CHAR type to the appropriate PERIOD data types. For such case, MultiLoad processes the

data as pre-existing CHAR data type. And no special handling is needed at client side. Refer to

the sample job script below for loading character data to PERIOD data columns.

Sample job scripts

Specifying UNFORMAT input records on .FIELD command

1. Run a FastExport job to generate period data file using UNFORMAT:

157

Page 19: Subj

Teradata

.dateform ansidate;

 .logtable PD_fe_log;

 .logon tdpid/xxxx,xxxx;

  drop table datatbl;

  create table datatbl, fallback (

  c1 integer,

  c2 period(timestamp with time zone));

  insert into datatbl(1, period(timestamp '2005-02-03

11:11:11-08:00', timestamp '2005-02-03 13:12:12-08:00'));

 

insert into datatbl(2, period(timestamp '2006-02-03

13:12:12-08:00', timestamp '2006-02-03 14:12:12-08:00'));

 

insert into datatbl(3, period(timestamp '2007-02-03

14:12:12-08:00', timestamp '2007-02-03 15:12:12-08:00'));

 

 .begin export;

 

sel * from datatbl;

 

.export outfile PM_data format unformat mode record;

 

 .end export;

 

 .logoff;

2. Run a MultiLoad job to load PERIOD data to the Teradata Database:

.dateform ansidate;

 

.logtable testlog;

158

Page 20: Subj

Teradata

 

.logon tdpid/xxxx,xxxx;

  

DROP TABLE test_TABLE;

 

DROP TABLE wt_test_TABLE;

 

DROP TABLE et_test_TABLE;

 

DROP TABLE uv_test_TABLE;

 

CREATE TABLE test_TABLE, fallback(

 

  FIELD1 INTEGER,

 

  FIELD2 period(timestamp with time zone))

 

  UNIQUE PRIMARY INDEX (Field1);

 

  .BEGIN import mload tables test_TABLE

 

 worktables wt_test_TABLE

 

 errortables et_test_TABLE uv_test_TABLE;

 

  

.layout lay1;

 

.FIELD f1 * integer;

 

.FIELD f2 * period(timestamp with time zone);

159

Page 21: Subj

Teradata

 

.dml label label1;

 

INSERT into test_TABLE VALUES (:f1,:f2);

 

  .import infile PM_data layout lay1 apply label1;

 

.end mload;

 

.logoff;

Using .TABLE command to generate data layout

1. Run a BTEQ job to generate PERIOD data:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

.logon tdpid/xxxx,xxxx;

 

drop table datatbl;

 

create table datatbl, fallback (

 

 c1 integer,

 

 c2 period(date) );

  

insert into datatbl(1, period(date '2005-02-03', date

'2006-02-04'));

 

insert into datatbl(2, period(date '2006-02-03', date

'2007-02-04'));

 

insert into datatbl(3, period(date '2007-02-03', date

160

Page 22: Subj

Teradata

18

19

20

21

22

23

24

'2008-02-04'));

  

 

.export data file=PD_data;

 

sel * from datatbl;

 

.export reset;

  

.logoff;

2. Run a MultiLoad job to load PERIOD data:

1

2

3

4

5

6

7

8

9

1

0

1

1

1

2

1

3

1

4

.logtable testlog;

 

.logon tdpid/xxxx,xxxx;

  

 

DROP TABLE test_TABLE;

 

DROP TABLE wt_test_TABLE;

 

DROP TABLE et_test_TABLE;

 

DROP TABLE uv_test_TABLE;

  

 

CREATE TABLE test_TABLE, fallback(

 

  FIELD1 INTEGER,

 

161

Page 23: Subj

Teradata

1

5

1

6

1

7

1

8

1

9

2

0

2

1

2

2

2

3

2

4

2

5

2

6

2

7

2

8

2

9

3

  FIELD2 period(date))

 

  UNIQUE PRIMARY INDEX (Field1);

 

 

.BEGIN import mload tables test_TABLE

 

 worktables wt_test_TABLE

 

 errortables et_test_TABLE uv_test_TABLE;

 

.layout lay1;

 

.table test_TABLE;

 

  

.dml label label1;

 

INSERT into test_TABLE.*;

 

  

.import infile PD_data layout lay1 apply label1;

 

.end mload;

 

.logoff;

162

Page 24: Subj

Teradata

0

3

1

3

2

3

3

3

4

3

5

3

6

3

7

3

8

3

9

4

0

4

1

4

2

4

3

4

4

163

Page 25: Subj

Teradata

Load CHARACTER data record to columns defined as Period data type

1. Run a BTEQ job to generate character data file:

.logon tdpid/xxxx,xxxx;

 

drop table datatbl;

 

drop table datatbl1;

  

 

create table datatbl, fallback (

 

 c1 integer,

 

 c2 char(36),

 

 c3 char(28),

 

 c4 char(32),

 

 c5 char(62));

 

insert into datatbl

(1,

 '(''12:12:12.123'', ''13:12:12.123'')',

 '(''12:12:12'', ''13:12:12'')',

 '(''2005-02-03'', ''2006-02-04'')',

 '(''2005-02-03 11:11:11-08:00'', ''2005-02-03 13:12:12-

08:00'')');

164

Page 26: Subj

Teradata

 

insert into datatbl

(2,

 '(''13:12:12.123'', ''14:12:12.123'')',

 '(''13:12:12'', ''14:12:12'')',

 '(''2006-02-03'', ''2007-02-04'')',

 '(''2006-02-03 11:11:11-08:00'', ''2006-02-03 13:12:12-

08:00'')');

 

insert into datatbl

(3,

'(''14:12:12.123'', ''15:12:12.123'')',

 '(''14:12:12'', ''15:12:12'')',

 '(''2007-02-03'', ''2008-02-04'')',

 '(''2007-02-03 11:11:11-08:00'', ''2007-02-03 13:12:12-

08:00'')');

 

  

.export data file=Period_char_data;

 

sel * from datatbl;

 

.export reset;

 

.logoff;

2. Run a MultiLoad job to load character data to PERIOD type columns, the Teradata

Database will cast from CHAR type to the appropriate PERIOD data types:

.logtable testlog;

 

.logon tdpid/xxxx,xxxx;

165

Page 27: Subj

Teradata

  

 

DROP TABLE test_TABLE;

 

DROP TABLE wt_test_TABLE;

 

DROP TABLE et_test_TABLE;

 

DROP TABLE uv_test_TABLE;

  

 

CREATE TABLE test_TABLE, fallback(

 

  FIELD1 INTEGER,

 

  FIELD2 period(time(3)),

 

  FIELD3 period(time(0)),

 

  FIELD4 period(date),

 

  FIELD5 period(timestamp with time zone))

 

  UNIQUE PRIMARY INDEX (Field1);

 

  

.BEGIN import mload tables test_TABLE

 

 worktables wt_test_TABLE

 

 errortables et_test_TABLE uv_test_TABLE;

166

Page 28: Subj

Teradata

 

  

.layout lay1;

 

.FIELD f1 * integer;

 

.FIELD f2 * char(36);

 

.FIELD f3 * char(28);

 

.FIELD f4 * char(32);

 

.FIELD f5 * char(62) nullif f4='(''2006-02-03'', ''2007-02-

04'')';

 

  

.dml label label1;

 

INSERT into test_TABLE VALUES (:f1,:f2,:f3,:f4, :f5);

 

  

.import infile Period_char_data layout lay1 apply label1;

 

.end mload;

 

.logoff;

Summary

Define PERIOD data types on .FIELD command and .FILLER command.

Period data types support for TABLE command

Generate correct USING CLAUSE for PERIOD data types

167

Page 29: Subj

Teradata

Handle PERIOD data types

The iBatis (MyBatis) Stored Procedure Wizard allows you to right click on a Stored Procedure in

the Teradata plug-in for Eclipse and quickly create a Web service.

The iBatis Stored Procedure Wizard wraps a Stored Procedure into an iBatis or MyBatis SQL

Map. The generated SQL map can then be used to create a Web service or it can be used to

create a Java application that uses the iBatis or MyBatis frame works.

Prerequisite for this Article

If you have not worked through the article Create a Teradata Project using the Teradata Plug-in

for Eclipse, do so now before you continue. Once you know how to produce a Teradata Project,

make a Teradata project called ProductPrj.

You will also need some database objects before you can start this tutorial. First create the

following products table using either the SQL Editor or the Stored Procedure creation dialog

from the Teradata Plug-in for Eclipse.

1

2

3

4

5

6

7

CREATE MULTISET TABLE guest.products ,NO FALLBACK ,

     NO BEFORE JOURNAL,

     NO AFTER JOURNAL,

     CHECKSUM = DEFAULT,

     DEFAULT MERGEBLOCKRATIO

     (

      id INTEGER NOT NULL,

168

Page 30: Subj

Teradata

8

9

10

      description VARCHAR(255) CHARACTER SET LATIN CASESPECIFIC,

      price DECIMAL(15,2),

PRIMARY KEY ( id ));

Now create a Stored Procedure using the products table with the following DDL:

CREATE PROCEDURE "guest"."getProduct" (

        IN "id" INTEGER,

        OUT "description" VARCHAR(256),

        OUT "price" DECIMAL(10 , 2))

BEGIN

Select

    price, description into :price, :description

from guest.products where id=:id;

END;

Launch Wizard

The Wizard is launched from the DTP Data Source Explorer by right clicking on a Stored

Procedure tree node in the explorer and selecting the "Create iBatis(MyBatis) SQL Map..." menu

169

Page 31: Subj

Teradata

option.

Stored Procedure Selection

Once the Wizard is launched, the Stored Procedure Selection Wizard page will come up. This

page shows the selected schema and procedure for the Wizard.

170

Page 32: Subj

Teradata

Create iBatis SQL Mapping XML File

The next page of the Wizard is the Create iBatis SQL Mapping XML File Wizard page. This

page lets you define the location, name of the iBatis SQL mapping file and the mapping name for

the selected Stored Procedure. The option of appending the mapping to an existing file will be

default. You will need to select the option Launch iBatis DAO with Web Services Wizard  if you

want to create a Web service directly after you have created a SQL Map for your stored

procedure.

171

Page 33: Subj

Teradata

Domain Objects Source Location

The next page of the Wizard is the Domain Objects Source Location page. This page lets you

define the location and package name of the domain object to be used as the result map for an

SQL mapping.

172

Page 34: Subj

Teradata

Edit Classes

The next page is the Edit Classes Wizard page. This page lets you rename and edit the properties

for the classes which will be created by the Wizard.

173

Page 35: Subj

Teradata

This page will show the parameter class and any result set classes that have been derived from

the Stored Procedure. The default names of the classes can be renamed to names that make sense

for your application. In this case change the Parameters class to Product. You should notice the

members of the Parameter class correspond to the parameters for the Stored Procedure.

174

Page 36: Subj

Teradata

Generated Code

Once all of the required information is entered into the Wizard, the Finish button can be selected

and the SQL Map is generated. The SQL Map contains a resultMap for the parameter class and a

175

Page 37: Subj

Teradata

SQL statement to call the Stored Procedure. The Stored Procedure being executed has id as an

"in" parameter and has description and price as "out" parameters.

1

2

3

4

5

6

7

8

9

1

0

1

1

1

2

1

3

1

4

1

?xml version="1.0" encoding="UTF-8" ?>

 <!DOCTYPE mapper PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN"

"http://mybatis.org/dtd/mybatis-3-mapper.dtd"> 

<mapper namespace="repository.ProductMap">

  

    <!-- Define object mapping -->

    <resultMap type="domain.Product" id="Product">

        <result column="id" jdbcType="INTEGER"

javaType="java.lang.Integer" property="id" />

        <result column="description" jdbcType="VARCHAR"

javaType="java.lang.String" property="description" />

        <result column="price" jdbcType="DECIMAL"

javaType="java.math.BigDecimal" property="price" />

    </resultMap>

 

    <!-- Define procedure SQL statement -->

    <select id="getProduct" parameterType="domain.Product"

statementType="CALLABLE">

    call "guest"."getProduct"(#{id,mode=IN,

176

Page 38: Subj

Teradata

5

1

6

1

7

1

8

1

9

2

0

jdbcType=INTEGER},#{description,mode=OUT,

jdbcType=VARCHAR},#{price,mode=OUT, jdbcType=OTHER,

typeHandler=com.teradata.commons.mybatis.extensions.NumberHandl

er})

    </select>

    

</mapper> <!-- Do not edit or add anything below this comment

-->

iBatis (MyBatis) DAO with Web Services Wizard

The iBatis (MyBatis) DAO with Web Services Wizard will be launched if the option was

selected from the iBatis Stored Procedure Wizard. This Wizard will create a DAO and a Web

service derived from the generated SQL Map.

iBatis DAO Definition

The First page of the Wizard defines the new DAO and options to create a Web Service.

Select the following options:

Create WSDL

Create Web Service

Save Password

Now hit the Next button:

177

Page 39: Subj

Teradata

iBatis DAO Methods

178

Page 40: Subj

Teradata

The iBatis DAO Methods Wizard Page allows you to select which SQL actions from your iBatis

Map file to be used in your Web service. You can change your return type from returning a

single result set object to returning a list instead. Once you hit the next button Your DAO and

Web service Definition files will be created.

179

Page 41: Subj

Teradata

Web service Creation

The next page is the standard WTP Web services Wizard. Set your Client to test. Once you hit

the Finish button your Stubs and Skeletons will be created for your Web service. The

Implementation stub will be modified to use the new DAO you just created.

180

Page 42: Subj

Teradata

Web service Client

The Web service client will come up ready to use and connected to your Teradata database,

through the Web service implementation. The generated client will show all of the members of

the Parameter class but you are only required to enter the id because it is the "in" parameter of

the Stored Procedure. The results will show all of the parameters of the stored procedure, id and

the two "out" parameters, description and price.

iBatis (MyBatis)  Macro Wizard

A similar Wizard is the  iBatis (MyBatis) Macro Wizard. This Wizard wraps a Macro into an

iBatis or MyBatis SQL Map. The newly generated SQL map can then be used to create a Web

service as described above or it can be used to create a Java application that uses the iBatis or

181

Page 43: Subj

Teradata

MyBatis frame works. The Wizard is launched from a Macro Tree node from the DTP Data

Source Explorer.

Conclusion

Both the iBatis (MyBatis) Stored Procedure and Macro Wizards are easy to use because

parameter and result classes are derived from the selected Stored Procedure or Macro. The

Wizards generate DAOs and functional applications you can start using right away. You now can

get a head start on your application development by leveraging the Stored Procedures and

Macros you have already developed

Extract and Analyse Database Object Dependencies

Blog entry by ulrich on 01 Dec 2011 17 comments

Tags:

database object dependency

Sometimes we want or need to know which database objects are sourced from a specific other

database object like a table. In a perfect environment we would simply ask our perfect meta

data management tool which knows everything about data lineage. Unfortunately we are

sometimes exposed to a non optimal environment where the meta data management is

incomplete or does not exist.

I just want to show some possibilities to get at least some info out of the DB system - if you have

the appropriate right to run the queries.

Going top down is easy in Teradata. Just place a SHOW in front of any SELECT or EXEC and

you get all database object DDLs which are related to this statement. In some cases you can be

exposed to the situation that the DDL is incomplete. But a direct show of the DB object in

question should overcome this.

182

Page 44: Subj

Teradata

Unfortunately this would not overcome the bottom up question "Which database objects like

views or macros are accessing the table X?". The problem is that Views can be build up on

Views on Views.The Teradata Administrator has a function to show References of a database

object. Problem here is that this is showing only the direct references not all nested ones. At least

this is a good starting point. DBQL releases the following SQL statement for the reference

lookup:

1

2

3

4

5

6

7

8

9

1

0

1

1

1

2

1

3

SELECT DatabaseName,TVMName,TableKind AS "Type"

FROM dbc.TVM T,dbc.dbase D

WHERE D.DatabaseId=T.DatabaseId

 AND CreateText LIKE '%"DBC"."TVM"%' (NOT CS)

UNION

SELECT DatabaseName,TVMName,TableKind AS "Type"

FROM dbc.TextTbl X,dbc.dbase D,dbc.TVM T

WHERE X.TextType='C'

 AND X.TextString LIKE '%"DBC"."TVM"%' (NOT CS)

 AND X.DatabaseId=D.DatabaseId

 AND X.TextId=T.TVMId

MINUS

SELECT DatabaseName,TVMName,TableKind

FROM dbc.TVM T,dbc.dbase D

WHERE D.DatabaseId=T.DatabaseId

 AND DatabaseName='DBC'

183

Page 45: Subj

Teradata

1

4

1

5

1

6

1

7

1

8

 AND TVMName='TVM'

ORDER BY 1,2;

I have to admit I don't understand all of this - like TEXTTYPE = 'C' - but we can use this as a

type of black-box logic.

The SQL below does the same for all DB objects in dbc.tables:

1

2

3

4

5

6

7

8

9

CREATE VOLATILE TABLE OBJ_DEPENDENCY

AS

(

SELECT CAST(TA.DATABASENAME AS VARCHAR(30)) AS SOURCE_DB,

       CAST(TA.TABLENAME AS VARCHAR(30)) AS SOURCE_OBJ,

       TA.TABLEKIND AS SOURCE_OBJ_KIND,

       D.DATABASENAME AS TARGET_DB,

       T.TVMNAME AS TARGET_OBJ,

       T.TABLEKIND AS TARGET_OBJ_KIND

184

Page 46: Subj

Teradata

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

FROM DBC.TVM T,

       DBC.DBASE D ,

       DBC.TABLES TA

WHERE D.DATABASEID = T.DATABASEID

      AND T.CREATETEXT LIKE '%"' !! TRIM (TA.DATABASENAME) !!

'"."'!! TRIM (TA.TABLENAME)!! '"%' (NOT CS)

UNION

SELECT TA.DATABASENAME AS SOURCE_DB,

       TA.TABLENAME AS SOURCE_OBJ,

       TA.TABLEKIND AS SOURCE_OBJ_KIND,

       D.DATABASENAME AS TARGET_DB,

       T.TVMNAME AS TARGET_OBJ,

       T.TABLEKIND AS TARGET_OBJ_KIND

FROM DBC.TEXTTBL X,

       DBC.DBASE D,

       DBC.TVM T,

       DBC.TABLES TA

WHERE X.TEXTTYPE='C'

      AND X.TEXTSTRING LIKE '%"' !! TRIM (TA.DATABASENAME) !!

'"."'!! TRIM (TA.TABLENAME)!! '"%' (NOT CS)

      AND X.DATABASEID=D.DATABASEID

185

Page 47: Subj

Teradata

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

      AND X.TEXTID=T.TVMID

MINUS

SELECT TA.DATABASENAME AS SOURCE_DB,

       TA.TABLENAME AS SOURCE_OBJ,

       TA.TABLEKIND AS SOURCE_OBJ_KIND,

       D.DATABASENAME AS TARGET_DB,

       T.TVMNAME AS TARGET_OBJ,

       T.TABLEKIND AS TARGET_OBJ_KIND

FROM DBC.TVM T,

       DBC.DBASE D,

       DBC.TABLES TA

WHERE D.DATABASEID=T.DATABASEID

      AND D.DATABASENAME= TA.DATABASENAME

      AND T.TVMNAME= TA.TABLENAME

) WITH DATA

PRIMARY INDEX (SOURCE_DB,SOURCE_OBJ)

ON COMMIT PRESERVE ROWS;

As this SQL uses product joins to dbc.tables you might see some longer response times in bigger

(number of DB objects) production environments. In case you hit serious performance problems

use a permanent table, and create a macro which has a databasename as input parameter and run

it for one DB at a time.

186

Page 48: Subj

Teradata

I found some self references of ETL tool generated DB objects which created some problems in

a later analysis. Therefore we should ensure that the table does not contain self referencing

records by deleting these.

1

2

3

DELETE

FROM OBJ_DEPENDENCY

WHERE TRIM (SOURCE_DB) !! TRIM(SOURCE_OBJ) = TRIM (TARGET_DB) !!

TRIM(TARGET_OBJ);

The table OBJ_DEPENDENCY contains all objects dependencies which can be derived via this

approach from DBC. It is important to run this for the whole system as otherwise the analysis

will be incomplete.

Feeding this info into some external program which is able to visualise graphs we can start to get

a better understanding of our database objects. The picture below is showing the object

dependencies of the R13.10 DBC database (all non DBC objects are excluded here). The

different colours are indicating the different object types. Arrows are showing the relation

direction. The placement / layout of the DB objects is chosen by the tool and has no meaning.

The centre is quite messy and would require a much bigger area to plot to see all the details.

187

Page 49: Subj

Teradata

188

Page 50: Subj

Teradata

The picture itself has some appeal but also shows some valuable insights like basic structures of

the DBC components like the ResUsageSpma complex which is shown below in a higher

resolution.

Having these kind of visualisation tools available is nice but we can do similar things with SQL

in the DB.

The following Recursive SQL is addressing the starting question of showing all dependent

objects for a given DB or DB obj.

1

2

3

WITH RECURSIVE DEPENDENT

( SOURCE_DB,

  SOURCE_OBJ,

189

Page 51: Subj

Teradata

4

5

6

7

8

9

1

0

1

1

1

2

1

3

1

4

1

5

1

6

1

7

1

  SOURCE_OBJ_KIND,

  DEPENDENT_DB,

  DEPENDENT_OBJ,

  DEPENDENT_OBJ_KIND,

  DEPENDENCY_LEVEL

)

AS

(

SELECT SOURCE_DB,

       SOURCE_OBJ,

       SOURCE_OBJ_KIND,

       TARGET_DB AS DEPENDENT_DB,

       TARGET_OBJ AS DEPENDENT_OBJ,

       TARGET_OBJ_KIND AS DEPENDENT_OBJ_KIND,

       CAST(1 AS SMALLINT) AS DEPENDENCY_LEVEL 

FROM OBJ_DEPENDENCY

UNION ALL

SELECT D.SOURCE_DB,

       D.SOURCE_OBJ,

       D.SOURCE_OBJ_KIND,

       O.TARGET_DB AS DEPENDENT_DB,

190

Page 52: Subj

Teradata

8

1

9

2

0

2

1

2

2

2

3

2

4

2

5

2

6

2

7

2

8

2

9

3

       O.TARGET_OBJ AS DEPENDENT_OBJ,

       O.TARGET_OBJ_KIND AS DEPENDENT_OBJ_KIND,

       D.DEPENDENCY_LEVEL + 1 AS DEPENDENCY_LEVEL

FROM OBJ_DEPENDENCY O

     JOIN

     DEPENDENT D

        ON O.SOURCE_DB = D.DEPENDENT_DB

           AND O.SOURCE_OBJ = D.DEPENDENT_OBJ

           AND D.DEPENDENCY_LEVEL <= 100

)

SELECT *

FROM DEPENDENT

ORDER BY SOURCE_DB,

         SOURCE_OBJ,

         SOURCE_OBJ_KIND,

         DEPENDENCY_LEVEL;

191

Page 53: Subj

Teradata

0

3

1

3

2

3

3

3

4

3

5

3

6

3

7

3

8

3

9

4

0

This query returns for all DB objects the dependent objects with the level of dependency. 1 on

the dependency_level would indicate that the dependent object accessing the source object

directly. 2 would indicate an indirect access via a different object. For example does

192

Page 54: Subj

Teradata

DBC.ResOneNode access DBC.ResUsageSpma on level 2 via the view

DBC.ResGeneralInfoView.Be aware that a source object can occur more than once in the result

set with different dependency levels.

Excel can be a good place to use this data for further analysis.

Analysing this data can give you some interesting insights about your applications and designs.

For example estimate the maximum value of your dependency_level before you query your

system. If the highest number is higher than expected it might be worth to check the design.It is

not possible to judge the quality of a database design based on only one number but check

yourself if you can explain the business logic if you access a database object which is going

down to other objects on levels >5 for example.

Be aware that we do not get any information about the type of dependency. If a view is accessing

a table we don't know if this is “only” for referential integrity checking, or filtering or if columns

are selected in the view.

So have fun to analyse your database object dependencies! 

DISCLAIMER:

As we use here a black-box SQL derived from Teradata Administrator as a starting point we

can't be sure that we get the correct and complete results. Neither can we be sure that the analysis

will work with the next DB release. Would be nice if Teradata would give us and support a

system view with this info in a future release. Maybe some bigger customer wants to request this.

Appendix

Mathematica code

As some asked find attached the code which produced a graph plot of object dependency.

GraphPlot is the Mathematica internal function we use – therefore the code is quite limited.

I assume you managed to configure the mathematica JDBC settings and you established a JDBC

193

Page 55: Subj

Teradata

connection your TD system.

The start would be to download the OBJ_DEPENDENCY content via

1

2

3

4

5

6

dat = SQLExecute[conn,

"select *

 from obj_dependency

 where source_db = 'DBC'

       and target_DB = 'DBC'

 order by 1,2,4,5"];

To plot graphs relations need to be in a form of A → B. Therefor you create this kind of relations

via

1

2

3

dep = Flatten[

   Map[{#[[3]] <> " " <> #[[1]] <> "." <> #[[2]] -> #[[6]] <> " "

<> #[[4]] <>

         "." <> #[[5]]} &, dat], 1];

Graph[dep] would already the following picture.

194

Page 56: Subj

Teradata

This is already nice – rest will be just to make it a bit nicer We need to list of object types in the

result set.

1objTypes = StringTrim[Union[dat[[All, 3]], dat[[All, 6]]]]

For the color definition we define the following function

1

2

3

4

5

6

7

8

getColor[

  word_String,

  words_List,

  colorschema_String

  ] := Module[{number},

  number = N[(Position[words, word][[1]] - 1)/(Length[words] -

1)][[1]];

  ColorData[colorschema][number]

195

Page 57: Subj

Teradata

  ]

For the vertex coloring and shap we define the following function

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

plotVertex[position_List, text_String] := Module[{},

   {

    White,

    Text[

     Framed[

      Style[text,

       12

       ],

      FrameStyle -> Black,

      Background -> getColor[StringTake[text, 1], objTypes,

"BrightBands"],

      RoundingRadius -> 5],

     position

     ]

    }

   ];

The final grap plot is than

1GraphPlot[

196

Page 58: Subj

Teradata

2

3

4

5

6

7

8

9

  dep,

  DirectedEdges -> True,

  VertexLabeling -> True ,

  VertexRenderingFunction -> plotVertex,

  PackingMethod -> "ClosestPackingCenter",

  ImageSize -> {6000, 4000}

  ]

 ]

Teradata Columnar

Blog entry by PaulSinclair on 27 Sep 2011 1 comment

Tags:

autocompression

column partitioning

column-storage

columnar

row partitioning

row-storage

Teradata 14.0 introduces Teradata Columnar – a new option to organize the data of a user-

defined table or join index on disk.

Teradata Columnar offers the ability to partition a table or join index by column.  It introduces

column-storage as an alternative choice to row-storage for a column partition and

197

Page 59: Subj

Teradata

autocompression.  Column partitioning can be used alone in a single-level partitioning definition

or with row partitioning in a multilevel partitioning definition.

Teradata Columnar is a new paradigm for partitioning, storing data, and compression that

changes the cost-benefit tradeoffs of the available physical database design choices and their

combinations.  Teradata Columnar provides a benefit to the user by reducing I/O for certain

classes of queries while at the same time decreasing space usage.

A column-partitioned (CP) table or join index has several key characteristics:

1. It does not have a primary index (a future blog entry will discuss why).

2. Each of its column partitions can be composed of a single column or multiple columns.

3. Each column partition usually contains multiple physical rows. Physical rows are disk-

based structures that the file system manages based on rowids.

4. A new physical row format COLUMN may be utilized for a column partition; such a

physical row is called a container. This is used to implement column-storage, row

header compression, and autocompression for a column partition. This provides a

compact way to store a series of column partition values.

5. Alternatively, a column partition may have physical rows with ROW format that are used

to implement row-storage; such a physical row is called a subrow. Each column

partition value is in its own physical row. Usually a subrow is wide (multicolumn, large

character strings, etc.) where the row header overhead for each column partition value

is insignificant and having each column partition value in its own physical row provides

more direct access to the value.

6. A CP table is just another type of table that can be accessed by a query. A single query

can access multiple kinds of tables.

Learn more about Teradata Columnar at the 2011 Teradata PARTNERS User Group Conference

& Expo

198

Page 60: Subj

Teradata

Rows vs. Columns? Why not Both? - Tuesday, October 4, 3:00 PM, Exhibition Hall F

PPI stands for partitioned primary index which means the table has a primary index and the rows

are partitioned on the AMPs (and within a partition, the rows are ordered by a hash of the

primary index columns).

A CP table is not a PPI table since a CP table doesn't have a primary index.  But a CP table can

have RANGE_N and CASE_N row partitioning (the kind of partitioning of rows that is used in

PPI) but since there is no primary index, the rows within a row partition are not ordered by a

hash of some columns of each row -- they are just in insert order. 

A CP table could have a join index on it where the join index does have primary index (but not

column partitioning).

A PI or PPI table could have a join index on it where the join index has column partitioning (but

not a primary index) plus optionally 1 or more levels of row partitioning. 

A NoPI table can't have row partitioning unless it also has column partitioning. 

For example:

1

2

3

4

5

6

7

CREATE TABLE SALES (

    TxnNo     INTEGER,

    TxnDate   DATE,

    ItemNo    INTEGER,

    Quantity  INTEGER )

  PARTITION BY COLUMN,

  UNIQUE INDEX (TxnNo);

199

Page 61: Subj

Teradata

This creates a column-partitioned (CP) table that partitions the data of the table vertically.   Each

column is in its own column partition that is stored using column-storage with row header

compression and autocompression.  All the data for TxnNo comes first, followed by the data for

TxnDate, followed by the data for ItemNo, and then the data for Quantity.  Note that a

primary index is not specified so this is NoPI table.  Moreover, a primary index must not be

specified if the table is column partitioned.

The following adds a level of row partitioning (so the table has multilevel partitioning).  All the

data for TxnNo for the first day comes first, followed by the next day of data for TxnNo, etc.

then all the data for TxnDate for the first day, the second day, etc, ending with the last day of

data for Quantity.

1

2

3

4

5

6

7

8

9

10

CREATE TABLE SALES (

    TxnNo     INTEGER,

    TxnDate   DATE,

    ItemNo    INTEGER,

    Quantity  INTEGER )

  PARTITION BY (

      COLUMN,

      RANGE_N(TxnDate BETWEEN

          DATE '2011-01-01' AND DATE '2011-12-31' EACH INTERVAL

'1' DAY) ),

  UNIQUE INDEX (TxnNo);

200