Upload
pramod-singla
View
228
Download
1
Embed Size (px)
Citation preview
ContentRecap and Q&A Data Flow Transformations Synchronous vs Asynchronous Transformations Row Transformations
Demo: Character Map Demo: Copy Column Demo: Data Conversion Demo: Derived Column Demo: Export Column Demo: OLE DB Command
Rowset Transformations Demo: Aggregate Demo: Sort Demo: Pivot Demo: Unpivot Demo: Percentage Sampling Demo: Row Sampling
Summary
@copyright 2014 ([email protected])
Recap and Q&A Data Flow Task Pipeline Architecture Data Sources
◦ Demo: ADO.NET Source ◦ Demo: Excel Source ◦ Demo: Flat File Source ◦ Demo: OLE DB Source ◦ Demo: XML Source ◦ Demo: Raw File Destination ◦ Demo: Raw File Source
Data Destinations ◦ Demo: OLE DB Destination◦ Demo: DataReader Destination◦ Demo: Excel Destination◦ Demo: Flat File Destination ◦ Demo: SQL Server Destination Analysis Services Destinations ◦ Demo: Dimension Processing ◦ Demo: Partition Processing
@copyright 2014 ([email protected])
Data Flow Transformations These are the components that aggregate, merge, distribute, and
modify data
All the Data Flow Transformations are broadly classified into 2 types:- Type 1 – Synchronous Transformations. Type 2 – Asynchronous Transformations.
All the Data Flow Transformations are broadly categorized as: Row Transformations Rowset Transformations Split and Join Transformations Business Intelligence Transformations Auditing Transformations Custom Transformations
@copyright 2014 ([email protected])
Row Transformations This transformation is used to update column values or create new columns. It transforms each row present in the pipeline (Input).
@copyright 2014 ([email protected])
Character Map (Demo) The transformation that applies string functions
to character data.
The following character mappings are available: Lowercase : changes all characters to lowercase Uppercase : changes all characters to uppercase Byte reversal : reverses the byte order of each character Hiragana : maps Katakana characters to Hiragana characters Katakana : maps Hiragana characters to Katakana characters Half width : changes double-byte characters to single-byte
characters Full width : changes single-byte characters to double-byte
characters Linguistic casing : applies linguistic casing rules instead of system
casing rules Simplified Chinese : maps traditional Chinese to simplified Chinese Traditional Chinese : maps simplified Chinese to traditional
Chinese
@copyright 2014 ([email protected])
Data Conversion column(Demo) The transformation that converts the data
type of a column to a different data type
Can perform the following types of data conversions: Change the data type Set the column length of string data and the precision
and scale on numeric data Specify a code page
If the length of an output column of string data is shorter than the length of its corresponding input column, the output data is truncated
@copyright 2014 ([email protected])
Derived Column(Demo) The transformation that populates columns
with the results of expressions.
If an expression references an input column that is overwritten by the Derived Column transformation, the expression uses the original value of the column, not the derived value.
@copyright 2014 ([email protected])
Export Column(Demo) The transformation that inserts data from a data flow into a file. uses pairs of input columns: One column contains a file name, and the
other column contains data. The data to be written must have a DT_TEXT, DT_NTEXT, or DT_IMAGE
data type.
@copyright 2014 ([email protected])
Append Truncate File exists Results
False False No The transformation creates a new file and writes the data to the file.
True False No The transformation creates a new file and writes the data to the file.
False True No The transformation creates a new file and writes the data to the file.
True True No The transformation fails design time validation. It is not valid to set both properties to true.
False False Yes A run-time error occurs. The file exists, but the transformation cannot write to it.
False True Yes The transformation deletes and re-creates the file and writes the data to the file.
True False Yes The transformation opens the file and writes the data at the end of the file.
True True Yes The transformation fails design time validation. It is not valid to set both properties to true.
OLE DB Command (Demo) The transformation that runs SQL commands
for each row in a data flow.
Configure the OLE DB Command Transformation in the following ways:
Provide the SQL statement that the transformation runs for each row.
Specify the number of seconds before the SQL statement times out.
Specify the default code page.
@copyright 2014 ([email protected])
Rowset Transformations (Demo) These transformations create new rowsets The rowset can include aggregate and sorted values, sample rowsets,
or pivoted and unpivoted rowsets.
@copyright 2014 ([email protected])
Transformation DescriptionAggregate Transformation The transformation that performs
aggregations such as AVERAGE, SUM, and COUNT.
Sort Transformation The transformation that sorts data.Percentage Sampling Transformation The transformation that creates a sample
data set using a percentage to specify the sample size.
Row Sampling Transformation The transformation that creates a sample data set by specifying the number of rows in the sample.
Pivot Transformation The transformation that creates a less normalized version of a normalized table.
Unpivot Transformation The transformation that creates a more normalized version of a nonnormalized table.
Pivot The transformation that creates a less normalized version of a normalized table. Equivalent to PIVOT command in TSQL Steps to use Pivot Transform :
Configure OLE DB Source and use above query as Source in data flow task. Drag and open Pivot Transform and go to Input Columns. Select all inputs as we are
going to use all of them in Pivot. Go to Input and output properties and expand Pivot Default Input. Here we will configure
how inputs will be used in Pivot operations using Pivot key Value. Expand Pivot Default Output, Click on the Output Columns and click AddColumn. Please
note that our destination has Five Columns, all Columns needs to be manually created in this section.Configure
Name – The name for the output column PivotKeyValue – The value in the pivoted column that will go into this output. Source Column: It is the lineage ID of the input column which holds the value for the output column.
@copyright 2014 ([email protected])
Unpivot (Demo)The transformation that creates a more
normalized version of a non-normalized table.
Equivalent to UNPIVOT command in TSQL
@copyright 2014 ([email protected])
Aggregate (Demo) The transformation that performs aggregations such as AVERAGE, SUM, and COUNT
The Aggregate transformation supports the following operations.
@copyright 2014 ([email protected])
Operation Description
Group by Divides datasets into groups. Columns of any data type can be used for grouping. For more information, see GROUP BY (Transact-SQL).
Sum Sums the values in a column. Only columns with numeric data types can be summed. For more information, see SUM (Transact-SQL).
Average Returns the average of the column values in a column. Only columns with numeric data types can be averaged. For more information, see AVG (Transact-SQL).
Count Returns the number of items in a group. For more information, see COUNT (Transact-SQL).
Count distinct Returns the number of unique nonnull values in a group.
Minimum Returns the minimum value in a group. For more information, see MIN (Transact-SQL). In contrast to the Transact-SQL MIN function, this operation can be used only with numeric, date, and time data types.
Maximum Returns the maximum value in a group. For more information, see MAX (Transact-SQL). In contrast to the Transact-SQL MAX function, this operation can be used only with numeric, date, and time data types.
Sort (Demo)This Transformation sorts data
Can apply multiple sorts to an input, identified by a numeral
The Sort transformation can also remove duplicate rows as part of its sort
This transformation has one input and one output. It does not support error outputs
@copyright 2014 ([email protected])
Percentage Sampling
It creates a sample data set using a percentage to specify the sample size.
It is useful in creating sample data sets
Number of rows in the sample output may not exactly reflect the specified percentage.
Two Named output are created: Sampled Output Unselected Ouput
@copyright 2014 ([email protected])
Row Sampling (Demo) The transformation that creates a sample data set by
specifying the number of rows in the sample.
Can specify the exact size of the output sample. This transformation is useful :
for random sampling. during package development for creating a small but
representative dataset.
Similar to the Percentage Sampling transformation.
Has one input and two outputs. It has no error output.
@copyright 2014 ([email protected])
Summary Data Flow Transformations Synchronous vs Asynchronous Transformations Row Transformations
Demo: Character Map Demo: Copy Column Demo: Data Conversion Demo: Derived Column Demo: OLE DB Command
Rowset Transformations Demo: Aggregate Demo: Sort Demo: Unpivot Demo: Pivot Demo: Percentage Sampling Demo: Row Sampling
@copyright 2014 ([email protected])
@copyright 2014 ([email protected])
Resources & QuestionsContact me :
[email protected] http://pramodsingla.wordpress.com/
Microsoft Resources: http://www.phpring.com/data-flow-transformation-categories-in-ssis/ http://sqlblog.com/blogs/jorg_klein/archive/2008/02/12/ssis-lookup-transformation-is-
case-sensitive.aspx http://pivottransform.blogspot.in/ http://www.jasonstrate.com/2011/01/31-days-of-ssis-unpivot-transformation-1131/
@copyright 2014 ([email protected])