SSIS for the Disinterested

Preview:

DESCRIPTION

Brian Garraty @ NULLgarity. SSIS for the Disinterested. Who I am. Brian D. Garraty SQL Server DBA, Va Beach Public Schools HRSSUG Leadership Team Background in C++, VB, ASP, C# @ NULLgarity NULLgarity.wordpress.com >10 years experience with SSIS & DTS. Why I’ve Come to You Today. - PowerPoint PPT Presentation

Citation preview

SSIS for the Disinterested

Brian Garraty@NULLgarity

Who I am

Brian D. Garraty SQL Server DBA, Va Beach Public

Schools HRSSUG Leadership Team Background in C++, VB, ASP, C# @NULLgarity NULLgarity.wordpress.com >10 years experience with SSIS &

DTS

Why I’ve Come to You Today SSIS is often

Misunderstood Disliked Considered difficult to learn Subject to ridicule

Why I’ve Come to You Today SSIS is actually

Fairly simple (mostly) Pretty good at a lot of things Fun The right set of tools for the job

(sometimes)

Success Requires

A desire to learn new things The time to learn new things Willingness to work through

frustrations Willingness to start over The right project

Who I Come to You As TodayBrian D. Garraty, SSIS Defender

Brian D. Garraty, SSIS Apologist

Who You Are

Type of Work You Do Experience with SSIS Rank your love of SSIS on Scale of 1

to 10 10 = God, Family, Country, SSIS 5 = Like it like a friend 1 = I want to fight it, really

Where we are headed…

Starting from scratch – 100 Level Control Flow Data Flow Configurations Dynamic SSIS

What you can do with SSIS What I have done with SSIS

Where we are headed…

Defense Will Rest If I knew then... Resources Questions

Out of Scope

SSIS in SQL Server Denali How to do this Stuff Demos

SSIS 101SQL Server Integration Services

History

Bundled Extract, Transform, Load (ETL) Platform

Rewrite of Data Transformation Services (DTS)

Released with SQL Server 2005

Dev Environment

Developed in Visual Studio (err…”Business Intelligence Development Studio”)

Solutions, Projects, Packages

Run!!!!!!

Solution Explorer

Package Level Connections

Variables

Runtime Environment

Packages are Deployed as files or to msdb Executed via dtexec Utility (or wrapper

to it)

SSIS Service Runs on the server Doesn’t do much

A New Paradigm

Control Flow Ordered workflow Isolated tasks Precedence Constraints Data flow tasks

Data Flow Move data from Point A to Point B Manipulate data along the way

Control FlowWhere You Build Your Workflow

Control Flow Picture

Control Flow Toolbox

Sequence Containers

Group related tasks Can be enabled/disabled Can serve as source for Precedence

Constraints, even when empty

Loop Containers

Precedence Constraints

Execute SQL Tasks

Script Task

Other Control Flow Tasks

Execute Process Task Send Mail Task Execute Package Task

Data Flow Task…

Control Flow Picture

Data FlowWhere You Manipulate Data

Data Flow Picture

Data Flow Sources

Data Flow Transformations

Data Destinations

Data Flow Picture - Big

Dynamic SSISWhat You See Isn’t Always What You Get

What can be dynamic?

Object property values Variable values

Expressions

Determined Dynamic Values Expressions evaluate at runtime Example: Use current date to build

unique log file name

Configurations

Instructed Dynamic Values XML file Example: Connection Strings

Dynamic SSIS

What you see may not be what you get…

But it is what you asked for!

What You Can Dowith SQL Server Integration Services

Disparate Data Sources

Join data from Oracle with data from SQL Server? No problem. No linked server. No OPENROWSET No xp_cmdshell No bcp

Discover Workloads

Have two SQL Servers with 1500 databases, give or take, and need to pull data from all? No problem. No linked server. No OPENROWSET No xp_cmdshell No bcp

Parallelize Workloads

Multiple lines == Multiple threads Often equals, at least

Offload Workloads

Pull data off OLTP systems Crunch it to your heart’s content Location agnostic

OLTP = Online Transactional Processing

Exploit Your Talents

Seemless integration with your comfort zone .NET Stored Procedures XML

The Random

Can you have it email me? Can you only email when X and Y but

not Z? This program needs to run when the

extract finishes but only if…

What I’ve Donewith SQL Server Integration Services

Hosted Data Warehouse Pull Extract data from numerous sources Stage to database Apply transforms and business logic Write final data to files Zip files and ftp to host Track each run

(Oh no!) System Integration

How will the data get from these 100 databases on these two servers to this one?

Can you not send any dups? Can you email us the dups? Can you track the dups that have

been fixed?

Active Directory Extract

Extract users, groups, & membership Load data into SQL Server Ensure it always works

The Defense RestsThe Dark Side of SSIS

GUI-Intensive

If you aren’t comfortable in a GUI-intensive IDE, SSIS will be a challenge

!WYSIWYG

But you do get what you asked for! WYSIWYAF

Expression Syntax

Inconsistent Difficult to Master Flying solo (No Intellisense)

Wound Tight

SSIS binds itself to your meta data and holds on tight

SSIS continually checks that things have not changed

Single column change in data source must trickle all the way down your data source

What’s the Difference?

Source Control? In my experience you can check things in, label them, and do “gets”.

Trying to do a Diff might induce panic

If needed, seek a third party

Tradeoff

The more dynamic you get, the less parallel you tend to get

Read only? If only!

Just viewing details of SSIS components often leads SSIS to want to make changes to the file

Code review often leads to check out the entire project

But It Works on My Machine! The server is not your machine Early on, budget some time to work

through issues upon deployment Drivers, rights, O/S, platform, etc.

may cause problems Be patient!

Misc Rants

No Undo (yet) Variable Scope is read-only Variable Scope is context-default

Audience?

If I Knew Then…Best Practices and Words of Caution

Multiple Package Projects Benefits:

Encapsulation Readability Collaboration

Deploy as Files

Packages in development must be files

You will have multiple file projects To minimize changes on deployment,

deploy as files More info: bit.ly/lSniJS

Root Folder Variable

String variable to store location of package

Use to build relative file paths Input files (xsd, raw, etc) Output files (xml, raw, etc) Connection strings (packages, log files,

etc) More info: bit.ly/iuLctD

Indirect Configurations

Store location of configuration file in environment variable

More info: bit.ly/ksAQOG

Recreate over Edit

In response to certain changes, delete

Take advantage of ease of creating a new item rather than suffer through unpleasant editing experience

Particularly true for file connections

Resources

The New ETL Paradigm, Jamie Thomson bit.ly/e17DUR

Jamie Thomson’s Blogs sqlblog.com/blogs/jamie_thomson consultingblogs.emc.com/jamiethomson

Resources (con’t)

SSIS Community Tasks and Components ssisctc.codeplex.com

Locally grown SSIS Training andyleonard.net

Questions?

Recommended