SQL Server Days 2013 - Create ETL solutions faster with metadata driven development

Preview:

DESCRIPTION

Working on your 20th staging package? Do you need to import 30 different flat files? Don’t have the time to create all those packages? By using Biml – Business Intelligence Markup Language – you can describe your BI solution in a declarative fashion. You can turn tedious, repetitive work into reusable scripts which generate SSIS packages for you, cutting back drastically on development time. This session will introduce you to Biml and get you started on building reusable patterns for common BI tasks.

Citation preview

SQL SERVER DAYS 2013

CREATE ETL SOLUTIONS FASTERWITH METADATA DRIVEN DEVELOPMENT

KOEN VERBEECK

WHO AM I?

OUTLINE

Hello WorldIntroduction Read Flat File

Read Flat Files While Looping

Metadata driven development

Conclusion

OUTLINE

Hello WorldIntroduction Read Flat File

Read Flat Files While Looping

Metadata driven development

Conclusion

INTRODUCTION

INTRODUCTION

• large percentage of BI projects fail• Gartner - http://www.gartner.com/newsroom/id/492112

• one of the reasons is underestimating development effort ETL• Kimball: 70% time of building a DWH goes into ETL

http://www.informationweek.com/the-38-subsystems-of-etl/55300422

INTRODUCTION

“I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it”Bill Gates

INTRODUCTION

• a lot of SSIS packages are very similar• packages importing flat files• packages writing change data to staging tables• packages exporting data to excel (for some reason)• packages updating dimensions• …

• … but they take a lot of time to create

INTRODUCTION

• solution?• code reuse

o SSIS basically only supports copy-pasteo copy-paste has improved in SSIS 2012

• design patternso for example: incremental load packageo SQL Server 2012 Integration Services Design Patterns

• enable through templateso build a template packageo save it to C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE\

PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransformationItems

• but still requires you to edit each package!• (and what if you forget to edit a crucial piece?)

INTRODUCTION

• metadata driven development to the rescue!• (aka code generating code)• automate generation of common logic in SSIS packages

• first option is the “dynamic SSIS package”1. reads metadata from tables2. generates code

o usually outputs T-SQL or bcp commandso uses T-SQL or C#o for example: SELECT … INTO statements

3. loops over the generated code4. executes each statement

• disadvantages• complex project• no parallelism• difficult row based error handling• difficult to incorporate “business logic”

INTRODUCTION

• second option: BIML• started as a project at MS: http://vulcan.codeplex.com/• developer left to found company Varigence

http://www.varigence.com/o took the idea (not the code) and developed BIML

• BIML is a markup language and compilero translates metadata into business intelligence solutions for SQL Servero supports SSIS and SSASo Varigence made part of BIML available as open source

INTRODUCTION

• BIDS Helper has open source implementation of BIML• it’s free!• it’s already in the add-on you love!• it is available for SSIS 2005, 2008, 2008R2, 2012 (and 2014?)

• BIML offers• powerful code generation

o only some parts of the project deployment model are not supported

• reuse BI patterns and componentso create your pattern in BIML and generate all your packages with the same structureo BIML files can reference each other

• .NET based script languageo C# code can be incorporated into BIML to generate objects based on metadatao Intellisense (sometimes) available

• don’t like BIML?o generated packages are just SSIS packages, you can edit them using BIDS/SSDT/SSDTBIo no vendor lock-in

INTRODUCTION

• scenario for our demos• import different flat files

o exports from ERP systems, other database vendors, 3rd party providers, …

• each type of flat file has a different structureo no single SSIS package for all flat files

• the name of the flat files can changeo for example the name includes a timestamp

• this would normally require 1 SSIS package per flat file type• couple of hours/days work?

• let’s solve it with BIML!

OUTLINE

Hello WorldIntroduction Read Flat File

Read Flat Files While Looping

Metadata driven development

Conclusion

HELLO WORLD

• basic BML script structure

BIML

Connections

FileFormats

Packages

Tasks

Containers

Tasks

Dataflow

Precedence constraintsTransformations

You can also specify• events• log handlers• variables• parameters• custom tasks• script tasks/components• …

HELLO WORLD

• let’s take a look at a simple BIML script

HELLO WORLD

• BIML root node

• add connections

• add packages

HELLO WORLD

• specify Tasks

• specify specific properties

HELLO WORLD

• check for errors & generate package

• result

DEMO

show Hello World BIML

OUTLINE

Hello WorldIntroduction Read Flat File

Read Flat Files While Looping

Metadata driven development

Conclusion

READ FLAT FILE

• specify FlatFileFormat• columns: name, data type, size, delimiter (, code page)• what you’d normally specify in the flat file connection manager

• specify connection

READ FLAT FILE

• specify data flow with transformations• if no input/output connectors are specified, transformations are

connected in the order specified in the BIML file

• result

DEMO

import flat file with BIML

OUTLINE

Hello WorldIntroduction Read Flat File

Read Flat Files While Looping

Metadata driven development

Conclusion

READ FLAT FILE IN LOOP

• now let’s loop over a bunch of flat files• specify variables to hold path to current file and source

folder

• add an expression on the flat file connection manager

READ FLAT FILE IN LOOP

• add a for each loop• which has its own tasks child element

READ FLAT FILE IN LOOP

• result

DEMO

import flat file using for each loop with BIML

OUTLINE

Hello WorldIntroduction Read Flat File

Read Flat Files While Looping

Metadata driven development

Conclusion

METADATA DRIVEN DEVELOPMENT

• BIML is nice

• … but isn’t the GUI much faster to developer packages?

• time to enhance BIML with some C# goodness!• called BIMLScript• use C# to read metadata• loop over metadata and create multiple objects• entire website dedicated with tutorials and code snippets

http://bimlscript.com/• also has an online editor

METADATA DRIVEN DEVELOPMENT

• Add namespaces

• Declare variables

METADATA DRIVEN DEVELOPMENT

• Retrieve metadata (stored in a SQL Server table)

• Loop over metadata and create corresponding objects

METADATA DRIVEN DEVELOPMENT

• result

METADATA DRIVEN DEVELOPMENT

• remarks• make sure the code or the metadata doesn’t contain invalid XML

characterso < > “ &

• using C# can mess with the Intellisenseo Visual Studio thinks it’s not valid XML anymoreo color coding can disappear > right click file and choose Open With…o Intellisense can stop working in Visual Studio > use online editor

• beware of the protection levels

• some elements can only appear onceo do not put those in a loopo e.g. Connections, Packages

DEMO

generate multiple packages using BIMLScript

OUTLINE

Hello WorldIntroduction Read Flat File

Read Flat Files While Looping

Metadata driven development

Conclusion

CONCLUSION

• BIML can radically reduce SSIS development time• for frequently used package patterns• when combined with BIMLScript

• BIML supports all versions of SSIS• but some project deployment functionality is missing

• bit of a learning curve• good understanding of SSIS is necessary• basic C# skills needed• return of investment is in next projects

RESOURCES

• Official BIML• Varigence BIML product page

http://www.varigence.com/Products/Biml/Capabilities

• BIMLScript resource hubhttp://bimlscript.com/

• BIDS Helper on Codeplexhttp://bidshelper.codeplex.com/

• Blogs• Stairway to BIML by Andy Leonard

http://www.sqlservercentral.com/stairway/100550/

• BIML articles by Joost van Rossumhttp://microsoft-ssis.blogspot.be/search/label/BIML

• BIML articles by Marco Schreuderhttp://blog.in2bi.eu/tags/biml/

• BIML articles by John Welchhttp://agilebi.com/jwelch/tag/biml/

• Introduction to BIML part I by Koen Verbeeckhttp://www.mssqltips.com/sqlservertip/3094/introduction-to-business-intelligence-markup-language-biml-for-ssis/

SQL SERVER DAYS 2013

Q&A

SQL SERVER DAYS 2013

THANKS FOR LISTENING!

koen.verbeeck@element61.be@Ko_Verhttp://www.linkedin.com/in/kverbeeck

© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after

the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Recommended