43
Microsoft SQL Server Business Intelligence Powered by Pragmatic Works Robert Peters, Vice President 1

Sql server bi poweredby pw_v16

Embed Size (px)

DESCRIPTION

Business executive leadership, who typically drive the need for BI solutions, are primarily focused on the end user aspect of BI: OLAP reporting and dashboards. However it is vital for businesses to understand that ETL, Integration, Data Modeling, and Data Warehousing form the cornerstones of a successful BI solution. The time and energy spent on selecting an enterprise ETL solution along with designing finely tuned and highly performing ETL processes will ultimately produce “clean” data ready to be consumed by the business. Traditionally, ETL Tools have been extremely expensive, and some of them still are. While these tools have superb functionality and support, the question remains, “does every organization need all the functionality they provide or are there cheaper alternatives that would do the job just as well?”

Citation preview

Page 1: Sql server bi poweredby pw_v16

Microsoft SQL Server Business IntelligencePowered by Pragmatic Works

Robert Peters, Vice President

1

Page 2: Sql server bi poweredby pw_v16

Table of ContentsIntroduction..........................................................................................................................................................................4

Pragmatic Works Mission..................................................................................................................................................4

Purpose.............................................................................................................................................................................4

The Pragmatic Way of Implementing Business Intelligence..............................................................................................4

What is Enterprise Information Management (EIM)?...........................................................................................................5

The Benefit of EIM:............................................................................................................................................................5

The SQL Server 2012 EIM Platform:..................................................................................................................................6

SQL Server Integration Services:............................................................................................................................................7

Integrating Data across the Enterprise and Beyond:.......................................................................................................12

SSIS Connection Managers and Data Sources:................................................................................................................12

SSIS Testing and Best Practice.........................................................................................................................................13

User Defined Best Practices.........................................................................................................................................13

Best Practices Analyzer................................................................................................................................................14

BI Data Testing............................................................................................................................................................14

SSIS Unit Testing..........................................................................................................................................................15

BI xPress BI Compare...................................................................................................................................................16

SQL Server Data Quality Services (DQS):.............................................................................................................................17

SQL Server Master Data Services (MDS):.............................................................................................................................20

Self-Service Business Intelligence........................................................................................................................................22

Empowering Business Users:...........................................................................................................................................22

Empowering Users to Manage Data Quality:..................................................................................................................23

Empowering Business Users to Manage Master Data:....................................................................................................24

Make Trusted Decisions on Credible, Consistent Data:...................................................................................................25

Pragmatic Works Software..................................................................................................................................................26

BI xPress/DBA xPress:......................................................................................................................................................26

Task Factory:...................................................................................................................................................................28

Doc xPress (Formally BI Documenter):............................................................................................................................29

Pragmatic Works Consulting Expertise (Remote & Onsite).................................................................................................30

Data Integration & Cleansing..........................................................................................................................................30

Analytics & Reporting......................................................................................................................................................30

Parallel Data Warehousing (PDW)...................................................................................................................................30

Appliance & Cloud...........................................................................................................................................................30

2 | P a g e

Page 3: Sql server bi poweredby pw_v16

Migration and Consolidation...........................................................................................................................................31

Performance & Infrastructure.........................................................................................................................................31

Availability & Scalability..................................................................................................................................................31

Corporate Training (Virtual & Onsite).................................................................................................................................32

Data Integration..................................................................................................................................................32

Analysis................................................................................................................................................................32

Business Analytics................................................................................................................................................32

Data Visualization................................................................................................................................................32

Database Administration.....................................................................................................................................32

Conclusion:..........................................................................................................................................................................33

Customers Projects.............................................................................................................................................................34

Pragmatic Works: Contacts.................................................................................................................................................36

3 | P a g e

Page 4: Sql server bi poweredby pw_v16

Introduction

Pragmatic Works Mission

Pragmatic Works’ strives to help Microsoft SQL Server Developers and DBA’s operate more efficiently with innovative products that support the entire SQL Server data platform. The company accomplishes this by offering software that simplifies the development and management of SQL Server, along with top-tier SQL Server training offerings and consulting services to assist with your most complex data management, big data, cloud, and business intelligence projects. Pragmatic Works has served more than 4,500 companies globally across multiple industries, including banking, insurance, financial, automotive, and education.

Purpose

Traditionally, organizations have focused on market leaders, who were often seen as visionaries, to understand what IT professionals needed to have in an enterprise ETL solution. These organizations assumed that they had automatically made the right choice if they purchased a tool from one of these market leaders. Since the late nineties, however, the market has changed substantially. Practically all of the Business Intelligence (BI) vendors have purchased or developed their own ETL tools. Since a centralized data warehouse is one of the cornerstones of a successful BI solution, this has turned out to be a wise choice. Market estimates show that 70-80% of the costs of a successful BI system relate to the creation of a reliable ETL processes and data integration.

Each year from 2008 to 2013, Pragmatic Works has regularly investigated market leading ETL tools based on best practices, ease of development, community involvement, cost, monitoring, integration and performance from this data. The purpose of these “gap analysis” reviews are to identify the shortcomings within Microsoft’s enterprise ETL Solution, SQL Server Integration Services, and help build solutions to match functionality found within other, much more costly, ETL tools. Pragmatic Works has worked closely with Microsoft’s development team to successfully develop tools which simplify the development of a Microsoft BI solution, also adding the ability to monitor and report on the whole Business Intelligence Stack (SSIS, SSAS, & SSRS). Our development efforts have focused on products that could easily be integrated into Visual Studio to increase the performance, productivity, data quality, security, and connectivity with SQL Server Integration Services (SSIS). Because Pragmatic Works’ software solutions seamlessly plug into the familiar Visual Studio interface, the learning curve for these tools are minimal, allowing DBAs and Developers to effectively start using immediately.

The Pragmatic Way of Implementing Business Intelligence

Business executive leadership, who typically drive the need for BI solutions, are primarily focused on the end user aspect of BI: OLAP reporting and dashboards. However it is vital for businesses to understand that ETL, Integration, Data Modeling, and Data Warehousing form the cornerstones of a successful BI solution. The time and energy spent on selecting an enterprise ETL solution along with designing finely tuned and highly performing ETL processes will ultimately produce “clean” data ready to be consumed by the business. Traditionally, ETL Tools have been extremely expensive, and some of them still are. While these tools have superb functionality and support, the question remains, “does every organization need all the functionality they provide or are there cheaper alternatives that would do the job just as well?”

Many organizations still have not committed the efforts necessary to build complete BI solutions, which starts with maximizing the extraction, transforming, and loading of the company data. The fact remains that data warehouses are still being developed by hand using either SQL or PL/SQL. Selecting and using the proper ETL solution not only enhances how data is transformed and loaded, but also streamlines development. Additionally, the reliability and stability of data warehouses built using an ETL tool has increased because more criteria can be checked and monitored in relation to

4 | P a g e

Page 5: Sql server bi poweredby pw_v16

each other, metadata being a case in point. It should, however, be noted that merely using an ETL tool does not automatically guarantee success.

The market for Business Intelligence tools is growing again with expectations of large revenue increases this year for many of the suppliers. There does, however, appear to be no increase in the success rate of the warehouses and BI systems that are being created. Despite the availability of improved software at lower prices, we are not producing better systems for our users. Our mission is to not only find the factors that make an ETL project a success, but to also deliver a simpler means of getting those writing SQL or PL/SQL to embrace ETL.

What is Enterprise Information Management (EIM)?

Enterprise Information Management (EIM) is a growing priority for organizations that want to gain a competitive advantage by basing key business decisions on credible, consistent data. Some of the challenges involved in implementing an effective EIM solution include:

Integrating data from an increasing number of diverse sources and in a growing number of formats into a common platform for decision making

Empowering information workers who understand the business to manage data governance, while ensuring IT maintain control

SQL Server 2012 provides a comprehensive platform for EIM, which makes it possible to: Integrate any data from applications and systems across the enterprise. Make trusted decisions based on cleansed and standardized data (one version of the truth). Empower business users to manage data governance and easily gain insights from the data.

5 | P a g e

Page 6: Sql server bi poweredby pw_v16

The Benefit of EIM

Increasingly competitive business environments require organizations to achieve a competitive advantage based on highly intelligent business decisions. Most organizations recognize the value of basing decisions on credible, consistent data, at a time when businesses, their customers, and third-party services on the Web are generating increasing volumes of data. The problem is that data is usually created and stored in isolated application silos with varied levels of consistency and accuracy, presenting challenges of integration and standardization. These challenges prevent companies from getting a comprehensive “single view of the truth” needed to drive effective decision making.

Many organizations are looking to Enterprise Information Management (EIM) as a way to integrate, consolidate, and cleanse data for decision making. An optimized EIM solution can integrate day to day business operations and support data warehousing and business intelligence (BI) to help organizations learn from their data and become more effective. The reason for this trend is clear. Executives in many organizations believe that by bringing together as much information as possible as a single, trusted source of data for decision making, they can ultimately make the business decisions necessary to gain a competitive advantage and increase company profits.

Microsoft SQL Server 2012 builds on the data integration and management features of previous releases to provide a comprehensive platform for EIM. Moreover, Microsoft’s data platform is designed to enable organizations to capitalize on the wealth of business knowledge held by information workers – enabling business users to take on the role of data stewards and manage data quality and consistency with minimal support from IT specialists.

The SQL Server 2012 EIM Platform

SQL Server 2012 in conjunction with the Pragmatic Workbench provides all the components needed for an effective EIM solution in a single product. Key components of SQL Server 2012 that help you build an EIM solution are:

SQL Server Integration Services SQL Server Data Quality Services SQL Server Master Data Services Pragmatic Workbench Implementation Services

These technologies work together to create an EIM solution that supports other SQL Server technologies for data warehousing and BI, and ensures that the entire business decision making ecosystem begins and ends with the business user. Figure 1 shows how SQL Server and other Microsoft technologies work together to provide a user-centric approach to business decision making.

6 | P a g e

Page 7: Sql server bi poweredby pw_v16

Figure 1: A user-centric approach to business decision making

SQL Server Integration Services

SQL Server Integration Services (SSIS) - a component of SQL Server and the Pragmatic Workbench - is an extensible platform for building high performance data integration (ETL - Extraction, Transformation and Loading) and workflow solutions. In an EIM context, SSIS provides a workflow and data flow engine that you can use to integrate data from virtually any data source into an ecosystem for business decision making. You can use Integration Services to automate tasks such as copying or downloading files, sending e-mail messages in response to events, updating data warehouses, cleaning and mining data, and managing SQL Server objects and data. Unlock and integrate the data from any industry standard third party source like SQL Server, Oracle, Teradata, DB2, SAP, SharePoint Source and Destination, Dynamics Source and Destination, Saleforce.com Source and Destination, Email Source, XML Source, real time, cloud-based applications, and more.

Microsoft has made it possible to extend the native SSIS capabilities, allowing companies like Pragmatic Works to create tools which significantly enhance SSIS performance, data quality, security, compliance, SLA management, and developer productivity. An example of this is through Pragmatic Works’ own collection of high-performance components and tasks for SSIS, called Task Factory. SQL Server provides you with the flexibility and power to manage your simple or complex ETL Projects using native SSIS features, but certain things still cannot be accomplished easily or are impossible to perform without extensive knowledge of programming. Task Factory instantly fills some of these gaps within native SSIS.

7 | P a g e

Page 8: Sql server bi poweredby pw_v16

Task Factory provides value by reducing overall development time while enhancing the quality of your SSIS packages. These are the different types of components that can be extended:

Control Flow Components – The run-time engine of SSIS implements the control/work flow and package management infrastructure that lets developers control the flow of execution and set options for logging, event handlers, and variables. SSIS includes several in-built control flow components that are sufficient to implement most common data integration scenarios. Task Factory further enhances control flow capabilities with added functionality:

o SharePoint Documents Task o Download File Task o Advanced Execute Process Task o Advanced Execute Package Task o Advanced Email and SMS Task o File Properties Task o Compression Task o Expression Task o PGP Task o SFTP Task

Log Provider Component – BI xPress directly integrates into the log provider to create a visual presentation that provides an environmental owner with an in-depth understanding of the happenings within their SSIS ETL environment. Here are a list of in-depth reports:

o Extract and Load Trendso Package Run Timeo Execution Summaryo Real-Time Executiono Execution Dashboardo Event Handling Reports

Advanced Drill down for troubleshootingo Package Alertso Package and Task Performance

8 | P a g e

Page 9: Sql server bi poweredby pw_v16

Connection Manager Components – Connection Managers encapsulate the information needed to connect to an external data source. SSIS includes a variety of inbuilt connection managers that support connections to the most commonly used data sources, from enterprise databases to text files, SSAS cube to FTP sites and Excel worksheets etc. Task Factory has a series of custom connections for XML Destination, Saleforce.com Source and Destination, Dynamics Source and Destination, Email Source, and SharePoint Source and Destination.

Data Flow Component s – The data flow engine of SSIS manages the data flow task, which is a specialized, high performance task dedicated to moving and transforming data from disparate sources to the destination. Unlike other control flow tasks, the data flow task contains additional objects called data flow components, which can be either sources, transformations, or destinations as discussed below:

o Source components or source adapters pull data from a specified source and feeds this into the data flow pipeline of the engine.

o Transformation components transform the data to the required format.o Destination components or destination adapters load or write data to the specified destination.o Task Factory delivers custom transformations to help scale and increase the performance across your entire ETL

environment.

SSIS features of a workflow engine where you can use to automate control flow tasks and data flows. Data flows consist of a sequence of data sources, transformations, and destinations arranged as a pipeline through which data is passed between buffers. The buffer-based nature of the data flow pipeline enables ETL developers to maximize data throughput and optimize the overall performance of the data flow. ETL developers can use SQL Server Data Tools, a graphical development interface built on the Visual Studio environment, to create SSIS package and with the enhancements from the Pragmatic Workbench.

9 | P a g e

Page 10: Sql server bi poweredby pw_v16

Organizations can now develop BI faster, code once and reuse, detailed reporting to determine Impact Analysis, Lineage, and Document your SQL Server environment, SQL Server Integration Services (SSIS), SQL Analysis Services (SSAS), and SQL Server Reporting Services (SSRS).

Each package encapsulates a control flow, which may in turn contain multiple data flows. SQL Server Data Tools provides a simple to use, highly productive development environment that makes it possible for developers to quickly create and deploy complex ETL solutions.

10 | P a g e

Page 11: Sql server bi poweredby pw_v16

Figure 2: Creating an SSIS Package in SQL Server Data Tools

BI xPress from Pragmatic Works offers an easy way to add rich auditing features in SSIS packages using a custom auditing framework. This Auditing Framework uses all native SSIS features and allows you to track packages in real-time. The Auditing Framework also features many predefined reports related to the performance of your SSIS packages. You have the ability to audit the following information using reports provided with Auditing Framework (Note: use the Report Viewer application to view auditing data.)

Which packages are currently running and which task is running inside the package Historical package execution detail for selected date range (i.e. Run time, Errors, Warnings etc.) Error and Warning by Task and Package Run time by Task and Package Variable values before and after execution Variable change history (every change to variable value can be tracked) Connection Manager connection string Extracted and Loaded Records along with their source and target information (e.g. Table/View, SQL Query, File Name,

Component Name, Data Flow Name, Connection String etc.). Run time Trend for several days/weeks/years by Package and Task Error/Warning/ Trend by Package and Task Extract/Load Trend by Package, DataFlow Extract/Load Trend by Data Object (e.g. File, Table/View or SQL Query)

SQL Server 2012 introduces a new project-level deployment model for SSIS packages, enabling organizations to deploy and manage multiple related SSIS packages as a single unit. You can define multiple execution environments, with associated

11 | P a g e

Page 12: Sql server bi poweredby pw_v16

configuration settings, in the form of variables that can be mapped to project-level parameters defined in the SSIS project. Projects are deployed to an SSIS catalog on a SQL Server instance and can be managed with SQL Server Management Studio. You can also schedule execution of individual SSIS packages by creating SQL Server Agent jobs, enabling you to create fully automated ETL solutions that power your EIM data integration processes.

Figure 3: SSIS Project Deployment and Management

By incorporating BI xPress when you deploy a project in an SSIS catalog, you can monitor the details of package execution easily though built-in reporting and status tracking (as shown in figure 4). This functionality grants you the ability to watch the status of up to (16) SSIS packages at one time and enables you to verify or troubleshoot package execution and monitor performance over time. The SSIS catalog import tool will enable users to import native performance and execution data from a 2012 SSIS catalog to the BI xPress auditing database. This makes it possible for users to view execution data for packages that do not have the auditing framework across multiple servers.

Figure 4: BIX Monitoring SSIS Package Execution

Integrating Data across the Enterprise and Beyond:

12 | P a g e

Page 13: Sql server bi poweredby pw_v16

One of the key aims of an EIM solution is to consolidate the information from multiple, disparate sources and provide users with a “single version of the truth” on which to base their decisions. One of the main challenges in achieving this consolidation is that the required data is locked in discrete application silos, or needs to be obtained from external sources. SSIS Connection Managers and Data Sources:

Earlier in this paper, you learned how SSIS provides a platform for creating ETL solutions that integrate data from multiple sources. One of the key benefits of SSIS is the broad range of data connectivity it supports, from relational database systems to XML and flat files or Excel workbooks. The primary way in which SSIS connects to data sources is through an extensible architecture of connection managers, a significant number of which are provided “out of the box” in SSIS.

Figure 15: SSIS Connection Managers

Figure 15 shows a range of connection managers, including ODBC and OLEDB connection managers that can be used to connect to a wide range of common data sources, including SQL Server, Oracle, DB2, MySQL, and other database systems. You can even connect to and consume data from cloud-based databases in SQL Azure. Additionally, connection managers are available for enterprise applications such as SAP and Teradata. With the growing market share of SharePoint and storage of critical business data in SharePoint, it has become common practice to integrate data from SharePoint with a data warehouse (DW) for decision making and business analytics. Pragmatic Works has built a solution allowing for easy integration of data from and to SharePoint. The SharePoint List Source Adapter allows users to quickly connect to their SSIS packages to SharePoint servers to retrieve list and view data. Organizations can also use the SharePoint List Destination Adapter which allows users to quickly connect to their SSIS packages to SharePoint servers to send data to lists. There is an Easy to use UI for quick mapping of fields from local source data to SharePoint list columns.

SharePoint Source SharePoint Destination

SSIS also includes a large number of connection managers for commonly used data file formats, such as Excel, XML, or comma-delimited text files. You can combine these with control flow tasks to manage file system resources, FTP connections, and Web services to create complex workflows that process and consume data files.

SSIS data flows can include distributed transactions for data sources that support them. You can use these to create reliable ETL processes that produce consistent data. You can also use the checkpoint capability of SSIS to restart failed data flows without repeating workflow tasks that have already completed successfully.

13 | P a g e

Page 14: Sql server bi poweredby pw_v16

If your data resides in SQL Server or Oracle databases, new features in SQL Server 2012 make it easier than ever to identify and extract modified data through enhanced support for Change Data Capture (CDC) . These features make it easy to detect data that has changed since the previous data extraction cycle, and restrict data retrieval to include only the modified rows. This significantly improves the performance of your ETL workflows while ensuring that the information your organization uses to make business decisions reflects the latest version of the data.

SSIS Testing and Best Practice

User Defined Best Practices

Allows users to write their own scripts and create their own rules.

Best Practices Analyzer

BI xPress’ Best Practices Analyzer allows users to investigate SSIS packages for adherence to best practices as defined by Pragmatic Works. The Best Practices Analyzer can investigate packages for violations of various severities such as

14 | P a g e

Page 15: Sql server bi poweredby pw_v16

“Warning”, “Error”, “Performance”, and “Informational”. The Best Practices Analyzer can be started via a command line, within Pragmatic Workbench, or SSDT 2012 for SSIS 2012 packages.

Investigate packages for adherence to best practices ad hoc as well as in batch mode. Store the results of the best practices analysis within the BI xPress database for later analysis. Store the results of the best practices analysis within an XML file using the command line. Determine package performance issues and potential bottlenecks when using certain predefined best practices.

BI Data Testing

Data Testing allows developers to compare the contents of their data match known values at any step in the unit testing flow. Data Testing is integrated into Pragmatic Workbench's Unit Testing Framework and can easily be set up with a data connection and a data query. Any ADO.NET compatible connection that has been installed on the user's computer can be used as a data source allowing heterogeneous data sources to compare data. User can also export the data testing results to CSV or HTML.

Compare data from different sources Cache datasets or use live queries

15 | P a g e

Page 16: Sql server bi poweredby pw_v16

SSIS Unit Testing

Unit testing allows developers, users, and package lifecycle administrators to ensure that a package is performing exactly what the developer implemented and handling unexpected circumstances with predefined behavior. Unit testing allows specified inputs to be used as "source" data and evaluates the task output with expected task output thereby ensuring the package behaves correctly.

Ensure package and / or task(s) execute as expected Allow user to specify the scope of each unit test and its expected output, if any Show the results of the Unit Test within SSDT

16 | P a g e

Page 17: Sql server bi poweredby pw_v16

BI xPress BI Compare

BI Compare will show users the differences between two SSIS packages and show details on each object.

See differences between any SSIS package The providers for SQL Server and SSAS allow users to read the entire instance specified, including all databases,

settings, logons etc.

17 | P a g e

Page 18: Sql server bi poweredby pw_v16

SQL Server Data Quality Services (DQS):

The ability to integrate data from multiple data sources into a data warehouse to support business decision making is clearly of great benefit to organizations seeking a competitive advantage. However, decisions must be based on data that is trusted to be accurate, consistent, and complete.

Microsoft® SQL Server 2012 Data Quality Services (DQS) is a new offering as part of SQL Server 2012, allowing customers to cleanse, match, standardize, and enrich their data to deliver trusted information for business intelligence, data warehouse, and transaction processing workloads. End users can even cleanse their personal files in unmanaged documents. SQL Server Data Quality Services (DQS) provides an approachable data quality solution for organizations of all sizes to help improve the quality of their data.

SQL Server Data Quality Service (DQS) provides a knowledge-based approach to managing data quality. Organizations can leverage the business knowledge of their users to create knowledge bases that define known values and validation rules for the data domains used in data records for business entities. For example, you might create a knowledge base for customer data that defines the data domains, or fields, that are commonly used in customer records (such as Customer ID, First Name, Last Name, Gender, Email, Street Address, City, State, Country, etc.). You can then perform knowledge discovery against existing data to identify known values for these fields (such as “California” and “Washington” for the State field), and define rules to validate any new domain values as they are discovered (such as a rule to ensure that all Email values contain a “@” character, or that all Gender values begin with “M” or “F”).

DQS provides a client application for managing knowledge bases, as shown in figure 5.

Figure 5: Data Quality Services Client Application

As well as defining validation rules for domains in a knowledge base, you can identify synonyms and common data entry errors for domain values, and specify a leading value to which all instances of these values should be corrected. For example, your knowledge discovery might reveal that records for customers who live in California most commonly have a State value of “California”; but often an application user will enter alternative values with the same meaning, such as “CA”, “Calif.”, or they will commonly mistype the value and accidentally enter “California”. Customer records with variants of the same state value might have minimal impact in the line of business application in which they are entered, but if the data in that application is to be used for analysis or reporting that aggregates values by state, the presence of multiple values for the same state can result in some misleading information on which to base business decisions.

To avoid this problem, you can identify these as known values in the DQS knowledge base, and specify that they are synonyms that should always be corrected to a leading value of “California”. Then, when you use DQS to perform data cleansing, the resulting cleansed data will include consistent values for the state domain. Figure 6 shows a DQS knowledge base in which a

18 | P a g e

Page 19: Sql server bi poweredby pw_v16

Country/Region domain includes the leading value “United Kingdom”, and several synonyms for this value that should be corrected.

Figure 6: Correcting domain values

While a DQS knowledge base is often primarily based on your own organization’s institutional knowledge about business-specific data, there are some cases where it can be useful to incorporate external knowledge for common types of data, such as postal address or telephone number validation. The Microsoft Windows Azure Marketplace includes several commercial datasets that are specifically designed for data cleansing and validation and for which you can purchase a subscription. When you have subscribed to one of these datasets, you can use it as reference data for a domain in a DQS knowledge base and supplement your own business-specific data validation and value correction rules. For example, figure 7 shows how external data, purchased in the Windows Azure Marketplace, can be used to validate and correct company names in a Company domain by referencing a comprehensive dataset of US registered companies.

Figure 7: Using external reference data in a DQS knowledge base

You can perform data cleansing interactively with the DQS client application by specifying a data source such as an Excel spread sheet or a table in a SQL Server database, and mapping the fields in the data source to domains in the knowledge base. Additionally, you can incorporate data cleansing into ETL processes by using the Data Cleansing transformation in an SSIS data flow, as shown in figure 8.

19 | P a g e

Page 20: Sql server bi poweredby pw_v16

Figure 8: Incorporating DQS data cleaning into an SSIS data flow

As well as using DQS for data cleansing, you can create matching policies and perform data matching to identify and consolidate duplicate records for the same business entity. For example, it’s possible that a customer has registered on your organization’s e-commerce Web site as “Jenny Russell”, but also made a purchase in a physical store where the name has been recorded as “Jennifer Russell”. The organization now has multiple customer records for the same customer, which will affect the accuracy of any reporting or analysis that aggregates data by customer.

With DQS, you can create a matching policy that compares multiple domains across records, assigning a weighted value for fields that are exact or approximate matches. So your matching policy might compare customer records on FirstName, LastName, Address, Email, and DateOfBirth domains. When multiple records have enough matching domains to satisfy the matching policy, DQS identifies the records as possible duplicates. For example, if a dataset includes a record for Jenny Russell and a record for Jennifer Russell, but the address, email, and date of birth values for the two records are the same, you can reasonably assume that these records might relate to the same customer.

Figure 9: A Matching Policy

The data cleansing and data matching functionality in DQS can help organizations manage the quality and integrity of their data, and help ensure that decisions are based on trusted information.

SQL Server Master Data Services (MDS)

20 | P a g e

Page 21: Sql server bi poweredby pw_v16

Master Data Services (MDS) is the SQL Server solution for master data management, focused on creation, maintenance and storage of master data structures used for object mapping, reference data, metadata management, and dimensions and hierarchies for data integration operations. This includes business intelligence and data warehousing, and integration between operational systems. With the Master Data Services Add-in for SQL 2012, business users can directly manage existing database or data warehouse dimensions and hierarchies from within Excel without IT intervention while the IT team is still given oversight to track and reverse changes made by the business.

With DQS, an organization can apply knowledge about individual data field values to cleanse datasets and identify duplicate records. However, large enterprises often need to maintain data representations of core business entities in multiple applications and systems across the business. For example, a company might store employee data in an HR management system and also in a payroll application; or it might store product data in a stock management system and in an e-commerce product catalog.

When the same business entities are represented in multiple systems, it can be useful to maintain a definitive, master record for each entity to ensure that any data relating to a specific entity is consistent across the enterprise. You may approach this challenge by designating one of your application data stores as the master system of record for a given type of business entity (for example, you could use the HR Management system as the definitive source of information for employees), or you could create a separate master data hub that ensures consistency across all systems. The discipline of maintaining a central data definition for business entities is commonly called master data management (MDM), and SQL Server Master Data Services (MDS) provides a SQL Server-based solution that you can use to implement MDM for any kind of business entity.

Figure 10: Managing data models with Master Data Services

As figure 10 shows, MDS enables you to create master data models for your core business entities. These models contain entity definitions, which in turn define the data attributes for each entity. You can also organize your entities into hierarchical relationships, so for example a product might belong to a subcategory, which in turn belongs to a category.

After you have created a master data model, you can manage the data entities in the model to define their attributes (which you can categorize into multiple attribute groups for specific applications or user scenarios). Figure 11 shows the attributes defined for a Product entity.

21 | P a g e

Page 22: Sql server bi poweredby pw_v16

Figure 11: An entity and its attributes

When you have defined the entities and attributes in your master data model, MDS provides staging tables that you can use to load data into the model. Additionally, you can create subscription views for the entities and hierarchies you have defined so that applications can retrieve master data from the model by simply submitting regular Transact-SQL queries. This database-oriented architecture for transferring data into and out of the master data model makes it easy to build a master data hub. New data is loaded into the MDS model to be brought under the governance of master data management, and applications can consume master data to ensure enterprise-wide consistency. In many cases, SSIS is used as the “engine” to manage the flow of data into and out of the master data hub as shown in figure 12.

Figure 12: Using SSIS to insert and extract master data

When your master data model has been populated with data, you can view and manage the data instances of the entities it defines, and create custom hierarchies and collections of entities for specific business scenarios. For example, you could create an explicit hierarchy of products that are sold through a specific retail partner channel, as shown in figure 13.

22 | P a g e

Page 23: Sql server bi poweredby pw_v16

Figure 13: An explicit hierarchy

You can also use MDS to validate the data in your master data model by applying custom business rules. For example, you could define a rule that verifies that all product prices are greater than zero as shown in figure 14.

Figure 14: Defining a business rule

MDS includes many more features that enable you to implement complex MDM solutions and ensure that consistent data representations of key business entities are used across the enterprise. The combination of this ability to manage master data with MDS, the data cleansing and matching functionality of DQS, and the data integration capabilities of SSIS, creates a comprehensive platform for EIM.

Self-Service Business Intelligence

Today’s business users have an increasing demand for self-service ad-hoc reporting. Self-service BI has become one of the most highly sought after pieces within a complete business intelligence solution. Creating self-service BI tools for business users to generate their own reporting drastically reduces dependency on IT for report generation. Microsoft’s self-service BI tools include robust Excel plugins, including Power Query, Power Pivot, Power View and Power Map. These functions provide users a simple way to discover, combine, and refine data all within the familiar Excel interface. By arming the business users with the tools needed to pull their own reports on-demand, it has become critical to incorporate a Master Data Management strategy as a way of ensuring users are reporting from the most accurate and credible data sets.

Empowering Business Users

23 | P a g e

Page 24: Sql server bi poweredby pw_v16

Microsoft’s complete EIM solution makes it clear that Microsoft understands business data belongs to the business, not to IT. The IT department is adept in managing application and data infrastructure, but knowledge of what that data actually means and how it should be cleansed and made consistent is best understood by the information workers who use it in their day-to-day roles. SQL Server 2012 gives IT specialists the tools they need to build a comprehensive data integration solution and manage data governance and compliance across data infrastructure, but also gives business users intuitive tools that they can use to manage the quality and integrity of their own data.

Empowering Users to Manage Data Quality

The DQS client application provides an intuitive wizard-based tool with which business users can create and manage knowledge bases, and perform data quality tasks such as data cleansing or matching, as shown in figure 17. This ability to manage data quality with minimal technology or database expertise makes it possible for business users to take on the role of “data steward”, and manage the integrity of the data used by the business.

Figure 17: A wizard-based approach to data quality management

After performing a data cleansing or matching operation with the DQS client application, the user can export the results as a Microsoft Excel workbook as shown in figure 18. This enables them to use a familiar tool to examine and verify the suggestions that DQS has generated before applying them to production data.

24 | P a g e

Page 25: Sql server bi poweredby pw_v16

Figure 18: Data cleansing results in Microsoft Excel

Empowering Business Users to Manage Master Data

Excel is also the primary tool from which business users can manage master data. With SQL Server 2012 Master Data Services, information workers can use the MDS Add-In for Microsoft Excel to create and manage existing database or data warehouse dimensions and hierarchies from within Excel as shown in figure 19. Excel provides a familiar and intuitive environment for managing master data where business users can build and publish master data models quickly and efficiently, without specialist support from IT or external consultants.

Figure 19: Managing a master data model with Excel

25 | P a g e

Page 26: Sql server bi poweredby pw_v16

When the master data model is built, Excel continues to provide a user-friendly environment for adding and editing entity records to the model. This is done by using standard Excel functionality to type individual attribute values or copy and paste entire ranges of cells that represent multiple entity instances.

Users can also save and share queries against the master data model. They can even validate data against the business rules defined in MDS as shown in figure 20.

Figure 20: Validating master data against business rules in Excel

Make Trusted Decisions on Credible, Consistent Data

The overall goal of any EIM solution is to enable business users to properly manage the data they use to make critical business decisions. With SQL Server 2012, business users can use DQS to define and manage the knowledge bases on which data cleansing and matching rely. Additionally, they can manage the consistency of business entity data through Master Data Services. The result of this user-centric approach is a solution that maximizes the value of business data, quickly and cost-effectively.

To be of any practical use, a user-centric EIM solution must support user-centric BI. Users must be able to easily consume the standardized data of which they have created.

Microsoft understands that a huge number of organizations rely on their Microsoft Office suite as their primary productivity application for business users. Microsoft Excel has become the defacto standard for business users looking to easily manage data in a simple, table format. Within Microsoft Excel, PowerPivot (shown in figure 21) provides a massively scalable, but easy to use Excel-based data analytics tool with which business users can slice and dice data, and easily share their analysis through SharePoint.

26 | P a g e

Page 27: Sql server bi poweredby pw_v16

Figure 21: Analyzing data with PowerPivot

SQL Server 2012 also introduces Power View, a user-centric tool for interactively visualizing data in an intuitive and easy-to use interface, as shown in figure 22.

The ability for business users to take on the role of data steward with DQS and MDS, and to directly analyze and visualize data with self-service BI tools like PowerPivot and Power View, enables them to take an active role in the complete EIM lifecycle. This user-centric approach empowers organizations to use their IT resources to manage data infrastructure and integration processes effectively, while reducing the burden on IT to manage business data and analytics. This helps to reduce the overall cost of implementing EIM and facilitating a dynamic approach to business decision making that promotes business responsiveness and flexibility.

Figure 22: Interactive data visualization with Power View

27 | P a g e

Page 28: Sql server bi poweredby pw_v16

Pragmatic Works Software

Since 2008, Pragmatic Works has relied on our customer’s feedback, real-life service engagements, and our collective experiences via our 19 Microsoft MVP’s on staff to help build software solutions that fill the gaps of Visual Studio / SQL Server Data Tools.

There are 6 core areas of which Pragmatic Works’ software enhances SSIS development and management: Performance, Productivity, Code Quality, Security & Compliance, Connectivity, and SLA Management.

BI xPress/DBA xPress

BI xPress is a plug-in to Visual Studio / SQL Server Data Tools that significantly speeds the development and eases the administration of SSIS, SSAS, and SSRS. Highlighted below are the key areas where Pragmatic Works’ BI xPress gives you an edge when developing your Microsoft BI solution.

Quality: o Reusability and Standardization

Data Flow Nugget MDX Calculation Builder Package Builder Snippet Wizard Auditing Framework Compare Snapshot

o Development and QA Testing: BI Data Testing SSIS Unit Testing Best Practice Analyzer

Performance:o Find Potential Performance Design Issues

Best Practice Analyzero Identify Longest Running Packages

Auditing Framework Report Performance Dashboard

Productivity:o Reusability in SSIS

Auditing Framework MDX Calculation Builder Notification Framework Package Builder Snippet Wizard Data Flow Nugget

o Find Problems Faster Best Practice Analyzer Data & Schema Inspector BI Compare

o Speed Time to Production Package Deployment Wizard Report Deployment

o Analyze Data Faster Schema and Data Surf

SLA Management:

28 | P a g e

Page 29: Sql server bi poweredby pw_v16

o Bring System Online Faster Report Dashboard Auditing Framework Notification Framework

Security and Compliance:o Separation of Duties:

Package Deployment Wizard Report Deployment

o Monitor Who Did What: Report Dashboard Auditing Framework Notification Framework

o Company Policy: Best Practice Analyzer

Task Factory As you know, writing code to perform SSIS tasks can be time consuming for even the most experienced SSIS developers. With Task Factory you can simply drag and drop prewritten components directly into your SSIS packages while never having to leave the familiar BIDS environment!

Quality: o Scrubbing and Fixing of Data

Data Validation Data Cleansing Replace Unwanted Characters NULL Handlers RegEx Replace

o Address Problems and Standards Address Parsing

o Time Zone Conversion Time zone Conversion Transform

Performance:o Data Warehousing

Dimension Merge SCDo Updating Issues

Upsert Destination Oracle Upsert Destination

Productivity:o Reusability in SSIS

Data Flow Nuggets o Avoid Manual Scripting

Advanced Derived Column Any Task Factory Component

Connectivity: o Instant CRM Connectivity

SalesForce.com Dynamics

o Other Source and Destinations SharePoint XML Email Source

Security and Compliance:

29 | P a g e

Page 30: Sql server bi poweredby pw_v16

o Security Concerns: PGP Encryption Secure FTP Advanced Derived Colum

Doc xPress (Formally BI Documenter): Complete documentation for Microsoft SQL Server instances, Analysis Services (SSAS) cubes, Integration Services (SSIS) and Reporting Services (SSRS) with the ability to take comparison snapshots of what changed in your environment for quick and painless troubleshooting.

Quality: o Code Review and Collaboration:

Document SSIS, RS, AS, SQL Compare Snapshot

o Identify Impact of Changes and Potential Issues: Compare Snapshot Impact analysis Lineage analysis Snapshot Comparison

o Where Did the Data Come From? Object Lineage

Productivity:o Documentation Generator

Solution Wizardo Document Automatically:

Document SSIS, RS, AS, SQLo Finding issues proactively:

Compare Snapshot Impact analysis Lineage analysis

o Managing the Metabase Snapshot Management

o Documentation Image Support ER Diagrams

o Generate Documentation Output Multiple Format Standard

SLA Management: o Identify What Broke the System:

Compare Snapshot Security and Compliance:

o Sarbanes-Oxley/HIPPA: Document SSIS, RS, AS, SQL Compare Snapshot

o Tracing Object Lineage Lineage analysis

o Audit Changes: Impact analysis Compare Snapshot

30 | P a g e

Page 31: Sql server bi poweredby pw_v16

Pragmatic Works Consulting Expertise (Remote & Onsite)

Pragmatic Works’ team of expert consultants, SQL Server MVP’s, authors and mentors will work directly with you to identify challenges, scope a solution and implement with the highest quality in the industry. Our unique approach and talented team make quick work of assessing challenging environments that may stifle others. Our relationships with Microsoft, hardware and software vendors and the community give us the advantages required to deliver world class thought leadership and solutions that will make your business more intelligent.

Data Integration & Cleansing

Whether it’s through ETL migrations, performance tuning, or complicated data cleansing and governance, Pragmatic Works can help organizations drive better decisions through finely tuned BI systems. Our offerings focus on getting you up and running on your solution in a short amount of time.

ETL Migrations in record time using our best of breed teams and tools ETL performance tuning and optimization to achieve record load times Data cleansing and governance experience second to none Master data implementation and management leaders

Analytics & Reporting

Data growth is exploding, and data sources are increasing exponentially. Leveraging all of this data to drive real-time analytics is how business leaders are gaining a competitive edge. Harness the power of the analytics and presentation layer tools packaged with SQL Server to provide the insights need to power your business.

Create stunning reports and visualizations using the latest technologies in SharePoint, Reporting Services, PerformancePoint, PowerPivot and Power View

Get quick development and data refresh capabilities with Analysis Services and Tabular Modeling Thought leadership on how best to structure, analyze and report on your data based on our industry leading experience and

expertise

Parallel Data Warehousing (PDW)

Pragmatic Works has recently been designated as Microsoft’s “PDW Implementation Partner of the Year.” Pragmatic Works has been implementing PDW solutions for our customer since V1. Our team has been expertly trained by Microsoft’s PDW architecture staff, which has enabled us to implement more PDW solutions than anyone else in the world. With Pragmatic Works you can rest assured you will receive the expertise, talent and experience needed for a successful PDW implementation.

AssessmentPragmatic Works’ proprietary assessment tool will quickly help your organization understand if PDW is a fit for you and the extent of the effort necessary for the undertaking. We will also scope a series of sprints for the implementation phase and deliver a project plan with your team.

TrainingPragmatic Works will work alongside the Microsoft team to deliver training on an actual PDW appliance. This in-depth training is done prior to the implementation phase to ensure a smooth transition for your team.

ImplementationPragmatic Works will work with the Microsoft PDW team to validate and benchmark the new appliance. We will then work with your team on the implementation focusing on best practices and recommendations.

31 | P a g e

Page 32: Sql server bi poweredby pw_v16

Appliance & Cloud

Microsoft’s Parallel Data Warehouse and Azure Cloud offerings are helping transform today’s leading businesses through breakthrough performance and amazing elasticity. Let our experience help you explore the limits of your business’ potential. Our team of Sr. BI/Data Architects can help you implement these solutions quickly so that the business can start enjoying the benefits almost immediately.

Pragmatic Works has the most experience migrating and deploying to Parallel Data Warehouse in the industry. Appliance specific expertise including parallel ETL design, BI integration, and handling availability and recoverability

challenges More Fast Track and other reference architecture experience than anyone else Cloud specialists on staff to help you explore your options within the Azure ecosystem

Migration and Consolidation

We all know that if you’re not changing, you’re not growing. We work constantly with our partners and clients to review the cost advantages of an efficient data platform. This means migrating and consolidating our IT environments. Whether we’re consolidating our physical machines to virtual environments or migrating our premise based applications to cloud based environments, the opportunity to cut cost is ever-present. At Pragmatic Works, we’re constantly looking for opportunities for our customers to migrate, consolidate and cut cost within their data management platforms.

Performance & Infrastructure

Performance demands and maintenance windows are not getting any easier to manage. Take your system performance and manageability to new heights through optimization or consolidation.

Let us help you identify and resolve bottlenecks and design challenges in your environment Take advantage of savings to be found through industry leading consolidation efforts on Hyper-V or VmWare Scale your environment to handle new applications, increased user volume or make better use of your licensing Review and understand your risk before, during and after migrations with our thought leadership and focus on performance

and optimization

Availability & Scalability

Data availability is more critical than ever. The need for highly available data in multiple locations is a requirement for most organizations. Gartner reports that more than 70% of businesses wish they had more confidence in their recovery and availability strategies. Let Pragmatic Works give your business that confidence.

End to end recovery testing and validation Prototyping of recovery technologies and new SQL Server availability features Experts with clustering, mirroring and other recovery technologies Top Tier replication experience on staff Real-world experience with complicated business availability scenarios

32 | P a g e

Page 33: Sql server bi poweredby pw_v16

Corporate Training (Virtual & Onsite)Broaden your staff’s skillset and maximize SQL Server & Microsoft BI development within your organization.The Pragmatic Works Corporate Training Program has been created by Microsoft MVP and authors, Brian Knight and Devin Knight, to deliver an unparalleled training experience. The classes that make up the program allow users to train in very small, intimate settings for a very hands-on experience with the class instructor. Give your staff the confidence they need to develop exceptional solutions that wow the user base.

Key Features and Benefits

o Pragmatic Works trainers are tenured instructors and real world consultants who bring their experience and teaching expertise to your location.

o Class curriculums are developed by SQL Server MVP’s and Pragmatic Works Principals Brian Knight and Adam Jorgensen and include topics most commonly encountered in the real world.

o Train with Pragmatic Works’ Sr. BI Architects & Sr. DBAs who have a Master level certification in delivering the training curriculum

Pragmatic Works Corporate Virtual Training

Our Virtual Training classes can be taken directly from the office, and with schedules that span just a few hours a day, attendees can still keep up with work activities. The max class size is 24 attendees, allowing you to work along closely with the class instructor on class labs that help reinforce the material presented. Following the training, attendees are provided “Virtual Mentoring” time to work one-on-one with Pragmatic Works’ Sr. Consultants to tackle their own toughest challenges.

Pragmatic Works Corporate Onsite Training

Our courses can be scheduled to take place at your offices reducing the need for your group to travel and assume additional expenses. The classes are lab based with an instructor who will walk through examples and then lead students in completing work and facilitating discussions. With onsite training, students also get the benefit of discussing their own unique cases and work environment related to the course material.

Data Integrationo Pragmatic SSIS o Pragmatic Master SSIS

Analysiso Pragmatic SSAS o Pragmatic Master SSAS o Introduction to MDX o Tabular and PowerPivot for Developers

Business Analyticso PowerPivot for the Business Analyst o Self-Service Business Intelligence

Data Visualizationo Pragmatic SSRS o Pragmatic Master SSRS o SharePoint for Business Intelligence

Database Administrationo Pragmatic SQL Server Performance Tuning

33 | P a g e

Page 34: Sql server bi poweredby pw_v16

Analysis

Pragmatic SSAS Pragmatic Master SSAS Introduction To MDX Tabular and PowerPivot for

Developers

BusinessAnalytics

PowerPivot for the Business Analyst

Self-Service Business Intelligence

DataIntegration

Pragmatic SSIS Pragmatic Master SSIS

DataVisualization

Pragmatic SSRS Pragmatic Master SSRS SharePoint for Business Intelligence

DatabaseAdministration

Pragmatic SQL Server Performance Tuning

Conclusion:The ideal Enterprise Information Management (EIM) solution should start and end with the business users who drive the success of the company. Pragmatic Works empowers business users to manage the quality, integrity, and standardization of the data they use every day, allowing them to trust that they are making decisions on credible, consistent data. In this model, IT still retains oversight of the organization’s data infrastructure. With SQL Server 2012 Integration Services, Master Data Services, Data Quality Services and the Pragmatic Workbench, you can easily bring together data from all across your enterprise. You can use the data quality and governance rules defined by the business to create a reliable, trusted source of data for business decision making.

34 | P a g e

Page 35: Sql server bi poweredby pw_v16

Few of Clients (References Available upon Request)

35 | P a g e

Page 36: Sql server bi poweredby pw_v16

Pragmatic Works: Contacts

Brian Knight, CEOBrian Knight, SQL Server MVP, MCITP, is the co-founder of SQLServerCentral.com, JumpstartTV.com, and was on the Principal Board of Directors of the Professional Association for SQL Server (PASS). Brian is a contributing columnist for many industry magazines and sites. He has co-authored and authored more than 15 technology books. Brian has spoken at dozens of conferences like PASS, SQL Connections and TechEd and many Code Camps.

Tim Moolic, COOTim Moolic has more than 15 years of experience executing sales, marketing and alliance strategies for software and consulting companies. Tim founded his first software and consulting business in 1996 and has worked with a wide range of enterprise products from systems management and automated testing to database and online collaboration tools. As Chief Operating Officer for Pragmatic Works, Tim continues to expand business within the company including software acquisitions, recruiting services, platform alliances and international franchising.

Adam Jorgensen, President – Pragmatic Works ConsultingAdam Jorgensen, MBA, MCDBA, MCITP: BI has over a decade of experience leading organizations around the world in developing and implementing enterprise solutions. His passion is finding new and innovative avenues for clients and the community to embrace business intelligence and lower barriers to implementation. Adam's focus is on mentoring senior management and technical teams as he helps them to realize the value in the data they already own, while accelerating the technology learning curve. Adam is also very involved in the community as a featured author on SQLServerCentral, SQLShare, as well as a regular contributor to the SQLPASS Virtual User Groups for Business Intelligence and other organizations. He regularly speaks at industry group events, major conferences, Code Camps, and SQLSaturday events on strategic and technical topics.

Rob Peters, Vice President of SalesRobert’s work with startup companies include developing and building sales organizations, designing and implementing processes to improve the sales conversion and increasing the volume of revenue. As a Regional Sales Manager at Quest Software, he was responsible for driving over 25 Million in sales for their SQL Server Division. Robert was instrumental in making Quest the leader in the SQL Server space. Robert exudes passion, drive and experience and is a master as providing the client the highest level of service and professionalism. He is truly been a valuable asset to every organization he has been affiliated with and is excited to continue his success at Pragmatic Works.

36 | P a g e