41
INFORMATICA

47468272 introduction-to-informatica

Embed Size (px)

DESCRIPTION

Informatica

Citation preview

Page 1: 47468272 introduction-to-informatica

INFORMATICA

Page 2: 47468272 introduction-to-informatica

Overview

A DataWarehouse is a collection of subject oriented databases. It is a series of processes, procedures and tools (h/w & s/w). From the Data Warehouse , data flows to various customized databases. If this data is periodically extracted from data warehouse and loaded into local databases, then local database is called a Data Mart.

Page 3: 47468272 introduction-to-informatica

Metadata

Data SourcesData Sources Data ManagementData Management AccessAccess

Complete Warehouse Solution Architecture

Operational Data

Legacy Data

The Post

VISA

External DataSources

EnterpriseData

Warehouse

Organizationally structured

ExtractTransformLoad

Data Information Knowledge

Asset Assembly (and Management) Asset Exploitation

Data Mart

Data Mart

Departmentally structured

Data Mart

Sales

Inventory

Purchase

Page 4: 47468272 introduction-to-informatica

Use of Informatica in Datawarehousing

Page 5: 47468272 introduction-to-informatica

The data in the data warehouse comes from various sources running on different platforms. An ETL tool is used to integrate data from various sources and load it into DataWarehouse.INFORMATICA is an ETL tool used in the process of Extracting data, transforming the data and loading it in data warehouse. INFORMATICA has two products to carry out this ETL process.

PowerCenterPowerMart

Overview

Page 6: 47468272 introduction-to-informatica

Source TargetServerSource

DataTransformed Data

Instructions

Repository

Overview

Page 7: 47468272 introduction-to-informatica

Components

INFORMATICA PowerCenter has following components :•ODBC•PowerCenter Server: It is a application that reads, transforms and writes data to target.

Page 8: 47468272 introduction-to-informatica
Page 9: 47468272 introduction-to-informatica

•PowerCenter Client : The client has five different tools:

The Source Analyzer : Used to add source definitions to the repository.The Warehouse Designer : Used to create targets and add their definitions to the repository.The Transformation Developer : Used to create reusable transformations.

Components

Page 10: 47468272 introduction-to-informatica

Mapplet Designer : Used to create

mapplets.The Mapping Designer : Used to create

mappings from source to targets.

Components

Page 11: 47468272 introduction-to-informatica
Page 12: 47468272 introduction-to-informatica

Connectivity And Set Up

Page 13: 47468272 introduction-to-informatica

Configuring Server Manager

• Informatica Server name

• Type of network protocol to access the server – TCP/IP or IPX/SPX

• Port number on which the client communicates (for TCP/IP) - 4001

• Address of machine on which the server runs (for IPX/SPX)

• Timeout – number of seconds the SM waits for response from Informatica Server

Page 14: 47468272 introduction-to-informatica

Configuring Server Manager

• Default directories for session files and caches e.g $PMRootDir, $PMSessionLogDir, $PMBadFileDir

• Defining Database Connections

• Defining FTP connections

Page 15: 47468272 introduction-to-informatica

Features

•INFORMATICA Server : Reads data from sources, transforms data as instructed by repository metadata and writes it to target.

Page 16: 47468272 introduction-to-informatica

•Repository manager: Used to create and manage repositories.

Repository is a database containing a set of instructions to know from where to get data (source), how to process/transform it and where to write it (target). This set of instructions is called metadata.

Features

Page 17: 47468272 introduction-to-informatica

You can create repository users and groups, assign privileges and permissions, manage folders and locks, import and export from ODBC data sources.•Designer: used to create mappings and target tables.•Server manager: used to create sessions and configure the schedule to run the sessions.

Features

Page 18: 47468272 introduction-to-informatica

Repository User Management

Multiple developers can use same repository to create/manage multiple projects or same project. Informatica allows to create separate user profile for each developer with separate username and password.

Page 19: 47468272 introduction-to-informatica

Privileges like Administer Server, Create sessions, User Designer can be assigned to each user on repository.Groups of users can be created and privileges can be granted to the groups.A user can be member of one or more groups.

Repository User Management

Page 20: 47468272 introduction-to-informatica

Access can be restricted to individual folders within a repository.Permissions of following types can be granted to Owner, Owner’s group and Repository users on folders: Read: Allow to view the folder and objects within the folder. Write: Allow to create and edit objects within the folder. Execute: Allow to execute or schedule a session in the folder.

Repository User Management

Page 21: 47468272 introduction-to-informatica

Designer

• Creation of mappings

MAPPING

Type of metadata that you create to specify how to move and transform data between sources and targets

- Stored in Repository

Page 22: 47468272 introduction-to-informatica
Page 23: 47468272 introduction-to-informatica

A mapping describes how to move and transform data from sources to targets. Mapping includes:

SourceTargetTransformations

Mapping

Page 24: 47468272 introduction-to-informatica

Sample MappingMapping

Page 25: 47468272 introduction-to-informatica

A component of a mapping which describes how Informatica Server should transform data.

Transformations

Page 26: 47468272 introduction-to-informatica

There are two categories of transformations depending upon their scope:

Standard Transformation: It is created in a mapping

and exists within that mapping. It can not be used in

other mappings.

Reusable Transformation: It is created and stored

independently in the repository. It can be used by all

mappings.

Transformations

Page 27: 47468272 introduction-to-informatica

Following are the types of transformations:

Expression – Calculate a value or modify text. Operates on individual rows.Aggregator – Perform aggregate calculations. Operates on sets of rows.

Transformations

Page 28: 47468272 introduction-to-informatica

Source Qualifier – Filter records read from the relational source only. Order records queried by Informatica server.Filter – Filter records sent to the targets. Applicable to any source.Stored Procedure – Call a stored procedure.External procedure/Advanced External Procedure – Call a procedure in a shared library (e.g. a DLL) or in a COM layer of Windows NT.

Transformations

Page 29: 47468272 introduction-to-informatica

Sequence Generator – Generates primary keys.Rank – Limit records to a top or bottom range.Normalizer – Normalize records including those read from COBOL sources.Lookup – Get related values.

Transformations

Page 30: 47468272 introduction-to-informatica

Update Strategy – Determine whether to insert, update, delete or reject data.Joiner – Join records from different databases or flat file systems.

Transformations

Page 31: 47468272 introduction-to-informatica

Every mapping needs at least one Source Qualifier Transformation or a normalizer transformation for COBOL sources.

Transformations

Page 32: 47468272 introduction-to-informatica

Ports

A port represents a single column of data.Every source definition, target definition and transformation contains a collection of ports.

Page 33: 47468272 introduction-to-informatica

There exist four types of ports:

Input port - Receives data.

Output port – provide data.

Input/Output port – pass data.

Variable port – Used to store components of expression.

Ports

Page 34: 47468272 introduction-to-informatica

Source definitions contain only output ports, since they provide data.

Target definitions contain only input ports, since they receive data.

Transformations contain a combination of input port, output port and input/output port, since they can pass the data as it is or modify the data depending upon its type.

Ports

Page 35: 47468272 introduction-to-informatica

Transformation Language

Transformation Language is used to write expressions for Transformations. It consists of functions (similar to SQL) used to modify the data or validate the data.

Page 36: 47468272 introduction-to-informatica

Expressions can be written in followingtypes of transformations:

Aggregator

Expression

Filter

Rank

Update Strategy.

Transformation Language

Page 37: 47468272 introduction-to-informatica

Transformation Language consists of following components: Functions : E.g. AVG, COUNT, ISNULL,

SUBSTR, IIF etc. Operators : E.g. Addition, Subtraction,

Multiplication, Division etc. Constants : E.g. Built-in constants like TRUE Variables : E.g. SYSDATE to represent current

date. Return Values.

Transformation Language

Page 38: 47468272 introduction-to-informatica

Mapplets

A Mapplet is a reusable object created in a repository that represents a set of transformations.

Page 39: 47468272 introduction-to-informatica

Basic steps to create a project:

Create database that contains repository.

Create data model for target.

Create repositories.

Create folders within repositories.

Import definitions of sources.

Create targets that will receive data.

Summary

Page 40: 47468272 introduction-to-informatica

Create mappings between source & targets,

including transformations which modify the data.

Create source & target connections in the server

manager.

Create sessions for transferring data between

source & target.

Schedule & run sessions.

Summary

Page 41: 47468272 introduction-to-informatica