Transcript

Informatica Big Data Trial Sandbox for

Cloudera Installation and Configuration

© 2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. All other company and product names may be trade names or trademarks of their respective owners and/or copyrighted materials of such owners.

AbstractThis document describes how to install and configure Big Data Trial Sandbox for Cloudera.

Supported Versions• Informatica 9.6.1

Table of ContentsInstallation and Configuration Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Step 1. Download the Big Data Trial Sandbox for Cloudera Software. . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Download and Install VMware Player. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Register at Informatica Marketplace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Download the Big Data Trial Sandbox for Cloudera Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Step 2. Start the Big Data Trial Sandbox for Cloudera Virtual Machine. . . . . . . . . . . . . . . . . . . . . . . . . . 3

Step 3. Configure and Install the Big Data Trial Sandbox for Cloudera Client. . . . . . . . . . . . . . . . . . . . . . 4

Configure the Domain Properties on the Windows Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Configure an Optional Static IP Address for the Virtual Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Install the Big Data Trial Sandbox for Cloudera Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Step 4. Access Big Data Trial Sandbox for Cloudera. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Log in to the Administrator Tool. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Log in to the Developer Tool. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Installation and Configuration OverviewBig Data Trial Sandbox for Cloudera consists of a virtual machine component and a client component. Use Big Data Trial Sandbox for Cloudera to run Informatica mappings on a Cloudera virtual machine configured for the Hadoop environment.

The Big Data Trial Sandbox for Cloudera virtual machine has the following components:

• 9.6.1 Informatica services

• Cloudera CDH 5.0

• Sample data

• Sample mappings for common big data use cases

The Big Data Trial Sandbox for Cloudera client installs the libraries and binaries required for the Informatica Developer (Developer tool) client.

Step 1. Download the Big Data Trial Sandbox for Cloudera SoftwareBefore you download the Big Data Trial Sandbox for Cloudera software, you must download and install VMware Player. Then, register at Informatica Marketplace and download the Big Data Trial Sandbox for Cloudera virtual machine and client.

2

Download and Install VMware PlayerTo play the Big Data Trial Sandbox for Cloudera virtual machine download and install VMware Player.

Download VMware Player from the following VMware website: https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/6_0

Note: You can use any client that can run VMware images.

You must have at least 10 GB of RAM and 30 GB of disk space available on the machine on which you download and install VMWare Player.

Register at Informatica MarketplaceRegister at Informatica Marketplace. Then, create an account to log in to Informatica Marketplace to download the Big Data Trial Sandbox for Cloudera client and server software.

When you register with Informatica Marketplace, you get a free 60-day trial to use Big Data Trial Sandbox for Cloudera.

You can access Informatica Marketplace here: https://community.informatica.com/community/marketplace/

Download the Big Data Trial Sandbox for Cloudera FilesAfter you log in to Informatica Marketplace, download the Big Data Trial Sandbox for Cloudera virtual machine and client.

Download the following files:BigDataTrialSandboxForCloudera.ova

Includes the Big Data Trial Sandbox for Cloudera virtual machine. Download the file to the machine on which VMware Player is installed.

961_BigDataTrial_Client_Installer_win32_x86.zip

Includes the compressed Big Data Trial Sandbox for Cloudera client. Download the file to an Informatica client installation directory on a Microsoft Windows-32 machine.

Extract files in the client zip file to a root directory on your local machine. For example, extract the files to the C:/ drive on your machine.

Step 2. Start the Big Data Trial Sandbox for Cloudera Virtual MachineOpen the Big Data Trial Sandbox for Cloudera virtual machine in VMware Player.

1. Go to the directory where you downloaded BigDataTrialSandboxForCloudera.ova and double click the file.

VMware Player opens and starts the BigDataTrialSandboxForCloudera virtual machine.

2. Optionally, in VMware Player click Browse > Import to extract the contents of the virtual machine to the selected location and start the virtual machine. Then, click Play virtual machine.

You are logged in to the virtual machine. The Informatica services and Hadoop services start automatically.

3

Step 3. Configure and Install the Big Data Trial Sandbox for Cloudera ClientTo communicate with the virtual machine before you run the client, you must configure the domain properties for the Big Data Trial Sandbox for Cloudera client installation.

Optionally, to avoid updating the IP address of the virtual machine each time it changes, you can configure a static IP address for the virtual machine.

Then, you can run the silent installer to install the Big Data Trial Sandbox for Cloudera client.

Configure the Domain Properties on the Windows MachineConfigure the IP address and host name of the virtual machine for the Developer tool.

1. Click Applications > System Tools > Terminal to open the terminal to run commands.

2. Run the ifconfig command to find the IP address of the virtual machine.

The ifconfig command returns all interfaces on the virtual machine. Select the eth interface to get values for IP address.

The following image shows the ifconfig command with the return value for inet addr highlighted with a red arrow:

3. Add the IP address and the default hostname cdh-bde-demo to the hosts file on the Windows machine on which you install the Developer tool.

The hosts file can be located in the following location: C:\Windows\System32\drivers\etc\hosts. Add the following line to the hosts file: <IP address> <hostname>. For example, add the following line:

192.168.159.141 cdh-bde-demo

Configure an Optional Static IP Address for the Virtual MachineOptionally, to avoid updating the IP address in the hosts file each time the IP address of the virtual machine changes, configure a static IP address for the virtual machine.

1. Click Applications > System Tools > Terminalto open the terminal to run commands.

2. Run the ifconfig command to find the IP address and hardware ethernet address of the virtual machine.

The ifconfig command returns all interfaces on the virtual machine. Select the eth interface to get values for the hardware ethernet address.

4

The following image shows the ifconfig command with the return values for inet addr and HWaddr outlined with red boxes:

3. Edit vmnetdhcp.conf to add the values for host name, IP address, and hardware ethernet address.

vmnetdhcp.conf is located in the following directory: C:\ProgramData\VMwareAdd the following entry before the #END tag at the end of the file:

host <hostname> {hardware ethernet <your HWaddr>;fixed-address <your inet addr>;}

The following sample code shows how to set a static IP address:host cdh-bde-demo {hardware ethernet 00:0C:29:FF:22:95;fixed-address 192.168.159.141;}

4. Add the IP address and the default hostname cdh-bde-demo to the hosts file on the Windows machine on which you install the Developer tool.

The hosts file can be located in the following location: C:\Windows\System32\drivers\etc\hosts. Add the following line to the hosts file: <IP address> <hostname>. For example, add the following line:

192.168.159.141 cdh-bde-demo5. Shut down the virtual machine.

6. Restart the host machine and virtual machine.

Install the Big Data Trial Sandbox for Cloudera ClientTo start the silent installation for the client open a command window .

To install the client libraries and binaries perform the following steps:

1. Open a command window.

2. Go to the directory that contains the client installation files.

3. Click silentInstall.bat to run the silent installer.

The silent installer runs in the background. The process can take several minutes.

The silent installation is complete when the Informatica_Version_Client_InstallLog.log file is created in the installation directory.

The log file is located in the following directory: C:\Informatica\9.6.1_BDE_Trial\

Step 4. Access Big Data Trial Sandbox for ClouderaYou can log in to Informatica Administrator (the Administrator tool) to monitor Informatica services and the status of mapping jobs.

5

You can log in to the Developer tool to run the sample mappings based on common big data use cases. You can create your own mappings and run the mappings from the Developer tool.

For more information on how to run mappings in the Developer tool, see the Informatica Big Data Trial Sandbox for Cloudera User Guide.

Log in to the Administrator Tool

You can access the Administrator tool from the following URL: http://cdh-bde-demo:6008

Enter the following credentials to log in to the Administrator tool:

User name: Administrator

Password: Administrator

Log in to the Developer ToolYou can start the Developer tool client from the Windows Start menu.

Enter the following credentials to connect to the Model repository Infa_mrs:

User name: Administrator

Password: Administrator

TroubleshootingThis section describes troubleshooting information.

Informatica Services shut down

The Informatica services might shut down when the machine on which you run the virtual machine goes into hibernation or when you resume the virtual machine.

Run the following command to restart the services on the operating system of the virtual machine: sh /home/infauser/BDETRIAL/.cmdInfaServiceUtil.sh start

Debug mapping failures

To debug mapping failures, check the error messages in the mapping log file.

The mapping log file appears in the following location: /home/infauser/bdetrial_repo/informatica/informatica/tomcat/bin/disTemp

Virtual machine does not start because of a 64-bit error

VMWare Player displays a message that states it cannot power on a 64-bit virtual machine. Or, VMware Player might display the following error when you play the virtual machine: The host supports Intel VT-x, but Intel VT-x is disabled. Intel VT-x might be disabled if it has been disabled in the BIOS/firmware settings or the host has not been power-cycled since changing this setting.

You must enable the BIOS of the machine on which VMware Player runs to use Intel Virtualization Technology. For more information, refer to the VMware Knowledge Base article here.

Virtual machine is in a suspended state

If the virtual machine is in a suspended state, you need to resume the virtual machine. You need to log in to the virtual machine. After you log in, the Informatica services and Hadoop services start automatically.

In VMware Player, select the virtual machine and click Play virtual machine.

6

Enter a user name and password for the virtual machine. The default user name and password is: infa / infa

The Developer tool takes a long time to connect to the Model repository

The Developer tool might take a long time to connect to the Model repository because the virtual machine cannot find the IP address and host name of the client machine.

You must add the IP address and host name of the client machine on the hosts file of the virtual machine.

Use the ipconfig and hostname commands from the command line of the Windows machine to find the IP address and hostname of the Windows machine.

Add the IP address and the host name to the hosts file on the virtual machine.

For example, the hosts file can be located in the following directory on the virtual machine: /etc/hosts

Add the following line to the hosts file:

<IP address> <hostname>

Author Big Data Edition Team

7


Recommended