Download pdf - Hitachi Data Center Analytics advanced reporter Getting ... · you need to have Hadoop setup along with the advanced reporter. ... Hadoop fulfills this requirement, where you can

Hitachi Data Center Analytics advancedreporter

Getting Started Guide

MK-96HDCA008-00

October 2016

© 2016 Hitachi, Ltd. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or by any means, electronicor mechanical, including copying and recording, or stored in a database or retrieval system forcommercial purposes without the express written permission of Hitachi, Ltd., or Hitachi Data SystemsCorporation (collectively “Hitachi”). Licensee may make copies of the Materials provided that any suchcopy is: (i) created as an essential step in utilization of the Software as licensed and is used in noother manner; or (ii) used for archival purposes. Licensee may not make any other copies of theMaterials. “Materials” mean text, data, photographs, graphics, audio, video and documents.

Hitachi reserves the right to make changes to this Material at any time without notice and assumesno responsibility for its use. The Materials contain the most current information available at the timeof publication.

Some of the features described in the Materials might not be currently available. Refer to the mostrecent product announcement for information about feature and product availability, or contactHitachi Data Systems Corporation at https://support.hds.com/en_us/contact-us.html.

Notice: Hitachi products and services can be ordered only under the terms and conditions of theapplicable Hitachi agreements. The use of Hitachi products is governed by the terms of youragreements with Hitachi Data Systems Corporation.

By using this software, you agree that you are responsible for:1. Acquiring the relevant consents as may be required under local privacy laws or otherwise from

authorized employees and other individuals to access relevant data; and2. Verifying that data continues to be held, retrieved, deleted, or otherwise processed in

accordance with relevant laws.

Notice on Export Controls. The technical data and technology inherent in this Document may besubject to U.S. export control laws, including the U.S. Export Administration Act and its associatedregulations, and may be subject to export or import regulations in other countries. Reader agrees tocomply strictly with all such regulations and acknowledges that Reader has the responsibility to obtainlicenses to export, re-export, or import the Document and any Compliant Products.

Hitachi is a registered trademark of Hitachi, Ltd., in the United States and other countries.

AIX, AS/400e, DB2, Domino, DS6000, DS8000, Enterprise Storage Server, eServer, FICON,FlashCopy, IBM, Lotus, MVS, OS/390, PowerPC, RS/6000, S/390, System z9, System z10, Tivoli,z/OS, z9, z10, z13, z/VM, and z/VSE are registered trademarks or trademarks of InternationalBusiness Machines Corporation.

Active Directory, ActiveX, Bing, Excel, Hyper-V, Internet Explorer, the Internet Explorer logo,Microsoft, the Microsoft Corporate Logo, MS-DOS, Outlook, PowerPoint, SharePoint, Silverlight,SmartScreen, SQL Server, Visual Basic, Visual C++, Visual Studio, Windows, the Windows logo,Windows Azure, Windows PowerShell, Windows Server, the Windows start button, and Windows Vistaare registered trademarks or trademarks of Microsoft Corporation. Microsoft product screen shots arereprinted with permission from Microsoft Corporation.

All other trademarks, service marks, and company names in this document or website are propertiesof their respective owners.

2Hitachi Data Center Analytics advanced reporter Getting Started Guide

https://support.hds.com/en_us/contact-us.html

Contents

Preface................................................................................................. 5Product version........................................................................................................6Accessing product documentation............................................................................. 6Comments...............................................................................................................6Getting help.............................................................................................................6

1 Installing advanced reporter...................................................................7Overview.................................................................................................................8Distribution formats................................................................................................. 8

OVA file............................................................................................................. 9RPM file............................................................................................................. 9

Installing advanced reporter on VMware vSphere Client .............................................9Initial setup for advanced reporter on VMware vSphere Client................................... 12Installing advanced reporter on a Linux machine ..................................................... 13Initial setup for advanced reporter on Linux machine................................................14Registering Data Center Analytics servers in advanced reporter ................................ 15

2 Creating Hadoop cluster on Linux machine ........................................... 17Hadoop installation prerequisites ............................................................................18Installing the prerequisites .....................................................................................18Installing the Hadoop software ...............................................................................21Configuring Hadoop .............................................................................................. 22

Configuring the master node ............................................................................ 22Configuring worker nodes .................................................................................27

Launching Hadoop ................................................................................................ 29Uploading the JAR on HDFS ................................................................................... 31Adjusting Hadoop settings on advanced reporter...................................................... 31

3 Working with batches ..........................................................................33Viewing batch reports ............................................................................................34Managing batches ................................................................................................. 35

Viewing batch summary ................................................................................... 35


Viewing batch status ........................................................................................35Pausing and resuming a batch .......................................................................... 35Downloading batch ZIP .................................................................................... 35

Managing advanced reporter...................................................................................36Changing FTP settings ..................................................................................... 36Adding or removing Hadoop worker nodes ........................................................ 36Changing password ..........................................................................................37

Uninstalling advanced reporter................................................................................37

4 Advanced operations ...........................................................................39Custom resources ................................................................................................. 40

Custom resource group..................................................................................... 40Model.............................................................................................................. 40Creating custom resources................................................................................ 40Exporting resource details for custom resources .................................................41Creating custom resource group, custom model, and mapping resources..............41Importing custom resources ............................................................................. 42Updating custom resources .............................................................................. 42Deleting custom resources ............................................................................... 43

Custom properties..................................................................................................43Purging and zipping data...................................................................................44Configuring report count in a container and resource requirement for batches...... 45

Container report count configuration.............................................................45Resource requirements for containers...........................................................46

Updating yarn-site.xml file.................................................................................46Setting the time zone for advanced reporter....................................................... 46

Interval rollup........................................................................................................ 47


PrefaceThe advanced reporter for Hitachi Data Center Analytics provides acentralized reporting facility using one or more Data Center Analytics servers.It complements the Data Center Analytics server by providing an offlinereporting facility where reports are scheduled for different intervals anddelivered once ready. The reports help you to monitor the health of the targetinfrastructure and to identify potential issues.

Reports are scheduled hourly, daily, weekly, and monthly in batches. If abatch run is hourly, then all the reports show an hourly report run. Currently,only pre-configured report batches are supported. To process these batches,you need to have Hadoop setup along with the advanced reporter.

The advanced reporter for Data Center Analytics can process a huge data setand create a batch of the reports. To perform this, it requires significantmemory, and processing power. Hadoop fulfills this requirement, where youcan create the Hadoop cluster with a group of machines (Hadoop masternode and worker nodes); these machines are linked to each other and dividethe batch generation tasks.

□ Product version

□ Accessing product documentation

□ Comments

□ Getting help

Preface 5Hitachi Data Center Analytics advanced reporter Getting Started Guide

Product versionThis document revision applies to Hitachi Data Center Analytics version 7.0 orlater and advanced reporter 1.0 or later.

Accessing product documentationProduct user documentation is available on Hitachi Data Systems SupportConnect: https://knowledge.hds.com/Documents. Check this site for themost current documentation, including important updates that may havebeen made after the release of the product.

CommentsPlease send us your comments on this document to [email protected] the document title and number, including the revision level (forexample, -07), and refer to specific sections and paragraphs wheneverpossible. All comments become the property of Hitachi Data SystemsCorporation.

Thank you!

Getting helpHitachi Data Systems Support Connect is the destination for technical supportof products and solutions sold by Hitachi Data Systems. To contact technicalsupport, log on to Hitachi Data Systems Support Connect for contactinformation: https://support.hds.com/en_us/contact-us.html.

Hitachi Data Systems Community is a global online community for HDScustomers, partners, independent software vendors, employees, andprospects. It is the destination to get answers, discover insights, and makeconnections. Join the conversation today! Go to community.hds.com,register, and complete your profile.

6 PrefaceHitachi Data Center Analytics advanced reporter Getting Started Guide

https://knowledge.hds.com/Documents

mailto:[email protected]

https://knowledge.hds.com/

https://support.hds.com/en_us/contact-us.html

https://community.hds.com/welcome

https://community.hds.com/welcome

1Installing advanced reporter

□ Overview

□ Distribution formats

□ Installing advanced reporter on VMware vSphere Client

□ Initial setup for advanced reporter on VMware vSphere Client

□ Installing advanced reporter on a Linux machine

□ Initial setup for advanced reporter on Linux machine

□ Registering Data Center Analytics servers in advanced reporter

Installing advanced reporter 7Hitachi Data Center Analytics advanced reporter Getting Started Guide

OverviewThe following workflow shows the high-level tasks for deploying advancedreporter:

1. Verify that the host on which you plan to install the Data Center Analytics

advanced reporter meets the minimum system requirements.2. Install and configure the advanced reporter.3. Register the advanced reporter license in the Data Center Analytics

server.4. Create the Hadoop cluster.5. Register the Data Center Analytics server in advanced reporter.

Distribution formatsThe advanced reporter for Data Center Analytics is available in the followingformats.

8 Installing advanced reporterHitachi Data Center Analytics advanced reporter Getting Started Guide

OVA fileThis is the Open Virtualization Format (OVF) package, which allows theadvanced reporter to be deployed on the VMware infrastructure. Theadvanced reporter becomes a VM in the environment with all the requiredcomponents installed.

The advanced reporter consists of two OVA files, one for the advancedreporter and Hadoop master node (packaged together) and other for theHadoop worker node.

RPM fileYou can use the RPM file to install the advanced reporter on a Red HatEnterprise Linux server (RHEL). This file can be deployed on a standalonephysical machine that is running the Linux operating system.

Installing advanced reporter on VMware vSphere ClientYou can install the advanced reporter using the Open Virtualization Format(OVF) package on a VMware infrastructure.

OVA consists of two files:• The advanced reporter and Hadoop master node• Hadoop worker node. If you want to configure the multiple worker nodes

in a cluster, then you must deploy the Hadoop worker node OVA

Note: The specific installation steps depend on the VMware vSphere Clientversion.

Before you begin• vSphere Client on ESX Server 3.5 or later.• RAM: 8 GB• CPU: 4 vCPU• Storage:

○ For advanced reporter and Hadoop master node OVA: 1 TB (ThinProvisioned)

○ For Hadoop worker node OVA: 100 GB (Thin Provisioned)• Static IP address: Ensure that you can assign static IP address for

advanced reporter and Hadoop worker nodes

Procedure

1. Open the vSphere Client application, go to File > Deploy OVFTemplate

2. In the Deploy OVF Template window, click Browse, navigate to thedirectory in the installation media that contains the OVA file, select it,and click Next.


3. In the OVF Template Details window, verify the details and click Next.4. In the Name and Location window, type the following details and click

Next.• Name: A name for advanced reporter. The server is identified by this

name.• Inventory location: The inventory in the VMware application to add

the advanced reporter.5. In the Storage window, select the storage to save the virtual storage

files and click Next.6. In the Disk Format Window, select the Thin Provision button, and

then click Next.

The options that you selected appears.7. Select the Power on after deployment check box and click Finish.8. Assign the static IP address and make sure that the static IP address and

host name are mapped in DNS.a. Use the following credentials to log on to the master and worker

nodes:• localhost login: config• Password: megha!234

The system shows a notification to assign a static IP address.b. Type Yes and then press Enter.c. Enter the following details and apply the changes:

• IP address• NetMask• Gateway• Domain Name• Server Name

d. Enter the Hadoop worker node and Hadoop master node entries in allnodes.

9. Log on to the Hadoop master node and Hadoop worker nodes as rootuser through an SSH client(like PuTTY) using the following credentials:• Hadoop master node:

User name: rootPassword: megha.jeos

• Hadoop worker node:User name: rootPassword: app.jeos

10. Edit the /etc/hosts file to define the Hadoop worker node and Hadoopmaster node entries so that they can communicate with each other.There must be mapping between IP address and hostname.


For example,

192.168.XXX.XXX Hadoop.master.node

192.168.XXX.XXX Hadoop.worker.node1


When all the necessary entries have been added, all the nodes will beaccessible from each other using host name and IP address.

Next steps1. NTP Client settings: The time on all the nodes (Hadoop master node and

Hadoop worker nodes OVAs) must be synchronized. The OVA includes anNTP client.• If the Hadoop master node and Hadoop worker nodes have internet

access, ignore this step.• If the Hadoop master node and Hadoop worker nodes do not have

internet access, configure the NTP client to contact your internal NTPserver. To do so, log in to Hadoop master node and Hadoop workernodes with root user. Refer the NTP Client documentation to configurethe NTP client.

2. If you are using a firewall, log on to Hadoop master node and Hadoopworker nodes and ensure that TCP traffic from Hadoop worker nodes isable to reach Hadoop master node. Ports required on Hadoop masternode and Hadoop worker nodes are as follows:• Hadoop master node:

○ User name: root○ Password: megha.jeos○ 8443: To launch user interface○ 22: For SFTP○ 8088: To access Hadoop user interface○ 50070: To access HDFS user interface○ 9000: To send and delete file on HDFS

• Hadoop worker node:○ Ensure that the Hadoop worker nodes are listening on eth0/eth1○ User name: root○ Password: app.jeos○ 22: For SFTP

3. Configure the advanced reporter.

Related tasks

• Initial setup for advanced reporter on VMware vSphere Client on page 12


Initial setup for advanced reporter on VMware vSphereClient

This section describes the initial settings for the advanced reporter onVMware vSphere Client

Before you begin• Check the IP address of the advanced reporter.• Obtain the advanced reporter license from your Hitachi Data Systems

representative.

Procedure

1. Enter the advanced reporter URL in your browser:https://ip_address:Port NumberThe default port for https access is 8443.

2. Read and accept the license agreement, and then click Next.3. In the Set Details For Existing admin User window, enter the

password for the default administrator user.The user name of the default administrator user is admin.

4. Log on to the advanced reporter as the administrator user.5. Type the Hadoop worker nodes static IP address or host name.

You can have multiple worker node entries, but do not use a mix of IPaddresses and host names.

6. (Optional) Click Add More to add more than three Hadoop worker nodeIP addresses or host names, and then click Next.Server starts configuring the Hadoop and shows the status of eachHadoop worker node

7. Click NextThe FTP Information window appears.

8. In the FTP Information window, enter the following information:• FTP Method: SFTP• Server: advanced reporter IP address.• Port: 22• User: meghabatchdata• Password: meghabatch!23

9. (Optional) Select the Use Proxy check box and provide the followingdetails, and then click Submit:• Proxy Type: HTTP or SOCKS5• Proxy Server: Hostname or IP address• Port: Proxy FTP port• Proxy User and Password


Installing advanced reporter on a Linux machineYou can use the RPM file to install the advanced reporter on a Red HatEnterprise Linux server (RHEL).

Before you begin• Linux machine with either of the OS versions supported:

○ RHEL 6.5 (64-bit)○ CentOS 6.6 (64-bit)

• For Linux machine, assign the static IP address and make sure that theaddress and host name are mapped in DNS

• Java package: Java 7• The FTP server on the machine where you are plan to install advanced

reporter.• RAM: 8 GB• CPU: 4 vCPU• Storage: 1 TB (Thin Provisioned)• You must configure the firewall rule to allow incoming connection on 8443

TCP port.

Procedure

1. Copy the advanced reporter RPM file from the installation media to afolder on the Linux machine.

2. Run the following command:

rpm -ivh <advanced reporterrpm>3. Run the advanced reporter configuration script:

sh /usr/local/meghaBatchApp_installer/config.sh4. Provide the directory path to store the application data. By default, data

is stored in /data folder.Ensure that the disk has been mounted on /data directory. If you wantuse any other directory, ensure that you have mounted disk for thatdirectory

5. Press Enter.The default port is assigned for the advanced reporter. The default port is8443. If the default port is busy, then the system notifies you that youmust type an alternative port number. Type the appropriate port number.

6. If the advanced reporter is running, run the appropriate command:• To verify the status:

sh /usr/local/megha/bin/megha-jetty.sh status


• If the advanced reporter is not running:

sh /usr/local/megha/bin/megha-jetty.sh start7. Add the URL of advanced reporter, provided at the end of the

installation, in your firewall to bypass the port numbers and open the UIfor the advanced reporter.

Next steps

Configure the advanced reporter.

Related tasks

• Initial setup for advanced reporter on Linux machine on page 14

Initial setup for advanced reporter on Linux machineThis section describes the initial settings for the advanced reporter on Linuxmachine.

Before you begin• Check the IP address of the advanced reporter.• Obtain the advanced reporter license from your HDS representative.

Procedure

1. Enter theadvanced reporter URL in your browser:https://ip_address:Port NumberYou must use the static IP address.

2. Read and accept the license agreement, and then click Next.3. In the Set Details For Existing admin User window, enter the

password for the default administrator user. The user name of the defaultadministrator user is admin.

4. Log on to the advanced reporter as the administrator user.5. In the FTP Information window, enter the following information:

• FTP Method: SFTP• Server: advanced reporter IP address.• Port: 22• User: meghabatchdata• Password: meghabatch!23

6. (Optional) Select the Use Proxy check box and provide the followingdetails, and then click Submit:• Proxy Type: HTTP or SOCKS5• Proxy Server: Hostname or IP address• Port: Proxy FTP port• Proxy User and Password


Registering Data Center Analytics servers in advancedreporter

You must register Data Center Analytics server in advanced reporter, toestablish the connection between the advanced reporter and Data CenterAnalytics server. You can register multiple Data Center Analytics servers inadvanced reporter, but cannot have one Data Center Analytics serverregistered in multiple advanced reporters.

Before you begin

Make sure that you have registered the license of advanced reporter in theData Center Analytics server.

To register the advanced reporter license, in the Data Center Analytics servernavigate to Manage > License Information.

Procedure

1. Log on to the advanced reporter.

https://advanced_reporter_IP_address:Port/csBatchApp/appLogin.htm

2. In the Register Data Center Analytics server window, click RegisterServer.

3. Type the IP address and port of the Data Center Analytics server andthen click Register. The default ports for Data Center Analytics serverare: 8080 for http and 8443 for https

https://Data_Center_Analytics_server_IP_address:Port

Next steps• After registering theData Center Analytics server(s), you can view the

reports in a batch using the user appuser.• To view the advanced reporter, in the Data Center Analytics server

navigate to Manage > App Status after registering Data Center Analyticsserver.

• If the IP address of a registered Data Center Analytics server is changed infuture, you must edit the registered Data Center Analytics server details.



2Creating Hadoop cluster on Linux

machineTo create a Hadoop setup (cluster), you must have a Hadoop master nodeand one or more Hadoop worker nodes. You can have more Hadoop workernodes depending upon the size of the target systems or environment.

Note: The Hadoop master node and advanced reporter RPM can be installedon the same machine.

□ Hadoop installation prerequisites

□ Installing the prerequisites

□ Installing the Hadoop software

□ Configuring Hadoop

□ Launching Hadoop

□ Uploading the JAR on HDFS

□ Adjusting Hadoop settings on advanced reporter

Creating Hadoop cluster on Linux machine 17Hitachi Data Center Analytics advanced reporter Getting Started Guide

Hadoop installation prerequisites• Linux machines (one machine for each Hadoop master node / worker

node) with the following configuration:○ OS: RHEL6.5 (64-bit) or CentOS6.6 (64-bit)○ CPU: 4 vCPUs○ RAM: 8 GB○ Storage: 100 GB (Thin Provisioned)

• Assign the static IP address for the Hadoop master node and all workernodes. Ensure that the static IP address and host name of the Hadoopmaster node and all worker nodes are mapped in DNS.

• Java 7 must be installed on the Hadoop master node and all worker nodes.• Date and time must be synchronized across the Hadoop master node and

all worker nodes.

• Date and time must be synchronized across all Hadoop master node andall worker nodes.

• SSH must be installed and sshd must be running on the Hadoop masternode and all worker nodes.

• A user with same user ID and group ID, which is meghabatch, must beavailable on the Hadoop master node and all worker nodes.

• Password-less SSH connection must be established for the meghabatchuser from Hadoop master node to each worker node.

• (Optional) If you need a firewall, you must set custom firewall rules on allHadoop master node and Hadoop worker nodes.

• Hadoop master node and worker nodes must have the wget packageinstalled.

• Make sure that Hadoop worker nodes can send data to FTP Server (whichis installed on a machine where advanced reporter is installed).

Installing the prerequisitesThe following procedure must be followed for all the nodes, unless specified.

Procedure

1. Log on to a node as root user.2. Set the JAVA_HOME variable and navigate to the java installation path.

For example: /usr/lib/jvm/java-1.7.0, and then run the followingcommands:

export JAVA_HOME=<java_installation_path>

export PATH=${JAVA_HOME}/bin:$PATH

18 Creating Hadoop cluster on Linux machineHitachi Data Center Analytics advanced reporter Getting Started Guide

3. Set the date and time.The timezones of all nodes must match

4. Install the open SSH:a. To view the available packages on the node, run the following

command:

yum search opensshb. To verify if the operating system is 64-bit or 32-bit, run the following

command:

uname -mc. Select the openssh packages that matches the architecture and install

the package using the following command:

yum install <pkgname>

You must install the following three packages:• openssh• openssh-clients• openssh-server

5. Log on as root user on each cluster node and add entries for each node.a. Edit the /etc/hosts file, and make sure that the mapping is between

the IP address and the hostname.For example,

192.168.XXX.XXX Hadoop.master.node


192.168.XXX.XXX Hadoop.worker.node2b. When the process is complete, all the cluster nodes must be able to

ping each other with IP and hostname.6. Create a dedicated user for Hadoop:

a. Create a group meghabatch:

groupadd meghabatchb. Create a user meghabatch:

useradd -G meghabatch meghabatchc. Set the password for meghabatch user:

echo meghabatch:meghabatch | chpasswd7. To set up the password-less SSH for the Hadoop master and all worker

nodes, perform the following steps on the Hadoop master node.


Note: If the $USER_HOME variable is not set, then you must enterthe home path (for example: /home/megabatch).

a. Log on to the Hadoop master node with meghabatch user and create aprivate or public key-pair:

ssh-keygen -t rsab. Create SSH directory on worker node:

ssh meghabatch@<Worker node IP address> mkdir -p .sshc. Copy the master node's public key to worker nodes:

cat id_rsa.pub | ssh meghabatch@<Worker node IP address> 'cat >> .ssh/authorized_keys'

d. Set the appropriate permissions:

ssh meghabatch@<Worker node IP address> "chmod 700 .ssh; chmod 640 .ssh/authorized_keys"

8. Repeat steps 7(b) through (d) on master node for all the Hadoop workernodes to establish the password-less connection from the Hadoop masternode to all worker nodes

Next steps1. If you are using a firewall, log on to Hadoop master and worker nodes

and ensure that TCP traffic from worker nodes is able to reach masternode. Ports required are as follows:• Hadoop master node:

○ 8443 - To launch user interface○ 22 - For SFTP○ 8088 - To access Hadoop user interface○ 50070 - To access HDFS user interface○ 9000 - To send and delete file on HDFS

• Hadoop worker node: Make sure that Hadoop worker nodes arelistening on eth0/eth1.○ 22 - For SFTP

Apart from these ports, Hadoop internally uses various ports. For details,see the Apache Hadoop documentation.

2. The time on all nodes must be synchronized. To do so, perform thefollowing:a. Install the NTP client:

yum install ntp ntpdate ntp-doc

chkconfig ntpd on


b. If the Hadoop master and worker nodes have internet access, thenstart the NTP client:

/etc/init.d/ntpd startc. If the Hadoop master node and worker nodes do not have internet

access, then configure the NTP client to connect to the internal NTPserver. See, the NTP Client documentation for more information

Installing the Hadoop softwareBefore you begin

Download Hadoop 2.6.4 using wget from any of the mirrors listed at: http://www.apache.org/dyn/closer.cgi/hadoop/common.

For example,

wget http://redrockdigimark.com/apachemirror/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

Perform the following steps on the Hadoop master node and all workernodes.

Procedure

1. Create the directory /usr/local/hadoop, using the following command:

mkdir /usr/local/hadoop2. Navigate to the directory where you have downloaded the Hadoop

version.3. To extract the contents into the Hadoop version directory run the

following command:

tar -xzf hadoop-2.6.4.tar.gz4. Copy the contents of Hadoop version directory that you created /usr/

local/hadoop, using the following command:

cp -r hadoop-2.6.4/* /usr/local/hadoop/5. Create a script hadoop.sh in /etc/profile.d and add the following

environment variables:

export JAVA_HOME=Java-installation-directoryexport PATH=$JAVA_HOME/bin:${PATH}export HADOOP_HOME=/usr/local/hadoopexport HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport YARN_HOME=$HADOOP_HOME


http://www.apache.org/dyn/closer.cgi/hadoop/common

http://www.apache.org/dyn/closer.cgi/hadoop/common

export HADOOP_YARN_HOME=$HADOOP_HOMEexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport YARN_CONF_DIR=$HADOOP_HOME/etc/Hadoop

6. Create a tmp folder in $HADOOP_HOME, using the following command:

mkdir $HADOOP_HOME/tmp

Note: If the $HADOOP_HOME variable is not set, then replace thisvariable with Hadoop home directory. (For example:/usr/local/hadoop)

7. Assign the ownership of /usr/local/hadoop to the meghabatch user:

chown -R meghabatch:meghabatch /usr/local/hadoop

Configuring HadoopConfiguration steps must be performed using the meghabatch user. To switchfrom root user to meghabatch user, run the following command : su -meghabatch• The configuration files are located in /usr/local/hadoop/etc/hadoop/ on

all the nodes.• The following files are same on all the nodes in cluster:

○ core-site.xml○ capacity-scheduler.xml

• You must edit the following files on each node in the cluster:○ yarn-site.xml○ hdfs-site.xml

• Make sure that all the properties are inside one configuration XMLelement.

Configuring the master node

Procedure

1. Navigate to the following directory: /usr/local/hadoop/etc/hadoop2. Open the core-site.xml file, and copy and paste the following:

Note: You must replace the Hadoop master node IP address orhost name in <Enter_Hadoop_master_node_IP_address_or_host_name> placeholder. Make sure that hostname isaccessible from all the nodes

<configuration> <property>


<name>fs.default.name</name><value>hdfs://<Enter_Hadoop_master_node_IP_address_or_host_ name>:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property> <property> <name>hadoop.http.staticuser.user</name> <value>meghabatch</value> </property></configuration>

3. Open the capacity-scheduler.xml file and verify that the file has thefollowing configuration:

<configuration> <property> <name>yarn.scheduler.capacity.maximum-applications</name> <value>10000</value> <description> Maximum number of applications that can be pending and running. </description> </property> <property><name>yarn.scheduler.capacity.maximum-am-resource-percent</name> <value>0.3</value> <description>Maximum percent of resources in the cluster which can be used to run application Hadoop master nodes i.e. controls number of concurrent running applications. </description> </property> <property><name>yarn.scheduler.capacity.resource-calculator</name><value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value> <description> The ResourceCalculator implementation to be used to compare Resources in the scheduler. The default i.e. DefaultResourceCalculator only uses Memory while DominantResourceCalculator uses dominant-resource to compare multi-dimensional resources such as Memory, CPU etc. </description> </property> <property> <name>yarn.scheduler.capacity.root.queues</name> <value>default</value> <description> The queues at this level (root is the root queue). </description> </property>


<property> <name>yarn.scheduler.capacity.root.default.capacity</name> <value>100</value> <description>Default queue target capacity.</description> </property> <property> <name>yarn.scheduler.capacity.root.default.user-limit-factor</name> <value>1</value> <description> Default queue user limit a percentage from 0.0 to 1.0. </description> </property> <property><name>yarn.scheduler.capacity.root.default.maximum-capacity</name> <value>100</value> <description> The maximum capacity of the default queue. </description> </property> <property> <name>yarn.scheduler.capacity.root.default.state</name> <value>RUNNING</value> <description>The state of the default queue. State can be one of RUNNING or STOPPED. </description> </property> <property> <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name> <value>*</value> <description> The ACL of who can submit jobs to the default queue. </description> </property> <property> <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name> <value>*</value> <description> The ACL of who can administer jobs on the default queue. </description> </property> <property> <name>yarn.scheduler.capacity.node-locality-delay</name> <value>40</value> <description>Number of missed scheduling opportunities after which the CapacityScheduler attempts to schedule rack-local containers. Typically this should be set to number of nodes in the cluster, By default is setting approximately number of nodes in one rack which is 40. </description>


</property> </configuration>

4. Open the yarn-site.xml file and and copy and paste the following:

Note: You must enter the Hadoop master node IP address or hostname in Enter_Hadoop_master_node_IP_ address_or_host_nameplaceholder. Make sure that hostname is accessible from all thenodes.

<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property><name>yarn.resourcemanager.resource-tracker.address</name><value><Enter_Hadoop_master_node_IP_address_or_host name>:8025</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name><value><Enter_Hadoop_master_node_IP_address_or_host name >:8030</value> </property> <property> <name>yarn.resourcemanager.address</name><value><Enter_Hadoop_master_node_IP_address_or_host name >:8040</value> </property> <property> <name>yarn.nodemanager.address</name><value><Enter_Hadoop_master_node_IP_address_or_host name >:8050</value> </property> <property> <name>yarn.nodemanager.localizer.address</name><value><Enter_Hadoop_master_node_IP_address_or_host name >:8060</value> </property> <property> <name>yarn.resourcemanager.am.max-retries</name> <value>2</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>1024</value> </property><property> <name>yarn.scheduler.maximum-allocation-mb</name>


<value>7168</value></property><property><name>yarn.scheduler.maximum-allocation-vcores</name> <value>4</value></property> <property> <name>yarn.nodemanager.delete.debug-delay-sec</name> <value>3600</value> </property> <property> <name>yarn.nodemanager.pmem-check-enabled</name> <value>false</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> </configuration>

Table 1 Description of propertiesYou can change the following properties as per the configuration of the Hadoop master node andall worker nodes

Property Description

yarn.scheduler.maximum-allocation-mb This property defines the maximum memoryin MB that can be allocated to a container onthis cluster.

The value of this property must be 2 GB lessthan the highest memory available on any ofthe worker nodes. For example, if you havetwo Hadoop worker nodes - Worker 1 andWorker 2.

Worker 1 is assigned 16 GB RAM whereWorker 2 is assigned 8 GB RAM. In this case,the value of this property must be 14 GB.16GB (RAM on Worker node 1) minus 2 GB = 14GB.

yarn.scheduler.maximum-allocation-vcores

This property defines the maximum numberof CPU cores that can be allocated to acontainer on this cluster.

The value of this property must be thehighest in all the Hadoop worker nodes.

For example, if you have two Hadoop workernodes - Worker 1 and Worker 2. Worker 1 isassigned 4 vCores where Worker 2 isassigned 2 vCores. In this case, the value ofthis property will be 4.

5. In the /usr/local/hadoop/etc/hadoop/worker, edit the Hadoop workernodes file and add the IP addresses or host names of all the Hadoopworker nodes. Make sure that hostname is accessible from all the nodes.This file must be edited only on the Hadoop master node

6. Open the hdfs-site.xml file and copy and paste the following:


Note: You must enter the Hadoop master node IP address or hostname in Enter_Hadoop_master_node_IP_ address_or_host_nameplaceholder. Make sure that hostname is accessible from all thenodes.

<configuration><property><name>fs.defaultFS</name><value>hdfs://<Enter_Hadoop_master_node_IP_ address_or_host_name >:9000</value></property><property><name>dfs.webhdfs.enabled</name><value>true</value></property><property><name>dfs.data.dir</name><value>/usr/local/hadoop/tmp/dfs/name/data</value><final>true</final></property><property><name>dfs.name.dir</name><value>/usr/local/hadoop/tmp/dfs/name</value><final>true</final></property><property><name>dfs.replication</name><value>2</value></property><property><name>dfs.namenode.secondary.http-address</name><value><IP_address_or_host_name_of_any_Hadoop_worker_node>:50090</value></property></configuration>

Table 2 Description of property


dfs.replication This property is used to define the number ofnodes for replication. You can change thisvalue based on how many nodes you want toassign for replication.

Configuring worker nodesYou must perform these steps on all the Hadoop worker nodes.

Copy the files core-site.xml and capacity-scheduler.xml from theHadoop master node to each worker node.


Procedure

1. Log on to the Hadoop master node.2. Navigate to the following directory:.

/usr/local/hadoop/etc/hadoop3. To copy the files, run the following command:

scp core-site.xml capacity-scheduler.xml meghabatch@<Hadoop worker node>:/usr/local/hadoop/etc/hadoop

4. Open the yarn-site.xml file, and copy and paste the following:

<configuration><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property><property><name>yarn.resourcemanager.resource-tracker.address</name><value><Enter Hadoop master node IP address or host name>:8025</value></property><property><name>yarn.resourcemanager.scheduler.address</name><value><Enter_Hadoop_master_node_IP_ address_or_host_name >:8030</value></property><property><name>yarn.resourcemanager.address</name><value><Enter Hadoop master node IP address or host name>:8040</value></property><property><name>yarn.nodemanager.address</name><value><Hadoop worker node IP address or host name on which you are configuring this file>:8050</value></property><property><name>yarn.nodemanager.resource.memory-mb</name><value>7168</value></property><property><name>yarn.nodemanager.resource.cpu-vcores</name><value>4</value></property></configuration>

Table 3 Description of propertiesYou can change the following properties as per the configuration of Hadoop worker node.



yarn.nodemanager.resource.memory-mb The value of this property must be 2 GB lessthan the total physical memory available onthe Hadoop worker node.

yarn.nodemanager.resource.cpu-vcores The value of this property must be same asthat of cores available on the Hadoop workernode.

5. Open the hdfs-site.xml file and copy and paste the following:

<configuration><property><name>fs.defaultFS</name><value>hdfs://<Enter_Hadoop_master_node_IP_ address_or_host_name >:9000</value></property>

<property><name>dfs.data.dir</name><value>/usr/local/hadoop/tmp/dfs/name/data</value><final>true</final></property><property><name>dfs.name.dir</name><value>/usr/local/hadoop/tmp/dfs/name</value><final>true</final></property>

<property><name>dfs.replication</name><value>2</value></property>

<property><name>dfs.webhdfs.enabled</name><value>true</value></property></configuration>

Table 4 Description of property


dfs.replication This property is used to define the number ofnodes for replication. You can change thisvalue based on the number of nodes youwant to assign for replication.

Launching HadoopProcedure

1. Log on to the Hadoop master node as meghabatch user


2. Before starting the cluster services for the first time, format namenodeusing the following command on Hadoop master node:

/usr/local/hadoop/bin/hdfs namenode –format3. Start the Hadoop master node and Hadoop worker nodes' services

$HADOOP_HOME/sbin/start-yarn.sh $HADOOP_HOME/sbin/start-dfs.sh

4. Create following mandatory directories on HDFS for storing configurationdata using the following commands:

HADOOP_HOME/bin/hdfs dfs -mkdir /dtp/HADOOP_HOME/bin/hdfs dfs -mkdir /dtp/confHADOOP_HOME/bin/hdfs dfs -mkdir /dtp/libHADOOP_HOME/bin/hdfs dfs -mkdir /dtp/rptdefsHADOOP_HOME/bin/hdfs dfs -mkdir /dtp/reports

5. You can verify whether all the services are up and running by executingthe following command:

jps

The outputs on the master and worker nodes are similar:• On Hadoop master node:

23431 NameNode24618 Jps22757 ResourceManager

• On Hadoop worker node:

10117 SecondaryNameNode // One Hadoop worker node will run as secondary Namenode.16464 Jps9741 NodeManager10017 DataNode

Note: ResourceManager must be running on the Hadoop masternode and node manager must be running on all the Hadoopworker nodes. The Apache Hadoop also has its own user interface.You can browse through the Hadoop distributed file system usingthe following link: http://<Hadoop_master_node_IP_address>:50070/dfshealth.html#tab-overview

In addition, details and applications submitted to cluster can beseen through following link: http://<Hadoop_master_node_IP_address>:8088/cluster


Uploading the JAR on HDFSThe dtpAll.jar is provided to you with RPM installation. You must uploadthis file to the Hadoop distributed file system.

Procedure

1. Run the following command on the machine, which has dtpAll.jar:

curl -i -X PUT http://<Hadoop_master_node>:50070/webhdfs/v1/dtp/lib/dtpAll.jar?user.name=meghabatch&op=CREATE

The output must be similar to the following:

HTTP/1.1 307 TEMPORARY_REDIRECTCache-Control: no-cacheExpires: Wed, 22 Jun 2016 06:22:28 GMTDate: Wed, 22 Jun 2016 06:22:28 GMTPragma: no-cacheExpires: Wed, 22 Jun 2016 06:22:28 GMTDate: Wed, 22 Jun 2016 06:22:28 GMTPragma: no-cacheSet-Cookie: hadoop.auth="u=meghabatch&p=meghabatch&t=simple&e=1466612548284&s=9ycC70xTVI/LFa3HKl3J+NO15iA="; Path=/; Expires=Wed, 22-Jun-2016 16:22:28 GMT; HttpOnly

Location:

http://ABCD4:534/webhdfs/v1/dtp/lib/dtpAll.jar?op=CREATE&user.name=meghabatch&namenoderpcaddress= ABCD46:9000&overwrite=false

Content-Type: application/octet-streamContent-Length: 0Server: Jetty(6.1.26)

(Change overwrite=false from above location to overwrite=true)

2. The Location must to be used in the following command

curl -i -T dtpAll.jar ”<Location>”

Adjusting Hadoop settings on advanced reporter


Procedure

1. Stop the jetty service using the command:

sh /usr/local/meghaBatchApp/bin/megha-jetty.sh stop 2. Confirm that the jetty service is stopped:

sh /usr/local/meghaBatchApp/bin/megha-jetty.sh status3. Hadoop-related configurations, such as Hadoop master node IP address,

and HDFS NameNode URL must be updated in advanced reporter atfollowing location:/usr/local/meghaBatchApp/conf/sys/dtpConfig.jsonSample file:

{ "amContainerVCores" : 1, "amContainerMemory" : 3072, "containerVCores" : 1, "containerMemory" : 2048, "reportCount" : 10, "jar" : {"resource" : "hdfs://<Hadoop master IP address>:9000/dtp/lib/dtpAll.jar", "type" : "FILE", "visibility" : "APPLICATION", "size" : "<Size of Jar file>", "timestamp" : "<Timestamp of Jar file>" },"yarnBaseUri" : "http://<Hadoop master IP address>:8088/ws/v1/cluster/", "hdfsBaseUri" : "hdfs://<Hadoop master IP address>:9000/"}

Note: The size and timestamp must match to HDFS, this size andtimestamp will be used by Hadoop worker nodes to verify theauthenticity of the jar file.

a. Use the following commands to obtain the size and timestamp. (Thecommand must be run on the Hadoop master node with meghabatchuser:

/usr/local/hadoop/bin/hadoop dfs -ls /dtp/lib/usr/local/hadoop/bin/hadoop dfs -stat '%Y' /dtp/lib/dtpAll.jar

b. Replace these two values in the dtpConfig.json file on theappropriate location.

4. Start the jetty service:

sh /usr/local/meghaBatchApp/bin/megha-jetty.sh start


3Working with batches

Batches include a pre-defined hierarchy of reports. Every batch has a pre-defined navigation tree and set of reports. Batches are run hourly, daily,weekly, and monthly.

You can view the batch reports using the following built-in user account:• User name: appuser• Password: appuser123

□ Viewing batch reports

□ Managing batches

□ Managing advanced reporter

□ Uninstalling advanced reporter

Working with batches 33Hitachi Data Center Analytics advanced reporter Getting Started Guide

Viewing batch reportsTo view the batch reports, log on to the advanced reporter.

Viewing batch reports• To view batch reports, click . For example, the batch is expanded as

shown in the following screen capture and shows the latest successfulreport batch.

• To view the reports, click any tree node. The report appears in the reports

pane.

Note: This release supports only Adaptable Modular Storage (AMS)and Hitachi Enterprise storage systems.

Viewing previous batch

To view the previous batch run, click Previous Batch in the tree navigationmenu bar. The tree of the previous batch run appears. You can click on anynode to view reports.

Viewing next batch

To view the next batch run, click Next Batch in the tree navigation menu bar.The tree of the next batch run appears. You can click on any node to viewreports.

Viewing batch run of specific time in past

To view a batch run of past, click Choose Date for Run in the tree navigationmenu bar and select the date. The tree for the selected batch run appears.You can click on any node to view reports.

34 Working with batchesHitachi Data Center Analytics advanced reporter Getting Started Guide

Downloading a report in a batch run

To download a report in a batch run, click Download this Report in the reportspane. The report is downloaded in a zip file.

Managing batchesYou can download the batch, view the summary and status of the batch.

Viewing batch summaryClick on the batch name, and select the Summary tab to view the batchdetails including: name, schedule of batch, time zone, export setting, reportnames in batch.

Viewing batch statusYou can view the status of the batch.

Procedure

1. Select the batch.2. Click the Status tab.

The Status window appears.3. In the Status window, do the following:

a. The Data Status shows the status of the data availability in theregistered Data Center Analytics server. Hover on the status icon fordetails.

b. The Batch Status shows the status of the batch, you can pause thescheduled batch and can resume it later when required.

c. The table shows the summary of the batch run, you can download thelogs of the batch run.

Pausing and resuming a batchYou can pause and resume a batch that is in progress.

Procedure

1. Select the batch that you want to pause or resume.2. Select the Status tab.3. Click Pause/Resume.

Downloading batch ZIPYou can download the batch reports in a zip file.

Procedure

1. Click Download Zip of All Reports in the tree navigation menu bar.The Select Resource Scope(s) for Download window appears


2. Select the resource name for which you want to download reports.The zip file contains all the reports of the selected resource. Thestructure of the folders in the zip is same as shown in the navigationtree.

Managing advanced reporterYou can change the FTP setting, add or remove Hadoop worker nodes, andchange password.

Changing FTP settingsIf the IP address of the FTP server changes, then you must update the FTPserver details.

Procedure

1. Log on to advanced reporter as the administrator user. The defaultadministrator user is admin.

2. Hover the and then select FTP Settings.The FTP Information window appears.

3. Type the FTP details.

Related tasks

• Initial setup for advanced reporter on VMware vSphere Client on page 12• Initial setup for advanced reporter on Linux machine on page 14

Adding or removing Hadoop worker nodesYou can add or remove the Hadoop worker node as per your requirement.

Procedure

1. Log on to advanced reporter as administrator user.

2. Hover the mouse on and then select Add/Remove Node.The Add or Remove Worker Node window appears.

3. Select either of the following buttons:• Add New Worker Node• Remove Existing Worker Node

4. Do the following:• To add worker node, enter the IP address or host name that you want

add.• To remove worker node, select the corresponding node name button.

5. Click Submit.


Changing passwordYou can change the password of the following default users admin andappuser. To change the password, you must log on to advanced reporter asrespective users.

Procedure

1. Hover over the mouse on and then select Change Password.The Change Password window appears.

2. Enter the old and new password as prompted.

Uninstalling advanced reporterEnsure that you have unregistered Data Center Analytics servers fromadvanced reporter before you uninstall advanced reporter.

Procedure

1. Log on to advanced reporter with root user.2. Run the following command:

rpm –e <Name_of_the_RPM>



4Advanced operations

□ Custom resources

□ Custom properties

□ Interval rollup

Advanced operations 39Hitachi Data Center Analytics advanced reporter Getting Started Guide

Custom resourcesCustom resources consist of custom model and custom resource groups.Custom resources help you to group the resources and monitor the resourcestogether. The performance reports are shown in advanced reporter.

Custom resource groupCustom resource group is a group of resources. You can create a customresource group of desired resources to monitor their performance. Thereports are generated to monitor the performance of these resources.

ModelModel is a group of custom resource groups. Here, no reports are configuredfor any custom model.

The process of creating custom resources includes:• Creating custom resource groups• Mapping resource groups with a model

Custom resources are configured in Data Center Analytics server. When theadvanced reporter downloads the data from Data Center Analytics server, thecustom resource structure and reports appears in the advanced reporternavigation pane.

To create custom resources, customDataProcessor.jar andexport.properties are used. These two files are packaged into Data CenterAnalytics server. From the command line, you must log onto Data CenterAnalytics server with root user and run commands for custom resourcesoperations.

In this release, you can create custom resources only for the followingresource types of Adaptable Modular Storage (AMS) and Hitachi Enterprisestorage systems:• LDEV• Port• Pool• Owner

Creating custom resourcesTo create custom resources, you must know the signature of the resources,like LDEV. For example, if you want to do the performance analysis of specificLDEVs and pools, then you must create custom resources.

Procedure

1. Discover signatures of LDEVs and Pools that you want to monitor.2. Map the LDEVs signatures in a custom LDEV group.

40 Advanced operationsHitachi Data Center Analytics advanced reporter Getting Started Guide

3. Map the pools signatures in a custom pool group.4. Map the custom LDEV group and custom pool group with custom model.5. Import the custom resources.

Related tasks

• Exporting resource details for custom resources on page 41• Creating custom resource group, custom model, and mapping resources

on page 41• Importing custom resources on page 42

Exporting resource details for custom resourcesYou can export the resources detail from the Data Center Analytics server.

Procedure

1. Log on to the Data Center Analytics server and navigate to the followinglocation: /usr/local/megha/bin/customResourceGenerator

2. Open the export.properties file and note the resource for which youwant to export the details.

3. Run the following command:

java -jar customDataProcessor.jar -ip IP_address -port Port_number -userId User_id -password Password -op export -exportType <Export _type> -propFile /usr/local/megha/bin/customResourceGenerator/export.properties

For example,

java -jar customDataProcessor.jar -ip 192.168.1.1 -port 8443 -userId test –password test -op export -exportType raid_ldev_export -propFile /usr/local/megha/bin/customResourceGenerator/export.properties

The options are as follows:• -ip: Data Center Analytics server IP address• -port: Data Center Analytics server port• -userId: Server User ID• -password: Server User Password• -op: Operation type• -exportType: Export type like ldev_export, pool_export• -propFile: Property file to read query for specified export type

After exporting the task, a CSV file is created at the same location.

Creating custom resource group, custom model, and mappingresources

After exporting the resource details in CSV file, you can use this file to importcustom resource details to the Data Center Analytics server.


In this CSV file, you can create custom resource group and map resources toit. You can also map custom resource group to a model.

You can define various custom resource groups in a single CSV file. Onecustom resource group can have multiple resources mapped to it but allresources must belong to one storage system.

Importing custom resourcesAfter mapping the custom resource groups with resources and models, youmust import the CSV in Data Center Analytics server.

Procedure

1. Create a CSV file to import it in the Data Center Analytics server.2. Log on to the Data Center Analytics server and navigate to the following

location: /usr/local/megha/bin/customResourceGenerator.3. Copy the CSV file.4. To import the CSV file, run the following command:

java -jar customDataProcessor.jar -ip IP_addres> -port Port_numbe> -userId User_id -password Password -op import -csvFile /usr/local/megha/bin/customResourceGenerator/<CSV_file_name>

For example,

java -jar customDataProcessor.jar -ip 192.168.1.1 -port 8443 -userId test -password test -op import -csvFile /usr/local/megha/bin/customResourceGenerator/import.csv"

After importing the custom resources, the model structure appears in theadvanced reporter. It might take some time to reflect this change inadvanced reporter.

Updating custom resourcesTo update the custom resources, you must update the imported file.

Procedure

1. Log on to the Data Center Analytics server and navigate to the followinglocation: /usr/local/megha/bin/customResourceGenerator.

2. Update the imported file and run the following command:

java -jar customDataProcessor.jar -ip IP_address -port Port_number -userId User_Id -password Password -op update -csvFile /usr/local/megha/bin/customResourceGenerator/CSV_file_name


For example,

java -jar customDataProcessor.jar -ip 192.168.1.1 -port 8443 -userId test -password test -op update -csvFile "/usr/local/megha/bin/customResourceGenerator/update.csv"

Deleting custom resourcesYou can delete the custom model and its corresponding resource groups. Youcan only delete the models and the custom resource groups. Other resourcesthat are mapped with the custom resource groups are not deleted

Procedure

1. Log on to the Data Center Analytics server and navigate to the followinglocation: /usr/local/megha/bin/customResourceGenerator

2. To delete the custom resources, you must have the custom resourcesignature. You can obtain it using the following command:

java -jar customDataProcessor.jar -ip IP_address -port <Port_number> -userId <User_id> -password <Password> -op export -exportType model_export -propFile /usr/local/megha/bin/customResourceGenerator/export.properties

For example,

java -jar customDataProcessor.jar -ip 192.168.1.1 -port 8443 -userId test –password test -op export -exportType model_export -propFile "/usr/local/megha/bin/customResourceGenerator/export.properties"

The CSV file is created where you have saved the JAR file.3. Take a note of the custom resource that you want to delete.4. Run the following command to delete custom resource:

java -jar customDataProcessor.jar -ip IP_address -port <Port_number> -userId <User_id> -password <User_id> -op <Operation> -customModelSig Custom_resource_signature

For Example,

java -jar customDataProcessor.jar -ip 192.168.1.1 -port 8443 -userId test -password test -op delete -customModelSig customModel#85010102-test

Custom propertiesThis section describes the custom properties to purge and zip the data. Everybatch run task is divided into multiple Hadoop containers, where eachcontainer is given a set of reports to generate. These Hadoop containersrequire the resources (memory and CPU cores) to generate the reports.


You can configure the container resources required to run a set of reports.You can also set the count of reports in the container. It helps you to optimizethe use of Hadoop resources and to increase performance of advancedreporter.

Purging and zipping dataIn advanced reporter, batch run logs and batch runs are purged at differentfrequencies based on the batch schedule. You can define these frequencies asper the available disk space on the advanced reporter.

To change the purging frequency, perform the following:

Procedure

1. Log on to advanced reporter through SSH client (like Putty).2. Navigate to the following location: /usr/local/meghaBatchApp/conf3. In the custom.properties file, type the property name and specify the

value. For example: hourly.batch.purge.limit=15D4. Restart the jetty service using following commands:

sh /usr/local/meghaBatchApp/bin/megha-jetty.sh stopsh /usr/local/meghaBatchApp/bin/megha-jetty.sh start

Table 5 Purge report propertiesThe following tables list the properties and default values. The purging activity deletes the batchrun and its related information

Property Default Value Description

hourly.batch.purge.limit 15D Batch run with hourly schedule is purged after 15days.

You can view max 15 (days) X 24 (Batches in aday) = 360 batch runs.

daily.batch.purge.limit 90D Batch run with daily schedule is purged after 90days.

You can view max 90 (days) X 1 (Daily one batch)= 90 batch runs.

weekly.batch.purge.limit 105D Batch run with weekly schedule is purged after105 days.

You can view max 105 (days) / 7 (Number of daysin a week) = 15 batch runs.

monthly.batch.purge.limit

12M Batch run with monthly schedule is purged after12 months.

You can view max 12 batch runs.

Table 6 Batch zip log propertiesThe following tables list the properties and default values.



hourly.log.zip.limit 2D Batch run logs with hourly schedule are zippedafter 2 days.

You can view max 2 X 24 = 48 batch run logs inunzipped format.

daily.log.zip.limit 4D Batch run logs with daily schedule are zippedafter 4 days.

You can view max 4 X 1 = 4 batch run logs inunzipped format.

weekly.log.zip.limit 21D Batch run logs with weekly schedule are zippedafter 21 days.

You can view max 21 / 7 = 3 batch runs logs inunzipped format.

monthly.log.zip.limit 2M Batch run logs with monthly schedule are zippedafter 2 months.

You can view max 2 batch runs logs in unzippedformat.

Table 7 Batch purge run log properties


hourly.log.purge.limit 10D Batch run logs with hourly schedule are purgedafter 10 days since log were generated.

You can view max 10 X 24 = 240 batch run logs.

daily.log.purge.limit 30D Batch run logs with daily schedule are purgedafter 30 days since log were generated.

You can view max 30 X 1 = 30 batch run logs.

weekly.log.purge.limit 70D Batch run logs with weekly schedule are purgedafter 70 days since log were generated.

You can view max 70 / 7 = 10 batch run logs.

monthly.log.purge.limit 6M Batch run logs with monthly schedule are purgedafter 6 months since log were generated.

You can view max 6 batch run logs.

Configuring report count in a container and resource requirement forbatchesContainer report count configuration

A report batch run task is divided into multiple containers. By default, acontainer runs 10 reports. You can change the report container count inbatch.report.count property.

Procedure

1. Navigate to the following location: /usr/local/meghaBatchApp/conf.


2. In the custom.properties file, enter the batch.report.count propertyand change its value. For example: batch.report.count=5.

Resource requirements for containersYou can configure the resource requirements for report containers incustom.properties file For example: hourly.resources=2048,2. In thisexample, the resources required in Hadoop for hourly batch are: 2048 MBRAM and 2 CPU cores.

Table 8 Default values

Batch Type Property Default RAM (MB) CPU Core

Hourly Batch hourly.resources 1024 1

Daily Batch daily.resources 3072 1

Weekly Batch weekly.resources 3072 1

Monthly Batch Monthly Batch 3072 1

Updating yarn-site.xml fileAfter completing the setup, you can increase the RAM or vCores of workernodes

Procedure

1. If you increase the RAM or vCores of worker nodes, then you mustupdate the following properties in the yarn-site.xml file on the workernodes:• yarn.nodemanager.resource.memory-mb• yarn.nodemanager.resource.cpu-vcores

2. If you have more than one worker nodes in Hadoop setup and the newlyconfigured size of RAM or vCores is bigger than the RAM and vCore ofthe existing worker nodes, you must update the following properties inyarn-site.xml file on master node:• yarn.scheduler.maximum-allocation-mb• yarn.scheduler.maximum-allocation-vcores

Setting the time zone for advanced reporterMake sure that the time zone of Data Center Analytics application andadvanced reporter batch is same. If the time zone is different than the datainterval rollup might not be performed accurately for weekly and monthlybatches.

In Data Center Analytics server, you can view the application time zone atfollowing location /usr/local/megha/conf/custom.properties. Theproperty name is server.timezone. Make sure the value of time zoneproperty is same in Data Center Analytics server and advanced reporter.


In advanced reporter, the application time zone property name isbatch.timezone and its default value is, America/Los_Angeles. To changethis property value, perform the following:

Procedure

1. Navigate to the following location: /usr/local/meghaBatchApp/conf/custom.properties file

2. Enter the following property: batch.timezone=<Timezone>3. Restart the jetty service, using the following commands:

sh /usr/local/meghaBatchApp/bin/megha-jetty.sh stopsh /usr/local/meghaBatchApp/bin/megha-jetty.sh start

Interval rollupInterval rollup is applicable for weekly and monthly batch reports (not fordaily and hourly). If the data collection interval for any resource on the targetstorage system is 30 minutes or less, then the following integer values arevalid: 1, 2, 3, 4, 5, 6, 10, 12, 15, 20, and 30



Hitachi Data Center Analytics advanced reporter Getting Started Guide

Hitachi Data Systems

Corporate Headquarters2845 Lafayette StreetSanta Clara, California 95050-2639U.S.A.www.hds.com

Regional Contact Information

Americas+1 408 970 [email protected]

Europe, Middle East, and Africa+44 (0) 1753 [email protected]

Asia Pacific+852 3189 [email protected]

Contact Uswww.hds.com/en-us/contact.html

MK-96HDCA008-00

October 2016

http://www.hds.com

MAILTO:[email protected]