28
MSc. Miriel Martín Mesa, DIC, UCLV High Performance Cluster in the UCLV First steps

MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Embed Size (px)

Citation preview

Page 1: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

MSc. Miriel Martín Mesa, DIC, UCLV

High Performance Cluster in the UCLV

First steps

Page 2: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

The idea

Installing a High Performance Cluster in the UCLV, using professional servers with open source operating system

Page 3: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

¿Why?

The current researches require a large amount of computational resources that can not be obtained with a single computer.

The need to make several runs of the experiments, without having to wait to finish the current run to execute the next

The possibility of having an electric back that allows running jobs that require several days to finish

Page 4: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Current Hardware

7 nodes Dell R410 with: 2 Intel processors with 6 cores x processors 12 GB RAM, 250 GB hard drive 2 NIC Gbps,

10 Blade nodes Dell 1955 with: 2 Intel processors with 2 cores x processors 12 GB RAM, 36 GB HDD

Page 5: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Current Hardware

17 nodes with: 28 processors, 132 cores, 204 GB RAM, 1.3 TFLOPS (theoretical)

Page 6: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Cluster design

Beowulf design

Page 7: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Basic Software

• S/O: Debian 7• Resource manager: TorquePBS• Scheduler: MAUI• Central user authentication: NIS Server

Page 8: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Cluster installation (Master and nodes)

• PXE (Preboot eXecution Environment)• DHCP• TFTP • HTTP server• DNS Server (BIND)• Preseed script (Answers to installation questions)

Page 9: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Preseed code

d-i mirror/protocol string httpd-i mirror/country string manuald-i mirror/http/hostname string master.cluster.uclv.edu.cud-i mirror/http/directory string /debiand-i mirror/http/proxy stringd-i mirror/suite string wheezy

d-i partman-auto/disk string /dev/sdad-i partman-auto/method string regulard-i partman-auto/choose_recipe select atomicd-i partman-auto/purge_regular_from_device boolean trued-i partman-regular/confirm boolean trued-i partman/confirm_write_new_label boolean trued-i partman/choose_partition select Finish partitioning and write changes to diskd-i partman/confirm boolean true # If the system has free space you can choose to only partition that space.tasksel tasksel/first multiselect minimald-i pkgsel/include string openssh-server puppetd-i preseed/late_command string sed -i 's/no/yes/g' /target/etc/default/puppet

d-i mirror/protocol string httpd-i mirror/country string manuald-i mirror/http/hostname string master.uclv.cud-i mirror/http/directory string /debiand-i mirror/suite string wheezy

d-i partman-auto/disk string /dev/sdad-i partman-auto/method string regulard-i partman-auto/choose_recipe select atomicd-i partman-regular/confirm boolean trued-i partman/confirm_write_new_label boolean trued-i partman/choose_partition select Finish partitioning and write changes to diskd-i partman/confirm boolean

tasksel tasksel/first multiselect minimald-i pkgsel/include string openssh-server puppetd-i preseed/late_command string sed -i 's/no/yes/g' /target/etc/default/puppet

Page 10: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Cluster Management

Puppet Package management and configuration of the server and the nodes.

Page 11: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Cluster Management

Module: commons

Class packages-commons { $packages_commons = ["csh","flex","byacc","vim",tcsh","lsb", "lsb-core"] package { $packages_commons : ensure => installed }}

Page 12: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Cluster Management

Module: MPICH

class mpich ($mpich_version ) { file {mpich: path => "${mpich_path}", owner => root, mode => 775, ensure => directory, } exec { "mpich_configure": cwd => "${mpich_source}-${mpich_version}/", command => "nice -19 sh configure ${mpich_prefix} ${mpich_with_torque}", onlyif => "test ! -e ${mpich_source}-${mpich_version}/config.log", }……}

Page 13: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

cron { update_ntpdate: command => "/usr/sbin/ntpdate ", user => root, minute => 0, hour => '*/1',}

service { cron: ensure => running, enable => true, }

Cluster Management

Page 14: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Monitoring tools

Ganglia

Provides real-time monitoring and execution environment

Page 15: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Monitoring tools

Icinga

Monitors any network resource, notifies the errors, generates performance data for reporting and reports the status of resources

Page 16: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

System Access

Page 17: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

System Access

Secure shell (SSH):

# ssh [email protected]

Page 18: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

System Access

Page 19: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

System Access

Web page

Page 20: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

System Access

Web page

Page 21: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

System Access

Web page

Page 22: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

System Access

Web page

Page 23: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Cluster applications

Page 24: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Cluster applications

Page 25: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Example

#!/bin/bash#PBS -N example1#PBS -l nodes=2:ppn=4#PBS -l walltime=01:20:00#PBS -q default#PBS -m ae#PBS -M [email protected] $PBS_O_WORKDIR############################module load mpich/3.0.4 mpirun ./application

Page 26: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Cluster queues

Queue nodes access CoresMemory (GB) jobs/users

Max Time(hours) Priority

Default 

small 1Blade nodes 4 8 4 12 10

medium 1-3 Any 24 12 3 36 20

long 1-4 Any 24 15 2 168 30

Page 27: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

To do

• Implement system of user quotas

• Add an external storage

• Continue installing applications

demanded by users

We always need to do more

Page 28: MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating

Thank you

Muchas Gracias