18
BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Embed Size (px)

Citation preview

Page 1: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

BIOSTAT LINUX CLUSTER

By

Helen Wang

October 11, 2012

Page 2: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Basic Beowulf Cluster Structure

Page 3: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

A brief look of our cluster

Page 4: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Biostat Beowulf Cluster

• Server Name: Merlot.bis.vcu.edu• IP: 128.172.4.89• 2nd server as failover: blanc.bis.vcu.edu• IP: 128.172.4.90 (invisible on mission)• Software recommended to access servers:PC USERS: 1. MobaXterm http://mobaxterm.mobatek.net/2. ssh /open ssh / putty / winscp3. x-windows http://www.ts.vcu.edu/faq/software/ with license code MAC USERS: Mac Terminal

Page 5: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Access Merlot from your computerHow to use MobaXterm or ssh to access server• Outside of VCU, use webvpn.vcu.edu to connect to server• Open new session then SSH to add the server name• Open “SSH settings” to fill the information

remote hostname: 128.172.4.89username: YOUR_ACCOUNT_NAMEport number: 22

• Open session settings to put merlot in Session Name• ssh –X USERACCT@SERVER_IP for graphical access• Select the server to test the connection and exchange keys by

giving password• Create profile or bookmark for the easy access every time

Page 6: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Access ClusterWhat you need to do to access to server?

• get username and password• change your password to be qualified password:

$passwd• Get webvpn.vcu.edu to install VCU webvpn on your PC so you can access it from

anywhere.• send your home IP to administrator in order to access it from home ( not preferred)

www.whatismyipaddress.com• set up necessary variables to customize your personal console

templates:/home/huan/.cshrc/home/huan/.loginchange the identity to be your name

• Make temp and bin directory under your home dir$mkdir tmp$mkdir bin

Page 7: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Access ClusterServer and nodes• Master node (master1 / master2): merlot.bis.vcu.edu• Running CentOS ( redhat kenrnel)Version 5.5, x86-64 • Open source or Software download – choose 64 bits CentOS or RHEL 5 if possible

Purposes:front-end user interface; slow; - not for running jobs – testing jobs running for 1 hour and will be terminated by system.accessible from outside by permission;

Slave nodes (nodes):node1.biocl.vcu.edu – node22.biocl.vcu.edu (dual quart core Xeon processors with 64 GB RAM)Node15-22 has large memory capacity for running memory hunger jobs (96GB)

Purposes:computation; not prefer to access user interface, accessible via master and managed by portable batch management ( PBS ); fast;internal network; -10.0.0.X, not accessible directly from outside

Page 8: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Accessing ClusterSoftware available on master and nodes• R 2.15 with CRAN libraries and bio_conductor

libraries• C++/G++ compiler, Fortran compiler ( f77/f90)• Perl 5.8.8• Python/Biopython compilers• Common Open sources needed by users

( PLINK, MERLIN, IMPUTE etc.) Upon users requests

• SAS 9.3 is on all nodes /usr/local/bin/sas

Page 9: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Commands to be used on cluster

• Submitting R jobs on normal queue$qR MYSCRIPT ( if the script name is MYSCRIPT.R, submit it with no .R extension)each users is allowed to run 30-50 jobs simultaneously

• Submitting jobs on large memory queuelarge memory queue is on node1 for memory intensive jobs ( limited 8 totally)$qRL MYSCRIPT

Page 10: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Template used on cluster

• Modify template to create your own pbs script for running programs

• #!/bin/bash• #PBS -q serial• #PBS -N MYSCRIPT• #• echo "******STARTING****************************"• #• # cd to the directory from which I submitted the job. Otherwise it will execute in my home directory.• #• set WORKDIR = ~/YOURWORDIR• #PBS -V• #echo “PBS batch job id is $PBS_JOBID“• echo "Working directory of this job is: " $WORKDIR• #• echo "Beginning to run job“• Command line you need to execute the job ( /home/huan/bin/calculate - PARAMETEERS)

• SAVE IT IN AN FILE MYSCRIPT• $qsub MYSCRIPT

Page 11: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Commands used on cluster

• Submitting interactive jobwhen there is no script command for submitting jobs using new application

$qsub -I to get on a nodeNODE7$plink –script PLKSCRIPT

• Checking job status“R” Running; “E” Exiting “H” Holding “Q” Queued$qstat$qstat –n ( show which node your job is on)

Page 12: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

Commands to be used on group• Change into nodes to check the status

$ssh NODE#NODE#$ topNODE#$ exit

• Quit or cancel job submission$qstat ( to get the jobID)#qdel YOURJOBID

• Limitation for the name of the SCRIPTNo more than 10 charactersno space in betweenno special characters.use a temporary name if necessary and change it back when the job is done.

Page 13: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

General commands used in Linux

• List files$ls lists the files in the current directory.$ls -F shows the difference between directories and ordinary files.$ls -a lists all files, even those that are normally invisible in UNIX (files whose names start with a period, i.e. .xstartup).$ls -lt |more lists files sorted by time, pipe more give you page by page display

• Make directory and change directory$mkdir DIR1$cd DIR1$cd .. Go back to upper level directory$cd Go back to your home directory

• Remotely copy between servers scp acct1@server1:/yourpath/yourfiles acct2@server2:/yourpath/

Page 14: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

General commands used in Linux

• Copy file or directory$cp PATH1/FILE1 PATH2/FILE2 copies the contents of FILE1 into the file FILE2$cp –r PATH1/DIR1 PATH2/DIR2 copies the contents of DIR1 into the file DIR2use . Instead of FILE2/DIR2 if you keep the same file/dir name.$cp PATH1/FILE1 ./copies FILE1 from PATH1 to current directory with the same file name

• Move file or directory$mv file_name dir_name moves the file file_name from the current directory into the directory dir_name, where dir_name is a subdirectory of the current directory$mv old_file new_file renames old_file and calls it new_file.

Page 15: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

General commands used in Linux

• Delete files and directory$rm my_file deletes my_file $rm –r my_dir deletes my_diruse wild card * to delete multiple files/directories$rm PLK* deletes all files start with PLK

• System monitoring$ps list all processes you own on system$ps gux lists only your processes. $ps aux lists all processes running on your machine $kill my_processsends a terminate signal to the process specified by the process id (PID) my_process

Page 16: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

General commands used in Linux

• Display a file$more my_file displays the text of my_file one page at a time. To see the next page, hit the space bar; to see the previous page, type b; to quit paging the file, type q $less my_file similar as more but faster and has more functions

• Retrieve string from file$grep string filename searches filename for string. It outputs every line which contains string. $grep -v string filename outputs every line which does not contain string

• Display manual of a command$man COMMAND

Page 17: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

General commands used in Linux• **Printing

$lpr print_file sends print_file to the default printer $lpr -Pother_printer print_file sends print_file to other_printer. Aliases for printing jobsa2psp7, a2psl7, a2psp7d, a2psl7d$aliaes print_file sends the print_file to default printer in font size 7 in portait or landscape with single or double sided

• Changing permissions for file and directorydefault permission rwx --- --- which forbids other users to view or copy$chmod 750 my_file$chmod 750 my_dir to change the permission so your group member can read copy or execute my_file /my_dir

Page 18: BIOSTAT LINUX CLUSTER By Helen Wang October 11, 2012

At Last

• Edit file using nano or vihttp://www.ts.vcu.edu/faq/unix/picoeditor.htmlhttp://www.ts.vcu.edu/faq/unix/vieditor.html

• use samba connection to map a network drive on PC, recommending to use “EditPad Lite”

• Transferring files between windows and serveruse sftp windows to transferuse samba connection to transfer

• Useful linkshttp://www.ts.vcu.edu/faq/unix/docs.html