Computer Best Practice for Experimental High Energy Physicist · "Computer Best Practice for...

Preview:

Citation preview

"Computer Best Practice for Experimental High Energy Physicist "

April 23, 2018Dr. Woochun Park

CTO, Minos Technology, LLC

Better knowledge of computer and programming could save your analysis time. In this lecture, I suggest a list of computer best practices to make your research environment more efficient based on my experiences, 15 years of experimental high energy physicsand 7 years of software/hardware industry.

wget http://www.minos-tech.com/practice/20180423_SNU.pdf or pptx

Coding Principle (IMHO)● Maintainability

○ Self-explaining variable name○ Comment○ Easy debugging○ Exception handling

● Scalability○ Modular programming○ Automation: more work by computer, less work by human○

● Usability○ Usage or manual○ Sensible definition of parameters

● Efficiency○ Logical algorithm

● Developer Friendly Environment○ ...

https://en.wikipedia.org/wiki/Best_coding_practices

Contents● Workstation (Desktop/Laptop) environment● Practical command● Bash script coding● Python script coding● Python Programing● C/C++ Programing● Database● Web application● Computer hardware resource optimization● Parallel computing

Census● YooLab or YangLab or any other?● How many do have Desktop or Laptop for terminal?● How many do have Windows or Linux (Ubuntu, CenOS, Fedora, SUSE)?● How many do use Virtual Box, Vmware workstation?● How many do use vi(m), emacs, pico or nano?● How many do NOT use ROOT?● In ROOT, how many use PYROOT, compiled by ROOT or g++ compilation?● How many use password to login to server?● How many use more than 1 monitor?● How many use ssh, sftp, scp, rsync, wget?● How many use bash, sh, tcsh, csh, ksh, perl, php, ruby, python?● Who do plan to work in the lab after 7pm tonight?

Linux OR Windows ?No worries. What you need is “command line interface”.

● Linux: Desktop is recommended○ CentOS 6 and 7 have a huge difference○ Stick to CentOS 6 if you can○ Any version would be fine though

● Windows: Cygwin is recommended (instead of putty)○ Cygwin works well over any Windows version including the latest Windows 10.

Login without passwordAren’t you tired of typing password over and over? If so, then this could help:

● You need to create your ssh key:○ # ssh-keygen○ This command will create a key at ~/.ssh/id_rsa.pub

● Then, you need to copy your key in remote server:○ # ssh-copy-id -i ~/.ssh/id_rsa.pub minos@147.47.50.161○ It will ask your password one more time.○ By doing this, your key will be saved in ~/.ssh/authorized_keys at remote server○ No more password entry

● This could be used to improve security (advanced user):○ In my /etc/ssh/sshd_config, comment in:○ # PasswordAuthentication no ○ Make sure of having another terminal open in the same server, otherwise you could be

locked and need to reimage your server!!!!!○ Then, restart sshd service:○ # service sshd restart

Login without password: continued● Advantage: shortcut to automation

○ Quick command line remotely:■ ssh minos@147.47.50.219 “ls -ltr”

○ Convenience to set up crontab with rsync / scp

File transfer● sftp

○ Once login, you can browse local directories as well as remote directories○ Command

■ Transfer: get, put■ Remote directories: ls, cd, pwd, mkdir■ Local directories: lls, lcd, lpwd, lmkdir

● scp○ One line command with specific source and target path○ With passwordless login, easy to setup crontab

● rsync○ Transfer only difference of source and target; In case of interruption, would be very useful

● wget○ Asymmetric transfer (only download is possible)○ wget -O will save a file with different name○ wget -N does check if modified on server○ wget ftp://ftp.snu.edu/mydir/ -q -O destination_file_list.txt # without index.html you will see all

the list of files; some http server doesn’t allow it○ https://www.rosehosting.com/blog/wget-command-examples/

Entry to Command Line Interface: .bashrc

● Opposite to graphic user interface (controlled by mouse mostly)● You may not know that bash script is always run when you login or open

xterm in your desktop console:○ When you do ssh login, ~/.bash_profile is sourced where ~/.bashrc is sourced as well○ When you open xterm in desktop console, ~/.bashrc is sourced○ When you want to source any bash file, you can do ○ # . filename○ ~/.bashrc is the best file to put your environment variables

.bashrc: Configure your promptIt’s uncommon that you open many terminals over many different servers. How do you distinguish terminals?

● “PS1” is the name to configure your stylish prompt.● \u – Username● \h – Hostname● \w – Full path of the current working directory● \t - Display the current time in the hh:mm:ss format● \@ - Display the current time in 12-hour am/pm format● # PS1="\u@\h [\t]> "● # PS1="[\@] \u@\h> "● # PS1="\u@\h \w> "

● Use color:○ PS1="\e[0;34m\u@\h \w> \e[m"

https://www.thegeekstuff.com/2008/09/bash-shell-ps1-10-examples-to-make-your-linux-prompt-like-angelina-jolie/

.bashrc: history● # for setting history length see HISTSIZE and HISTFILESIZE in bash(1)

○ HISTSIZE=1000○ HISTFILESIZE=2000○ showing timestamp

■ HISTTIMEFORMAT="%d/%m/%y %T "

● history command shows numbers to each command● !N ● !N:p● !specificword● CTRL + P, !!, !-1, up arrow● CTRL + R

https://www.thegeekstuff.com/2008/08/15-examples-to-master-linux-command-line-history/

.bashrc: alias● alias so=’. ~/.bashrc’● alias ls='ls --color=auto'● alias grep='grep --color=auto'● alias egrep='egrep --color=auto' # grep -E

● ## a quick way to get out of current directory ##alias ..='cd ..'alias ...='cd ../../../'alias ....='cd ../../../../'alias .....='cd ../../../../'alias .4='cd ../../../../'alias .5='cd ../../../../..'

● Lengthy command(s) can be aliased: Note semicolon(;)○ # alias sync_killdisk='cp /mnt/data/dev/killdisk_driver.py /opt/projects/minos/bin/killdisk/ ; cp /mnt/data/dev/myKillDisk

/opt/projects/minos/bin/killdisk/; cp /mnt/data/dev/myPoweroff /opt/projects/minos/bin/killdisk/'● Subshell can be aliased:

○ See next page● https://www.cyberciti.biz/tips/bash-aliases-mac-centos-linux-unix.html

.bashrc: function and subshell

wget http://www.minos-tech.com/practice/dotbashrc

Bash Script: Less to revise when copy and paste

Multiple places to edit

Error message could be escaped from log file

● wget http://www.minos-tech.com/practice/single_param_example.sh

Bash Script Parameter: String Vs. Number and color

Self-explanatory

Exception Handling

● Make use of color to draw attention● Handle exception from the

beginning

● wget http://www.minos-tech.com/practice/string_and_color.sh

4/24 Start● In case remote server port is different from the default one (22)

○ # ssh-copy-id -i ~/.ssh/id_rsa.pub -p 4280 joonblee@ui10.sdfarm.kr● STL vector vs list

○ https://stackoverflow.com/questions/2209224/vector-vs-list-in-stl

● Performance difference between tamsa2 and prime

cd● cd● cd ..● cd -● pushd /scratch/● …..● popd● pwd● mkdir here● mkdir there● mkdir -p here/there

ls# ls -ltrhd */

# -l : show detail

# -t : sort by time

# -S : sort by size desc

# -r : reverse order

# -h : human readable

# -d : list directories themselves, not their contents

Sar: Performance monitoring● CPU: sar● DISKIO: sar -d -p● Network: sar -n DEV● Memory: sar -r ● Certain day: sar -f /var/log/sa/saNN

https://www.linuxtechi.com/generate-cpu-memory-io-report-sar-command/

Continued from p16~19: Number vs Word

root: + vs ++ vs nothing

wget http://www.minos-tech.com/practice/run_BtoF.py

● for i in `ps aux | grep BtoF | awk '{print $2}'`; do echo $i; done● for i in `ps aux | grep BtoF | awk '{print $2}'`; do kill -9 $i; done

Less number of files to manage

● Similar definitions of TFile/TH1F object● Indentation● Two pieces of codes with different programing language: C and bash● Y range could be out of window● Lengthy line (the first one)

Less number of files to manage (II)

Dynamic Y-axis range

● wget http://www.minos-tech.com/practice/make_plots.C

Another Array Example

MuonSegmentMDTSegment

TGCSegment

CSCSegment

RPCSegment

Revised:TFile *f[7];

for (int i=0; i<7; i++) { f[i]= new TFile(Form("ROOTFile_Wpos_munu_v2_metcut_%d.root",i+1)); }

TFile *ft = new TFile("ROOTFile_Wpos_munu_v2_metcut_12.root");TFile *ftB = new TFile("ROOTFile_Wpos_munu_v2_metcut_13.root");TFile *fW = new TFile("ROOTFile_Wpos_munu_v2_Wext_16.root");TFile *fDY = new TFile("ROOTFile_Wpos_munu_v2_metcut_11.root");TFile *fVV = new TFile("ROOTFile_Wpos_munu_v2_metcut_15.root");TFile *fQCD = new TFile("ROOTFile_Wpos_munu_v2_metcut_17.root");TFile *ftau = new TFile("ROOTFile_Wpos_munu_v2_metcut_18.root");

TH1D* h[7]; for (int i=0; i<7; i++) { h[i] = (TH1D*) f[i]->Get("h_"+type+"Data"); if (i>0) { h[0]->Add(h[i]); } } h->Rebin(bin);

diff / md5sum● diff compares local files● https://en.wikipedia.org/wiki/Md5sum

grep / sed ● $ wget http://www.minos-tech.com/practice/snu.txt● $ grep Seoul snu.txt● $ grep -n Seoul snu.txt # show line number● $ grep -n At snu.txt● $ grep -ni At snu.txt # case insensitive● $ grep -n At snu.txt -A 1 -B 2 # show 1 line after 2 line before● $ grep -E ‘Seoul|At’ snu.txt # OR● $ grep -n At snu.txt -A 1 -B 2 | grep -v At # NOT●● $ sed s/Seoul/Chonbuk/ snu.txt # display only● $ sed -i s/Seoul/Chonbuk/ snu.txt # in place ● $ sed s/Seoul/Chonbuk/ snu.txt > chonbuk.txt●● https://www.linuxtechi.com/20-sed-command-examples-linux-users/●

awk● $ wget http://www.minos-tech.com/practice/numbers.doc http://www.minos-tech.com/practice/numbers.csv● $ awk '{print $1,$2,$3}' numbers.csv● $ awk -F, '{print $1,$2,$3}' numbers.doc● $ awk -F, '{print $1, $NF}' numbers.csv● $ awk -F, '{sum+=$1} END {print sum}' numbers.csv● -4.44089e-16● $ awk -F, '{sum+=$1} END {print int(sum+0.5)}' numbers.csv● 0● $ awk -F, '{sum+=$2; n++} END {if (n>0) print sum/n}' numbers.csv● 2.03333● $ awk -F, '{sum+=$2} END {if (NR>0) print sum/NR}' numbers.csv● 2.03333● built in variables.

○ NF is for the number of fields in the current record. In this case, 3○ NR is for the number of records in the input file.

https://likegeeks.com/awk-command/

find / ping● $ find . -name index.html● $ find . -name ‘i*controller.erb’

● ping -c 1 -W 1 192.168.0.25

[root@prime minos]# grep 147 /etc/hosts.allow | grep -v \# | uniq | wc 29 58 601[root@prime minos]# grep 147 /etc/hosts.allow | grep -v \# | sort | uniq |wc 26 52 540

cat tac which whoami pwd

crontab● crontab -e● crontab -l● 0 2 * * * /bin/sh backup.sh

● 0 5,17 * * * /scripts/script.sh

● * * * * * /scripts/script.sh

● 0 17 * * sun /scripts/script.sh

● */10 * * * * /scripts/monitor.sh

● https://tecadmin.net/crontab-in-linux-with-20-examples-of-cron-schedule/

pushd / popd

My exampleGateway machine: 10.10.10.1

Bridge machine: 10.10.10.2 / 10.10.11.2

Internal systems: 10.10.11.XXX

Want to check they are alive:

alias pings='ssh root@10.10.10.2 '\''/root/pings'\'''

for i in `cat ips.list`; do SPACE=' ===> '; if [ $i != 'FILLER' ]; then RET=`ping -c 1 -W 1 $i| grep -E '1 received,'`; else RET=''; SPACE='';fi; echo $i$SPACE$RET; done

https://stackoverflow.com/questions/14370133/is-there-a-way-to-create-key-value-pairs-in-bash-script

10.10.11.2510.10.11.2010.10.11.1210.10.11.1610.10.11.4610.10.11.48FILLER10.10.11.2210.10.11.3510.10.11.4010.10.11.27 10.10.11.910.10.11.4210.10.11.44FILLER10.10.11.5410.10.11.5210.10.11.5010.10.11.3610.10.11.3710.10.11.38

10.10.11.25 ===>10.10.11.20 ===>10.10.11.12 ===>10.10.11.16 ===> 1 packets transmitted, 1 received, 0% packet loss, time 0ms10.10.11.46 ===> 1 packets transmitted, 1 received, 0% packet loss, time 0ms10.10.11.48 ===> 1 packets transmitted, 1 received, 0% packet loss, time 0msFILLER10.10.11.22 ===>10.10.11.35 ===>10.10.11.40 ===>10.10.11.27 ===>10.10.11.9 ===>10.10.11.42 ===>10.10.11.44 ===>FILLER10.10.11.54 ===>10.10.11.52 ===>10.10.11.50 ===>10.10.11.36 ===>10.10.11.37 ===>10.10.11.38 ===>

Create your own use caseExample:

(1) When I login, I like to see all the CMS version installed in this machine(2) I like to know my last location of working directory

Ntuple generation many times….(?) hierarchyGame of log file

Samba CIFS● $ ● $ yum -y install samba cifs_utils # to be samba server and client ● $●

sshd: 192.168.1. #LANsshd: 24.168.196.71 #WP HOMEsshd: 104.55.125.21 #legrandsshd: 98.25.231.108 # MRBsshd: 1.235.72.35 # Suwon

sshd: 147.47.50.161 #SNU muon serversshd: 147.47.50.219 #SNU prime server

Recommended