Introduction
Installing a node
Installing software
Managing modules
Using modules
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
OS & Software Upgrade
on the EECS Research Clusters
Daniel Andrzejewski
Electrical Engineering & Computer Science Department
University of Tennessee
December 3, 2008
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Managing modules
Using modules
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Outline
1
Introduction
What this presentation is about
Reasons for the upgrade
Basic clusters infrastructure
2
Installing a node
Installing a node
Some of the important files
Example
3
Installing software
4
Managing modules
5
Using modules
6
Compiling software
7
Running a basic MPI job
8
Other major changes from the previous architecture
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
What this presentation is about
Managing modules
Reasons for the upgrade
Using modules
Basic clusters infrastructure
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
What this presentation is about
Installing a node
Installing software
Managing modules
Using modules to compile software
Running an MPI job
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
What this presentation is about
Managing modules
Reasons for the upgrade
Using modules
Basic clusters infrastructure
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Introduction
Why the OS upgrade
- too much time spent on managing software
Why CentOS
- it matches our server infrastructure
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
What this presentation is about
Managing modules
Reasons for the upgrade
Using modules
Basic clusters infrastructure
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Basic clusters infrastructure
affected systems:
greedo - cfengine server
slyder - package, tftp, proxy server
not affected systems:
mira, hanharr, dengar - LDAP
alex, kril, shire - NFS
dns - DNS, DHCP
zam - NTP, Nagios, Syslog
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Installing a node
Managing modules
Some of the important files
Using modules
Example
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
kickstart
PXE booting
How it is set up
an entry in the DHCP server points to slyder
tftp server on slyder
don t forget to delete cfengine keys on greedo
example
greedo:Ü# ./delete.keys.sh frodotest2
removed /var/cfengine/ppkeys/root-172.16.0.2.pub
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Installing a node
Managing modules
Some of the important files
Using modules
Example
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Some of the important files
/tftpboot
/tftpboot/pxelinux.cfg/frodo-head
/tftpboot/pxelinux.cfg/frodo-compute
/tftpboot/pxelinux.cfg/frodo-compile
/export/kickstart/centos - CentOS 5.2 repository
/export/kickstart/centos/ks.frodo-head.cfg
/export/kickstart/centos/ks.frodo-compute.cfg
/export/kickstart/centos/ks.frodo-compile.cfg
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Installing a node
Managing modules
Some of the important files
Using modules
Example
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Example
slyder /tftpboot/pxelinux.cfg> ./gethostip frodo2
frodo2.sinrg.local 172.16.0.2 AC100002
slyder /tftpboot/pxelinux.cfg> ls -l AC100002
lrwxrwxrwx 1 root root 13 Oct 17 11:40 AC100002 -> frodo-compute
slyder /tftpboot/pxelinux.cfg> cat frodo-compute
default 0
timeout 50
prompt 1
display msgs/boot.frodo-compute.msg
F1 msgs/boot.frodo-compute.msg
F2 msgs/general.msg
F3 msgs/expert.msg
F4 msgs/param.msg
F5 msgs/rescue.msg
F7 msgs/snake.msg
label 0
kernel centos5.2/vmlinuz
append initrd=centos5.2/initrd.img ramdisk_size=6801
ks=nfs:172.16.0.72://export/kickstart/centos/ks.frodo-compute.cfg
ksdevice=eth0
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Managing modules
Using modules
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Installing software from an admin stand point
Two ways:
using package management (yum, rpm)
[root@frodo-head]# dsh -g frodo yum -y install lapack-3.1.1-1.el5.rf
[root@frodo-head]# dsh -g frodo rpm -qa | grep lapack
compiling from scratch and putting in /pkgs
what if there are other versions (e.g. mpicc from mpich and
openmpi)
set prefix to /pkgs/your-software-x.y.z, where x.y.z indicates a version
after installation create a diretory in /usr/local and duplicate the above directory tree using symbolic
links
example:
mkdir /usr/local/mpich
graft -i -t /usr/local/mpich /pkgs/mpich-1.2.7..2
add a module (only if created new directory under /usr/local)
Install software on the package server and from the head node update the
nodes using cfagent
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Managing modules
Using modules
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Modules provide an easy mechanism for updating a user s environment
How to add a module
create a file in /pkgs/Modules/modulefiles
example:
slyder Ü> cat /pkgs/Modules/modulefiles/mpich-1.2.7p1
proc ModulesHelp { } {
puts stderr "Sets up environment to use mpich."
}
module-whatis "adds mpich paths to your environment variables"
set apppath /usr/local/mpich
prepend-path PATH $apppath/bin
prepend-path LD_LIBRARY_PATH $apppath/lib
prepend-path LIBRARY_PATH $apppath/lib
prepend-path MANPATH $apppath/man
append-path MANPATH /usr/local/man:/usr/share/man
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Managing modules
Using modules
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Example
frodo2> . /usr/local/modules/3.2.6/init/tcsh
frodo2> module avail
/pkgs/Modules/modulefiles
j2sdk1.4.2_18 modules mpich-mx-1.2.7..7 mpich2-mx-1.0.7..2 openmpi-1.2.8
module-info mpich-1.2.7p1 mpich2-1.0.8 mx-1.2.7 openmpi-1.2.8-mx
frodo2> module load openmpi-1.2.8-mx
frodo2> module list
Currently Loaded Modulefiles:
1) mx-1.2.7 2) openmpi-1.2.8-mx
frodo2> which mpicc
/usr/local/openmpi-mx/bin/mpicc
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Managing modules
Using modules
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Compiling software from a user standpoint
frodo-head> ssh frodo-compile
frodo-compile> . /usr/local/modules/3.2.6/init/tcsh
frodo-compile> module load openmpi-1.2.8-mx
frodo-compile> mpicc -o hello hello.c
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Managing modules
Using modules
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
interactive
qsub -I -l nodes=64:ppn=2
. /usr/local/modules/3.2.6/init/tcsh
module load openmpi-1.2.8-mx
mpirun -np 64 hello
batch
example of a batch script hello.sh
#/bin/bash
#PBS -l nodes=64:ppn=1,pmem=1gb
NODES= cat $PBS_NODEFILE | wc -l
. /usr/local/modules/3.2.6/init/tcsh
module load openmpi-1.2.8-mx
mpirun -np $NODES $PBS_O_WORKDIR/hello
command to submit a job
qsub hello.sh
Itstaff EECS Clusters
Introduction
Installing a node
Installing software
Managing modules
Using modules
Compiling software
Running a basic MPI job
Other major changes from the previous architecture
Other major changes from the previous architecture
proxy server - yum updates can be done on the compute
nodes (private LAN)
mpich2 with mx drivers
Itstaff EECS Clusters
Wyszukiwarka
Podobne podstrony:
Sdi clustersEdPsych Modules PDF Cluster 3 Module 10cluster drop earringsclustermapEdPsych Modules PDF Cluster 2clustermapArrays and Clusters v1 0MySQL Cluster Administrator GuideEdPsych Modules PDF Cluster 4sql clusterEdPsych Modules PDF Cluster 8Star Formation in Clusters from ISO to FIRSTEdPsych Modules PDF Cluster 6Fiat Punto clusterplugFiat Punto clusterplugwięcej podobnych podstron