clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

OS & Software Upgrade

on the EECS Research Clusters

Daniel Andrzejewski

Electrical Engineering & Computer Science Department

University of Tennessee

December 3, 2008

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

Outline

1

Introduction

What this presentation is about

Reasons for the upgrade

Basic clusters infrastructure

2

Installing a node

Installing a node

Some of the important files

Example

3

Installing software

4

Managing modules

5

Using modules

6

Compiling software

7

Running a basic MPI job

8

Other major changes from the previous architecture

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

What this presentation is about

Reasons for the upgrade
Basic clusters infrastructure

What this presentation is about

Installing a node

Installing software

Managing modules

Using modules to compile software

Running an MPI job

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

What this presentation is about

Reasons for the upgrade

Basic clusters infrastructure

Introduction

Why the OS upgrade
- too much time spent on managing software

Why CentOS
- it matches our server infrastructure

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

What this presentation is about
Reasons for the upgrade

Basic clusters infrastructure

Basic clusters infrastructure

affected systems:

greedo - cfengine server

slyder - package, tftp, proxy server

not affected systems:

mira, hanharr, dengar - LDAP

alex, kril, shire

- NFS

dns

- DNS, DHCP

zam

- NTP, Nagios, Syslog

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

Installing a node

Some of the important files
Example

kickstart

PXE booting

How it is set up

an entry in the DHCP server points to slyder
tftp server on slyder
don’t forget to delete cfengine keys on greedo

example

greedo:˜# ./delete.keys.sh frodotest2
removed /var/cfengine/ppkeys/root-172.16.0.2.pub

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

Installing a node

Some of the important files

Example

Some of the important files

/tftpboot

/tftpboot/pxelinux.cfg/frodo-head

/tftpboot/pxelinux.cfg/frodo-compute

/tftpboot/pxelinux.cfg/frodo-compile

/export/kickstart/centos - CentOS 5.2 repository

/export/kickstart/centos/ks.frodo-head.cfg

/export/kickstart/centos/ks.frodo-compute.cfg

/export/kickstart/centos/ks.frodo-compile.cfg

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

Installing a node
Some of the important files

Example

Example

slyder /tftpboot/pxelinux.cfg> ./gethostip frodo2

frodo2.sinrg.local 172.16.0.2 AC100002

slyder /tftpboot/pxelinux.cfg> ls -l AC100002

lrwxrwxrwx 1 root root 13 Oct 17 11:40 AC100002 -> frodo-compute

slyder /tftpboot/pxelinux.cfg> cat frodo-compute

default 0
timeout 50
prompt 1
display msgs/boot.frodo-compute.msg
F1 msgs/boot.frodo-compute.msg
F2 msgs/general.msg
F3 msgs/expert.msg
F4 msgs/param.msg
F5 msgs/rescue.msg
F7 msgs/snake.msg

label 0

kernel centos5.2/vmlinuz
append initrd=centos5.2/initrd.img ramdisk_size=6801

ks=nfs:172.16.0.72://export/kickstart/centos/ks.frodo-compute.cfg
ksdevice=eth0

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

Installing software from an admin stand point

Two ways:

using package management (yum, rpm)

[root@frodo-head]# dsh -g frodo ’yum -y install lapack-3.1.1-1.el5.rf’
[root@frodo-head]# dsh -g frodo ’rpm -qa | grep lapack’

compiling from scratch and putting in /pkgs

what if there are other versions (e.g. mpicc from mpich and
openmpi)

set prefix to /pkgs/your-software-x.y.z, where x.y.z indicates a version
after installation create a diretory in /usr/local and duplicate the above directory tree using symbolic
links

example:

mkdir /usr/local/mpich
graft -i -t /usr/local/mpich /pkgs/mpich-1.2.7..2

add a module (only if created new directory under /usr/local)

Install software on the package server and from the head node update the

nodes using cfagent

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

Modules provide an easy mechanism for updating a user’s environment

How to add a module
create a file in /pkgs/Modules/modulefiles

example:

slyder ˜> cat /pkgs/Modules/modulefiles/mpich-1.2.7p1
proc ModulesHelp { } {

puts stderr "Sets up environment to use mpich."

}
module-whatis "adds mpich paths to your environment variables"
set apppath /usr/local/mpich

prepend-path PATH $apppath/bin
prepend-path LD_LIBRARY_PATH $apppath/lib
prepend-path LIBRARY_PATH $apppath/lib
prepend-path MANPATH $apppath/man
append-path MANPATH /usr/local/man:/usr/share/man

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

Example

frodo2> . /usr/local/modules/3.2.6/init/tcsh

frodo2> module avail

————————————— /pkgs/Modules/modulefiles —————————————
j2sdk1.4.2_18

modules

mpich-mx-1.2.7..7

mpich2-mx-1.0.7..2

openmpi-1.2.8

module-info

mpich-1.2.7p1

mpich2-1.0.8

mx-1.2.7

openmpi-1.2.8-mx

frodo2> module load openmpi-1.2.8-mx

frodo2> module list

Currently Loaded Modulefiles:

1) mx-1.2.7

2) openmpi-1.2.8-mx

frodo2> which mpicc

/usr/local/openmpi-mx/bin/mpicc

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

Compiling software from a user standpoint

frodo-head> ssh frodo-compile

frodo-compile> . /usr/local/modules/3.2.6/init/tcsh

frodo-compile> module load openmpi-1.2.8-mx

frodo-compile> mpicc -o hello hello.c

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

interactive

qsub -I -l nodes=64:ppn=2
. /usr/local/modules/3.2.6/init/tcsh
module load openmpi-1.2.8-mx
mpirun -np 64 hello

batch

example of a batch script hello.sh

#/bin/bash

#PBS -l nodes=64:ppn=1,pmem=1gb

NODES=‘cat $PBS_NODEFILE | wc -l‘

. /usr/local/modules/3.2.6/init/tcsh
module load openmpi-1.2.8-mx

mpirun -np $NODES $PBS_O_WORKDIR/hello

command to submit a job

qsub hello.sh

Itstaff

EECS Clusters

background image

Introduction

Installing a node

Installing software

Managing modules

Using modules

Compiling software

Running a basic MPI job

Other major changes from the previous architecture

Other major changes from the previous architecture

proxy server - yum updates can be done on the compute
nodes (private LAN)

mpich2 with mx drivers

Itstaff

EECS Clusters


Document Outline


Wyszukiwarka

Podobne podstrony:
Klastry (Clusters), edukacja i nauka, Informatyka
58 829 845 A New Model for Fatique Failure due to Carbide Clusters
ClusteringHW 3 17 06
Arrays and Clusters v1 0 id 692 Nieznany
MySQL Cluster Administrator Guide
Klastry (Clusters), edukacja i nauka, Informatyka
Anthony, Piers Cluster 3 Kirlian Quest
EdPsych Modules PDF Cluster 9
EdPsych Modules PDF Cluster 5 Module 15
1961 The Classification of Clusters of Galaxies Morgan
instrument cluster
EdPsych Modules PDF Cluster 8 Module 27
Anthony, Piers Cluster 3 Kirlian Quest
EdPsych Modules PDF Cluster 6 Module 18
corosync cluster engine dake caulfield beekhof
He Clusters structural embeddedness and knowledge a sttructural embeddedness modle of clusters
EdPsych Modules PDF Cluster 3 Module 09
EdPsych Modules PDF Cluster 9 Module 29

więcej podobnych podstron