Revision as of 23:05, 3 March 2013

Useful Commands

A script created by Nick Luciano that starts Scale jobs through TORQUE/Maui with a user defined delay between jobs. Useful because it spreads out file server load.

#!/bin/bash

if [ $# -ne 1 ]
then
	echo "File name argument expected"
        exit 1
fi
nameString=$1

zero=0
oneMinute=1
minsPerHour=60
lastMin=59
anHour=100
tomorrow=2400
midnight=0000
# number of minutes from now to start first job 0..59
offSet=0
# number of minutes between jobs 0..59
deltaMins=15

thisHour=`date +%H00`
thisMin=`date +%M`
startDir=`pwd`

if [ "$offSet" -le "$zero" ]; then
# offSet at least oneMinute so the first job doesn't queue until tomorrow.
   offSet=oneMinute
fi
thisMin=$((thisMin + offSet))
if [ "$thisMin" -ge "$lastMin" ]; then
   thisMin=$((thisMinute-(lastMin + oneMinute)))
   thisHour=$((thisHour + anHour))
fi

startTime=$((thisHour + thisMin))

for pathname in `find . -name "$1"`; do
    dir=$(dirname ${pathname})
    file=$(basename ${pathname})
    cd $dir
    command=`printf "qsub -a %04d $file" $startTime`
    echo "Now executing: $command"
    $command
    cd $startDir
    startTime=$((startTime + deltaMins))
    # thisHour+anHour is to assure no div by zero at midnight
    remainder=$((((startTime+anHour)%(thisHour+anHour)) - minsPerHour))
    if [ "$remainder" -ge "$zero"  ]; then
	thisHour=$((thisHour + anHour))
	if [ "$thisHour" -gt "$tomorrow" ]; then
		thisHour=midnight
	fi
	startTime=$((thisHour + remainder))
    fi
done

The following is obsolete - use the batch system TORQUE/Maui

To list processes you run on the cluster nodes, run this command on the head node:

ListMyProcesses

The following command uses gstat to get a list of nodes by load. It then sorts the list by load/free CPUs and connects you to the node with the most free CPUs. The format below is an alias that you can put in your .bashrc file if you want it to be automatically applied to your environment.

 alias fss='ssh `gstat -1a -i necluster|grep node|sort -gr -k2|sort -k13|sort -k11|head -n1|cut -f1 -d" "`'

Get cluster load information from Ganglia in a terminal:

gstat -p8649 -1a -i necluster

The above, add sum of user+system load and sort on the load sum. The least loaded nodes are shows first:

gstat -p8649 -1a -i necluster | grep node | awk '{print $11+$13"\t"$1;}' | sort -g

To list the unloaded nodes, run this command on the head node:

FindFreeNodes

Other tips

Temperature monitoring: http://nec549362.engr.utk.edu/cgi-bin/temp.cgi

@@ Line 1: / Line 1: @@
-== Useful Commands ==
+= Useful Commands =
-* To list processes you run on the cluster nodes, run this command on the head node:
-<source lang="bash">
-ListMyProcesses
-</source> <br>
-*  The following command uses ''gstat'' to get a list of nodes by load.  It then sorts the list by load/free CPUs and connects you to the node with the most free CPUs.  The format below is an alias that you can put in your ''.bashrc'' file if you want it to be automatically applied to your environment.
-<source lang="bash">
- alias fss='ssh `gstat -1a -i necluster|grep node|sort -gr -k2|sort -k13|sort -k11|head -n1|cut -f1 -d" "`'
-</source> <br>
-* Get cluster load information from Ganglia in a terminal:
-<source lang="bash">
-gstat -p8649 -1a -i necluster
-</source> <br>
-* The above, add sum of user+system load and sort on the load sum. The least loaded nodes are shows first:
-<source lang="bash">
-gstat -p8649 -1a -i necluster | grep node | awk '{print $11+$13"\t"$1;}' | sort -g
-</source> <br>
-* To list the unloaded nodes, run this command on the head node:
-<source lang="bash">
-FindFreeNodes
-</source> <br>
 * A script created by Nick Luciano that starts Scale jobs through [[TORQUE/Maui]] with a user defined delay between jobs.  Useful because it spreads out file server load.
@@ Line 86: / Line 60: @@
 done
 </source>
+<br/>
+=The following is obsolete - use the batch system [[TORQUE/Maui]]=
+* To list processes you run on the cluster nodes, run this command on the head node:
+<source lang="bash">
+ListMyProcesses
+</source> <br>
+*  The following command uses ''gstat'' to get a list of nodes by load.  It then sorts the list by load/free CPUs and connects you to the node with the most free CPUs.  The format below is an alias that you can put in your ''.bashrc'' file if you want it to be automatically applied to your environment.
+<source lang="bash">
+ alias fss='ssh `gstat -1a -i necluster|grep node|sort -gr -k2|sort -k13|sort -k11|head -n1|cut -f1 -d" "`'
+</source> <br>
+* Get cluster load information from Ganglia in a terminal:
+<source lang="bash">
+gstat -p8649 -1a -i necluster
+</source> <br>
+* The above, add sum of user+system load and sort on the load sum. The least loaded nodes are shows first:
+<source lang="bash">
+gstat -p8649 -1a -i necluster | grep node | awk '{print $11+$13"\t"$1;}' | sort -g
+</source> <br>
+* To list the unloaded nodes, run this command on the head node:
+<source lang="bash">
+FindFreeNodes
+</source> <br>
 == Other tips ==
 * Temperature monitoring: http://nec549362.engr.utk.edu/cgi-bin/temp.cgi

Difference between revisions of "Useful Commands"

Revision as of 23:05, 3 March 2013

Useful Commands

The following is obsolete - use the batch system TORQUE/Maui

Other tips

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools