Difference between revisions of "Useful Commands"
Jump to navigation
Jump to search
Line 25: | Line 25: | ||
FindFreeNodes | FindFreeNodes | ||
</source> <br> | </source> <br> | ||
+ | |||
+ | * A script created by Nick Luciano that starts Scale jobs through [[TORQUE/Maui]] with a user defined delay between jobs. Useful because it spreads out file server load. | ||
+ | |||
+ | <source lang="bash"> | ||
+ | #!/bin/bash | ||
+ | |||
+ | if [ $# -ne 1 ] | ||
+ | then | ||
+ | echo "File name argument expected" | ||
+ | exit 1 | ||
+ | fi | ||
+ | nameString=$1 | ||
+ | |||
+ | zero=0 | ||
+ | oneMinute=1 | ||
+ | minsPerHour=60 | ||
+ | lastMin=59 | ||
+ | anHour=100 | ||
+ | tomorrow=2400 | ||
+ | midnight=0000 | ||
+ | # number of minutes from now to start first job 0..59 | ||
+ | offSet=0 | ||
+ | # number of minutes between jobs 0..59 | ||
+ | deltaMins=15 | ||
+ | |||
+ | thisHour=`date +%H00` | ||
+ | thisMin=`date +%M` | ||
+ | startDir=`pwd` | ||
+ | |||
+ | if [ "$offSet" -le "$zero" ]; then | ||
+ | # offSet at least oneMinute so the first job doesn't queue until tomorrow. | ||
+ | offSet=oneMinute | ||
+ | fi | ||
+ | thisMin=$((thisMin + offSet)) | ||
+ | if [ "$thisMin" -ge "$lastMin" ]; then | ||
+ | thisMin=$((thisMinute-(lastMin + oneMinute))) | ||
+ | thisHour=$((thisHour + anHour)) | ||
+ | fi | ||
+ | |||
+ | startTime=$((thisHour + thisMin)) | ||
+ | |||
+ | for pathname in `find . -name "$1"`; do | ||
+ | dir=$(dirname ${pathname}) | ||
+ | file=$(basename ${pathname}) | ||
+ | cd $dir | ||
+ | command=`printf "qsub -a %04d $file" $startTime` | ||
+ | echo "Now executing: $command" | ||
+ | $command | ||
+ | cd $startDir | ||
+ | startTime=$((startTime + deltaMins)) | ||
+ | # thisHour+anHour is to assure no div by zero at midnight | ||
+ | remainder=$((((startTime+anHour)%(thisHour+anHour)) - minsPerHour)) | ||
+ | if [ "$remainder" -ge "$zero" ]; then | ||
+ | thisHour=$((thisHour + anHour)) | ||
+ | if [ "$thisHour" -gt "$tomorrow" ]; then | ||
+ | thisHour=midnight | ||
+ | fi | ||
+ | startTime=$((thisHour + remainder)) | ||
+ | fi | ||
+ | done | ||
+ | </source> | ||
== Other tips == | == Other tips == | ||
* Temperature monitoring: http://nec549362.engr.utk.edu/cgi-bin/temp.cgi | * Temperature monitoring: http://nec549362.engr.utk.edu/cgi-bin/temp.cgi |
Revision as of 21:06, 26 February 2013
Useful Commands
- To list processes you run on the cluster nodes, run this command on the head node:
ListMyProcesses
- The following command uses gstat to get a list of nodes by load. It then sorts the list by load/free CPUs and connects you to the node with the most free CPUs. The format below is an alias that you can put in your .bashrc file if you want it to be automatically applied to your environment.
alias fss='ssh `gstat -1a -i necluster|grep node|sort -gr -k2|sort -k13|sort -k11|head -n1|cut -f1 -d" "`'
- Get cluster load information from Ganglia in a terminal:
gstat -p8649 -1a -i necluster
- The above, add sum of user+system load and sort on the load sum. The least loaded nodes are shows first:
gstat -p8649 -1a -i necluster | grep node | awk '{print $11+$13"\t"$1;}' | sort -g
- To list the unloaded nodes, run this command on the head node:
FindFreeNodes
- A script created by Nick Luciano that starts Scale jobs through TORQUE/Maui with a user defined delay between jobs. Useful because it spreads out file server load.
#!/bin/bash
if [ $# -ne 1 ]
then
echo "File name argument expected"
exit 1
fi
nameString=$1
zero=0
oneMinute=1
minsPerHour=60
lastMin=59
anHour=100
tomorrow=2400
midnight=0000
# number of minutes from now to start first job 0..59
offSet=0
# number of minutes between jobs 0..59
deltaMins=15
thisHour=`date +%H00`
thisMin=`date +%M`
startDir=`pwd`
if [ "$offSet" -le "$zero" ]; then
# offSet at least oneMinute so the first job doesn't queue until tomorrow.
offSet=oneMinute
fi
thisMin=$((thisMin + offSet))
if [ "$thisMin" -ge "$lastMin" ]; then
thisMin=$((thisMinute-(lastMin + oneMinute)))
thisHour=$((thisHour + anHour))
fi
startTime=$((thisHour + thisMin))
for pathname in `find . -name "$1"`; do
dir=$(dirname ${pathname})
file=$(basename ${pathname})
cd $dir
command=`printf "qsub -a %04d $file" $startTime`
echo "Now executing: $command"
$command
cd $startDir
startTime=$((startTime + deltaMins))
# thisHour+anHour is to assure no div by zero at midnight
remainder=$((((startTime+anHour)%(thisHour+anHour)) - minsPerHour))
if [ "$remainder" -ge "$zero" ]; then
thisHour=$((thisHour + anHour))
if [ "$thisHour" -gt "$tomorrow" ]; then
thisHour=midnight
fi
startTime=$((thisHour + remainder))
fi
done
Other tips
- Temperature monitoring: http://nec549362.engr.utk.edu/cgi-bin/temp.cgi