
valentines day
Originally uploaded by omnia
Advanced Bash Scripting
I have written before about the usefulness of command-line scripting in computational science.
Today, while looking for some information on various file test operators in bash (e.g. to check whether a file or directory exists), I found this amazing guide. As the author puts it,
This tutorial assumes no previous knowledge of scripting or programming, but progresses rapidly toward an intermediate/advanced level of instruction . . . all the while sneaking in little snippets of UNIX® wisdom and lore. It serves as a textbook, a manual for self-study, and a reference and source of knowledge on shell scripting techniques.
For instructional purposes, all along the examples have little comments like, “explain why this is the case…”, to test your knowledge as you go through the manual. This would make it excellent for use as textbook on basic programming ideas. It is even available in PDF format, and was updated March 18th of 2008.
I can assure you that every new member of the lab will be getting a link to this guide from me. Proper knowledge of shell scripting is an amplifier of one’s productivity. An investment of a few hours learning the basics will probably return a hundred-fold savings of time over a few months. More advanced concepts are naturally learned as more difficult scenarios are encountered. I’ll be writing soon about some of the more sophisticated issues I’ve encountered using shell scripting.
Pretty Big Dig
Killing Zombies
Occasionally, on our cluster, a node will crash. If a job was running on it that spanned multiple nodes, sometimes the other nodes won’t get the message that their fellow has crashed, and they will just keep running whatever processes are on them.
I call these “Zombie” processes, because they just lumber along eating CPU time and rotting, keeping other jobs from using the node. Today, after noticing a particularly bad zombie infestation, I finally created a script that checks for zombified machines and then restarts them. This script is compatible with Torque and relies on Scyld Beowulf “b-commands” and IPMI, but you could easily replace them with similar utilities like rsh or ssh.
#!/bin/bash
# For all of the nodes in the main cluster...
for NODE in `seq 0 119`; do
# Calculate the load and convert it to an integer
LOAD=`bpsh $NODE uptime | awk '{ print $11 }' | sed "s/\,//"`
LOAD=`printf %1.0f $LOAD`
# Figure out whether the node should be running anything
ASSIGNED=`qstat -f | grep $NODE | wc -l`
if [ $ASSIGNED -gt 0 ]; then
ASSIGNED=1
fi
# If the node is running something but shouldn't be, reboot it.
if [ $LOAD -gt 1 ] && [ $ASSIGNED -eq 0 ]; then
echo Node $NODE is a zombie! Kicking. >> /root/logs/zombies.log
# This relies on IPMI
ipmitool -H 10.54.2.$(( 100 + $NODE )) -U (some user) -P (some password) power reset
fi
done
You can download the file directly: Zombie Checker
(Here’s another post on Zombies).
To Miss New Orleans
I have just returned from my second-to-last visit to New Orleans to see Amanda. Soon, we’ll be living in the same place and those trips will no longer be necessary. Despite all of the hassles of flying, despite missing Amanda, the one benefit of living apart has been an excuse to go see New Orleans regularly even after I’d moved away.
In two months, I’ll visit for one last week, and then the trips will largely stop.
We hope to move back there when Amanda is done with residency, if we can swing it. There will probably be some visits for holidays, as my family’s home is still just outside the city. Nonetheless, it’s slowly dawning on me that my monthly, re-charging dose of New Orleans is coming to an end.
At this point, the legacy of Katrina in my life is diminishing. Assuming that all continues to go well, I’ll be graduating from Johns Hopkins rather than Tulane. All sorts of other things both in my head and in my life have changed as a result of the flood. However, I will once again be living with my wife, we’ll go on with our post-school and post-New-Orleans lives. I’ll finally start to get some closure on something that can never really be fixed or undone.
