As you can imagine, running a for() loop on 1000+ systems is fraught with potential issues and is very very slow. Running commands in parallel greatly speeds things up and keeps normal network time outs from bringing your work to a grinding halt.
I typically work in environments where I manage one or more thousands of systems (at present, they out number me 1564 to 1). I always use a configuration management system such as CFengine, Puppet, or Chef to manage the systems. However, there is still occasional need to run one-off commands.
Generally, my first approach is to use GNU xargs (often named gxargs on standard Unix systems). There is a handy -P switch that allows for parallelization.
Here is an example of how to run a simple command locally in parallel.
seq 1200 1299 | xargs -P 32 -n 1 -INUMBER mkdir NUMBER
Using this incarnation, we create 100 directories maintaining 32 instances at one time until completion. Awesome, now we can apply this to ssh and run lots of commands remotely — right?.
While I often use this invocation for simple ssh commands I find that it falls short for running more complex operations, particularly when you need to maintain coherent logs for later review.
When you do this, the order of the output is not guaranteed. Infact, it is just a jumble of lines and if the commands you are running have multiple output lines, it is down right unintelligible.
So what can you do about this if need to maintain coherent log files from each machine?
In a case like this, I sometimes use pssh. pssh takes a number of arguments including (-h) a path to a file containing target hostnames or IP addresses, (-o) path to a directory for which holds log files from data sent to standard out, (-t) timeout, (-p) number of concurrent threads, amoung others.
[jmatthew@vs-lm960 pssh-1.4.3]# pssh -h /tmp/hostlist -o /tmp/outfiles hostname  00:04:46 [SUCCESS] vs-lm496 22  00:04:46 [SUCCESS] vs-lm1204 22  00:04:46 [SUCCESS] vs-lm1203 22  00:04:46 [SUCCESS] vs-lm1201 22  00:04:46 [SUCCESS] vs-lm1202 22  00:04:46 [SUCCESS] vs-lm1200 22 [root@vs-lm960 pssh-1.4.3]# ls -l /tmp/outfiles total 24 drwxr-xr-x 2 jmatthew aolusers 160 May 27 00:04 ./ drwxrwxrwt 10 jmatthew aolusers 1440 May 27 00:04 ../ -rw-r--r-- 1 jmatthew aolusers 25 May 27 00:04 vs-lm1200 -rw-r--r-- 1 jmatthew aolusers 25 May 27 00:04 vs-lm1201 -rw-r--r-- 1 jmatthew aolusers 25 May 27 00:04 vs-lm1202 -rw-r--r-- 1 jmatthew aolusers 25 May 27 00:04 vs-lm1203 -rw-r--r-- 1 jmatthew aolusers 25 May 27 00:04 vs-lm1204 -rw-r--r-- 1 jmatthew aolusers 24 May 27 00:04 vs-lm496 [jmatthew@vs-lm960 pssh-1.4.3]# cat /tmp/outfiles/vs-lm1200 vs-lm1200.websys.aol.com
Right away we see that pssh allows us to run commands in parallel and maintain logs which could be reviewed later. This has a definitely advantage when working over multiple systems.
At the time of writing, the latest source code can be downloaded from Google Tools.
Hopefully this saves you as much time as it saves me.