Bash script -Way to ignore hung Server

I wrote a script that runs commands on 1000+ servers in background. Sometimes the script gets hung on one of the servers. If/when a server gets hung(due to high load avg) when running a script,the command might also gets hung on that server. Is there a way to skip that host so the script can go to the next host and keep running along?.

I am highlighting two main function of my script, but no luck in giving "ConnectTimeout" and wait keywords.

exec_ssh()
{
for i in `cat $file`
do 
 ssh -q -o "StrictHostKeyChecking no" -o "NumberOfPasswordPrompts 0" -o ConnectTimeout=2 $i $command 2>>/dev/null &
 if wait $!; then
 echo "" >> /dev/null
 else
 echo "$i is not reachable over SSH or passwordless authentication is not setup on the server" >> /tmp/not_reachable
 fi

done >/tmp/output.csv &


run_command()

 export -f exec_ssh
 export command
 nohup bash -c exec_ssh &>>$log_file &

edited Jan 23 at 15:19

DopeGhoti

45.4k55988

asked Jan 23 at 15:09

Sin15

212

This might be a good time to learn about ssh -o BatchMode=yes (:

– DopeGhoti
Jan 23 at 15:22

Also on SO: How to skip/ignore Hung host

– glenn jackman
Jan 23 at 15:23

add a comment |

I am highlighting two main function of my script, but no luck in giving "ConnectTimeout" and wait keywords.

exec_ssh()
{
for i in `cat $file`
do 
 ssh -q -o "StrictHostKeyChecking no" -o "NumberOfPasswordPrompts 0" -o ConnectTimeout=2 $i $command 2>>/dev/null &
 if wait $!; then
 echo "" >> /dev/null
 else
 echo "$i is not reachable over SSH or passwordless authentication is not setup on the server" >> /tmp/not_reachable
 fi

done >/tmp/output.csv &


run_command()

 export -f exec_ssh
 export command
 nohup bash -c exec_ssh &>>$log_file &

edited Jan 23 at 15:19

DopeGhoti

45.4k55988

asked Jan 23 at 15:09

Sin15

212

This might be a good time to learn about ssh -o BatchMode=yes (:

– DopeGhoti
Jan 23 at 15:22

Also on SO: How to skip/ignore Hung host

– glenn jackman
Jan 23 at 15:23

add a comment |

I am highlighting two main function of my script, but no luck in giving "ConnectTimeout" and wait keywords.

exec_ssh()
{
for i in `cat $file`
do 
 ssh -q -o "StrictHostKeyChecking no" -o "NumberOfPasswordPrompts 0" -o ConnectTimeout=2 $i $command 2>>/dev/null &
 if wait $!; then
 echo "" >> /dev/null
 else
 echo "$i is not reachable over SSH or passwordless authentication is not setup on the server" >> /tmp/not_reachable
 fi

done >/tmp/output.csv &


run_command()

 export -f exec_ssh
 export command
 nohup bash -c exec_ssh &>>$log_file &

edited Jan 23 at 15:19

DopeGhoti

45.4k55988

asked Jan 23 at 15:09

Sin15

212

I am highlighting two main function of my script, but no luck in giving "ConnectTimeout" and wait keywords.

exec_ssh()
{
for i in `cat $file`
do 
 ssh -q -o "StrictHostKeyChecking no" -o "NumberOfPasswordPrompts 0" -o ConnectTimeout=2 $i $command 2>>/dev/null &
 if wait $!; then
 echo "" >> /dev/null
 else
 echo "$i is not reachable over SSH or passwordless authentication is not setup on the server" >> /tmp/not_reachable
 fi

done >/tmp/output.csv &


run_command()

 export -f exec_ssh
 export command
 nohup bash -c exec_ssh &>>$log_file &

bash

edited Jan 23 at 15:19

DopeGhoti

45.4k55988

asked Jan 23 at 15:09

Sin15

212

edited Jan 23 at 15:19

DopeGhoti

45.4k55988

asked Jan 23 at 15:09

Sin15

212

edited Jan 23 at 15:19

DopeGhoti

45.4k55988

edited Jan 23 at 15:19

DopeGhoti

45.4k55988

edited Jan 23 at 15:19

DopeGhoti

45.4k55988

asked Jan 23 at 15:09

Sin15

212

asked Jan 23 at 15:09

Sin15

212

asked Jan 23 at 15:09

Sin15

212

This might be a good time to learn about ssh -o BatchMode=yes (:

– DopeGhoti
Jan 23 at 15:22

Also on SO: How to skip/ignore Hung host

– glenn jackman
Jan 23 at 15:23

add a comment |

This might be a good time to learn about ssh -o BatchMode=yes (:

– DopeGhoti
Jan 23 at 15:22

Also on SO: How to skip/ignore Hung host

– glenn jackman
Jan 23 at 15:23

This might be a good time to learn about ssh -o BatchMode=yes (:

– DopeGhoti
Jan 23 at 15:22

Also on SO: How to skip/ignore Hung host

– glenn jackman
Jan 23 at 15:23

add a comment |

2 Answers
2

active

oldest

votes

Your script as written would keep running all of your remote commands concurrently, but for your use of wait which explicitly will wait for a backgrounded task to complete. In the case you describe of a high-load server, this means your ssh command is not timing out, but is simply taking a long time to complete, so the script is doing exactly what you ask it to. ConnectTimeout is moot when you are able to successfully make the ssh connection.

If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:

exec_ssh() 
 while read file; do
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
 echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
 fi
 done < "$file" > /tmp/output.csv &


run_command() 
 export -f exec_ssh
 export command
 nohup bash -c exec_ssh &>> "$log_file" &

It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:

if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
 # connection succeeded
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
 echo "Remote command threw $?"
 fi
else
 echo "SSH threw $?"
fi

edited Jan 23 at 15:36

answered Jan 23 at 15:28

DopeGhoti

45.4k55988

thanks, I will try this out and let you know.

– Sin15
Jan 23 at 16:38

add a comment |

As your local and remote commands get more complex you're quickly going to become overwhelmed with trying to cram this all into one coherent script, and with hundreds or thousands of backgrounded processes you're likely to run into resource contention issues even with a beefy local machine.

You can get this under control with xargs -P. I typically break up tasks like this into two scripts.

local.sh

Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:

#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"

remote.sh

The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.

The actual command

cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh

Where:

-P 16 will fork up to 16 sub-processes

-n 1 will feed exactly one argument per command

-I will substitute the argument in place of [not necessary here, but may be useful for constructing more complex xargs calls.

This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.

answered Jan 25 at 0:45

Sammitch

274110

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f496237%2fbash-script-way-to-ignore-hung-server%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:

exec_ssh() 
 while read file; do
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
 echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
 fi
 done < "$file" > /tmp/output.csv &


run_command() 
 export -f exec_ssh
 export command
 nohup bash -c exec_ssh &>> "$log_file" &

It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:

if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
 # connection succeeded
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
 echo "Remote command threw $?"
 fi
else
 echo "SSH threw $?"
fi

edited Jan 23 at 15:36

answered Jan 23 at 15:28

DopeGhoti

45.4k55988

thanks, I will try this out and let you know.

– Sin15
Jan 23 at 16:38

add a comment |

If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:

exec_ssh() 
 while read file; do
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
 echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
 fi
 done < "$file" > /tmp/output.csv &


run_command() 
 export -f exec_ssh
 export command
 nohup bash -c exec_ssh &>> "$log_file" &

It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:

if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
 # connection succeeded
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
 echo "Remote command threw $?"
 fi
else
 echo "SSH threw $?"
fi

edited Jan 23 at 15:36

answered Jan 23 at 15:28

DopeGhoti

45.4k55988

thanks, I will try this out and let you know.

– Sin15
Jan 23 at 16:38

add a comment |

If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:

exec_ssh() 
 while read file; do
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
 echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
 fi
 done < "$file" > /tmp/output.csv &


run_command() 
 export -f exec_ssh
 export command
 nohup bash -c exec_ssh &>> "$log_file" &

It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:

if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
 # connection succeeded
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
 echo "Remote command threw $?"
 fi
else
 echo "SSH threw $?"
fi

edited Jan 23 at 15:36

answered Jan 23 at 15:28

DopeGhoti

45.4k55988

If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:

exec_ssh() 
 while read file; do
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
 echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
 fi
 done < "$file" > /tmp/output.csv &


run_command() 
 export -f exec_ssh
 export command
 nohup bash -c exec_ssh &>> "$log_file" &

It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:

if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
 # connection succeeded
 if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
 echo "Remote command threw $?"
 fi
else
 echo "SSH threw $?"
fi

edited Jan 23 at 15:36

answered Jan 23 at 15:28

DopeGhoti

45.4k55988

edited Jan 23 at 15:36

answered Jan 23 at 15:28

DopeGhoti

45.4k55988

answered Jan 23 at 15:28

DopeGhoti

45.4k55988

answered Jan 23 at 15:28

DopeGhoti

45.4k55988

thanks, I will try this out and let you know.

– Sin15
Jan 23 at 16:38

add a comment |

thanks, I will try this out and let you know.

– Sin15
Jan 23 at 16:38

thanks, I will try this out and let you know.

– Sin15
Jan 23 at 16:38

add a comment |

You can get this under control with xargs -P. I typically break up tasks like this into two scripts.

local.sh

Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:

#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"

remote.sh

The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.

The actual command

cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh

Where:

-P 16 will fork up to 16 sub-processes

-n 1 will feed exactly one argument per command

-I will substitute the argument in place of [not necessary here, but may be useful for constructing more complex xargs calls.

This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.

answered Jan 25 at 0:45

Sammitch

274110

add a comment |

You can get this under control with xargs -P. I typically break up tasks like this into two scripts.

local.sh

Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:

#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"

remote.sh

The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.

The actual command

cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh

Where:

-P 16 will fork up to 16 sub-processes

-n 1 will feed exactly one argument per command

-I will substitute the argument in place of [not necessary here, but may be useful for constructing more complex xargs calls.

This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.

answered Jan 25 at 0:45

Sammitch

274110

add a comment |

You can get this under control with xargs -P. I typically break up tasks like this into two scripts.

local.sh

Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:

#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"

remote.sh

The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.

The actual command

cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh

Where:

-P 16 will fork up to 16 sub-processes

-n 1 will feed exactly one argument per command

-I will substitute the argument in place of [not necessary here, but may be useful for constructing more complex xargs calls.

This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.

answered Jan 25 at 0:45

Sammitch

274110

You can get this under control with xargs -P. I typically break up tasks like this into two scripts.

local.sh

Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:

#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"

remote.sh

The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.

The actual command

cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh

Where:

-P 16 will fork up to 16 sub-processes

-n 1 will feed exactly one argument per command

-I will substitute the argument in place of [not necessary here, but may be useful for constructing more complex xargs calls.

This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.

answered Jan 25 at 0:45

Sammitch

274110

answered Jan 25 at 0:45

Sammitch

274110

answered Jan 25 at 0:45

Sammitch

274110

answered Jan 25 at 0:45

Sammitch

274110

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu