Bash script -Way to ignore hung Server
Clash Royale CLAN TAG#URR8PPP
I wrote a script that runs commands on 1000+ servers in background. Sometimes the script gets hung on one of the servers. If/when a server gets hung(due to high load avg) when running a script,the command might also gets hung on that server. Is there a way to skip that host so the script can go to the next host and keep running along?.
I am highlighting two main function of my script, but no luck in giving "ConnectTimeout" and wait keywords.
exec_ssh()
{
for i in `cat $file`
do
ssh -q -o "StrictHostKeyChecking no" -o "NumberOfPasswordPrompts 0" -o ConnectTimeout=2 $i $command 2>>/dev/null &
if wait $!; then
echo "" >> /dev/null
else
echo "$i is not reachable over SSH or passwordless authentication is not setup on the server" >> /tmp/not_reachable
fi
done >/tmp/output.csv &
run_command()
export -f exec_ssh
export command
nohup bash -c exec_ssh &>>$log_file &
bash
add a comment |
I wrote a script that runs commands on 1000+ servers in background. Sometimes the script gets hung on one of the servers. If/when a server gets hung(due to high load avg) when running a script,the command might also gets hung on that server. Is there a way to skip that host so the script can go to the next host and keep running along?.
I am highlighting two main function of my script, but no luck in giving "ConnectTimeout" and wait keywords.
exec_ssh()
{
for i in `cat $file`
do
ssh -q -o "StrictHostKeyChecking no" -o "NumberOfPasswordPrompts 0" -o ConnectTimeout=2 $i $command 2>>/dev/null &
if wait $!; then
echo "" >> /dev/null
else
echo "$i is not reachable over SSH or passwordless authentication is not setup on the server" >> /tmp/not_reachable
fi
done >/tmp/output.csv &
run_command()
export -f exec_ssh
export command
nohup bash -c exec_ssh &>>$log_file &
bash
This might be a good time to learn aboutssh -o BatchMode=yes
(:
– DopeGhoti
Jan 23 at 15:22
Also on SO: How to skip/ignore Hung host
– glenn jackman
Jan 23 at 15:23
add a comment |
I wrote a script that runs commands on 1000+ servers in background. Sometimes the script gets hung on one of the servers. If/when a server gets hung(due to high load avg) when running a script,the command might also gets hung on that server. Is there a way to skip that host so the script can go to the next host and keep running along?.
I am highlighting two main function of my script, but no luck in giving "ConnectTimeout" and wait keywords.
exec_ssh()
{
for i in `cat $file`
do
ssh -q -o "StrictHostKeyChecking no" -o "NumberOfPasswordPrompts 0" -o ConnectTimeout=2 $i $command 2>>/dev/null &
if wait $!; then
echo "" >> /dev/null
else
echo "$i is not reachable over SSH or passwordless authentication is not setup on the server" >> /tmp/not_reachable
fi
done >/tmp/output.csv &
run_command()
export -f exec_ssh
export command
nohup bash -c exec_ssh &>>$log_file &
bash
I wrote a script that runs commands on 1000+ servers in background. Sometimes the script gets hung on one of the servers. If/when a server gets hung(due to high load avg) when running a script,the command might also gets hung on that server. Is there a way to skip that host so the script can go to the next host and keep running along?.
I am highlighting two main function of my script, but no luck in giving "ConnectTimeout" and wait keywords.
exec_ssh()
{
for i in `cat $file`
do
ssh -q -o "StrictHostKeyChecking no" -o "NumberOfPasswordPrompts 0" -o ConnectTimeout=2 $i $command 2>>/dev/null &
if wait $!; then
echo "" >> /dev/null
else
echo "$i is not reachable over SSH or passwordless authentication is not setup on the server" >> /tmp/not_reachable
fi
done >/tmp/output.csv &
run_command()
export -f exec_ssh
export command
nohup bash -c exec_ssh &>>$log_file &
bash
bash
edited Jan 23 at 15:19
DopeGhoti
45.4k55988
45.4k55988
asked Jan 23 at 15:09
Sin15Sin15
212
212
This might be a good time to learn aboutssh -o BatchMode=yes
(:
– DopeGhoti
Jan 23 at 15:22
Also on SO: How to skip/ignore Hung host
– glenn jackman
Jan 23 at 15:23
add a comment |
This might be a good time to learn aboutssh -o BatchMode=yes
(:
– DopeGhoti
Jan 23 at 15:22
Also on SO: How to skip/ignore Hung host
– glenn jackman
Jan 23 at 15:23
This might be a good time to learn about
ssh -o BatchMode=yes
(:– DopeGhoti
Jan 23 at 15:22
This might be a good time to learn about
ssh -o BatchMode=yes
(:– DopeGhoti
Jan 23 at 15:22
Also on SO: How to skip/ignore Hung host
– glenn jackman
Jan 23 at 15:23
Also on SO: How to skip/ignore Hung host
– glenn jackman
Jan 23 at 15:23
add a comment |
2 Answers
2
active
oldest
votes
Your script as written would keep running all of your remote commands concurrently, but for your use of wait
which explicitly will wait for a backgrounded task to complete. In the case you describe of a high-load server, this means your ssh
command is not timing out, but is simply taking a long time to complete, so the script is doing exactly what you ask it to. ConnectTimeout
is moot when you are able to successfully make the ssh
connection.
If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:
exec_ssh()
while read file; do
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
fi
done < "$file" > /tmp/output.csv &
run_command()
export -f exec_ssh
export command
nohup bash -c exec_ssh &>> "$log_file" &
It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:
if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
# connection succeeded
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
echo "Remote command threw $?"
fi
else
echo "SSH threw $?"
fi
thanks, I will try this out and let you know.
– Sin15
Jan 23 at 16:38
add a comment |
As your local and remote commands get more complex you're quickly going to become overwhelmed with trying to cram this all into one coherent script, and with hundreds or thousands of backgrounded processes you're likely to run into resource contention issues even with a beefy local machine.
You can get this under control with xargs -P
. I typically break up tasks like this into two scripts.
local.sh
Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:
#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"
remote.sh
The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.
The actual command
cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh
Where:
-P 16
will fork up to 16 sub-processes-n 1
will feed exactly one argument per command-I
will substitute the argument in place of[not necessary here, but may be useful for constructing more complex xargs calls.
This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f496237%2fbash-script-way-to-ignore-hung-server%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Your script as written would keep running all of your remote commands concurrently, but for your use of wait
which explicitly will wait for a backgrounded task to complete. In the case you describe of a high-load server, this means your ssh
command is not timing out, but is simply taking a long time to complete, so the script is doing exactly what you ask it to. ConnectTimeout
is moot when you are able to successfully make the ssh
connection.
If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:
exec_ssh()
while read file; do
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
fi
done < "$file" > /tmp/output.csv &
run_command()
export -f exec_ssh
export command
nohup bash -c exec_ssh &>> "$log_file" &
It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:
if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
# connection succeeded
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
echo "Remote command threw $?"
fi
else
echo "SSH threw $?"
fi
thanks, I will try this out and let you know.
– Sin15
Jan 23 at 16:38
add a comment |
Your script as written would keep running all of your remote commands concurrently, but for your use of wait
which explicitly will wait for a backgrounded task to complete. In the case you describe of a high-load server, this means your ssh
command is not timing out, but is simply taking a long time to complete, so the script is doing exactly what you ask it to. ConnectTimeout
is moot when you are able to successfully make the ssh
connection.
If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:
exec_ssh()
while read file; do
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
fi
done < "$file" > /tmp/output.csv &
run_command()
export -f exec_ssh
export command
nohup bash -c exec_ssh &>> "$log_file" &
It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:
if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
# connection succeeded
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
echo "Remote command threw $?"
fi
else
echo "SSH threw $?"
fi
thanks, I will try this out and let you know.
– Sin15
Jan 23 at 16:38
add a comment |
Your script as written would keep running all of your remote commands concurrently, but for your use of wait
which explicitly will wait for a backgrounded task to complete. In the case you describe of a high-load server, this means your ssh
command is not timing out, but is simply taking a long time to complete, so the script is doing exactly what you ask it to. ConnectTimeout
is moot when you are able to successfully make the ssh
connection.
If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:
exec_ssh()
while read file; do
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
fi
done < "$file" > /tmp/output.csv &
run_command()
export -f exec_ssh
export command
nohup bash -c exec_ssh &>> "$log_file" &
It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:
if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
# connection succeeded
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
echo "Remote command threw $?"
fi
else
echo "SSH threw $?"
fi
Your script as written would keep running all of your remote commands concurrently, but for your use of wait
which explicitly will wait for a backgrounded task to complete. In the case you describe of a high-load server, this means your ssh
command is not timing out, but is simply taking a long time to complete, so the script is doing exactly what you ask it to. ConnectTimeout
is moot when you are able to successfully make the ssh
connection.
If you do want to use this sort of script rather than a tool designed for distributed remote execution such as Ansible, I might modify your script as follows:
exec_ssh()
while read file; do
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$i" "$command" 2>>/dev/null & then
echo "$i is not reachable via non-interactive SSH or remote command threw error - exit code $?" >> /tmp/not_reachable
fi
done < "$file" > /tmp/output.csv &
run_command()
export -f exec_ssh
export command
nohup bash -c exec_ssh &>> "$log_file" &
It also might be worth considering separating your "can I SSH to the host" test and your "can I complete the job" test:
if ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" true; then
# connection succeeded
if ! ssh -q -o BatchMode=yes -o ConnectTimeout=2 "$host" "$command" & then
echo "Remote command threw $?"
fi
else
echo "SSH threw $?"
fi
edited Jan 23 at 15:36
answered Jan 23 at 15:28
DopeGhotiDopeGhoti
45.4k55988
45.4k55988
thanks, I will try this out and let you know.
– Sin15
Jan 23 at 16:38
add a comment |
thanks, I will try this out and let you know.
– Sin15
Jan 23 at 16:38
thanks, I will try this out and let you know.
– Sin15
Jan 23 at 16:38
thanks, I will try this out and let you know.
– Sin15
Jan 23 at 16:38
add a comment |
As your local and remote commands get more complex you're quickly going to become overwhelmed with trying to cram this all into one coherent script, and with hundreds or thousands of backgrounded processes you're likely to run into resource contention issues even with a beefy local machine.
You can get this under control with xargs -P
. I typically break up tasks like this into two scripts.
local.sh
Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:
#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"
remote.sh
The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.
The actual command
cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh
Where:
-P 16
will fork up to 16 sub-processes-n 1
will feed exactly one argument per command-I
will substitute the argument in place of[not necessary here, but may be useful for constructing more complex xargs calls.
This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.
add a comment |
As your local and remote commands get more complex you're quickly going to become overwhelmed with trying to cram this all into one coherent script, and with hundreds or thousands of backgrounded processes you're likely to run into resource contention issues even with a beefy local machine.
You can get this under control with xargs -P
. I typically break up tasks like this into two scripts.
local.sh
Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:
#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"
remote.sh
The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.
The actual command
cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh
Where:
-P 16
will fork up to 16 sub-processes-n 1
will feed exactly one argument per command-I
will substitute the argument in place of[not necessary here, but may be useful for constructing more complex xargs calls.
This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.
add a comment |
As your local and remote commands get more complex you're quickly going to become overwhelmed with trying to cram this all into one coherent script, and with hundreds or thousands of backgrounded processes you're likely to run into resource contention issues even with a beefy local machine.
You can get this under control with xargs -P
. I typically break up tasks like this into two scripts.
local.sh
Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:
#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"
remote.sh
The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.
The actual command
cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh
Where:
-P 16
will fork up to 16 sub-processes-n 1
will feed exactly one argument per command-I
will substitute the argument in place of[not necessary here, but may be useful for constructing more complex xargs calls.
This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.
As your local and remote commands get more complex you're quickly going to become overwhelmed with trying to cram this all into one coherent script, and with hundreds or thousands of backgrounded processes you're likely to run into resource contention issues even with a beefy local machine.
You can get this under control with xargs -P
. I typically break up tasks like this into two scripts.
local.sh
Generally this script has a single argument which is the hostname, and performs any necessary validations, pre-flight tasks, logging, etc. Eg:
#!/bin/bash
hostname=$1
# simple
cat remote.sh | ssh user@$hostname
# sudo the whole thing
cat remote.sh | ssh user@$hostname sudo
# log to files
cat remote.sh | ssh user@$hostname &> logs/$hostname.log
# or log to stdout with the hostname prefixed
cat remote.sh | ssh user@$hostname 2>&1 | sed "s/^/$hostname:/"
remote.sh
The script you want to run remotely, but now you don't have to cram it into a quoted one-liner and deal with quote-escaping hell.
The actual command
cat host_list.txt | xargs -P 16 -n 1 -I bash local.sh
Where:
-P 16
will fork up to 16 sub-processes-n 1
will feed exactly one argument per command-I
will substitute the argument in place of[not necessary here, but may be useful for constructing more complex xargs calls.
This way even if one of your local or remote scripts gets hung up you'll still have the other 15 chugging along unimpeded.
answered Jan 25 at 0:45
SammitchSammitch
274110
274110
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f496237%2fbash-script-way-to-ignore-hung-server%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
This might be a good time to learn about
ssh -o BatchMode=yes
(:– DopeGhoti
Jan 23 at 15:22
Also on SO: How to skip/ignore Hung host
– glenn jackman
Jan 23 at 15:23