Issue with the exit code of a nohup command in unix
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have been trying to figure out on how to capture the exit status of the nohup command and then send a mail based on the status.
Below is my code:
if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
nohup java -jar Xyz-port-0.0.1.jar &
<< Exit Code and then send mail if Exit
code is 0>>
else
echo "Process doesn't exist"
fi
linux shell-script nohup
add a comment |
I have been trying to figure out on how to capture the exit status of the nohup command and then send a mail based on the status.
Below is my code:
if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
nohup java -jar Xyz-port-0.0.1.jar &
<< Exit Code and then send mail if Exit
code is 0>>
else
echo "Process doesn't exist"
fi
linux shell-script nohup
1
Tangentially related: When should I not kill -9 a process?
– Kusalananda♦
Mar 14 at 10:02
@Kusalananda Not what i am looking for.
– user3901666
Mar 14 at 10:06
add a comment |
I have been trying to figure out on how to capture the exit status of the nohup command and then send a mail based on the status.
Below is my code:
if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
nohup java -jar Xyz-port-0.0.1.jar &
<< Exit Code and then send mail if Exit
code is 0>>
else
echo "Process doesn't exist"
fi
linux shell-script nohup
I have been trying to figure out on how to capture the exit status of the nohup command and then send a mail based on the status.
Below is my code:
if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
nohup java -jar Xyz-port-0.0.1.jar &
<< Exit Code and then send mail if Exit
code is 0>>
else
echo "Process doesn't exist"
fi
linux shell-script nohup
linux shell-script nohup
edited Mar 15 at 19:49
Rui F Ribeiro
42k1483142
42k1483142
asked Mar 14 at 9:43
user3901666user3901666
113
113
1
Tangentially related: When should I not kill -9 a process?
– Kusalananda♦
Mar 14 at 10:02
@Kusalananda Not what i am looking for.
– user3901666
Mar 14 at 10:06
add a comment |
1
Tangentially related: When should I not kill -9 a process?
– Kusalananda♦
Mar 14 at 10:02
@Kusalananda Not what i am looking for.
– user3901666
Mar 14 at 10:06
1
1
Tangentially related: When should I not kill -9 a process?
– Kusalananda♦
Mar 14 at 10:02
Tangentially related: When should I not kill -9 a process?
– Kusalananda♦
Mar 14 at 10:02
@Kusalananda Not what i am looking for.
– user3901666
Mar 14 at 10:06
@Kusalananda Not what i am looking for.
– user3901666
Mar 14 at 10:06
add a comment |
3 Answers
3
active
oldest
votes
Something like this will capture the exit code of your process more reliably:
parent.sh
...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi
Wait
Note the call to wait
will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java
command to complete, you might replace the call to wait
with a loop and some manual checks to ps
to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.
Nohup
If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup
not the java
code. The parent script could be nohup
ed and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.
Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?
– user3901666
Mar 15 at 5:32
add a comment |
A background job is always started successfully:
$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0
You can't launch your Java program in the background and detect whether that went well or not until the nohup
command has terminated. The nohup
command will terminate if it can't find or start java
, or whenever the Java program terminates. It is not nohup
that runs your program in the background. It just makes your program ignore any HUP
signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.
To send an email if the job launched failed, you may do
(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &
echo 'started background job'
I.e., start a backgrounded subshell in which you run your program, then test whether nohup
returned 126 or 127. It will do this if the java
command could not be found or could not be started.
Can you please let me know how to put this in my code?
– user3901666
Mar 14 at 10:14
@user3901666 See updated answer.
– Kusalananda♦
Mar 14 at 10:19
As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error
– user3901666
Mar 14 at 10:27
Whats your take on that?
– user3901666
Mar 14 at 10:27
@user3901666nohup
will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, thennohup
would return the exit status of the Java program when that program terminates.
– Kusalananda♦
Mar 14 at 10:32
|
show 1 more comment
The problem is not necessarily the nohup
command, but the &
at the end of the command line.
The man page for bash
says:
If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.
Technically, the shell forks a subshell to execute the nohup ...
command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... &
line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.
And the nohup
command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec()
the java ...
command. If the exec()
system call is successful, then the process that was nohup
transforms into java
without fork()
ing a new process and exiting, and so there will be no exit code to return yet.
The only way the actual nohup
command would return an exit code is if the java
command could not be found or executed, or if the nohup
command itself had an internal error. If the exec()
system call is successful, the only time an exit code will be available is when the java
command itself ends.
This is also why trying something like this does not do what you want:
nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi
If nohup
fails in the background, wait
will pick up its exit code and allow it to be checked; but if nohup
is successful in starting java
, then the wait
command will keep waiting until the java
exits, since both nohup
and java
will be run in the same process one after another.
You might want to do something like this, as @datUser suggested:
STARTWAIT=20 #number of seconds to wait after starting the service
nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi
Please note that my script solution has an inherent weakness: if the nohup
ped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)
To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait
command at some pre-determined time, and then run the wait()
system call on the monitored process (which is achieved by the wait
shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.
So what's the solution then?
– user3901666
Mar 14 at 10:24
Edited after a bit of testing, thinking and re-testing. :-)
– telcoM
Mar 15 at 7:34
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f506244%2fissue-with-the-exit-code-of-a-nohup-command-in-unix%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Something like this will capture the exit code of your process more reliably:
parent.sh
...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi
Wait
Note the call to wait
will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java
command to complete, you might replace the call to wait
with a loop and some manual checks to ps
to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.
Nohup
If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup
not the java
code. The parent script could be nohup
ed and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.
Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?
– user3901666
Mar 15 at 5:32
add a comment |
Something like this will capture the exit code of your process more reliably:
parent.sh
...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi
Wait
Note the call to wait
will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java
command to complete, you might replace the call to wait
with a loop and some manual checks to ps
to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.
Nohup
If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup
not the java
code. The parent script could be nohup
ed and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.
Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?
– user3901666
Mar 15 at 5:32
add a comment |
Something like this will capture the exit code of your process more reliably:
parent.sh
...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi
Wait
Note the call to wait
will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java
command to complete, you might replace the call to wait
with a loop and some manual checks to ps
to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.
Nohup
If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup
not the java
code. The parent script could be nohup
ed and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.
Something like this will capture the exit code of your process more reliably:
parent.sh
...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi
Wait
Note the call to wait
will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java
command to complete, you might replace the call to wait
with a loop and some manual checks to ps
to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.
Nohup
If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup
not the java
code. The parent script could be nohup
ed and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.
answered Mar 14 at 12:39
datUserdatUser
2,7341236
2,7341236
Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?
– user3901666
Mar 15 at 5:32
add a comment |
Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?
– user3901666
Mar 15 at 5:32
Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?
– user3901666
Mar 15 at 5:32
Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?
– user3901666
Mar 15 at 5:32
add a comment |
A background job is always started successfully:
$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0
You can't launch your Java program in the background and detect whether that went well or not until the nohup
command has terminated. The nohup
command will terminate if it can't find or start java
, or whenever the Java program terminates. It is not nohup
that runs your program in the background. It just makes your program ignore any HUP
signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.
To send an email if the job launched failed, you may do
(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &
echo 'started background job'
I.e., start a backgrounded subshell in which you run your program, then test whether nohup
returned 126 or 127. It will do this if the java
command could not be found or could not be started.
Can you please let me know how to put this in my code?
– user3901666
Mar 14 at 10:14
@user3901666 See updated answer.
– Kusalananda♦
Mar 14 at 10:19
As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error
– user3901666
Mar 14 at 10:27
Whats your take on that?
– user3901666
Mar 14 at 10:27
@user3901666nohup
will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, thennohup
would return the exit status of the Java program when that program terminates.
– Kusalananda♦
Mar 14 at 10:32
|
show 1 more comment
A background job is always started successfully:
$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0
You can't launch your Java program in the background and detect whether that went well or not until the nohup
command has terminated. The nohup
command will terminate if it can't find or start java
, or whenever the Java program terminates. It is not nohup
that runs your program in the background. It just makes your program ignore any HUP
signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.
To send an email if the job launched failed, you may do
(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &
echo 'started background job'
I.e., start a backgrounded subshell in which you run your program, then test whether nohup
returned 126 or 127. It will do this if the java
command could not be found or could not be started.
Can you please let me know how to put this in my code?
– user3901666
Mar 14 at 10:14
@user3901666 See updated answer.
– Kusalananda♦
Mar 14 at 10:19
As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error
– user3901666
Mar 14 at 10:27
Whats your take on that?
– user3901666
Mar 14 at 10:27
@user3901666nohup
will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, thennohup
would return the exit status of the Java program when that program terminates.
– Kusalananda♦
Mar 14 at 10:32
|
show 1 more comment
A background job is always started successfully:
$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0
You can't launch your Java program in the background and detect whether that went well or not until the nohup
command has terminated. The nohup
command will terminate if it can't find or start java
, or whenever the Java program terminates. It is not nohup
that runs your program in the background. It just makes your program ignore any HUP
signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.
To send an email if the job launched failed, you may do
(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &
echo 'started background job'
I.e., start a backgrounded subshell in which you run your program, then test whether nohup
returned 126 or 127. It will do this if the java
command could not be found or could not be started.
A background job is always started successfully:
$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0
You can't launch your Java program in the background and detect whether that went well or not until the nohup
command has terminated. The nohup
command will terminate if it can't find or start java
, or whenever the Java program terminates. It is not nohup
that runs your program in the background. It just makes your program ignore any HUP
signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.
To send an email if the job launched failed, you may do
(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &
echo 'started background job'
I.e., start a backgrounded subshell in which you run your program, then test whether nohup
returned 126 or 127. It will do this if the java
command could not be found or could not be started.
edited Mar 14 at 11:16
answered Mar 14 at 10:10
Kusalananda♦Kusalananda
141k17263439
141k17263439
Can you please let me know how to put this in my code?
– user3901666
Mar 14 at 10:14
@user3901666 See updated answer.
– Kusalananda♦
Mar 14 at 10:19
As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error
– user3901666
Mar 14 at 10:27
Whats your take on that?
– user3901666
Mar 14 at 10:27
@user3901666nohup
will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, thennohup
would return the exit status of the Java program when that program terminates.
– Kusalananda♦
Mar 14 at 10:32
|
show 1 more comment
Can you please let me know how to put this in my code?
– user3901666
Mar 14 at 10:14
@user3901666 See updated answer.
– Kusalananda♦
Mar 14 at 10:19
As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error
– user3901666
Mar 14 at 10:27
Whats your take on that?
– user3901666
Mar 14 at 10:27
@user3901666nohup
will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, thennohup
would return the exit status of the Java program when that program terminates.
– Kusalananda♦
Mar 14 at 10:32
Can you please let me know how to put this in my code?
– user3901666
Mar 14 at 10:14
Can you please let me know how to put this in my code?
– user3901666
Mar 14 at 10:14
@user3901666 See updated answer.
– Kusalananda♦
Mar 14 at 10:19
@user3901666 See updated answer.
– Kusalananda♦
Mar 14 at 10:19
As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error
– user3901666
Mar 14 at 10:27
As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error
– user3901666
Mar 14 at 10:27
Whats your take on that?
– user3901666
Mar 14 at 10:27
Whats your take on that?
– user3901666
Mar 14 at 10:27
@user3901666
nohup
will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, then nohup
would return the exit status of the Java program when that program terminates.– Kusalananda♦
Mar 14 at 10:32
@user3901666
nohup
will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, then nohup
would return the exit status of the Java program when that program terminates.– Kusalananda♦
Mar 14 at 10:32
|
show 1 more comment
The problem is not necessarily the nohup
command, but the &
at the end of the command line.
The man page for bash
says:
If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.
Technically, the shell forks a subshell to execute the nohup ...
command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... &
line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.
And the nohup
command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec()
the java ...
command. If the exec()
system call is successful, then the process that was nohup
transforms into java
without fork()
ing a new process and exiting, and so there will be no exit code to return yet.
The only way the actual nohup
command would return an exit code is if the java
command could not be found or executed, or if the nohup
command itself had an internal error. If the exec()
system call is successful, the only time an exit code will be available is when the java
command itself ends.
This is also why trying something like this does not do what you want:
nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi
If nohup
fails in the background, wait
will pick up its exit code and allow it to be checked; but if nohup
is successful in starting java
, then the wait
command will keep waiting until the java
exits, since both nohup
and java
will be run in the same process one after another.
You might want to do something like this, as @datUser suggested:
STARTWAIT=20 #number of seconds to wait after starting the service
nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi
Please note that my script solution has an inherent weakness: if the nohup
ped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)
To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait
command at some pre-determined time, and then run the wait()
system call on the monitored process (which is achieved by the wait
shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.
So what's the solution then?
– user3901666
Mar 14 at 10:24
Edited after a bit of testing, thinking and re-testing. :-)
– telcoM
Mar 15 at 7:34
add a comment |
The problem is not necessarily the nohup
command, but the &
at the end of the command line.
The man page for bash
says:
If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.
Technically, the shell forks a subshell to execute the nohup ...
command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... &
line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.
And the nohup
command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec()
the java ...
command. If the exec()
system call is successful, then the process that was nohup
transforms into java
without fork()
ing a new process and exiting, and so there will be no exit code to return yet.
The only way the actual nohup
command would return an exit code is if the java
command could not be found or executed, or if the nohup
command itself had an internal error. If the exec()
system call is successful, the only time an exit code will be available is when the java
command itself ends.
This is also why trying something like this does not do what you want:
nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi
If nohup
fails in the background, wait
will pick up its exit code and allow it to be checked; but if nohup
is successful in starting java
, then the wait
command will keep waiting until the java
exits, since both nohup
and java
will be run in the same process one after another.
You might want to do something like this, as @datUser suggested:
STARTWAIT=20 #number of seconds to wait after starting the service
nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi
Please note that my script solution has an inherent weakness: if the nohup
ped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)
To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait
command at some pre-determined time, and then run the wait()
system call on the monitored process (which is achieved by the wait
shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.
So what's the solution then?
– user3901666
Mar 14 at 10:24
Edited after a bit of testing, thinking and re-testing. :-)
– telcoM
Mar 15 at 7:34
add a comment |
The problem is not necessarily the nohup
command, but the &
at the end of the command line.
The man page for bash
says:
If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.
Technically, the shell forks a subshell to execute the nohup ...
command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... &
line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.
And the nohup
command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec()
the java ...
command. If the exec()
system call is successful, then the process that was nohup
transforms into java
without fork()
ing a new process and exiting, and so there will be no exit code to return yet.
The only way the actual nohup
command would return an exit code is if the java
command could not be found or executed, or if the nohup
command itself had an internal error. If the exec()
system call is successful, the only time an exit code will be available is when the java
command itself ends.
This is also why trying something like this does not do what you want:
nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi
If nohup
fails in the background, wait
will pick up its exit code and allow it to be checked; but if nohup
is successful in starting java
, then the wait
command will keep waiting until the java
exits, since both nohup
and java
will be run in the same process one after another.
You might want to do something like this, as @datUser suggested:
STARTWAIT=20 #number of seconds to wait after starting the service
nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi
Please note that my script solution has an inherent weakness: if the nohup
ped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)
To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait
command at some pre-determined time, and then run the wait()
system call on the monitored process (which is achieved by the wait
shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.
The problem is not necessarily the nohup
command, but the &
at the end of the command line.
The man page for bash
says:
If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.
Technically, the shell forks a subshell to execute the nohup ...
command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... &
line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.
And the nohup
command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec()
the java ...
command. If the exec()
system call is successful, then the process that was nohup
transforms into java
without fork()
ing a new process and exiting, and so there will be no exit code to return yet.
The only way the actual nohup
command would return an exit code is if the java
command could not be found or executed, or if the nohup
command itself had an internal error. If the exec()
system call is successful, the only time an exit code will be available is when the java
command itself ends.
This is also why trying something like this does not do what you want:
nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi
If nohup
fails in the background, wait
will pick up its exit code and allow it to be checked; but if nohup
is successful in starting java
, then the wait
command will keep waiting until the java
exits, since both nohup
and java
will be run in the same process one after another.
You might want to do something like this, as @datUser suggested:
STARTWAIT=20 #number of seconds to wait after starting the service
nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi
Please note that my script solution has an inherent weakness: if the nohup
ped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)
To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait
command at some pre-determined time, and then run the wait()
system call on the monitored process (which is achieved by the wait
shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.
edited Mar 15 at 7:33
answered Mar 14 at 10:20
telcoMtelcoM
20.8k12452
20.8k12452
So what's the solution then?
– user3901666
Mar 14 at 10:24
Edited after a bit of testing, thinking and re-testing. :-)
– telcoM
Mar 15 at 7:34
add a comment |
So what's the solution then?
– user3901666
Mar 14 at 10:24
Edited after a bit of testing, thinking and re-testing. :-)
– telcoM
Mar 15 at 7:34
So what's the solution then?
– user3901666
Mar 14 at 10:24
So what's the solution then?
– user3901666
Mar 14 at 10:24
Edited after a bit of testing, thinking and re-testing. :-)
– telcoM
Mar 15 at 7:34
Edited after a bit of testing, thinking and re-testing. :-)
– telcoM
Mar 15 at 7:34
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f506244%2fissue-with-the-exit-code-of-a-nohup-command-in-unix%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Tangentially related: When should I not kill -9 a process?
– Kusalananda♦
Mar 14 at 10:02
@Kusalananda Not what i am looking for.
– user3901666
Mar 14 at 10:06