Issue with the exit code of a nohup command in unix

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I have been trying to figure out on how to capture the exit status of the nohup command and then send a mail based on the status.



Below is my code:



if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
nohup java -jar Xyz-port-0.0.1.jar &
<< Exit Code and then send mail if Exit
code is 0>>
else
echo "Process doesn't exist"
fi









share|improve this question



















  • 1





    Tangentially related: When should I not kill -9 a process?

    – Kusalananda
    Mar 14 at 10:02











  • @Kusalananda Not what i am looking for.

    – user3901666
    Mar 14 at 10:06

















1















I have been trying to figure out on how to capture the exit status of the nohup command and then send a mail based on the status.



Below is my code:



if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
nohup java -jar Xyz-port-0.0.1.jar &
<< Exit Code and then send mail if Exit
code is 0>>
else
echo "Process doesn't exist"
fi









share|improve this question



















  • 1





    Tangentially related: When should I not kill -9 a process?

    – Kusalananda
    Mar 14 at 10:02











  • @Kusalananda Not what i am looking for.

    – user3901666
    Mar 14 at 10:06













1












1








1








I have been trying to figure out on how to capture the exit status of the nohup command and then send a mail based on the status.



Below is my code:



if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
nohup java -jar Xyz-port-0.0.1.jar &
<< Exit Code and then send mail if Exit
code is 0>>
else
echo "Process doesn't exist"
fi









share|improve this question
















I have been trying to figure out on how to capture the exit status of the nohup command and then send a mail based on the status.



Below is my code:



if [[ "" != "$PID" ]]; then
echo "killing $PID"
kill -9 $PID
nohup java -jar Xyz-port-0.0.1.jar &
<< Exit Code and then send mail if Exit
code is 0>>
else
echo "Process doesn't exist"
fi






linux shell-script nohup






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 15 at 19:49









Rui F Ribeiro

42k1483142




42k1483142










asked Mar 14 at 9:43









user3901666user3901666

113




113







  • 1





    Tangentially related: When should I not kill -9 a process?

    – Kusalananda
    Mar 14 at 10:02











  • @Kusalananda Not what i am looking for.

    – user3901666
    Mar 14 at 10:06












  • 1





    Tangentially related: When should I not kill -9 a process?

    – Kusalananda
    Mar 14 at 10:02











  • @Kusalananda Not what i am looking for.

    – user3901666
    Mar 14 at 10:06







1




1





Tangentially related: When should I not kill -9 a process?

– Kusalananda
Mar 14 at 10:02





Tangentially related: When should I not kill -9 a process?

– Kusalananda
Mar 14 at 10:02













@Kusalananda Not what i am looking for.

– user3901666
Mar 14 at 10:06





@Kusalananda Not what i am looking for.

– user3901666
Mar 14 at 10:06










3 Answers
3






active

oldest

votes


















1














Something like this will capture the exit code of your process more reliably:



parent.sh



...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi


Wait



Note the call to wait will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java command to complete, you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.



Nohup



If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup not the java code. The parent script could be nohuped and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.






share|improve this answer























  • Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?

    – user3901666
    Mar 15 at 5:32


















2














A background job is always started successfully:



$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0


You can't launch your Java program in the background and detect whether that went well or not until the nohup command has terminated. The nohup command will terminate if it can't find or start java, or whenever the Java program terminates. It is not nohup that runs your program in the background. It just makes your program ignore any HUP signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.



To send an email if the job launched failed, you may do



(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &

echo 'started background job'


I.e., start a backgrounded subshell in which you run your program, then test whether nohup returned 126 or 127. It will do this if the java command could not be found or could not be started.






share|improve this answer

























  • Can you please let me know how to put this in my code?

    – user3901666
    Mar 14 at 10:14











  • @user3901666 See updated answer.

    – Kusalananda
    Mar 14 at 10:19











  • As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error

    – user3901666
    Mar 14 at 10:27











  • Whats your take on that?

    – user3901666
    Mar 14 at 10:27











  • @user3901666 nohup will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, then nohup would return the exit status of the Java program when that program terminates.

    – Kusalananda
    Mar 14 at 10:32



















0














The problem is not necessarily the nohup command, but the & at the end of the command line.



The man page for bash says:




If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.




Technically, the shell forks a subshell to execute the nohup ... command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... & line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.



And the nohup command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec() the java ... command. If the exec() system call is successful, then the process that was nohup transforms into java without fork()ing a new process and exiting, and so there will be no exit code to return yet.



The only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error. If the exec() system call is successful, the only time an exit code will be available is when the java command itself ends.



This is also why trying something like this does not do what you want:



nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi


If nohup fails in the background, wait will pick up its exit code and allow it to be checked; but if nohup is successful in starting java, then the wait command will keep waiting until the java exits, since both nohup and java will be run in the same process one after another.



You might want to do something like this, as @datUser suggested:



STARTWAIT=20 #number of seconds to wait after starting the service

nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi


Please note that my script solution has an inherent weakness: if the nohupped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)



To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait command at some pre-determined time, and then run the wait() system call on the monitored process (which is achieved by the wait shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.






share|improve this answer

























  • So what's the solution then?

    – user3901666
    Mar 14 at 10:24











  • Edited after a bit of testing, thinking and re-testing. :-)

    – telcoM
    Mar 15 at 7:34











Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f506244%2fissue-with-the-exit-code-of-a-nohup-command-in-unix%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Something like this will capture the exit code of your process more reliably:



parent.sh



...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi


Wait



Note the call to wait will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java command to complete, you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.



Nohup



If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup not the java code. The parent script could be nohuped and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.






share|improve this answer























  • Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?

    – user3901666
    Mar 15 at 5:32















1














Something like this will capture the exit code of your process more reliably:



parent.sh



...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi


Wait



Note the call to wait will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java command to complete, you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.



Nohup



If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup not the java code. The parent script could be nohuped and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.






share|improve this answer























  • Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?

    – user3901666
    Mar 15 at 5:32













1












1








1







Something like this will capture the exit code of your process more reliably:



parent.sh



...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi


Wait



Note the call to wait will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java command to complete, you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.



Nohup



If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup not the java code. The parent script could be nohuped and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.






share|improve this answer













Something like this will capture the exit code of your process more reliably:



parent.sh



...
java -jar Xyz-port-0.0.1.jar > /dev/null & #optionally discard stdout of java
child=$!
wait $child
exit_status=$?
if [[ "$exit_status" -ne 0 ]]; then
# handle error
else
#handle success
fi


Wait



Note the call to wait will block indefinitly, so if you need to set an arbitrary limit on the amount of time you give the java command to complete, you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running. This would allow you to set a watchdog and call everybody out of the pool if the process hangs.



Nohup



If you need to run the java process and handle its exit code without being in an active terminal session, call your parent script (above) with nohup not the java code. The parent script could be nohuped and will survive in the background with no controlling terminal and do the emailing or cleanup reliably.







share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 14 at 12:39









datUserdatUser

2,7341236




2,7341236












  • Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?

    – user3901666
    Mar 15 at 5:32

















  • Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?

    – user3901666
    Mar 15 at 5:32
















Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?

– user3901666
Mar 15 at 5:32





Perfect ..that's what i was looking for..although i couldn't get "you might replace the call to wait with a loop and some manual checks to ps to see it the process is still running" Can you explain a bit more?

– user3901666
Mar 15 at 5:32













2














A background job is always started successfully:



$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0


You can't launch your Java program in the background and detect whether that went well or not until the nohup command has terminated. The nohup command will terminate if it can't find or start java, or whenever the Java program terminates. It is not nohup that runs your program in the background. It just makes your program ignore any HUP signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.



To send an email if the job launched failed, you may do



(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &

echo 'started background job'


I.e., start a backgrounded subshell in which you run your program, then test whether nohup returned 126 or 127. It will do this if the java command could not be found or could not be started.






share|improve this answer

























  • Can you please let me know how to put this in my code?

    – user3901666
    Mar 14 at 10:14











  • @user3901666 See updated answer.

    – Kusalananda
    Mar 14 at 10:19











  • As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error

    – user3901666
    Mar 14 at 10:27











  • Whats your take on that?

    – user3901666
    Mar 14 at 10:27











  • @user3901666 nohup will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, then nohup would return the exit status of the Java program when that program terminates.

    – Kusalananda
    Mar 14 at 10:32
















2














A background job is always started successfully:



$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0


You can't launch your Java program in the background and detect whether that went well or not until the nohup command has terminated. The nohup command will terminate if it can't find or start java, or whenever the Java program terminates. It is not nohup that runs your program in the background. It just makes your program ignore any HUP signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.



To send an email if the job launched failed, you may do



(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &

echo 'started background job'


I.e., start a backgrounded subshell in which you run your program, then test whether nohup returned 126 or 127. It will do this if the java command could not be found or could not be started.






share|improve this answer

























  • Can you please let me know how to put this in my code?

    – user3901666
    Mar 14 at 10:14











  • @user3901666 See updated answer.

    – Kusalananda
    Mar 14 at 10:19











  • As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error

    – user3901666
    Mar 14 at 10:27











  • Whats your take on that?

    – user3901666
    Mar 14 at 10:27











  • @user3901666 nohup will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, then nohup would return the exit status of the Java program when that program terminates.

    – Kusalananda
    Mar 14 at 10:32














2












2








2







A background job is always started successfully:



$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0


You can't launch your Java program in the background and detect whether that went well or not until the nohup command has terminated. The nohup command will terminate if it can't find or start java, or whenever the Java program terminates. It is not nohup that runs your program in the background. It just makes your program ignore any HUP signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.



To send an email if the job launched failed, you may do



(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &

echo 'started background job'


I.e., start a backgrounded subshell in which you run your program, then test whether nohup returned 126 or 127. It will do this if the java command could not be found or could not be started.






share|improve this answer















A background job is always started successfully:



$ sntaoehu &
[1] 33566
$ bash: sntaoehu: command not found
[1]+ Exit 127 sntaoehu
$ echo $?
0


You can't launch your Java program in the background and detect whether that went well or not until the nohup command has terminated. The nohup command will terminate if it can't find or start java, or whenever the Java program terminates. It is not nohup that runs your program in the background. It just makes your program ignore any HUP signal and will hang around until your program terminates, and then it returns the program's exit status to the invoking shell.



To send an email if the job launched failed, you may do



(
nohup java -jar Xyz-port-0.0.1.jar
status=$?
if [ "$status" -eq 126 ] || [ "$status" -eq 127 ]; then
# something wrong in launching java
echo 'nohup failed to run java'
printf 'nohup exit status is %sn' "$status"
elif [ "$status" -ne 0 ]; then
# the java code returned an error
echo 'java returned an error'
printf 'java exit status is %sn' "$status"
else
# everything went well and java exited ok
echo 'java exited safely'
fi | mail -s 'java job status report' me@example.com
) &

echo 'started background job'


I.e., start a backgrounded subshell in which you run your program, then test whether nohup returned 126 or 127. It will do this if the java command could not be found or could not be started.







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 14 at 11:16

























answered Mar 14 at 10:10









KusalanandaKusalananda

141k17263439




141k17263439












  • Can you please let me know how to put this in my code?

    – user3901666
    Mar 14 at 10:14











  • @user3901666 See updated answer.

    – Kusalananda
    Mar 14 at 10:19











  • As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error

    – user3901666
    Mar 14 at 10:27











  • Whats your take on that?

    – user3901666
    Mar 14 at 10:27











  • @user3901666 nohup will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, then nohup would return the exit status of the Java program when that program terminates.

    – Kusalananda
    Mar 14 at 10:32


















  • Can you please let me know how to put this in my code?

    – user3901666
    Mar 14 at 10:14











  • @user3901666 See updated answer.

    – Kusalananda
    Mar 14 at 10:19











  • As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error

    – user3901666
    Mar 14 at 10:27











  • Whats your take on that?

    – user3901666
    Mar 14 at 10:27











  • @user3901666 nohup will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, then nohup would return the exit status of the Java program when that program terminates.

    – Kusalananda
    Mar 14 at 10:32

















Can you please let me know how to put this in my code?

– user3901666
Mar 14 at 10:14





Can you please let me know how to put this in my code?

– user3901666
Mar 14 at 10:14













@user3901666 See updated answer.

– Kusalananda
Mar 14 at 10:19





@user3901666 See updated answer.

– Kusalananda
Mar 14 at 10:19













As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error

– user3901666
Mar 14 at 10:27





As per the below post.. wait might wait for forever..he only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error

– user3901666
Mar 14 at 10:27













Whats your take on that?

– user3901666
Mar 14 at 10:27





Whats your take on that?

– user3901666
Mar 14 at 10:27













@user3901666 nohup will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, then nohup would return the exit status of the Java program when that program terminates.

– Kusalananda
Mar 14 at 10:32






@user3901666 nohup will return a specific exit code if the Java program can not be found, this is true. This will be a status of 127. If the Java program was started okay, then nohup would return the exit status of the Java program when that program terminates.

– Kusalananda
Mar 14 at 10:32












0














The problem is not necessarily the nohup command, but the & at the end of the command line.



The man page for bash says:




If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.




Technically, the shell forks a subshell to execute the nohup ... command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... & line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.



And the nohup command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec() the java ... command. If the exec() system call is successful, then the process that was nohup transforms into java without fork()ing a new process and exiting, and so there will be no exit code to return yet.



The only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error. If the exec() system call is successful, the only time an exit code will be available is when the java command itself ends.



This is also why trying something like this does not do what you want:



nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi


If nohup fails in the background, wait will pick up its exit code and allow it to be checked; but if nohup is successful in starting java, then the wait command will keep waiting until the java exits, since both nohup and java will be run in the same process one after another.



You might want to do something like this, as @datUser suggested:



STARTWAIT=20 #number of seconds to wait after starting the service

nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi


Please note that my script solution has an inherent weakness: if the nohupped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)



To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait command at some pre-determined time, and then run the wait() system call on the monitored process (which is achieved by the wait shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.






share|improve this answer

























  • So what's the solution then?

    – user3901666
    Mar 14 at 10:24











  • Edited after a bit of testing, thinking and re-testing. :-)

    – telcoM
    Mar 15 at 7:34















0














The problem is not necessarily the nohup command, but the & at the end of the command line.



The man page for bash says:




If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.




Technically, the shell forks a subshell to execute the nohup ... command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... & line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.



And the nohup command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec() the java ... command. If the exec() system call is successful, then the process that was nohup transforms into java without fork()ing a new process and exiting, and so there will be no exit code to return yet.



The only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error. If the exec() system call is successful, the only time an exit code will be available is when the java command itself ends.



This is also why trying something like this does not do what you want:



nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi


If nohup fails in the background, wait will pick up its exit code and allow it to be checked; but if nohup is successful in starting java, then the wait command will keep waiting until the java exits, since both nohup and java will be run in the same process one after another.



You might want to do something like this, as @datUser suggested:



STARTWAIT=20 #number of seconds to wait after starting the service

nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi


Please note that my script solution has an inherent weakness: if the nohupped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)



To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait command at some pre-determined time, and then run the wait() system call on the monitored process (which is achieved by the wait shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.






share|improve this answer

























  • So what's the solution then?

    – user3901666
    Mar 14 at 10:24











  • Edited after a bit of testing, thinking and re-testing. :-)

    – telcoM
    Mar 15 at 7:34













0












0








0







The problem is not necessarily the nohup command, but the & at the end of the command line.



The man page for bash says:




If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.




Technically, the shell forks a subshell to execute the nohup ... command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... & line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.



And the nohup command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec() the java ... command. If the exec() system call is successful, then the process that was nohup transforms into java without fork()ing a new process and exiting, and so there will be no exit code to return yet.



The only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error. If the exec() system call is successful, the only time an exit code will be available is when the java command itself ends.



This is also why trying something like this does not do what you want:



nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi


If nohup fails in the background, wait will pick up its exit code and allow it to be checked; but if nohup is successful in starting java, then the wait command will keep waiting until the java exits, since both nohup and java will be run in the same process one after another.



You might want to do something like this, as @datUser suggested:



STARTWAIT=20 #number of seconds to wait after starting the service

nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi


Please note that my script solution has an inherent weakness: if the nohupped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)



To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait command at some pre-determined time, and then run the wait() system call on the monitored process (which is achieved by the wait shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.






share|improve this answer















The problem is not necessarily the nohup command, but the & at the end of the command line.



The man page for bash says:




If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.




Technically, the shell forks a subshell to execute the nohup ... command, and the main shell executing the script immediately proceeds to the next command in the script. So, when the main shell executes the line after the nohup ... & line, the nohup command may not have exited yet: the subshell might be in the process of loading it for execution.



And the nohup command does not necessarily exit as such: it sets the signal handler to ignore the HUP signal, and then attempts to directly exec() the java ... command. If the exec() system call is successful, then the process that was nohup transforms into java without fork()ing a new process and exiting, and so there will be no exit code to return yet.



The only way the actual nohup command would return an exit code is if the java command could not be found or executed, or if the nohup command itself had an internal error. If the exec() system call is successful, the only time an exit code will be available is when the java command itself ends.



This is also why trying something like this does not do what you want:



nohup java -jar Xyz-port-0.0.1.jar &
wait $!
if [ $? -ne 0 ]; then
echo "Error in running Java"
fi


If nohup fails in the background, wait will pick up its exit code and allow it to be checked; but if nohup is successful in starting java, then the wait command will keep waiting until the java exits, since both nohup and java will be run in the same process one after another.



You might want to do something like this, as @datUser suggested:



STARTWAIT=20 #number of seconds to wait after starting the service

nohup java -jar Xyz-port-0.0.1.jar &
NEWPID=$!
for i in $(seq $STARTWAIT); do
# test if the process still exists
if ! kill -0 $NEWPID 2>/dev/null; then
# Process has died: get its exit code, report error, stop waiting
wait $!
EXITCODE=$?
echo "Process died after restart and reported result $EXITCODE" >&2
break
fi
sleep 1
done
if kill -0 $NEWPID; then
echo "The process was started and is still alive after $STARTWAIT seconds"
echo "so I guess it's now running correctly"
fi


Please note that my script solution has an inherent weakness: if the nohupped process dies and another process receives the same PID number within one second, this script will fail to notice that it's no longer the same process. (But if your PID numbers are getting recycled that fast, you probably have bigger problems.)



To optimally solve the problem, you would need the script to set up a time-out mechanism of some sort that would stop the wait command at some pre-determined time, and then run the wait() system call on the monitored process (which is achieved by the wait shell command) because that is something only the actual parent of the monitored process can do. This is quite tricky to implement in a shell script.







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 15 at 7:33

























answered Mar 14 at 10:20









telcoMtelcoM

20.8k12452




20.8k12452












  • So what's the solution then?

    – user3901666
    Mar 14 at 10:24











  • Edited after a bit of testing, thinking and re-testing. :-)

    – telcoM
    Mar 15 at 7:34

















  • So what's the solution then?

    – user3901666
    Mar 14 at 10:24











  • Edited after a bit of testing, thinking and re-testing. :-)

    – telcoM
    Mar 15 at 7:34
















So what's the solution then?

– user3901666
Mar 14 at 10:24





So what's the solution then?

– user3901666
Mar 14 at 10:24













Edited after a bit of testing, thinking and re-testing. :-)

– telcoM
Mar 15 at 7:34





Edited after a bit of testing, thinking and re-testing. :-)

– telcoM
Mar 15 at 7:34

















draft saved

draft discarded
















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f506244%2fissue-with-the-exit-code-of-a-nohup-command-in-unix%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown






Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?