Proper way to use OnFailure in systemd

up vote
2
down vote

favorite

I have a service running software that generates some configuration files if they don't exist, and read them if they do exist. The problem I have been facing is that these files sometimes get corrupt, making the software unable to start, and thus making the service fail. In this case I would like to remove these files and restart the service.

I tried creating a service that should get executed in case of failure, by doing this:

[Service]
ExecStart=/bin/run_program
OnFailure=software-fail.service

where this service is:

[Service]
ExecStart=/bin/rm /file/to/delete
ExecStop=systemctl --user start software.service

The problem, however, is that this service doesn't start, even when the service has failed.

I tried doing

systemctl --user enable software-fail.service

but then it starts every time the system starts, just like any other service.

My temporary solution is to use

ExecStopPost=/bin/rm /file/to/delete

but this is not a satisfying way of solving it, as it will always delete the file upon stopping the service, no matter if it was because of failure or not.

Output when failing:

Ã¢Â—Â software.service - Software
 Loaded: loaded (/home/trippelganger/.config/systemd/user/software.service; enabled; vendor preset: enabled)
 Active: failed (Result: exit-code) since Fri 2018-05-04 09:05:26 CEST; 5s ago
 Process: 1839 ExecStart=/bin/run_program (code=exited, status=1/FAILURE)
 Main PID: 1839 (code=exited, status=1/FAILURE)



May 04 09:05:26 trippelganger systemd[595]: software.service: Main process exited, code=exited, status=1/FAILURE
May 04 09:05:26 trippelganger systemd[595]: software.service: Unit entered failed state.
May 04 09:05:26 trippelganger systemd[595]: software.service: Failed with result 'exit-code'.

Output of systemctl --user status software-fail.service
is:

Ã¢Â—Â software-fail.service - Delete corrupt files
 Loaded: loaded (/home/trippelganger/.config/systemd/user/software-fail.service; disabled; vendor preset: enabled)
 Active: inactive (dead)

edited May 5 at 3:01

asked May 3 at 15:07

trippelganger

133

Is there any journal output from your first service file? How about the second service file? That is, can you tell if the first service reaches the failure condition at all? You should update the question with the (relevant) journal output.
â€“Â Chiraag
May 3 at 16:26

@Chiraag Updated.
â€“Â trippelganger
May 4 at 7:10

@trippelganger Please edit your question to make some corrections. start_if_fail.service should probably be software-fail.service (from your later systemctl status output) and ExecStop=systemctl ... fails because the path is not absolute, so I imagine what you really have is ExecStop=/bin/systemctl ... (though it's possible this might also be part of the problem you're having.)
â€“Â Filipe Brandenburger
May 4 at 16:05

add a commentÂ |Â

up vote
2
down vote

favorite

I tried creating a service that should get executed in case of failure, by doing this:

[Service]
ExecStart=/bin/run_program
OnFailure=software-fail.service

where this service is:

[Service]
ExecStart=/bin/rm /file/to/delete
ExecStop=systemctl --user start software.service

The problem, however, is that this service doesn't start, even when the service has failed.

I tried doing

systemctl --user enable software-fail.service

but then it starts every time the system starts, just like any other service.

My temporary solution is to use

ExecStopPost=/bin/rm /file/to/delete

but this is not a satisfying way of solving it, as it will always delete the file upon stopping the service, no matter if it was because of failure or not.

Output when failing:

Ã¢Â—Â software.service - Software
 Loaded: loaded (/home/trippelganger/.config/systemd/user/software.service; enabled; vendor preset: enabled)
 Active: failed (Result: exit-code) since Fri 2018-05-04 09:05:26 CEST; 5s ago
 Process: 1839 ExecStart=/bin/run_program (code=exited, status=1/FAILURE)
 Main PID: 1839 (code=exited, status=1/FAILURE)



May 04 09:05:26 trippelganger systemd[595]: software.service: Main process exited, code=exited, status=1/FAILURE
May 04 09:05:26 trippelganger systemd[595]: software.service: Unit entered failed state.
May 04 09:05:26 trippelganger systemd[595]: software.service: Failed with result 'exit-code'.

Output of systemctl --user status software-fail.service
is:

Ã¢Â—Â software-fail.service - Delete corrupt files
 Loaded: loaded (/home/trippelganger/.config/systemd/user/software-fail.service; disabled; vendor preset: enabled)
 Active: inactive (dead)

edited May 5 at 3:01

asked May 3 at 15:07

trippelganger

133

Is there any journal output from your first service file? How about the second service file? That is, can you tell if the first service reaches the failure condition at all? You should update the question with the (relevant) journal output.
â€“Â Chiraag
May 3 at 16:26

@Chiraag Updated.
â€“Â trippelganger
May 4 at 7:10

@trippelganger Please edit your question to make some corrections. start_if_fail.service should probably be software-fail.service (from your later systemctl status output) and ExecStop=systemctl ... fails because the path is not absolute, so I imagine what you really have is ExecStop=/bin/systemctl ... (though it's possible this might also be part of the problem you're having.)
â€“Â Filipe Brandenburger
May 4 at 16:05

add a commentÂ |Â

up vote
2
down vote

favorite

I tried creating a service that should get executed in case of failure, by doing this:

[Service]
ExecStart=/bin/run_program
OnFailure=software-fail.service

where this service is:

[Service]
ExecStart=/bin/rm /file/to/delete
ExecStop=systemctl --user start software.service

The problem, however, is that this service doesn't start, even when the service has failed.

I tried doing

systemctl --user enable software-fail.service

but then it starts every time the system starts, just like any other service.

My temporary solution is to use

ExecStopPost=/bin/rm /file/to/delete

but this is not a satisfying way of solving it, as it will always delete the file upon stopping the service, no matter if it was because of failure or not.

Output when failing:

Ã¢Â—Â software.service - Software
 Loaded: loaded (/home/trippelganger/.config/systemd/user/software.service; enabled; vendor preset: enabled)
 Active: failed (Result: exit-code) since Fri 2018-05-04 09:05:26 CEST; 5s ago
 Process: 1839 ExecStart=/bin/run_program (code=exited, status=1/FAILURE)
 Main PID: 1839 (code=exited, status=1/FAILURE)



May 04 09:05:26 trippelganger systemd[595]: software.service: Main process exited, code=exited, status=1/FAILURE
May 04 09:05:26 trippelganger systemd[595]: software.service: Unit entered failed state.
May 04 09:05:26 trippelganger systemd[595]: software.service: Failed with result 'exit-code'.

Output of systemctl --user status software-fail.service
is:

Ã¢Â—Â software-fail.service - Delete corrupt files
 Loaded: loaded (/home/trippelganger/.config/systemd/user/software-fail.service; disabled; vendor preset: enabled)
 Active: inactive (dead)

edited May 5 at 3:01

asked May 3 at 15:07

trippelganger

133

I tried creating a service that should get executed in case of failure, by doing this:

[Service]
ExecStart=/bin/run_program
OnFailure=software-fail.service

where this service is:

[Service]
ExecStart=/bin/rm /file/to/delete
ExecStop=systemctl --user start software.service

The problem, however, is that this service doesn't start, even when the service has failed.

I tried doing

systemctl --user enable software-fail.service

but then it starts every time the system starts, just like any other service.

My temporary solution is to use

ExecStopPost=/bin/rm /file/to/delete

but this is not a satisfying way of solving it, as it will always delete the file upon stopping the service, no matter if it was because of failure or not.

Output when failing:

Ã¢Â—Â software.service - Software
 Loaded: loaded (/home/trippelganger/.config/systemd/user/software.service; enabled; vendor preset: enabled)
 Active: failed (Result: exit-code) since Fri 2018-05-04 09:05:26 CEST; 5s ago
 Process: 1839 ExecStart=/bin/run_program (code=exited, status=1/FAILURE)
 Main PID: 1839 (code=exited, status=1/FAILURE)



May 04 09:05:26 trippelganger systemd[595]: software.service: Main process exited, code=exited, status=1/FAILURE
May 04 09:05:26 trippelganger systemd[595]: software.service: Unit entered failed state.
May 04 09:05:26 trippelganger systemd[595]: software.service: Failed with result 'exit-code'.

Output of systemctl --user status software-fail.service
is:

Ã¢Â—Â software-fail.service - Delete corrupt files
 Loaded: loaded (/home/trippelganger/.config/systemd/user/software-fail.service; disabled; vendor preset: enabled)
 Active: inactive (dead)

edited May 5 at 3:01

asked May 3 at 15:07

trippelganger

133

edited May 5 at 3:01

asked May 3 at 15:07

trippelganger

133

asked May 3 at 15:07

trippelganger

133

asked May 3 at 15:07

trippelganger

133

Is there any journal output from your first service file? How about the second service file? That is, can you tell if the first service reaches the failure condition at all? You should update the question with the (relevant) journal output.
â€“Â Chiraag
May 3 at 16:26

@Chiraag Updated.
â€“Â trippelganger
May 4 at 7:10

@trippelganger Please edit your question to make some corrections. start_if_fail.service should probably be software-fail.service (from your later systemctl status output) and ExecStop=systemctl ... fails because the path is not absolute, so I imagine what you really have is ExecStop=/bin/systemctl ... (though it's possible this might also be part of the problem you're having.)
â€“Â Filipe Brandenburger
May 4 at 16:05

add a commentÂ |Â

Is there any journal output from your first service file? How about the second service file? That is, can you tell if the first service reaches the failure condition at all? You should update the question with the (relevant) journal output.
â€“Â Chiraag
May 3 at 16:26

@Chiraag Updated.
â€“Â trippelganger
May 4 at 7:10

@trippelganger Please edit your question to make some corrections. start_if_fail.service should probably be software-fail.service (from your later systemctl status output) and ExecStop=systemctl ... fails because the path is not absolute, so I imagine what you really have is ExecStop=/bin/systemctl ... (though it's possible this might also be part of the problem you're having.)
â€“Â Filipe Brandenburger
May 4 at 16:05

Is there any journal output from your first service file? How about the second service file? That is, can you tell if the first service reaches the failure condition at all? You should update the question with the (relevant) journal output.
â€“Â Chiraag
May 3 at 16:26

@Chiraag Updated.
â€“Â trippelganger
May 4 at 7:10

@trippelganger Please edit your question to make some corrections. start_if_fail.service should probably be software-fail.service (from your later systemctl status output) and ExecStop=systemctl ... fails because the path is not absolute, so I imagine what you really have is ExecStop=/bin/systemctl ... (though it's possible this might also be part of the problem you're having.)
â€“Â Filipe Brandenburger
May 4 at 16:05

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
0
down vote

accepted

NOTE: You probably want to use ExecStopPost= instead of OnFailure= here (see my other answer), but this is trying to address why your OnFailure= setup is not working.

The problem with OnFailure= not starting the unit might be because it's in the wrong section, it needs to be in the [Unit] section and not [Service].

You can try this instead:

# software.service
[Unit]
Description=Software
OnFailure=software-fail.service

[Service]
ExecStart=/bin/run_program

And:

# software-fail.service
[Unit]
Description=Delete corrupt files

[Service]
ExecStart=/bin/rm /file/to/delete
ExecStop=/bin/systemctl --user start software.service

I can make it work with this setup.

But note that using OnFailure= is not ideal here, since you can't really tell why the program failed, and chaining another start of it in ExecStop= by calling /bin/systemctl start directly is pretty hacky... The solution using ExecStopPost= and looking at the exit status is definitely superior.

If you define OnFailure= inside [Service], systemd (at least version 234 from Fedora 27) complains with:

software.service:6: Unknown lvalue 'OnFailure' in section 'Service'

Not sure if you're seeing that in your logs or not... (Maybe this was added in recent systemd?) That should be a hint of what is going on there... I hope this helps.

answered May 4 at 16:01

Filipe Brandenburger

3,451521

Thank you, what a stupid mistake! I did not see the "Unknown lvalue" text in my logs, using the latest systemd in Debian Stretch.
â€“Â trippelganger
May 5 at 3:05

add a commentÂ |Â

up vote
1
down vote

In order to perform some cleanup if the service fails, you can use ExecStopPost=, which is executed whether the service succeeds or not.

In the code you run at ExecStopPost=, you can use one of $SERVICE_RESULT, $EXIT_CODE or $EXIT_STATUS to determine the failure condition and act accordingly. See the documentation on those environment variables to check which one is appropriate for you.

Then you can use Restart=on-failure so that systemd tries to restart your unit when it fails.

Putting it all together, this is what it would look like. Assuming that run_program will exit with status 2 whenever the files are corrupted (hopefully you can adapt this to other failure scenarios from the documentation above), this should work:

[Service]
ExecStart=/bin/run_program
ExecStopPost=/bin/sh -c 'if [ "$$EXIT_STATUS" = 2 ]; then rm /file/to/delete; fi'
Restart=on-failure

(NOTE: The double dollar-sign $$ is to escape this to systemd, so the shell sees $EXIT_STATUS and accesses that variable. Using a single dollar-sign would also work, but then systemd would do that replacement instead and the shell would see [ "2" = 2 ], which arguably also works... Anyways, you can bypass most of that by putting all this logic into a shell script and calling it by its full path in ExecStopPost=, that would be probably better and you could also easily add more commands to the script, such as logging the action taken to recover from the error condition.)

Hopefully this will give you enough pointers to figure out how to configure this correctly given your particular situation!

answered May 3 at 21:10

Filipe Brandenburger

3,451521

1

Thank you, this is a better way of doing my current solution. However, it does not explain how to properly use the OnFailure command in systemd, which was my original question. I find the systemd documentation can be a bit hard to interpret at times. Also, shouldn't I check for exit status 1, rather than 2?
â€“Â trippelganger
May 4 at 7:43

Checking exit status 2 was an example, in case your program does something specific in case of file corruption (some programs use different exit status to signal specific conditions.) A more general solution would be checking for "$$EXIT_STATUS" != 0, but the best is for you to look at your program's documentation (or source code) and look at the three variables systemd uses and try to figure out what's the best way to detect the error condition in your specific case.
â€“Â Filipe Brandenburger
May 4 at 15:41

Indeed I did not explain why OnFailure= was not working for you, I focused on the problem you were trying to solve and I think OnFailure= is not the best solution for that... I added another answer to address OnFailure=, it seems the problem is you have it in the wrong section ([Service], while it should be in [Unit]), I hope that's helpful to you as well. In any case, for this specific problem you described, ExecStopPost= and checking for exit status (or service result, etc.) is probably a better approach.
â€“Â Filipe Brandenburger
May 4 at 16:03

Isn't this kind of situation exactly what OnFailure is used for?
â€“Â trippelganger
May 5 at 3:04

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f441575%2fproper-way-to-use-onfailure-in-systemd%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
0
down vote

accepted

NOTE: You probably want to use ExecStopPost= instead of OnFailure= here (see my other answer), but this is trying to address why your OnFailure= setup is not working.

The problem with OnFailure= not starting the unit might be because it's in the wrong section, it needs to be in the [Unit] section and not [Service].

You can try this instead:

# software.service
[Unit]
Description=Software
OnFailure=software-fail.service

[Service]
ExecStart=/bin/run_program

And:

# software-fail.service
[Unit]
Description=Delete corrupt files

[Service]
ExecStart=/bin/rm /file/to/delete
ExecStop=/bin/systemctl --user start software.service

I can make it work with this setup.

If you define OnFailure= inside [Service], systemd (at least version 234 from Fedora 27) complains with:

software.service:6: Unknown lvalue 'OnFailure' in section 'Service'

Not sure if you're seeing that in your logs or not... (Maybe this was added in recent systemd?) That should be a hint of what is going on there... I hope this helps.

answered May 4 at 16:01

Filipe Brandenburger

3,451521

Thank you, what a stupid mistake! I did not see the "Unknown lvalue" text in my logs, using the latest systemd in Debian Stretch.
â€“Â trippelganger
May 5 at 3:05

add a commentÂ |Â

up vote
0
down vote

accepted

NOTE: You probably want to use ExecStopPost= instead of OnFailure= here (see my other answer), but this is trying to address why your OnFailure= setup is not working.

The problem with OnFailure= not starting the unit might be because it's in the wrong section, it needs to be in the [Unit] section and not [Service].

You can try this instead:

# software.service
[Unit]
Description=Software
OnFailure=software-fail.service

[Service]
ExecStart=/bin/run_program

And:

# software-fail.service
[Unit]
Description=Delete corrupt files

[Service]
ExecStart=/bin/rm /file/to/delete
ExecStop=/bin/systemctl --user start software.service

I can make it work with this setup.

If you define OnFailure= inside [Service], systemd (at least version 234 from Fedora 27) complains with:

software.service:6: Unknown lvalue 'OnFailure' in section 'Service'

Not sure if you're seeing that in your logs or not... (Maybe this was added in recent systemd?) That should be a hint of what is going on there... I hope this helps.

answered May 4 at 16:01

Filipe Brandenburger

3,451521

Thank you, what a stupid mistake! I did not see the "Unknown lvalue" text in my logs, using the latest systemd in Debian Stretch.
â€“Â trippelganger
May 5 at 3:05

add a commentÂ |Â

up vote
0
down vote

accepted

NOTE: You probably want to use ExecStopPost= instead of OnFailure= here (see my other answer), but this is trying to address why your OnFailure= setup is not working.

The problem with OnFailure= not starting the unit might be because it's in the wrong section, it needs to be in the [Unit] section and not [Service].

You can try this instead:

# software.service
[Unit]
Description=Software
OnFailure=software-fail.service

[Service]
ExecStart=/bin/run_program

And:

# software-fail.service
[Unit]
Description=Delete corrupt files

[Service]
ExecStart=/bin/rm /file/to/delete
ExecStop=/bin/systemctl --user start software.service

I can make it work with this setup.

If you define OnFailure= inside [Service], systemd (at least version 234 from Fedora 27) complains with:

software.service:6: Unknown lvalue 'OnFailure' in section 'Service'

Not sure if you're seeing that in your logs or not... (Maybe this was added in recent systemd?) That should be a hint of what is going on there... I hope this helps.

answered May 4 at 16:01

Filipe Brandenburger

3,451521

NOTE: You probably want to use ExecStopPost= instead of OnFailure= here (see my other answer), but this is trying to address why your OnFailure= setup is not working.

The problem with OnFailure= not starting the unit might be because it's in the wrong section, it needs to be in the [Unit] section and not [Service].

You can try this instead:

# software.service
[Unit]
Description=Software
OnFailure=software-fail.service

[Service]
ExecStart=/bin/run_program

And:

# software-fail.service
[Unit]
Description=Delete corrupt files

[Service]
ExecStart=/bin/rm /file/to/delete
ExecStop=/bin/systemctl --user start software.service

I can make it work with this setup.

If you define OnFailure= inside [Service], systemd (at least version 234 from Fedora 27) complains with:

software.service:6: Unknown lvalue 'OnFailure' in section 'Service'

Not sure if you're seeing that in your logs or not... (Maybe this was added in recent systemd?) That should be a hint of what is going on there... I hope this helps.

answered May 4 at 16:01

Filipe Brandenburger

3,451521

answered May 4 at 16:01

Filipe Brandenburger

3,451521

answered May 4 at 16:01

Filipe Brandenburger

3,451521

answered May 4 at 16:01

Filipe Brandenburger

3,451521

Thank you, what a stupid mistake! I did not see the "Unknown lvalue" text in my logs, using the latest systemd in Debian Stretch.
â€“Â trippelganger
May 5 at 3:05

add a commentÂ |Â

Thank you, what a stupid mistake! I did not see the "Unknown lvalue" text in my logs, using the latest systemd in Debian Stretch.
â€“Â trippelganger
May 5 at 3:05

Thank you, what a stupid mistake! I did not see the "Unknown lvalue" text in my logs, using the latest systemd in Debian Stretch.
â€“Â trippelganger
May 5 at 3:05

add a commentÂ |Â

up vote
1
down vote

In order to perform some cleanup if the service fails, you can use ExecStopPost=, which is executed whether the service succeeds or not.

Then you can use Restart=on-failure so that systemd tries to restart your unit when it fails.

[Service]
ExecStart=/bin/run_program
ExecStopPost=/bin/sh -c 'if [ "$$EXIT_STATUS" = 2 ]; then rm /file/to/delete; fi'
Restart=on-failure

Hopefully this will give you enough pointers to figure out how to configure this correctly given your particular situation!

answered May 3 at 21:10

Filipe Brandenburger

3,451521

1

Thank you, this is a better way of doing my current solution. However, it does not explain how to properly use the OnFailure command in systemd, which was my original question. I find the systemd documentation can be a bit hard to interpret at times. Also, shouldn't I check for exit status 1, rather than 2?
â€“Â trippelganger
May 4 at 7:43

Checking exit status 2 was an example, in case your program does something specific in case of file corruption (some programs use different exit status to signal specific conditions.) A more general solution would be checking for "$$EXIT_STATUS" != 0, but the best is for you to look at your program's documentation (or source code) and look at the three variables systemd uses and try to figure out what's the best way to detect the error condition in your specific case.
â€“Â Filipe Brandenburger
May 4 at 15:41

Indeed I did not explain why OnFailure= was not working for you, I focused on the problem you were trying to solve and I think OnFailure= is not the best solution for that... I added another answer to address OnFailure=, it seems the problem is you have it in the wrong section ([Service], while it should be in [Unit]), I hope that's helpful to you as well. In any case, for this specific problem you described, ExecStopPost= and checking for exit status (or service result, etc.) is probably a better approach.
â€“Â Filipe Brandenburger
May 4 at 16:03

Isn't this kind of situation exactly what OnFailure is used for?
â€“Â trippelganger
May 5 at 3:04

add a commentÂ |Â

up vote
1
down vote

In order to perform some cleanup if the service fails, you can use ExecStopPost=, which is executed whether the service succeeds or not.

Then you can use Restart=on-failure so that systemd tries to restart your unit when it fails.

[Service]
ExecStart=/bin/run_program
ExecStopPost=/bin/sh -c 'if [ "$$EXIT_STATUS" = 2 ]; then rm /file/to/delete; fi'
Restart=on-failure

Hopefully this will give you enough pointers to figure out how to configure this correctly given your particular situation!

answered May 3 at 21:10

Filipe Brandenburger

3,451521

1

Thank you, this is a better way of doing my current solution. However, it does not explain how to properly use the OnFailure command in systemd, which was my original question. I find the systemd documentation can be a bit hard to interpret at times. Also, shouldn't I check for exit status 1, rather than 2?
â€“Â trippelganger
May 4 at 7:43

Checking exit status 2 was an example, in case your program does something specific in case of file corruption (some programs use different exit status to signal specific conditions.) A more general solution would be checking for "$$EXIT_STATUS" != 0, but the best is for you to look at your program's documentation (or source code) and look at the three variables systemd uses and try to figure out what's the best way to detect the error condition in your specific case.
â€“Â Filipe Brandenburger
May 4 at 15:41

Indeed I did not explain why OnFailure= was not working for you, I focused on the problem you were trying to solve and I think OnFailure= is not the best solution for that... I added another answer to address OnFailure=, it seems the problem is you have it in the wrong section ([Service], while it should be in [Unit]), I hope that's helpful to you as well. In any case, for this specific problem you described, ExecStopPost= and checking for exit status (or service result, etc.) is probably a better approach.
â€“Â Filipe Brandenburger
May 4 at 16:03

Isn't this kind of situation exactly what OnFailure is used for?
â€“Â trippelganger
May 5 at 3:04

add a commentÂ |Â

up vote
1
down vote

In order to perform some cleanup if the service fails, you can use ExecStopPost=, which is executed whether the service succeeds or not.

Then you can use Restart=on-failure so that systemd tries to restart your unit when it fails.

[Service]
ExecStart=/bin/run_program
ExecStopPost=/bin/sh -c 'if [ "$$EXIT_STATUS" = 2 ]; then rm /file/to/delete; fi'
Restart=on-failure

Hopefully this will give you enough pointers to figure out how to configure this correctly given your particular situation!

answered May 3 at 21:10

Filipe Brandenburger

3,451521

In order to perform some cleanup if the service fails, you can use ExecStopPost=, which is executed whether the service succeeds or not.

Then you can use Restart=on-failure so that systemd tries to restart your unit when it fails.

[Service]
ExecStart=/bin/run_program
ExecStopPost=/bin/sh -c 'if [ "$$EXIT_STATUS" = 2 ]; then rm /file/to/delete; fi'
Restart=on-failure

Hopefully this will give you enough pointers to figure out how to configure this correctly given your particular situation!

answered May 3 at 21:10

Filipe Brandenburger

3,451521

answered May 3 at 21:10

Filipe Brandenburger

3,451521

answered May 3 at 21:10

Filipe Brandenburger

3,451521

answered May 3 at 21:10

Filipe Brandenburger

3,451521

1

Thank you, this is a better way of doing my current solution. However, it does not explain how to properly use the OnFailure command in systemd, which was my original question. I find the systemd documentation can be a bit hard to interpret at times. Also, shouldn't I check for exit status 1, rather than 2?
â€“Â trippelganger
May 4 at 7:43

Checking exit status 2 was an example, in case your program does something specific in case of file corruption (some programs use different exit status to signal specific conditions.) A more general solution would be checking for "$$EXIT_STATUS" != 0, but the best is for you to look at your program's documentation (or source code) and look at the three variables systemd uses and try to figure out what's the best way to detect the error condition in your specific case.
â€“Â Filipe Brandenburger
May 4 at 15:41

Indeed I did not explain why OnFailure= was not working for you, I focused on the problem you were trying to solve and I think OnFailure= is not the best solution for that... I added another answer to address OnFailure=, it seems the problem is you have it in the wrong section ([Service], while it should be in [Unit]), I hope that's helpful to you as well. In any case, for this specific problem you described, ExecStopPost= and checking for exit status (or service result, etc.) is probably a better approach.
â€“Â Filipe Brandenburger
May 4 at 16:03

Isn't this kind of situation exactly what OnFailure is used for?
â€“Â trippelganger
May 5 at 3:04

add a commentÂ |Â

1

Thank you, this is a better way of doing my current solution. However, it does not explain how to properly use the OnFailure command in systemd, which was my original question. I find the systemd documentation can be a bit hard to interpret at times. Also, shouldn't I check for exit status 1, rather than 2?
â€“Â trippelganger
May 4 at 7:43

Checking exit status 2 was an example, in case your program does something specific in case of file corruption (some programs use different exit status to signal specific conditions.) A more general solution would be checking for "$$EXIT_STATUS" != 0, but the best is for you to look at your program's documentation (or source code) and look at the three variables systemd uses and try to figure out what's the best way to detect the error condition in your specific case.
â€“Â Filipe Brandenburger
May 4 at 15:41

Indeed I did not explain why OnFailure= was not working for you, I focused on the problem you were trying to solve and I think OnFailure= is not the best solution for that... I added another answer to address OnFailure=, it seems the problem is you have it in the wrong section ([Service], while it should be in [Unit]), I hope that's helpful to you as well. In any case, for this specific problem you described, ExecStopPost= and checking for exit status (or service result, etc.) is probably a better approach.
â€“Â Filipe Brandenburger
May 4 at 16:03

Isn't this kind of situation exactly what OnFailure is used for?
â€“Â trippelganger
May 5 at 3:04

Thank you, this is a better way of doing my current solution. However, it does not explain how to properly use the OnFailure command in systemd, which was my original question. I find the systemd documentation can be a bit hard to interpret at times. Also, shouldn't I check for exit status 1, rather than 2?
â€“Â trippelganger
May 4 at 7:43

Checking exit status 2 was an example, in case your program does something specific in case of file corruption (some programs use different exit status to signal specific conditions.) A more general solution would be checking for "$$EXIT_STATUS" != 0, but the best is for you to look at your program's documentation (or source code) and look at the three variables systemd uses and try to figure out what's the best way to detect the error condition in your specific case.
â€“Â Filipe Brandenburger
May 4 at 15:41

Indeed I did not explain why OnFailure= was not working for you, I focused on the problem you were trying to solve and I think OnFailure= is not the best solution for that... I added another answer to address OnFailure=, it seems the problem is you have it in the wrong section ([Service], while it should be in [Unit]), I hope that's helpful to you as well. In any case, for this specific problem you described, ExecStopPost= and checking for exit status (or service result, etc.) is probably a better approach.
â€“Â Filipe Brandenburger
May 4 at 16:03

Isn't this kind of situation exactly what OnFailure is used for?
â€“Â trippelganger
May 5 at 3:04

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu