How to monitor the Pacemaker cluster using a script?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












0















I have created a two node cluster (both nodes RHEL 7) using pacemaker. It is used to run a custom application. I have created below resources and assigned it to the cluster:



  1. A shared storage for application data

  2. A virtual IP

It works perfectly fine.



Now, we have a requirement. Currently the failover happens only if something goes wrong with the entire server. Pacemaker is unaware of the status of the application running on the active node and completely ignores it. We have a shell script that is able to run a health check on the application and returns true/false values based on the health of the application.

Can anyone please suggest me how to configure pacemaker to use this shell script to regularly check status of the application on the active node of the cluster and initiate failover if script returns a false value.



I have seen examples, in webserver clusters people create a sample html page and use this (http://127.0.0.1/samplepage.html) as a resource with pacemaker to check the health of apache webserver in active node.



Please guide me how to achieve similar result using a shell script.



Update:



Here is my configuration:



[root@node1 ~]# pcs status
Cluster name: webspheremq
Stack: corosync
Current DC: node1 (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Jun 14 20:38:48 2017 Last change: Tue Jun 13 20:04:58 2017 by root via crm_attribute on svdg-stg29

2 nodes and 3 resources configured: 2 resources DISABLED and 0 BLOCKED from being started due to failures

Online: [ node1 node2 ]

Full list of resources:

Resource Group: websphere
websphere_fs (ocf::heartbeat:Filesystem): Started node1
websphere_vip (ocf::heartbeat:IPaddr2): Started node1
FailOverScript (ocf::heartbeat:Dummy): Started node1


Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled


To start and stop the application, I have two shell scripts. During failover, I would need stop.sh to run in the node from which resources will be moved and start.sh to run in the node to which cluster is failing over.



I did little experiment and found that people are using dummy resource to achieve this kind of requirements (to execute scripts during failover).



So here is what I have done so far:



I created a dummy resource (FailOverScript) for testing application start/stop scripts like below:



[root@node1 tmp]# pcs status resources
Resource Group: websphere
websphere_fs (ocf::heartbeat:Filesystem): Started node1
websphere_vip (ocf::heartbeat:IPaddr2): Started node1
**FailOverScript (ocf::heartbeat:Dummy): Started node1**


As of now, I included test scripts under start and stop actions of the resource FailOverScript. It should execute scripts failoverstartscript.sh and failoverstopscript.sh respectively when this dummy resource starts and stops.



[root@node1 heartbeat]# pwd
/usr/lib/ocf/resource.d/heartbeat
[root@node1 heartbeat]#
[root@node1 heartbeat]# grep -A5 "start()" FailOverScript
FailOverScript_start() {
FailOverScript_monitor
/usr/local/bin/failoverstartscript.sh
if [ $? = $OCF_SUCCESS ]; then
return $OCF_SUCCESS
fi
[root@node1 heartbeat]#
[root@node1 heartbeat]#
[root@node1 heartbeat]# grep -A5 "stop()" FailOverScript
FailOverScript_stop() {
FailOverScript_monitor
/usr/local/bin/failoverstopscript.sh
if [ $? = $OCF_SUCCESS ]; then
rm $OCF_RESKEY_state
fi


But when this dummy resource is started/stopped (through manual failover), the script does not execute. Tried different things but I am still unable to figure out the reason for this.
Need some help to find the reason for the scripts not to execute automatically during failover.










share|improve this question
























  • Can you share your configuration?

    – Matt Kereczman
    Jun 12 '17 at 16:23











  • ... and how you're starting your application? Ideally, you would configure your application in Pacemaker, which would allow Pacemaker to monitor the application AND the node.

    – Matt Kereczman
    Jun 12 '17 at 16:36











  • @MattKereczman I have added configuration details as update.

    – Vinod
    Jun 14 '17 at 16:46











  • That's just the view of the running resources, I want to see how they're configured: # pcs cluster cib > /tmp/cib.xml

    – Matt Kereczman
    Jun 14 '17 at 17:49











  • Please find the requested file here: ge.tt/28wrZIl2

    – Vinod
    Jun 15 '17 at 11:44















0















I have created a two node cluster (both nodes RHEL 7) using pacemaker. It is used to run a custom application. I have created below resources and assigned it to the cluster:



  1. A shared storage for application data

  2. A virtual IP

It works perfectly fine.



Now, we have a requirement. Currently the failover happens only if something goes wrong with the entire server. Pacemaker is unaware of the status of the application running on the active node and completely ignores it. We have a shell script that is able to run a health check on the application and returns true/false values based on the health of the application.

Can anyone please suggest me how to configure pacemaker to use this shell script to regularly check status of the application on the active node of the cluster and initiate failover if script returns a false value.



I have seen examples, in webserver clusters people create a sample html page and use this (http://127.0.0.1/samplepage.html) as a resource with pacemaker to check the health of apache webserver in active node.



Please guide me how to achieve similar result using a shell script.



Update:



Here is my configuration:



[root@node1 ~]# pcs status
Cluster name: webspheremq
Stack: corosync
Current DC: node1 (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Jun 14 20:38:48 2017 Last change: Tue Jun 13 20:04:58 2017 by root via crm_attribute on svdg-stg29

2 nodes and 3 resources configured: 2 resources DISABLED and 0 BLOCKED from being started due to failures

Online: [ node1 node2 ]

Full list of resources:

Resource Group: websphere
websphere_fs (ocf::heartbeat:Filesystem): Started node1
websphere_vip (ocf::heartbeat:IPaddr2): Started node1
FailOverScript (ocf::heartbeat:Dummy): Started node1


Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled


To start and stop the application, I have two shell scripts. During failover, I would need stop.sh to run in the node from which resources will be moved and start.sh to run in the node to which cluster is failing over.



I did little experiment and found that people are using dummy resource to achieve this kind of requirements (to execute scripts during failover).



So here is what I have done so far:



I created a dummy resource (FailOverScript) for testing application start/stop scripts like below:



[root@node1 tmp]# pcs status resources
Resource Group: websphere
websphere_fs (ocf::heartbeat:Filesystem): Started node1
websphere_vip (ocf::heartbeat:IPaddr2): Started node1
**FailOverScript (ocf::heartbeat:Dummy): Started node1**


As of now, I included test scripts under start and stop actions of the resource FailOverScript. It should execute scripts failoverstartscript.sh and failoverstopscript.sh respectively when this dummy resource starts and stops.



[root@node1 heartbeat]# pwd
/usr/lib/ocf/resource.d/heartbeat
[root@node1 heartbeat]#
[root@node1 heartbeat]# grep -A5 "start()" FailOverScript
FailOverScript_start() {
FailOverScript_monitor
/usr/local/bin/failoverstartscript.sh
if [ $? = $OCF_SUCCESS ]; then
return $OCF_SUCCESS
fi
[root@node1 heartbeat]#
[root@node1 heartbeat]#
[root@node1 heartbeat]# grep -A5 "stop()" FailOverScript
FailOverScript_stop() {
FailOverScript_monitor
/usr/local/bin/failoverstopscript.sh
if [ $? = $OCF_SUCCESS ]; then
rm $OCF_RESKEY_state
fi


But when this dummy resource is started/stopped (through manual failover), the script does not execute. Tried different things but I am still unable to figure out the reason for this.
Need some help to find the reason for the scripts not to execute automatically during failover.










share|improve this question
























  • Can you share your configuration?

    – Matt Kereczman
    Jun 12 '17 at 16:23











  • ... and how you're starting your application? Ideally, you would configure your application in Pacemaker, which would allow Pacemaker to monitor the application AND the node.

    – Matt Kereczman
    Jun 12 '17 at 16:36











  • @MattKereczman I have added configuration details as update.

    – Vinod
    Jun 14 '17 at 16:46











  • That's just the view of the running resources, I want to see how they're configured: # pcs cluster cib > /tmp/cib.xml

    – Matt Kereczman
    Jun 14 '17 at 17:49











  • Please find the requested file here: ge.tt/28wrZIl2

    – Vinod
    Jun 15 '17 at 11:44













0












0








0








I have created a two node cluster (both nodes RHEL 7) using pacemaker. It is used to run a custom application. I have created below resources and assigned it to the cluster:



  1. A shared storage for application data

  2. A virtual IP

It works perfectly fine.



Now, we have a requirement. Currently the failover happens only if something goes wrong with the entire server. Pacemaker is unaware of the status of the application running on the active node and completely ignores it. We have a shell script that is able to run a health check on the application and returns true/false values based on the health of the application.

Can anyone please suggest me how to configure pacemaker to use this shell script to regularly check status of the application on the active node of the cluster and initiate failover if script returns a false value.



I have seen examples, in webserver clusters people create a sample html page and use this (http://127.0.0.1/samplepage.html) as a resource with pacemaker to check the health of apache webserver in active node.



Please guide me how to achieve similar result using a shell script.



Update:



Here is my configuration:



[root@node1 ~]# pcs status
Cluster name: webspheremq
Stack: corosync
Current DC: node1 (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Jun 14 20:38:48 2017 Last change: Tue Jun 13 20:04:58 2017 by root via crm_attribute on svdg-stg29

2 nodes and 3 resources configured: 2 resources DISABLED and 0 BLOCKED from being started due to failures

Online: [ node1 node2 ]

Full list of resources:

Resource Group: websphere
websphere_fs (ocf::heartbeat:Filesystem): Started node1
websphere_vip (ocf::heartbeat:IPaddr2): Started node1
FailOverScript (ocf::heartbeat:Dummy): Started node1


Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled


To start and stop the application, I have two shell scripts. During failover, I would need stop.sh to run in the node from which resources will be moved and start.sh to run in the node to which cluster is failing over.



I did little experiment and found that people are using dummy resource to achieve this kind of requirements (to execute scripts during failover).



So here is what I have done so far:



I created a dummy resource (FailOverScript) for testing application start/stop scripts like below:



[root@node1 tmp]# pcs status resources
Resource Group: websphere
websphere_fs (ocf::heartbeat:Filesystem): Started node1
websphere_vip (ocf::heartbeat:IPaddr2): Started node1
**FailOverScript (ocf::heartbeat:Dummy): Started node1**


As of now, I included test scripts under start and stop actions of the resource FailOverScript. It should execute scripts failoverstartscript.sh and failoverstopscript.sh respectively when this dummy resource starts and stops.



[root@node1 heartbeat]# pwd
/usr/lib/ocf/resource.d/heartbeat
[root@node1 heartbeat]#
[root@node1 heartbeat]# grep -A5 "start()" FailOverScript
FailOverScript_start() {
FailOverScript_monitor
/usr/local/bin/failoverstartscript.sh
if [ $? = $OCF_SUCCESS ]; then
return $OCF_SUCCESS
fi
[root@node1 heartbeat]#
[root@node1 heartbeat]#
[root@node1 heartbeat]# grep -A5 "stop()" FailOverScript
FailOverScript_stop() {
FailOverScript_monitor
/usr/local/bin/failoverstopscript.sh
if [ $? = $OCF_SUCCESS ]; then
rm $OCF_RESKEY_state
fi


But when this dummy resource is started/stopped (through manual failover), the script does not execute. Tried different things but I am still unable to figure out the reason for this.
Need some help to find the reason for the scripts not to execute automatically during failover.










share|improve this question
















I have created a two node cluster (both nodes RHEL 7) using pacemaker. It is used to run a custom application. I have created below resources and assigned it to the cluster:



  1. A shared storage for application data

  2. A virtual IP

It works perfectly fine.



Now, we have a requirement. Currently the failover happens only if something goes wrong with the entire server. Pacemaker is unaware of the status of the application running on the active node and completely ignores it. We have a shell script that is able to run a health check on the application and returns true/false values based on the health of the application.

Can anyone please suggest me how to configure pacemaker to use this shell script to regularly check status of the application on the active node of the cluster and initiate failover if script returns a false value.



I have seen examples, in webserver clusters people create a sample html page and use this (http://127.0.0.1/samplepage.html) as a resource with pacemaker to check the health of apache webserver in active node.



Please guide me how to achieve similar result using a shell script.



Update:



Here is my configuration:



[root@node1 ~]# pcs status
Cluster name: webspheremq
Stack: corosync
Current DC: node1 (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Jun 14 20:38:48 2017 Last change: Tue Jun 13 20:04:58 2017 by root via crm_attribute on svdg-stg29

2 nodes and 3 resources configured: 2 resources DISABLED and 0 BLOCKED from being started due to failures

Online: [ node1 node2 ]

Full list of resources:

Resource Group: websphere
websphere_fs (ocf::heartbeat:Filesystem): Started node1
websphere_vip (ocf::heartbeat:IPaddr2): Started node1
FailOverScript (ocf::heartbeat:Dummy): Started node1


Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled


To start and stop the application, I have two shell scripts. During failover, I would need stop.sh to run in the node from which resources will be moved and start.sh to run in the node to which cluster is failing over.



I did little experiment and found that people are using dummy resource to achieve this kind of requirements (to execute scripts during failover).



So here is what I have done so far:



I created a dummy resource (FailOverScript) for testing application start/stop scripts like below:



[root@node1 tmp]# pcs status resources
Resource Group: websphere
websphere_fs (ocf::heartbeat:Filesystem): Started node1
websphere_vip (ocf::heartbeat:IPaddr2): Started node1
**FailOverScript (ocf::heartbeat:Dummy): Started node1**


As of now, I included test scripts under start and stop actions of the resource FailOverScript. It should execute scripts failoverstartscript.sh and failoverstopscript.sh respectively when this dummy resource starts and stops.



[root@node1 heartbeat]# pwd
/usr/lib/ocf/resource.d/heartbeat
[root@node1 heartbeat]#
[root@node1 heartbeat]# grep -A5 "start()" FailOverScript
FailOverScript_start() {
FailOverScript_monitor
/usr/local/bin/failoverstartscript.sh
if [ $? = $OCF_SUCCESS ]; then
return $OCF_SUCCESS
fi
[root@node1 heartbeat]#
[root@node1 heartbeat]#
[root@node1 heartbeat]# grep -A5 "stop()" FailOverScript
FailOverScript_stop() {
FailOverScript_monitor
/usr/local/bin/failoverstopscript.sh
if [ $? = $OCF_SUCCESS ]; then
rm $OCF_RESKEY_state
fi


But when this dummy resource is started/stopped (through manual failover), the script does not execute. Tried different things but I am still unable to figure out the reason for this.
Need some help to find the reason for the scripts not to execute automatically during failover.







rhel cluster pacemaker






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jun 14 '17 at 16:42







Vinod

















asked Jun 9 '17 at 13:28









VinodVinod

6929




6929












  • Can you share your configuration?

    – Matt Kereczman
    Jun 12 '17 at 16:23











  • ... and how you're starting your application? Ideally, you would configure your application in Pacemaker, which would allow Pacemaker to monitor the application AND the node.

    – Matt Kereczman
    Jun 12 '17 at 16:36











  • @MattKereczman I have added configuration details as update.

    – Vinod
    Jun 14 '17 at 16:46











  • That's just the view of the running resources, I want to see how they're configured: # pcs cluster cib > /tmp/cib.xml

    – Matt Kereczman
    Jun 14 '17 at 17:49











  • Please find the requested file here: ge.tt/28wrZIl2

    – Vinod
    Jun 15 '17 at 11:44

















  • Can you share your configuration?

    – Matt Kereczman
    Jun 12 '17 at 16:23











  • ... and how you're starting your application? Ideally, you would configure your application in Pacemaker, which would allow Pacemaker to monitor the application AND the node.

    – Matt Kereczman
    Jun 12 '17 at 16:36











  • @MattKereczman I have added configuration details as update.

    – Vinod
    Jun 14 '17 at 16:46











  • That's just the view of the running resources, I want to see how they're configured: # pcs cluster cib > /tmp/cib.xml

    – Matt Kereczman
    Jun 14 '17 at 17:49











  • Please find the requested file here: ge.tt/28wrZIl2

    – Vinod
    Jun 15 '17 at 11:44
















Can you share your configuration?

– Matt Kereczman
Jun 12 '17 at 16:23





Can you share your configuration?

– Matt Kereczman
Jun 12 '17 at 16:23













... and how you're starting your application? Ideally, you would configure your application in Pacemaker, which would allow Pacemaker to monitor the application AND the node.

– Matt Kereczman
Jun 12 '17 at 16:36





... and how you're starting your application? Ideally, you would configure your application in Pacemaker, which would allow Pacemaker to monitor the application AND the node.

– Matt Kereczman
Jun 12 '17 at 16:36













@MattKereczman I have added configuration details as update.

– Vinod
Jun 14 '17 at 16:46





@MattKereczman I have added configuration details as update.

– Vinod
Jun 14 '17 at 16:46













That's just the view of the running resources, I want to see how they're configured: # pcs cluster cib > /tmp/cib.xml

– Matt Kereczman
Jun 14 '17 at 17:49





That's just the view of the running resources, I want to see how they're configured: # pcs cluster cib > /tmp/cib.xml

– Matt Kereczman
Jun 14 '17 at 17:49













Please find the requested file here: ge.tt/28wrZIl2

– Vinod
Jun 15 '17 at 11:44





Please find the requested file here: ge.tt/28wrZIl2

– Vinod
Jun 15 '17 at 11:44










1 Answer
1






active

oldest

votes


















2














Instead of trying to modify the Dummy RA to execute arbitrary scripts, you could instead look at using the anything resource-agent.



# pcs resource describe ocf:heartbeat:anything
ocf:heartbeat:anything - Manages an arbitrary service

This is a generic OCF RA to manage almost anything.

Resource options:
binfile (required): The full name of the binary to be executed.
This is expected to keep running with the
same pid and not just do something and
exit.
cmdline_options: Command line options to pass to the binary
workdir: The path from where the binfile will be executed.
pidfile: File to read/write the PID from/to.
logfile: File to write STDOUT to
errlogfile: File to write STDERR to
user: User to run the command as
monitor_hook: Command to run in monitor operation
stop_timeout: In the stop operation: Seconds to wait for kill
-TERM to succeed before sending kill -SIGKILL.
Defaults to 2/3 of the stop operation timeout.


You would point the anything agent at your script as the binfile= parameter, then, if you have some way of monitoring your custom application other than checking for a running pid (that's what the anything agent does by default), you can define that in the monitor_hook parameter.






share|improve this answer























  • Thanks @Matt Kereczman . Sorry, I could not work on this till now. Will check this and update the outcome. Before that one query please. It looks like anything RA is missing in my setup. This is what I found in Redhat site link .

    – Vinod
    Jun 20 '17 at 11:48











  • I wasn't aware they did that. I agree it's better to write a proper RA, but I'm not sure if I agree that using the 'systemd' unit file is any better or worse than the anything agent. The implied assumption that anyone can sit down and write an RA is a little far fetched; to be fair RHEL is geared towards enterprise, so for their audience, that likely isn't an issue. However, if you're still interested in the agent, you can get it from upstream: github.com/ClusterLabs/resource-agents/blob/master/heartbeat/…

    – Matt Kereczman
    Jun 20 '17 at 16:36










Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f370191%2fhow-to-monitor-the-pacemaker-cluster-using-a-script%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














Instead of trying to modify the Dummy RA to execute arbitrary scripts, you could instead look at using the anything resource-agent.



# pcs resource describe ocf:heartbeat:anything
ocf:heartbeat:anything - Manages an arbitrary service

This is a generic OCF RA to manage almost anything.

Resource options:
binfile (required): The full name of the binary to be executed.
This is expected to keep running with the
same pid and not just do something and
exit.
cmdline_options: Command line options to pass to the binary
workdir: The path from where the binfile will be executed.
pidfile: File to read/write the PID from/to.
logfile: File to write STDOUT to
errlogfile: File to write STDERR to
user: User to run the command as
monitor_hook: Command to run in monitor operation
stop_timeout: In the stop operation: Seconds to wait for kill
-TERM to succeed before sending kill -SIGKILL.
Defaults to 2/3 of the stop operation timeout.


You would point the anything agent at your script as the binfile= parameter, then, if you have some way of monitoring your custom application other than checking for a running pid (that's what the anything agent does by default), you can define that in the monitor_hook parameter.






share|improve this answer























  • Thanks @Matt Kereczman . Sorry, I could not work on this till now. Will check this and update the outcome. Before that one query please. It looks like anything RA is missing in my setup. This is what I found in Redhat site link .

    – Vinod
    Jun 20 '17 at 11:48











  • I wasn't aware they did that. I agree it's better to write a proper RA, but I'm not sure if I agree that using the 'systemd' unit file is any better or worse than the anything agent. The implied assumption that anyone can sit down and write an RA is a little far fetched; to be fair RHEL is geared towards enterprise, so for their audience, that likely isn't an issue. However, if you're still interested in the agent, you can get it from upstream: github.com/ClusterLabs/resource-agents/blob/master/heartbeat/…

    – Matt Kereczman
    Jun 20 '17 at 16:36















2














Instead of trying to modify the Dummy RA to execute arbitrary scripts, you could instead look at using the anything resource-agent.



# pcs resource describe ocf:heartbeat:anything
ocf:heartbeat:anything - Manages an arbitrary service

This is a generic OCF RA to manage almost anything.

Resource options:
binfile (required): The full name of the binary to be executed.
This is expected to keep running with the
same pid and not just do something and
exit.
cmdline_options: Command line options to pass to the binary
workdir: The path from where the binfile will be executed.
pidfile: File to read/write the PID from/to.
logfile: File to write STDOUT to
errlogfile: File to write STDERR to
user: User to run the command as
monitor_hook: Command to run in monitor operation
stop_timeout: In the stop operation: Seconds to wait for kill
-TERM to succeed before sending kill -SIGKILL.
Defaults to 2/3 of the stop operation timeout.


You would point the anything agent at your script as the binfile= parameter, then, if you have some way of monitoring your custom application other than checking for a running pid (that's what the anything agent does by default), you can define that in the monitor_hook parameter.






share|improve this answer























  • Thanks @Matt Kereczman . Sorry, I could not work on this till now. Will check this and update the outcome. Before that one query please. It looks like anything RA is missing in my setup. This is what I found in Redhat site link .

    – Vinod
    Jun 20 '17 at 11:48











  • I wasn't aware they did that. I agree it's better to write a proper RA, but I'm not sure if I agree that using the 'systemd' unit file is any better or worse than the anything agent. The implied assumption that anyone can sit down and write an RA is a little far fetched; to be fair RHEL is geared towards enterprise, so for their audience, that likely isn't an issue. However, if you're still interested in the agent, you can get it from upstream: github.com/ClusterLabs/resource-agents/blob/master/heartbeat/…

    – Matt Kereczman
    Jun 20 '17 at 16:36













2












2








2







Instead of trying to modify the Dummy RA to execute arbitrary scripts, you could instead look at using the anything resource-agent.



# pcs resource describe ocf:heartbeat:anything
ocf:heartbeat:anything - Manages an arbitrary service

This is a generic OCF RA to manage almost anything.

Resource options:
binfile (required): The full name of the binary to be executed.
This is expected to keep running with the
same pid and not just do something and
exit.
cmdline_options: Command line options to pass to the binary
workdir: The path from where the binfile will be executed.
pidfile: File to read/write the PID from/to.
logfile: File to write STDOUT to
errlogfile: File to write STDERR to
user: User to run the command as
monitor_hook: Command to run in monitor operation
stop_timeout: In the stop operation: Seconds to wait for kill
-TERM to succeed before sending kill -SIGKILL.
Defaults to 2/3 of the stop operation timeout.


You would point the anything agent at your script as the binfile= parameter, then, if you have some way of monitoring your custom application other than checking for a running pid (that's what the anything agent does by default), you can define that in the monitor_hook parameter.






share|improve this answer













Instead of trying to modify the Dummy RA to execute arbitrary scripts, you could instead look at using the anything resource-agent.



# pcs resource describe ocf:heartbeat:anything
ocf:heartbeat:anything - Manages an arbitrary service

This is a generic OCF RA to manage almost anything.

Resource options:
binfile (required): The full name of the binary to be executed.
This is expected to keep running with the
same pid and not just do something and
exit.
cmdline_options: Command line options to pass to the binary
workdir: The path from where the binfile will be executed.
pidfile: File to read/write the PID from/to.
logfile: File to write STDOUT to
errlogfile: File to write STDERR to
user: User to run the command as
monitor_hook: Command to run in monitor operation
stop_timeout: In the stop operation: Seconds to wait for kill
-TERM to succeed before sending kill -SIGKILL.
Defaults to 2/3 of the stop operation timeout.


You would point the anything agent at your script as the binfile= parameter, then, if you have some way of monitoring your custom application other than checking for a running pid (that's what the anything agent does by default), you can define that in the monitor_hook parameter.







share|improve this answer












share|improve this answer



share|improve this answer










answered Jun 15 '17 at 18:39









Matt KereczmanMatt Kereczman

56227




56227












  • Thanks @Matt Kereczman . Sorry, I could not work on this till now. Will check this and update the outcome. Before that one query please. It looks like anything RA is missing in my setup. This is what I found in Redhat site link .

    – Vinod
    Jun 20 '17 at 11:48











  • I wasn't aware they did that. I agree it's better to write a proper RA, but I'm not sure if I agree that using the 'systemd' unit file is any better or worse than the anything agent. The implied assumption that anyone can sit down and write an RA is a little far fetched; to be fair RHEL is geared towards enterprise, so for their audience, that likely isn't an issue. However, if you're still interested in the agent, you can get it from upstream: github.com/ClusterLabs/resource-agents/blob/master/heartbeat/…

    – Matt Kereczman
    Jun 20 '17 at 16:36

















  • Thanks @Matt Kereczman . Sorry, I could not work on this till now. Will check this and update the outcome. Before that one query please. It looks like anything RA is missing in my setup. This is what I found in Redhat site link .

    – Vinod
    Jun 20 '17 at 11:48











  • I wasn't aware they did that. I agree it's better to write a proper RA, but I'm not sure if I agree that using the 'systemd' unit file is any better or worse than the anything agent. The implied assumption that anyone can sit down and write an RA is a little far fetched; to be fair RHEL is geared towards enterprise, so for their audience, that likely isn't an issue. However, if you're still interested in the agent, you can get it from upstream: github.com/ClusterLabs/resource-agents/blob/master/heartbeat/…

    – Matt Kereczman
    Jun 20 '17 at 16:36
















Thanks @Matt Kereczman . Sorry, I could not work on this till now. Will check this and update the outcome. Before that one query please. It looks like anything RA is missing in my setup. This is what I found in Redhat site link .

– Vinod
Jun 20 '17 at 11:48





Thanks @Matt Kereczman . Sorry, I could not work on this till now. Will check this and update the outcome. Before that one query please. It looks like anything RA is missing in my setup. This is what I found in Redhat site link .

– Vinod
Jun 20 '17 at 11:48













I wasn't aware they did that. I agree it's better to write a proper RA, but I'm not sure if I agree that using the 'systemd' unit file is any better or worse than the anything agent. The implied assumption that anyone can sit down and write an RA is a little far fetched; to be fair RHEL is geared towards enterprise, so for their audience, that likely isn't an issue. However, if you're still interested in the agent, you can get it from upstream: github.com/ClusterLabs/resource-agents/blob/master/heartbeat/…

– Matt Kereczman
Jun 20 '17 at 16:36





I wasn't aware they did that. I agree it's better to write a proper RA, but I'm not sure if I agree that using the 'systemd' unit file is any better or worse than the anything agent. The implied assumption that anyone can sit down and write an RA is a little far fetched; to be fair RHEL is geared towards enterprise, so for their audience, that likely isn't an issue. However, if you're still interested in the agent, you can get it from upstream: github.com/ClusterLabs/resource-agents/blob/master/heartbeat/…

– Matt Kereczman
Jun 20 '17 at 16:36

















draft saved

draft discarded
















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f370191%2fhow-to-monitor-the-pacemaker-cluster-using-a-script%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown






Popular posts from this blog

Peggy Mitchell

Palaiologos

The Forum (Inglewood, California)