mjhjmtu

Question

Sometimes you have to make sure that only one instance of a shell script is running at the same time.

For example a cron job which is executed via crond that does not provide
locking on its own (e.g. the default Solaris crond).

A common pattern to implement locking is code like this:

#!/bin/sh
LOCK=/var/tmp/mylock
if [ -f $LOCK ]; then # 'test' -> race begin
 echo Job is already running!
 exit 6
fi
touch $LOCK # 'set' -> race end
# do some work
rm $LOCK

Of course, such code has a race condition. There is a time window where the
execution of two instances can both advance after line 3 before one is able to
touch the $LOCK file.

For a cron job this is usually not a problem because you have an interval of
minutes between two invocations.

But things can go wrong - for example when the lockfile is on a NFS server -
that hangs. In that case several cron jobs can block on line 3 and queue up. If
the NFS server is active again then you have thundering herd of parallel
running jobs.

Searching on the web I found the tool lockrun which seems like a good
solution to that problem. With it you run a script that needs locking like
this:

$ lockrun --lockfile=/var/tmp/mylock myscript.sh

You can put this in a wrapper or use it from your crontab.

It uses lockf() (POSIX) if available and falls back to flock() (BSD). And lockf() support over NFS should be relatively widespread.

Are there alternatives to lockrun?

What about other cron daemons? Are there common crond's that support locking in a
sane way? A quick look into the man page of Vixie Crond (default on
Debian/Ubuntu systems) does not show anything about locking.

Would it be a good idea to include a tool like lockrun into coreutils?

In my opinion it implements a theme very similar to timeout, nice and friends.

Tangentially, and for the benefit of others who may consider your initial pattern Good Enough(tm), that shell code should possibly trap TERM in order to remove its lockfile when killed; and it seems to be good practice to store one's own pid in the lockfile, rather than just touching it. — Oct 4 '11 at 19:13
possible duplicate of What Unix commands can be used as a semaphore/lock? — Oct 4 '11 at 20:57
related question on SO: stackoverflow.com/questions/185451/… — Oct 4 '11 at 21:07
@Ulrich very belatedly, storing a PID in an NFS lockfile adds very little value. Even adding the hostname still doesn't really help with checking for a live process — Sep 14 '15 at 21:15

score 43 · Accepted Answer · 2016-05-06 14:52:05Z

Here's another way to do locking in shell script that can prevent the race condition you describe above, where two jobs may both pass line 3. The noclobber option will work in ksh and bash. Don't use set noclobber because you shouldn't be scripting in csh/tcsh. ;)

lockfile=/var/tmp/mylock

if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null; then

 trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT

 # do stuff here

 # clean up after yourself, and release your trap
 rm -f "$lockfile"
 trap - INT TERM EXIT
else
 echo "Lock Exists: $lockfile owned by $(cat $lockfile)"
fi

YMMV with locking on NFS (you know, when NFS servers are not reachable), but in general it's much more robust than it used to be. (10 years ago)

If you have cron jobs that do the same thing at the same time, from multiple servers, but you only need 1 instance to actually run, the something like this might work for you.

I have no experience with lockrun, but having a pre-set lock environment prior to the script actually running might help. Or it might not. You're just setting the test for the lockfile outside your script in a wrapper, and theoretically, couldn't you just hit the same race condition if two jobs were called by lockrun at exactly the same time, just as with the 'inside-the-script' solution?

File locking is pretty much honor system behavior anyways, and any scripts that don't check for the lockfile's existence prior to running will do whatever they're going to do. Just by putting in the lockfile test, and proper behavior, you'll be solving 99% of potential problems, if not 100%.

If you run into lockfile race conditions a lot, it may be an indicator of a larger problem, like not having your jobs timed right, or perhaps if interval is not as important as the job completing, maybe your job is better suited to be daemonized.

EDIT BELOW - 2016-05-06 (if you're using KSH88)

Base on @Clint Pachl's comment below, if you use ksh88, use mkdir instead of noclobber. This mostly mitigates a potential race condition, but doesn't entirely limit it (though the risk is miniscule). For more information read the link that Clint posted below.

lockdir=/var/tmp/mylock
pidfile=/var/tmp/mylock/pid

if ( mkdir $lockdir ) 2> /dev/null; then
 echo $$ > $pidfile
 trap 'rm -rf "$lockdir"; exit $?' INT TERM EXIT
 # do stuff here

 # clean up after yourself, and release your trap
 rm -rf "$lockdir"
 trap - INT TERM EXIT
else
 echo "Lock Exists: $lockdir owned by $(cat $pidfile)"
fi

And, as an added advantage, if you need to create tmpfiles in your script, you can use the lockdir directory for them, knowing they will be cleaned up when the script exits.

For more modern bash, the noclobber method at the top should be suitable.

No, with lockrun you don't have a problem - when a NFS server hangs, all lockrun calls will hang (at least) in the lockf() system call - when it is back up all processes are resumed but only one process will win the lock. No race condition. I don't run into such problems with cronjobs a lot - the opposite is the case - but this is a problem when it hits you it has the potential to create a lot of pain. — Oct 4 '11 at 20:15
I have accepted this answer because the method is safe and so far the most elegant one. I suggest a small variant: set -o noclobber && echo "$$" > "$lockfile" to get a safe fallback when the shell does not support the noclobber option. — Oct 7 '11 at 9:45
Good answer, but you should also 'kill -0' the value in lockfile to ensure that the process that created the lock still exists. — Dec 17 '14 at 13:29
The noclobber option may be prone to race conditions. See mywiki.wooledge.org/BashFAQ/045 for some food for thought. — Jun 16 '15 at 5:09
Note: using noclobber (or -C) in ksh88 does not work because ksh88 does not use O_EXCL for noclobber. If you're running with a newer shell you may be OK... — Nov 17 '15 at 19:24

Arcege 16.8k44157 · Answer 2 · 2011-10-05 01:57:59Z

I prefer to use hard links.

lockfile=/var/lock/mylock
tmpfile=$lockfile.$$
echo $$ > $tmpfile
if ln $tmpfile $lockfile 2>&-; then
 echo locked
else
 echo locked by $(<$lockfile)
 rm $tmpfile
 exit
fi
trap "rm $tmpfile $lockfile" 0 1 2 3 15
# do what you need to

Hard links are atomic over NFS and for the most part, mkdir is as well. Using mkdir(2) or link(2) are about the same, at a practical level; I just prefer using hard links because more implementations of NFS allowed atomic hard links than atomic mkdir. With modern releases of NFS, you shouldn't have to worry using either.

glenn jackman 50k569106 · Answer 3 · 2011-10-04 19:37:17Z

I understand that mkdir is atomic, so perhaps:

lockdir=/var/tmp/myapp
if mkdir $lockdir; then
 # this is a new instance, store the pid
 echo $$ > $lockdir/PID
else
 echo Job is already running, pid $(<$lockdir/PID) >&2
 exit 6
fi

# then set traps to cleanup upon script termination 
# ref http://www.shelldorado.com/goodcoding/tempfiles.html
trap 'rm -r "$lockdir" >/dev/null 2>&1' 0
trap "exit 2" 1 2 3 13 15

Ok, but I could not find informations whether mkdir() over NFS (>=3) is standardized to be atomic. — Oct 4 '11 at 21:05
@maxschlepzig RFC 1813 does not explicitly call out for mkdir to be atomic (it does for rename). In practice, it is known that some implementations are not. Related: an interesting thread, including a contribution by the author of GNU arch. — Oct 4 '11 at 22:12

maxschlepzig 33.3k32135208 · Answer 4 · 2016-05-07 21:18:20Z

An easy way is to use lockfile coming usually with the procmail package.

LOCKFILE="/tmp/mylockfile.lock"
# try once to get the lock else exit
lockfile -r 0 "$LOCKFILE" || exit 0

# here the actual job

rm -f "$LOCKFILE"

Partly Cloudy 1312 · Answer 5 · 2016-11-10 00:31:00Z

sem which comes as part of the GNU parallel tools may be what you're looking for:

sem [--fg] [--id <id>] [--semaphoretimeout <secs>] [-j <num>] [--wait] command

As in:

sem --id my_semaphore --fg "echo 1 ; date ; sleep 3" &
sem --id my_semaphore --fg "echo 2 ; date ; sleep 3" &
sem --id my_semaphore --fg "echo 3 ; date ; sleep 3" &

outputting:

1
Thu 10 Nov 00:26:21 UTC 2016
2
Thu 10 Nov 00:26:24 UTC 2016
3
Thu 10 Nov 00:26:28 UTC 2016

Note that order isn't guaranteed. Also the output isn't displayed until it finishes (irritating!). But even so, it's the most concise way I know to guard against concurrent execution, without worrying about lockfiles and retries and cleanup.

Does the locking offered by sem handle being shot down mid-execution? — Nov 10 '16 at 0:56

Michael Mrozek♦ 60.3k29187208 · Answer 6 · 2012-05-31 22:06:45Z

I use dtach.

$ dtach -n /tmp/socket long_running_task ; echo $?
0
$ dtach -n /tmp/socket long_running_task ; echo $?
dtach: /tmp/socket: Address already in use
1

dru8274 31413 · Answer 7 · 2011-10-05 01:16:46Z

I use the command-line tool "flock" to manage locks in my bash scripts, as described here and here. I have used this simple method from the flock manpage, to run some commands in a subshell...

 (
 flock -n 9
 # ... commands executed under lock ...
 ) 9>/var/lock/mylockfile

In that example, it fails with exit code of 1 if it can't acquire the lockfile. But flock can also be used in ways that don't require commands to be run in a sub-shell :-)

@dubiousjim, BSD lockf also calls flock() and is thus problematic over NFS. Btw, in the meantime, flock() on Linux now falls back to fcntl() when the file is located on a NFS mount, thus, in a Linux-only NFS environment flock() now does work over NFS. — May 7 '16 at 21:24

score 1 · Answer 8 · 2011-10-05 17:29:19Z

up vote
1
down vote

Don't use a file.

If your script is executed like this e.g.:

bash my_script

You can detect if it's running using:

running_proc=$(ps -C bash -o pid=,cmd= | grep my_script);
if [[ "$running_proc" != "$$ bash my_script" ]]; do 
 echo Already locked
 exit 6
fi

edited Oct 5 '11 at 17:29

answered Oct 5 '11 at 5:52

frogstarr78

8261610

Hm, the ps checking code runs from within my_script? In the case another instance is running - does not running_proc contain two matching lines? I like the idea, but of course - you will get false results when another user is running a script with the same name ...
– maxschlepzig
Oct 5 '11 at 8:20

3

It also includes a race condition: if 2 instances execute the first line in parallel then none gets the 'lock' and both exit with status 6. This would be a kind of one round mutual starvation. Btw, I am not sure why you use $! instead of $$ in your example.
– maxschlepzig
Oct 5 '11 at 12:02

@maxschlepzig indeed sorry about the incorrect $! vs. $$
– frogstarr78
Oct 5 '11 at 17:29

@maxschlepzig to handle multiple users running the script add euser= to the -o argument.
– frogstarr78
Oct 5 '11 at 17:33

@maxschlepzig to prevent multiple lines you can also change the arguments to grep, or additional "filters" (e.g. grep -v $$). Basically I was attempting to provide a different approach to the problem.
– frogstarr78
Oct 5 '11 at 17:39

|
show 1 more comment

tiian 1 · Answer 9 · 2014-04-24 20:37:51Z

Using FLOM (Free LOck Manager) tool, serializing commands becomes as easy as running

flom -- command_to_serialize

FLOM allows you to implement more sofisticate use cases (distributed locking, readers/writers, numeric resources, etc...) as explained here: http://sourceforge.net/p/flom/wiki/FLOM%20by%20examples/

score 0 · Answer 10 · 2015-06-13 10:09:52Z

Here is something I sometimes add on a server to easily handle race conditions for any job's on the machine.
It is simmilar to Tim Kennedy's post, but this way you get race handling by only adding one row to each bash script that needs it.

Put the content below in e.g /opt/racechecker/racechecker :

ZPROGRAMNAME=$(readlink -f $0)
EZPROGRAMNAME=`echo $ZPROGRAMNAME | sed 's///_/g'`
EZMAIL="/usr/bin/mail"
EZCAT="/bin/cat"

if [ -n "$EZPROGRAMNAME" ] ;then
 EZPIDFILE=/tmp/$EZPROGRAMNAME.pid
 if [ -e "$EZPIDFILE" ] ;then
 EZPID=$($EZCAT $EZPIDFILE)
 echo "" | $EZMAIL -s "$ZPROGRAMNAME already running with pid $EZPID" alarms@someemail.com >>/dev/null
 exit -1
 fi
 echo $$ >> $EZPIDFILE
 function finish 
 rm $EZPIDFILE
 
 trap finish EXIT
fi

Here is how to use it. Note the row after the shebang:

 #/bin/bash
 . /opt/racechecker/racechecker
 echo "script are running"
 sleep 120

The way it works is that it figures out the main bashscript file name and creates a pidfile under "/tmp". It also adds a listener to the finish signal.
The listener will remove the pidfile when the main script are properly finishing.

Instead if a pidfile exist when an instance are launched, then the if statement containing the code inside the second if-statement will be executed. In this case I have decided to launch an alarm mail when this happens.

What if the script crashes

A further exercise would be to handle crashes. Ideally the pidfile should be removed even if the main-script crashes for any reason, this are not done in my version above. That means if the script crashes the pidfile would have to be manually removed to restore functionality.

In case of system crash

It is a good idea to store the pidfile/lockfile under for example /tmp. This way your scripts will definently continue to execute after a system crash since the pidfiles will always be deleted on bootup.

Unlike Tim Kennedy's ansatz, your script DOES contain a race condition. This is because your checking of the presence of the PIDFILE and its conditional creation is not done in an atomic operation. — Jun 13 '15 at 12:40
+1 on that! I will take this under consideration and modify my script. — Jun 13 '15 at 13:28

Wildcard 22.6k961164 · Answer 11 · 2018-12-04 05:30:08Z

For actual use, you should use the top voted answer.

However, I want to discuss some various broken and semi-workable approaches using ps and the many caveats they have, since I keep seeing people use them.

This answer is really the answer to "Why not use ps and grep to handle locking in the shell?"

Broken approach #1

First, an approach given in another answer that has a few upvotes despite the fact that it does not (and could never) work and was clearly never tested:

running_proc=$(ps -C bash -o pid=,cmd= | grep my_script);
if [[ "$running_proc" != "$$ bash my_script" ]]; do 
 echo Already locked
 exit 6
fi

Let's fix the syntax errors and the broken ps arguments and get:

running_proc=$(ps -C bash -o pid,cmd | grep "$0");
echo "$running_proc"
if [[ "$running_proc" != "$$ bash $0" ]]; then
 echo Already locked
 exit 6
fi

This script will always exit 6, every time, no matter how you run it.

If you run it with ./myscript, then the ps output will just be 12345 -bash, which doesn't match the required string 12345 bash ./myscript, so that will fail.

If you run it with bash myscript, things get more interesting. The bash process forks to run the pipeline, and the child shell runs the ps and grep. Both the original shell and the child shell will show up in the ps output, something like this:

25793 bash myscript
25795 bash myscript

That's not the expected output $$ bash $0, so your script will exit.

Broken approach #2

Now in all fairness to the user who wrote broken approach #1, I did something similar myself when I first tried this:

if otherpids="$(pgrep -f "$0" | grep -vFx "$$")" ; then
 echo >&2 "There are other copies of the script running; exiting."
 ps >&2 -fq "$otherpids//$'n'/ " # -q takes about a tenth the time as -p
 exit 1
fi

This almost works. But the fact of forking to run the pipe throws this off. So this one will always exit, also.

Unreliable approach #3

pids_this_script="$(pgrep -f "$0")"
if not_this_process="$(echo "$pids_this_script" | grep -vFx "$$")"; then
 echo >&2 "There are other copies of this script running; exiting."
 ps -fq "$not_this_process//$'n'/ "
 exit 1
fi

This version avoids the pipeline forking problem in approach #2 by first getting all PIDs that have the current script in their command line arguments, and then filtering that pidlist, separately, to omit the PID of the current script.

This might work...providing no other process has a command line matching the $0, and providing the script is always called the same way (e.g. if it's called with a relative path and then an absolute path, the latter instance will not notice the former).

Unreliable approach #4

So what if we skip checking the full command line, since that might not indicate a script actually running, and check lsof instead to find all the processes that have this script open?

Well, yes, this approach is actually not too bad:

if otherpids="$(lsof -t "$0" | grep -vFx "$$")"; then
 echo >&2 "Error: There are other processes that have this script open - most likely other copies of the script running. Exiting to avoid conflicts."
 ps >&2 -fq "$otherpids//$'n'/ "
 exit 1
fi

Of course, if a copy of the script is running, then the new instance will start up just fine and you'll have two copies running.

Or if the running script is modified (e.g. with Vim or with a git checkout), then the "new" version of the script will start up with no problem, since both Vim and git checkout result in a new file (a new inode) in place of the old one.

However, if the script is never modified and never copied, then this version is pretty good. There's no race condition because the script file already has to be open before the check can be reached.

There can still be false positives if another process has the script file open, but note that even if it's open for editing in Vim, vim doesn't actually hold the script file open so won't result in false positives.

But remember, don't use this approach if the script might be edited or copied since you'll get false negatives i.e. multiple instances running at once - so the fact that editing with Vim doesn't give false positives kind of shouldn't matter to you. I mention it, though, because approach #3 does give false positives (i.e. refuses to start) if you have the script open with Vim.

So what to do, then?

The top-voted answer to this question gives a good solid approach.

Perhaps you can write a better one...but if you don't understand all the problems and caveats with all the above approaches, you're not likely to write a locking method that avoids them all.

user54178 1 · Answer 12 · 2013-12-09 22:08:54Z

Check my script ...

You may LOVE it....

[rambabu@Server01 ~]$ sh Prevent_cron-OR-Script_against_parallel_run.sh
Parallel RUN Enabled
Now running
Task completed in Parallel RUN...
[rambabu@Server01 ~]$ cat Prevent_cron-OR-Script_against_parallel_run.sh
#!/bin/bash
#Created by RambabuKella
#Date : 12-12-2013

#LOCK file name
Parallel_RUN="yes"
#Parallel_RUN="no"
PS_GREP=0
LOCK=/var/tmp/mylock_`whoami`_"$0"
#Checking for the process
PS_GREP=`ps -ef |grep "sh $0" |grep -v grep|wc -l`
if [ "$Parallel_RUN" == "no" ] ;then
echo "Parallel RUN Disabled"

 if [ -f $LOCK ] || [ $PS_GREP -gt 2 ] ;then
 echo -e "nJob is already running OR LOCK file exists. "
 echo -e "nDetail are : "
 ps -ef |grep "$0" |grep -v grep
 cat "$LOCK"
 exit 6
 fi
echo -e "LOCK file " $LOCK " created on : `date +%F-%H-%M` ." &> $LOCK
# do some work
echo "Now running"
echo "Task completed on with single RUN ..."
#done

rm -v $LOCK 2>/dev/null
exit 0
else

echo "Parallel RUN Enabled"

# do some work
echo "Now running"
echo "Task completed in Parallel RUN..."
#done

exit 0
fi
echo "some thing wrong"
exit 2
[rambabu@Server01 ~]$

Anthon 60k17102163 · Answer 13 · 2013-09-19 16:48:35Z

I offer the following solution, in a script named 'flocktest'

#!/bin/bash
export LOGFILE=`basename $0`.logfile
logit () 
echo "$1" >>$LOGFILE

PROGPATH=$0
(
flock -x -n 257
(($?)) && logit "'$PROGPATH' is already running!" && exit 0
logit "'$PROGPATH', proc($$): sleeping 30 seconds"
sleep 30
)257<$PROGPATH

score 43 · Accepted Answer · 2016-05-06 14:52:05Z

Here's another way to do locking in shell script that can prevent the race condition you describe above, where two jobs may both pass line 3. The noclobber option will work in ksh and bash. Don't use set noclobber because you shouldn't be scripting in csh/tcsh. ;)

lockfile=/var/tmp/mylock

if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null; then

 trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT

 # do stuff here

 # clean up after yourself, and release your trap
 rm -f "$lockfile"
 trap - INT TERM EXIT
else
 echo "Lock Exists: $lockfile owned by $(cat $lockfile)"
fi

YMMV with locking on NFS (you know, when NFS servers are not reachable), but in general it's much more robust than it used to be. (10 years ago)

If you have cron jobs that do the same thing at the same time, from multiple servers, but you only need 1 instance to actually run, the something like this might work for you.

I have no experience with lockrun, but having a pre-set lock environment prior to the script actually running might help. Or it might not. You're just setting the test for the lockfile outside your script in a wrapper, and theoretically, couldn't you just hit the same race condition if two jobs were called by lockrun at exactly the same time, just as with the 'inside-the-script' solution?

File locking is pretty much honor system behavior anyways, and any scripts that don't check for the lockfile's existence prior to running will do whatever they're going to do. Just by putting in the lockfile test, and proper behavior, you'll be solving 99% of potential problems, if not 100%.

If you run into lockfile race conditions a lot, it may be an indicator of a larger problem, like not having your jobs timed right, or perhaps if interval is not as important as the job completing, maybe your job is better suited to be daemonized.

EDIT BELOW - 2016-05-06 (if you're using KSH88)

Base on @Clint Pachl's comment below, if you use ksh88, use mkdir instead of noclobber. This mostly mitigates a potential race condition, but doesn't entirely limit it (though the risk is miniscule). For more information read the link that Clint posted below.

lockdir=/var/tmp/mylock
pidfile=/var/tmp/mylock/pid

if ( mkdir $lockdir ) 2> /dev/null; then
 echo $$ > $pidfile
 trap 'rm -rf "$lockdir"; exit $?' INT TERM EXIT
 # do stuff here

 # clean up after yourself, and release your trap
 rm -rf "$lockdir"
 trap - INT TERM EXIT
else
 echo "Lock Exists: $lockdir owned by $(cat $pidfile)"
fi

And, as an added advantage, if you need to create tmpfiles in your script, you can use the lockdir directory for them, knowing they will be cleaned up when the script exits.

For more modern bash, the noclobber method at the top should be suitable.

No, with lockrun you don't have a problem - when a NFS server hangs, all lockrun calls will hang (at least) in the lockf() system call - when it is back up all processes are resumed but only one process will win the lock. No race condition. I don't run into such problems with cronjobs a lot - the opposite is the case - but this is a problem when it hits you it has the potential to create a lot of pain. — Oct 4 '11 at 20:15
I have accepted this answer because the method is safe and so far the most elegant one. I suggest a small variant: set -o noclobber && echo "$$" > "$lockfile" to get a safe fallback when the shell does not support the noclobber option. — Oct 7 '11 at 9:45
Good answer, but you should also 'kill -0' the value in lockfile to ensure that the process that created the lock still exists. — Dec 17 '14 at 13:29
The noclobber option may be prone to race conditions. See mywiki.wooledge.org/BashFAQ/045 for some food for thought. — Jun 16 '15 at 5:09
Note: using noclobber (or -C) in ksh88 does not work because ksh88 does not use O_EXCL for noclobber. If you're running with a newer shell you may be OK... — Nov 17 '15 at 19:24

Arcege 16.8k44157 · Answer 15 · 2011-10-05 01:57:59Z

I prefer to use hard links.

lockfile=/var/lock/mylock
tmpfile=$lockfile.$$
echo $$ > $tmpfile
if ln $tmpfile $lockfile 2>&-; then
 echo locked
else
 echo locked by $(<$lockfile)
 rm $tmpfile
 exit
fi
trap "rm $tmpfile $lockfile" 0 1 2 3 15
# do what you need to

Hard links are atomic over NFS and for the most part, mkdir is as well. Using mkdir(2) or link(2) are about the same, at a practical level; I just prefer using hard links because more implementations of NFS allowed atomic hard links than atomic mkdir. With modern releases of NFS, you shouldn't have to worry using either.

glenn jackman 50k569106 · Answer 16 · 2011-10-04 19:37:17Z

I understand that mkdir is atomic, so perhaps:

lockdir=/var/tmp/myapp
if mkdir $lockdir; then
 # this is a new instance, store the pid
 echo $$ > $lockdir/PID
else
 echo Job is already running, pid $(<$lockdir/PID) >&2
 exit 6
fi

# then set traps to cleanup upon script termination 
# ref http://www.shelldorado.com/goodcoding/tempfiles.html
trap 'rm -r "$lockdir" >/dev/null 2>&1' 0
trap "exit 2" 1 2 3 13 15

Ok, but I could not find informations whether mkdir() over NFS (>=3) is standardized to be atomic. — Oct 4 '11 at 21:05
@maxschlepzig RFC 1813 does not explicitly call out for mkdir to be atomic (it does for rename). In practice, it is known that some implementations are not. Related: an interesting thread, including a contribution by the author of GNU arch. — Oct 4 '11 at 22:12

maxschlepzig 33.3k32135208 · Answer 17 · 2016-05-07 21:18:20Z

An easy way is to use lockfile coming usually with the procmail package.

LOCKFILE="/tmp/mylockfile.lock"
# try once to get the lock else exit
lockfile -r 0 "$LOCKFILE" || exit 0

# here the actual job

rm -f "$LOCKFILE"

Partly Cloudy 1312 · Answer 18 · 2016-11-10 00:31:00Z

sem which comes as part of the GNU parallel tools may be what you're looking for:

sem [--fg] [--id <id>] [--semaphoretimeout <secs>] [-j <num>] [--wait] command

As in:

sem --id my_semaphore --fg "echo 1 ; date ; sleep 3" &
sem --id my_semaphore --fg "echo 2 ; date ; sleep 3" &
sem --id my_semaphore --fg "echo 3 ; date ; sleep 3" &

outputting:

1
Thu 10 Nov 00:26:21 UTC 2016
2
Thu 10 Nov 00:26:24 UTC 2016
3
Thu 10 Nov 00:26:28 UTC 2016

Note that order isn't guaranteed. Also the output isn't displayed until it finishes (irritating!). But even so, it's the most concise way I know to guard against concurrent execution, without worrying about lockfiles and retries and cleanup.

Does the locking offered by sem handle being shot down mid-execution? — Nov 10 '16 at 0:56

Michael Mrozek♦ 60.3k29187208 · Answer 19 · 2012-05-31 22:06:45Z

I use dtach.

$ dtach -n /tmp/socket long_running_task ; echo $?
0
$ dtach -n /tmp/socket long_running_task ; echo $?
dtach: /tmp/socket: Address already in use
1

dru8274 31413 · Answer 20 · 2011-10-05 01:16:46Z

I use the command-line tool "flock" to manage locks in my bash scripts, as described here and here. I have used this simple method from the flock manpage, to run some commands in a subshell...

 (
 flock -n 9
 # ... commands executed under lock ...
 ) 9>/var/lock/mylockfile

In that example, it fails with exit code of 1 if it can't acquire the lockfile. But flock can also be used in ways that don't require commands to be run in a sub-shell :-)

@dubiousjim, BSD lockf also calls flock() and is thus problematic over NFS. Btw, in the meantime, flock() on Linux now falls back to fcntl() when the file is located on a NFS mount, thus, in a Linux-only NFS environment flock() now does work over NFS. — May 7 '16 at 21:24

score 1 · Answer 21 · 2011-10-05 17:29:19Z

up vote
1
down vote

Don't use a file.

If your script is executed like this e.g.:

bash my_script

You can detect if it's running using:

running_proc=$(ps -C bash -o pid=,cmd= | grep my_script);
if [[ "$running_proc" != "$$ bash my_script" ]]; do 
 echo Already locked
 exit 6
fi

edited Oct 5 '11 at 17:29

answered Oct 5 '11 at 5:52

frogstarr78

8261610

Hm, the ps checking code runs from within my_script? In the case another instance is running - does not running_proc contain two matching lines? I like the idea, but of course - you will get false results when another user is running a script with the same name ...
– maxschlepzig
Oct 5 '11 at 8:20

3

It also includes a race condition: if 2 instances execute the first line in parallel then none gets the 'lock' and both exit with status 6. This would be a kind of one round mutual starvation. Btw, I am not sure why you use $! instead of $$ in your example.
– maxschlepzig
Oct 5 '11 at 12:02

@maxschlepzig indeed sorry about the incorrect $! vs. $$
– frogstarr78
Oct 5 '11 at 17:29

@maxschlepzig to handle multiple users running the script add euser= to the -o argument.
– frogstarr78
Oct 5 '11 at 17:33

@maxschlepzig to prevent multiple lines you can also change the arguments to grep, or additional "filters" (e.g. grep -v $$). Basically I was attempting to provide a different approach to the problem.
– frogstarr78
Oct 5 '11 at 17:39

|
show 1 more comment

tiian 1 · Answer 22 · 2014-04-24 20:37:51Z

Using FLOM (Free LOck Manager) tool, serializing commands becomes as easy as running

flom -- command_to_serialize

FLOM allows you to implement more sofisticate use cases (distributed locking, readers/writers, numeric resources, etc...) as explained here: http://sourceforge.net/p/flom/wiki/FLOM%20by%20examples/

score 0 · Answer 23 · 2015-06-13 10:09:52Z

Here is something I sometimes add on a server to easily handle race conditions for any job's on the machine.
It is simmilar to Tim Kennedy's post, but this way you get race handling by only adding one row to each bash script that needs it.

Put the content below in e.g /opt/racechecker/racechecker :

ZPROGRAMNAME=$(readlink -f $0)
EZPROGRAMNAME=`echo $ZPROGRAMNAME | sed 's///_/g'`
EZMAIL="/usr/bin/mail"
EZCAT="/bin/cat"

if [ -n "$EZPROGRAMNAME" ] ;then
 EZPIDFILE=/tmp/$EZPROGRAMNAME.pid
 if [ -e "$EZPIDFILE" ] ;then
 EZPID=$($EZCAT $EZPIDFILE)
 echo "" | $EZMAIL -s "$ZPROGRAMNAME already running with pid $EZPID" alarms@someemail.com >>/dev/null
 exit -1
 fi
 echo $$ >> $EZPIDFILE
 function finish 
 rm $EZPIDFILE
 
 trap finish EXIT
fi

Here is how to use it. Note the row after the shebang:

 #/bin/bash
 . /opt/racechecker/racechecker
 echo "script are running"
 sleep 120

The way it works is that it figures out the main bashscript file name and creates a pidfile under "/tmp". It also adds a listener to the finish signal.
The listener will remove the pidfile when the main script are properly finishing.

Instead if a pidfile exist when an instance are launched, then the if statement containing the code inside the second if-statement will be executed. In this case I have decided to launch an alarm mail when this happens.

What if the script crashes

A further exercise would be to handle crashes. Ideally the pidfile should be removed even if the main-script crashes for any reason, this are not done in my version above. That means if the script crashes the pidfile would have to be manually removed to restore functionality.

In case of system crash

It is a good idea to store the pidfile/lockfile under for example /tmp. This way your scripts will definently continue to execute after a system crash since the pidfiles will always be deleted on bootup.

Unlike Tim Kennedy's ansatz, your script DOES contain a race condition. This is because your checking of the presence of the PIDFILE and its conditional creation is not done in an atomic operation. — Jun 13 '15 at 12:40
+1 on that! I will take this under consideration and modify my script. — Jun 13 '15 at 13:28

Wildcard 22.6k961164 · Answer 24 · 2018-12-04 05:30:08Z

For actual use, you should use the top voted answer.

However, I want to discuss some various broken and semi-workable approaches using ps and the many caveats they have, since I keep seeing people use them.

This answer is really the answer to "Why not use ps and grep to handle locking in the shell?"

Broken approach #1

First, an approach given in another answer that has a few upvotes despite the fact that it does not (and could never) work and was clearly never tested:

running_proc=$(ps -C bash -o pid=,cmd= | grep my_script);
if [[ "$running_proc" != "$$ bash my_script" ]]; do 
 echo Already locked
 exit 6
fi

Let's fix the syntax errors and the broken ps arguments and get:

running_proc=$(ps -C bash -o pid,cmd | grep "$0");
echo "$running_proc"
if [[ "$running_proc" != "$$ bash $0" ]]; then
 echo Already locked
 exit 6
fi

This script will always exit 6, every time, no matter how you run it.

If you run it with ./myscript, then the ps output will just be 12345 -bash, which doesn't match the required string 12345 bash ./myscript, so that will fail.

If you run it with bash myscript, things get more interesting. The bash process forks to run the pipeline, and the child shell runs the ps and grep. Both the original shell and the child shell will show up in the ps output, something like this:

25793 bash myscript
25795 bash myscript

That's not the expected output $$ bash $0, so your script will exit.

Broken approach #2

Now in all fairness to the user who wrote broken approach #1, I did something similar myself when I first tried this:

if otherpids="$(pgrep -f "$0" | grep -vFx "$$")" ; then
 echo >&2 "There are other copies of the script running; exiting."
 ps >&2 -fq "$otherpids//$'n'/ " # -q takes about a tenth the time as -p
 exit 1
fi

This almost works. But the fact of forking to run the pipe throws this off. So this one will always exit, also.

Unreliable approach #3

pids_this_script="$(pgrep -f "$0")"
if not_this_process="$(echo "$pids_this_script" | grep -vFx "$$")"; then
 echo >&2 "There are other copies of this script running; exiting."
 ps -fq "$not_this_process//$'n'/ "
 exit 1
fi

This version avoids the pipeline forking problem in approach #2 by first getting all PIDs that have the current script in their command line arguments, and then filtering that pidlist, separately, to omit the PID of the current script.

This might work...providing no other process has a command line matching the $0, and providing the script is always called the same way (e.g. if it's called with a relative path and then an absolute path, the latter instance will not notice the former).

Unreliable approach #4

So what if we skip checking the full command line, since that might not indicate a script actually running, and check lsof instead to find all the processes that have this script open?

Well, yes, this approach is actually not too bad:

if otherpids="$(lsof -t "$0" | grep -vFx "$$")"; then
 echo >&2 "Error: There are other processes that have this script open - most likely other copies of the script running. Exiting to avoid conflicts."
 ps >&2 -fq "$otherpids//$'n'/ "
 exit 1
fi

Of course, if a copy of the script is running, then the new instance will start up just fine and you'll have two copies running.

Or if the running script is modified (e.g. with Vim or with a git checkout), then the "new" version of the script will start up with no problem, since both Vim and git checkout result in a new file (a new inode) in place of the old one.

However, if the script is never modified and never copied, then this version is pretty good. There's no race condition because the script file already has to be open before the check can be reached.

There can still be false positives if another process has the script file open, but note that even if it's open for editing in Vim, vim doesn't actually hold the script file open so won't result in false positives.

But remember, don't use this approach if the script might be edited or copied since you'll get false negatives i.e. multiple instances running at once - so the fact that editing with Vim doesn't give false positives kind of shouldn't matter to you. I mention it, though, because approach #3 does give false positives (i.e. refuses to start) if you have the script open with Vim.

So what to do, then?

The top-voted answer to this question gives a good solid approach.

Perhaps you can write a better one...but if you don't understand all the problems and caveats with all the above approaches, you're not likely to write a locking method that avoids them all.

user54178 1 · Answer 25 · 2013-12-09 22:08:54Z

Check my script ...

You may LOVE it....

[rambabu@Server01 ~]$ sh Prevent_cron-OR-Script_against_parallel_run.sh
Parallel RUN Enabled
Now running
Task completed in Parallel RUN...
[rambabu@Server01 ~]$ cat Prevent_cron-OR-Script_against_parallel_run.sh
#!/bin/bash
#Created by RambabuKella
#Date : 12-12-2013

#LOCK file name
Parallel_RUN="yes"
#Parallel_RUN="no"
PS_GREP=0
LOCK=/var/tmp/mylock_`whoami`_"$0"
#Checking for the process
PS_GREP=`ps -ef |grep "sh $0" |grep -v grep|wc -l`
if [ "$Parallel_RUN" == "no" ] ;then
echo "Parallel RUN Disabled"

 if [ -f $LOCK ] || [ $PS_GREP -gt 2 ] ;then
 echo -e "nJob is already running OR LOCK file exists. "
 echo -e "nDetail are : "
 ps -ef |grep "$0" |grep -v grep
 cat "$LOCK"
 exit 6
 fi
echo -e "LOCK file " $LOCK " created on : `date +%F-%H-%M` ." &> $LOCK
# do some work
echo "Now running"
echo "Task completed on with single RUN ..."
#done

rm -v $LOCK 2>/dev/null
exit 0
else

echo "Parallel RUN Enabled"

# do some work
echo "Now running"
echo "Task completed in Parallel RUN..."
#done

exit 0
fi
echo "some thing wrong"
exit 2
[rambabu@Server01 ~]$

Anthon 60k17102163 · Answer 26 · 2013-09-19 16:48:35Z

I offer the following solution, in a script named 'flocktest'

#!/bin/bash
export LOGFILE=`basename $0`.logfile
logit () 
echo "$1" >>$LOGFILE

PROGPATH=$0
(
flock -x -n 257
(($?)) && logit "'$PROGPATH' is already running!" && exit 0
logit "'$PROGPATH', proc($$): sleeping 30 seconds"
sleep 30
)257<$PROGPATH

Correct locking in shell scripts?

13 Answers 13

EDIT BELOW - 2016-05-06 (if you're using KSH88)

For actual use, you should use the top voted answer.

Broken approach #1

Broken approach #2

Unreliable approach #3

Unreliable approach #4

So what to do, then?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

13 Answers 13

13 Answers 13

EDIT BELOW - 2016-05-06 (if you're using KSH88)

EDIT BELOW - 2016-05-06 (if you're using KSH88)

EDIT BELOW - 2016-05-06 (if you're using KSH88)

EDIT BELOW - 2016-05-06 (if you're using KSH88)

For actual use, you should use the top voted answer.

Broken approach #1

Broken approach #2

Unreliable approach #3

Unreliable approach #4

So what to do, then?

For actual use, you should use the top voted answer.

Broken approach #1

Broken approach #2

Unreliable approach #3

Unreliable approach #4

So what to do, then?

For actual use, you should use the top voted answer.

Broken approach #1

Broken approach #2

Unreliable approach #3

Unreliable approach #4

So what to do, then?

For actual use, you should use the top voted answer.

Broken approach #1

Broken approach #2

Unreliable approach #3

Unreliable approach #4

So what to do, then?

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

How to check contact read email or not when send email to Individual?

How many registers does an x86_64 CPU actually have?

Running qemu-guest-agent on windows server 2008

13 Answers
13

13 Answers
13

13 Answers
13