Back to Top

Wednesday, January 17, 2007

Perl, Windows and File Locking

For some Perl scripts you want to make sure that only one instance of it is running at the same time. So you use lockfiles, in a way like this:

open(LockFile, ">$lock_file") or die("Failed to lock file $lock_file, error: $^E");
flock(LockFile, LOCK_EX) or die("Failed to lock file $lock_file");

The idea being that the OS guarantees you that only one process can have an exclusive lock on a file at a time (warning! this is true only on local file systems! don't try this with networked file systems like Samba of NFS!)

All is nice and dandy until you try to execute other programs from inside your script. To make the problem more hands-on: let's say you have the script which launches multiple instances of in a fire and forget manner. Let's say that at one moment dies (or is killed) and you try to restart it. If you are on Linux, no problem, however on Windows you may be greeted by the messsage "failed to aquire lockfile". So you fire up Process Explorer and make sure that no other instance of the script is running. Still you get the same error when trying to start up. The next step is to search for open handles to the given lockfile. Much to my surprise I found that instances of had handles opens to it.

How did they get it? The API for creating processes under Windows is CreateProcess (no surprise here), but look at the 5th parameter: BOOL bInheritHandles. Quote from MSDN:

[in] If this parameter TRUE, each inheritable handle in the calling process is inherited by the new process. If the parameter is FALSE, the handles are not inherited. Note that inherited handles have the same value and access rights as the original handles.

Now the source of the problem became clear: had a file handle open to the lockfile, which was inherited by each process launched by it. While the children process were running, you couldn't restart the master process. But why doesn't the problem appear on Linux? Because Larry Wall has thought of this problem, and decided (very sanely) that only three handles should be inherited by the child process by default: STDIN, STDOUT and STDERR. This is controlled by the $^F / $SYSTEM_FD_MAX variable. Unfortunately setting this variable has no effect whatsoever on Windows, since the file handles don't start at 0 for it.

The first solution which came to mind was to replace each call of system with Win32API::CreateProcess, however there was the possibility that I may miss some system calls in current or future scripts.

My second idea was to intercept somehow the fork event and close in the child process the handle to the lockfile. This was a better idea, since I could do this once and it would take effect everywhere, however there was a problem: I couldn't find a way to detect the execution of fork.

Here is the third and final solution. Only the locking subroutine must be modified and no additional burden is laid on the programmers using the subroutine (ie. they don't have to remember to use some additional magic when running external programs):

First, include the appropriate library, but only if we are running Windows (to make the code cross platform compatible):

if ($^O =~ /Win32/) {
  use Win32API::File qw(:Func :HANDLE_FLAG_);

Now do the following after you opened the file handle to the lockfile:

if ($^O =~ /Win32/) {
  my $os_handle = GetOsFHandle(*LockFile); 
  SetHandleInformation($os_handle, HANDLE_FLAG_INHERIT, 0x0) if ($os_handle);

This will tell Windows that the handle shouldn't be inherited by the child processes, even when the bInheritHandles parameter of CreateProcess is set to true (remember, the documentation said each inheritable handle, not each handle). For more details see the SetHandleInformation API page. Also remember that not only file handles but other type of handles (like event or mutexes) are also inherited and this method is also applicable to those.

1 comment:

  1. Anonymous2:58 AM

    If you're on a Unixy system, and you're using exec (or system) to start the new process, then you'd want to look at fcntl, F_GETFD/F_SETFD, and FD_CLOEXEC instead, for the "close-on-exec" flag.