linux - how to replace contemporary values in a file

Friday, 21 December 2018

linux - how to replace contemporary values in a file

I'm trying to write in a single "file.cfg" the values of two variables generated by two independent scripts. The two variables are constantly updated and saved in the "file.cfg". Below is an example of my work.

example "file.cfg" content:

a=null
b=null

example "script_a.sh" update "a" value with:

#!/bin/bash
while : do
    .............
    val_a=1 
    sed -i "s/^\(a=\).*/\1$val_a/" file.cfg
    .............
done

example "script_b.sh" update "b" value with:

#!/bin/bash
while : do
    .............
    val_b=2 
    sed -i "s/^\(b=\).*/\1$val_b/" file.cfg
    .............
done

The scripts work perfectly and the values are updated. But if the two scripts are executed simultaneously one of the two values is not updated.

I discovered that sed with the "-i" option creates a temporary file that is overwritten by the two simultaneous operations. How can I solve?

Answer

This other answer exploits the idea of lockfile. There is another utility: flock(1). From its manual:

flock [options] file|directory command [arguments]
flock [options] file|directory -c command
[…]

This utility manages flock(2) locks from within shell scripts or from the command line.

The first and second of the above forms wrap the lock around the execution of a command, in a manner similar to su(1) or newgrp(1). They lock a specified file or directory, which is created (assuming appropriate permissions) if it does not already exist. By default, if the lock cannot be immediately acquired, flock waits until the lock is available.

And because it uses flock(2) system call, I believe the kernel guarantees no two processes can hold a lock for the same file:

LOCK_EX Place an exclusive lock. Only one process may hold an exclusive lock for a given file at a given time.

In your scripts, instead of sed … run flock some_lockfile sed …, e.g.

flock some_lockfile sed -i "s/^\(a=\).*/\1$val_a/" file.cfg

And that's it, the lock gets released when sed exits. The only disadvantages are:

some_lockfile may already be in use as a lockfile; the safe way is to use mktemp to create a temporary file and use it;

at the end you need to remove some_lockfile (I guess you don't want to leave it as garbage); but if anything else uses the file (probably not as a lockfile), you may not want to remove it; again, mktemp is the way to go: create a temporary file, use it, remove it – regardless of what other processes do.

Why not flock file.cfg sed … then? It would lock the exact file that is operated on; this wouldn't leave garbage at all. Why not?

Well, because this is flawed. To understand it let's see what (GNU) sed -i exactly does:

-i[SUFFIX]
--in-place[=SUFFIX]

This option specifies that files are to be edited in-place. GNU sed does this by creating a temporary file and sending output to this file rather than to the standard output.

[…]

When the end of the file is reached, the temporary file is renamed to the output file’s original name. The extension, if supplied, is used to modify the name of the old file before renaming the temporary file, thereby making a backup copy.

I have tested that flock locks inode rather than name (path). This means just after sed -i renames the temporary file to the original name (file.cfg in your case), the lock no longer applies to the original name.

Now consider the following scenario:

The first flock file.cfg sed -i … file.cfg locks the original file and works with it.

Before the first sed finishes, another flock file.cfg sed -i … file.cfg arises. This new flock targets the original file.cfg and waits for the first lock to be released.

The first sed moves its temporary file to the original name and exits. The first lock is released.

The second flock spawns the second sed which now opens the new file.cfg. This file is not the original file (because of different inode). But the second flock targeted and locked the original file, not the one the second sed just opened!

Before the second sed finishes, another flock file.cfg sed -i … file.cfg arises. This new flock checks the current file.cfg and finds it's not locked; it locks the file and spawns sed. The third sed begins to read the current file.cfg.

There are now two sed -i processes reading from the same file in parallel. Whichever ends first, loses – the other one will overwrite the results eventually by moving its independent copy to the original name.

That's why you need some_lockfile with a rock solid inode number.

Notes

Friday, 21 December 2018