Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Faster locking

by bbs2web (Acolyte)
on Apr 03, 2017 at 15:51 UTC ( [id://1186844]=perlquestion: print w/replies, xml ) Need Help??

bbs2web has asked for the wisdom of the Perl Monks concerning the following question:

We have routers which don't support SNMP OIDs for BGP peer status information, so I wrote a fairly simple script. The API call is relatively 'expensive' so I cache the information and want to implement locking so that only a single process ever updates whilst the others reference the cached data.

Overview: if cache's modification time is less than 60 seconds old, or less th +an 120 seconds old when it's busy being updated { load cached hash else load information via API and save hash to speed up subsequent look + ups
my $cache_ttl = 60; my $cachefile = '/tmp/mtapi.'.$router.'.cache'; my $lockfile = '/tmp/mtapi.'.$router.'.lock'; my %bgp_peer; my $cached = 0; my $modified = (stat($cachefile))[9]; if (defined $modified) { if (($modified >= (time-$cache_ttl)) || ((-e $lockfile) && ($modifie +d >= (time-($cache_ttl*2))))) { if (%bgp_peer = %{lock_retrieve($cachefile)}) { $cached = 1 }; } } if ($cached == 0) { open (LOCK, ">$lockfile"); flock(LOCK, LOCK_EX); my $api = MikroTik::API->new ( { host => $router, username => $api_user, password => $api_passwd, use_ssl => 1, } ); %bgp_peer = $api->get_by_key('/routing/bgp/peer/print', 'name'); $api->logout(); lock_nstore \%bgp_peer, $cachefile; chmod 0660, $cachefile; close LOCK; unlink $lockfile; }

I however still see concurrent API logins happening, a lot more than the 'one in a million' that I had assumed...

PS: I'm really not a programmer, so I'd humbly accept any recommendations or suggestions.

Replies are listed 'Best First'.
Re: Faster locking
by shmem (Chancellor) on Apr 03, 2017 at 16:35 UTC

    You don't check whether a) the open b) the flock c) the close of filehandle FLOCK are successful, you don't check the unlink, so your script just silently ignores any misdeeds and whistles along, and you get no clue about what's going on.

    To avoid creating/removing the lockfile (takes time, too), you could just lock the script itself, using the DATA filehandle:

    if ($cached == 0) { exit unless flock(DATA, LOCK_EX | LOCK_NB); ... flock(DATA, LOCK_UN); } ... __END__ __DATA__

    The LOCK_NB is there so the script doesn't wait until a lock is granted.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: Faster locking
by oiskuu (Hermit) on Apr 03, 2017 at 17:30 UTC

    I'll assume the lock_retrieve() and lock_nstore() both obtain a lock and then operate on the cachefile.

    First off, you have some problematic races. There is one between -e $lockfile and open (LOCK, ">$lockfile"); where multiple script instances can fall through to the non-cached path (and then sequentially update the cache file).

    Secondly, I think there's a race between close LOCK and unlink $lockfile. After the close, another process can obtain the lockfile, only for this to be promptly deleted, allowing a third process to grab the lock as well...

    Thirdly, the locking will needlessly hinder cached reads when it happens.

    Now about the fix. 1) Open the cache-file. 2) Stat the open file. 3) If age is less than 2*thresh, use the cached content. 4) If age is more than 1*thresh, trigger $update_needed. 5) When update is needed, launch the updater, or, if access delays are acceptable, perform the locked-update right there. 6) The updater can LOCK_EX|LOCK_NB the cache file, write a new tmpfile and rename that. Coded this way, the cached fast path needs no locking at all. HtH.

    Edit. added LOCK_NB above, probably the simplest way to ensure a single updater is run. flock+fork should be acceptable for running the update.

      Thank you! I dropped the .lock file and am now locking the script itself. It appears I can lock it again, with no apparent side effect, if it was locked earlier. Found a neat way of referencing the running script without using DATA but now can't find the site to reference it.

      The lock_retrieve and lock_nstore were intended to keep other processes waiting, whilst the updated data is being flushed to the cache file.

      Herewith the resulting script snippet:

      my $cache_ttl = 60; my $cachefile = '/tmp/mtapi.'.$router.'.cache'; my %bgp_peer; my $cached = 0; my $modified = (stat($cachefile))[9]; open our $lockfile, '<', $0 || die $!; if (defined $modified) { if (($modified >= (time-$cache_ttl)) || (!flock $lockfile, LOCK_EX | + LOCK_NB)) { if (%bgp_peer = %{lock_retrieve($cachefile)}) { $cached = 1 }; } } if ($cached == 0) { if (!flock $lockfile, LOCK_EX | LOCK_NB) { print STDERR "Another pro +cess is already using MikroTik API.\n"; exit 6 }; $api->connect(); $api->login(); %bgp_peer = $api->get_by_key('/routing/bgp/peer/print', 'name'); $api->logout(); lock_nstore \%bgp_peer, $cachefile; chmod 0660, $cachefile; flock $lockfile, LOCK_UN; }

        Could someone please confirm that Perl would skip trying to acquire a lock when the first 'if' condition is true?

        Should I rather place the test in to a nested condition, completely outside of the first?

        if ( (1 == 1) || (!flock $lockfile, LOCK_EX | LOCK_NB) )

        PS: Things appear to be working as expected, but I would still like to know, for my own peace of mind.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1186844]
Approved by marto
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-03-28 19:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found