Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: file name parsing to get the original file name

by Abigail-II (Bishop)
on Aug 20, 2003 at 08:00 UTC ( #285124=note: print w/ replies, xml ) Need Help??


in reply to Re: Re: file name parsing to get the original file name
in thread file name parsing to get the original file name

Some quick benchmarking shows your solution to be about half as fast compared to mine.

#!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; our @files = qw { /etc/passwd one/two/three/four/five/six.a file a/very/deep/file/indeed/deeper/than/you/may/think/really }; cmpthese -5 => { abigail => 'foreach my $f (@files) { my $fn = (split m{/} => $f) [-1] }', markm => 'foreach my $f (@files) { my ($fn) = $f =~ /([^\/]+)\z/ }', }; __END__ Benchmark: running abigail, markm for at least 5 CPU seconds... abigail: 5 wallclock secs ( 5.19 usr + 0.00 sys = 5.19 CPU) @ 68 +404.24/s (n=355018) markm: 6 wallclock secs ( 5.23 usr + 0.00 sys = 5.23 CPU) @ 34 +427.53/s (n=180056) Rate markm abigail markm 34428/s -- -50% abigail 68404/s 99% --

Abigail


Comment on Re: file name parsing to get the original file name
Download Code
Re: Re: file name parsing to get the original file name
by MarkM (Curate) on Aug 20, 2003 at 20:30 UTC

    Interesting. It looks like you've found yet another piece of Perl that isn't implemented in the most optimal manner. :-)

    Playing around, I found that on my system, the following tweak allows the 'single regexp match' to beat the 'split into a temporary list, and grab the last entry' approach by ~15%:

    $f =~ /(?:.*\/)?(.+)/s; my $fn = $1;

    I'm still surprised that Perl can match against '/' several times (split) faster than it can skip to the last '/' with a single rather simple match. It seems the Perl regular expression engine could still use a few optimizations. Until then, I suppose explicit optimization isn't that bad.

    Cheers,
    mark

      I'm still surprised that Perl can match against '/' several times (split) faster than it can skip to the last '/' with a single rather simple match.

      I'm not surprised. /(^\/+)\z/ is a rather complicated regex due to the character class being used. The optimizer can't figure out that is the same as "looking for the last slash". m!/! on the other hand is so simple, the optimizer recognizes it as searching for a fixed string.

      Abigail

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://285124]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2014-07-23 02:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (131 votes), past polls