Re: Re: file name parsing to get the original file name

in reply to Re: file name parsing to get the original file name
in thread file name parsing to get the original file name

If we are trying for 'best UNIX-only solution that requires no modules', I vote for:

my($name) = $path =~ /([^\/]+)\z/;

I second Abigail-II's suggestion that a module is used, though, as these sorts of problems are generic in nature, and it is very scary to see hundreds of different solutions to the same problem, each with their own independent set of failings.

At least if a single module is used by everybody, then the code is being excercised in a higher percentage of the possible contexts, and problems will be fixed sooner, rather than being discovered much later.

UPDATE: Optimizing the above expression, we can see the speed improve by a factor of 6:

$path =~ /(?:.*\/)?(.+)/s; my $name = $1;

It seems that the Perl regular expression engine does a poor job of dealing with matching a pattern at the end of a string. This is not surprising given that most regular expression engines start searching from the beginning of the string.

Comment on Re: Re: file name parsing to get the original file name Select or Download Code

Replies are listed 'Best First'.
Re: file name parsing to get the original file name by Abigail-II (Bishop) on Aug 20, 2003 at 08:00 UTC
Some quick benchmarking shows your solution to be about half as fast compared to mine. #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; our @files = qw { /etc/passwd one/two/three/four/five/six.a file a/very/deep/file/indeed/deeper/than/you/may/think/really }; cmpthese -5 => { abigail => 'foreach my $f (@files) { my $fn = (split m{/} => $f) [-1] }', markm => 'foreach my $f (@files) { my ($fn) = $f =~ /([^\/]+)\z/ }', }; __END__ Benchmark: running abigail, markm for at least 5 CPU seconds... abigail: 5 wallclock secs ( 5.19 usr + 0.00 sys = 5.19 CPU) @ 68 +404.24/s (n=355018) markm: 6 wallclock secs ( 5.23 usr + 0.00 sys = 5.23 CPU) @ 34 +427.53/s (n=180056) Rate markm abigail markm 34428/s -- -50% abigail 68404/s 99% -- [download] Abigail	[reply] [d/l]
Re: Re: file name parsing to get the original file name by MarkM (Curate) on Aug 20, 2003 at 20:30 UTC
Interesting. It looks like you've found yet another piece of Perl that isn't implemented in the most optimal manner. :-) Playing around, I found that on my system, the following tweak allows the 'single regexp match' to beat the 'split into a temporary list, and grab the last entry' approach by ~15%: `$f =~ /(?:.*\/)?(.+)/s; my $fn = $1;` I'm still surprised that Perl can match against '/' several times (split) faster than it can skip to the last '/' with a single rather simple match. It seems the Perl regular expression engine could still use a few optimizations. Until then, I suppose explicit optimization isn't that bad. Cheers, mark	[reply] [d/l]
Re: file name parsing to get the original file name by Abigail-II (Bishop) on Aug 20, 2003 at 21:22 UTC
I'm still surprised that Perl can match against '/' several times (split) faster than it can skip to the last '/' with a single rather simple match. I'm not surprised. `/(^\/+)\z/` is a rather complicated regex due to the character class being used. The optimizer can't figure out that is the same as "looking for the last slash". `m!/!` on the other hand is so simple, the optimizer recognizes it as searching for a fixed string. Abigail	[reply]
Re^3: file name parsing to get the original file name (regex performance) by Aristotle (Chancellor) on Aug 24, 2003 at 05:43 UTC
Enter sexeger. Add this to Abigail's benchmark. `aristotle => 'foreach my $f (@files) { my ($fn) = reverse($f) =~ m!^(.?)/?!s; $fn = reverse $fn; }',` [download] `Rate markm abigail markm2 aristotle markm 39625/s -- -56% -58% -61% abigail 89688/s 126% -- -4% -11% markm2 93877/s 137% 5% -- -7% aristotle 100885/s 155% 12% 7% --` Reversing the string (twice!) may be costly, but the simplicity of the regex offsets this. Note that `[^/]+` would have been much* slower. `.?` has been treated to special optimizations. Makeshifts last the longest.*	[reply] [d/l]
Re^3: file name parsing to get the original file name by rthawkcom (Novice) on Jun 22, 2011 at 15:54 UTC
Wouldn't it be easier just to do: $path=~/.\/(.)$/;$name=$1; Basically nuke everything in the way and grab what's left??	[reply]
Re^4: file name parsing to get the original file name by MarkM (Curate) on Dec 13, 2012 at 23:55 UTC
The check: $path=~/.\/(.)$/;$name=$1; Doesn't take into account that the path might not have a directory component...	[reply]

In Section Seekers of Perl Wisdom