Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Breaking file path into segments

by cherio (Initiate)
on Apr 12, 2019 at 20:11 UTC ( #1232511=perlquestion: print w/replies, xml ) Need Help??
cherio has asked for the wisdom of the Perl Monks concerning the following question:

I have file path that has one or more directories and like this "/abc/def/ghi". I need a regex that can break it into an array as follows
( /abc/def/ghi, /abc/def, /abc )
alternatively a reversed array will do as well
( /abc, /abc/def, /abc/def/ghi )
The following regex breaks path into up to 3 segments
my $path = "/abc/def/ghi"; my @segments = ($path =~ m'^(/[^/]+ (/[^/]+ (/[^/]+)? )? )'x);
This regex is rigid. I can make it long enough to be able to break path into N segments but if the actual path has N+1 segments the regex won't work.

I want a regex that can break path into variable number of segments and not hardcode maximum path length into it.

Replies are listed 'Best First'.
Re: Breaking file path into segments
by choroba (Bishop) on Apr 12, 2019 at 20:23 UTC
    I prefer Path::Tiny to handle paths for me:
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Path::Tiny; my $path = '/abc/def/ghi'; my $climb = path($path); my @paths = $climb; push @paths, $climb until ($climb = $climb->parent) eq '/'; say for @paths;

    File::Spec is a core module that can handle the path as well:

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use File::Spec; my $path = '/abc/def/ghi'; my @paths = 'File::Spec'->splitdir($path); for my $i (reverse 0 .. $#paths) { $paths[$i] = join '/', @paths[0 .. $i]; } shift @paths; # Remove the empty path. say for @paths;

    But a regex with split will work similarly:

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Path::Tiny; my $path = '/abc/def/ghi'; my @paths = split m{(?<=.)/}, $path; for my $i (reverse 0 .. $#paths) { $paths[$i] = join '/', @paths[0 .. $i]; } say for @paths;

    I used a lookbehind assertion to skip the empty path before the first slash.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      File::Spec is a standard module and comes with Perl. For a list of standard modules, see perldoc perlmodlib

      If your code is going to be run on a different machine than your development one, using standard modules means you don't have to distribute and maintain additional code (the modules).

        > you don't have to distribute and maintain additional code

        Unless you work on a RedHat based Linux distribution where you need to install the perl-PathTools package. Installing perl-Path-Tiny is comparably complex.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Breaking file path into segments
by holli (Monsignor) on Apr 12, 2019 at 20:29 UTC
    use Path::Class; print join "#", dir("abc/def/ghi")->components;


    holli

    You can lead your users to water, but alas, you cannot drown them.
Re: Breaking file path into segments (updated x2)
by AnomalousMonk (Chancellor) on Apr 12, 2019 at 21:38 UTC

    I agree with others that a path-processing module is the way to go here, but if you just gotta have a regex, here's another:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "print qq{perl version: $]}; ;; my $path = '/abc/def/ghi'; ;; local our @dirs; $path =~ m{ (\A (?: / [^/]+ \b)+) (?{ push @dirs, $^N }) (?!) }xms; ;; dd \@dirs; " perl version: 5.008009 ["/abc/def/ghi", "/abc/def", "/abc"]
    With Perl version 5.10+, the odd-looking  (?!) can become  (*FAIL) from Special Backtracking Control Verbs in perlre (but the compiler will optimize it to (*FAIL) anyway). With version 5.18+ (IIRC), the package-global array  @dirs can become a my array; that bug was fixed.

    Update 1: On second thought, the  \b anchor is a bit too alnum-specific: make it  (?! [^/]) instead:
        $path =~ m{ (\A (?: / [^/]+ (?! [^/]))+) (?{ push @dirs, $^N }) (?!) }xms;

    Update 2: And if you want to reverse the order of the pieces, use a lazy quantifier:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "print qq{perl version: $]}; ;; my $path = '/=abc/def/ghi=/---'; ;; local our @dirs; $path =~ m{ (\A (?: / [^/]+ (?! [^/]))+?) (?{ push @dirs, $^N }) (?!) + }xms; ;; dd \@dirs; " perl version: 5.008009 ["/=abc", "/=abc/def", "/=abc/def/ghi=", "/=abc/def/ghi=/---"]


    Give a man a fish:  <%-{-{-{-<

Re: Breaking file path into segments
by johngg (Canon) on Apr 13, 2019 at 11:01 UTC

    An alternative to doing the whole thing with a regex would be to split into individual path elements then push joined elements onto the array, popping elements off the end until there's nothing left.

    use 5.026; use warnings; use Data::Dumper; my @paths = qw{ /abc/def/ghi /wxy/z /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 bin/fred somefile }; foreach my $path ( @paths ) { my @elems = split m{/}, $path; my @arr; while ( @elems ) { last if @elems == 1 && ! $elems[ 0 ]; # Ignore empty first ele +ment # if path starts with a +/ push @arr, join q{/}, @elems; pop @elems; } say $path; print Data::Dumper->Dumpxs( [ \ @arr ], [ qw{ *arr } ] ); say q{-} x 30; }

    The output.

    /abc/def/ghi @arr = ( '/abc/def/ghi', '/abc/def', '/abc' ); ------------------------------ /wxy/z @arr = ( '/wxy/z', '/wxy' ); ------------------------------ /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 @arr = ( '/usr/local/lib/x86_64-linux-gnu/perl/5.26.1', '/usr/local/lib/x86_64-linux-gnu/perl', '/usr/local/lib/x86_64-linux-gnu', '/usr/local/lib', '/usr/local', '/usr' ); ------------------------------ bin/fred @arr = ( 'bin/fred', 'bin' ); ------------------------------ somefile @arr = ( 'somefile' ); ------------------------------

    A little more long-winded but possibly simpler to understand. I hope this is helpful.

    Cheers,

    JohnGG

Re: Breaking file path into segments
by Marshall (Abbot) on Apr 15, 2019 at 04:29 UTC
    Another idea for you:
    I am not sure if these paths are absolute or relative paths.
    Update: Modified code to preserve initial "root path". Handles 3 cases now.
    Modify the code below accordingly.
    #!/usr/bin/perl use strict; use warnings; my @paths = qw{ /abc/def/ghi /wxy/z /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 bin/fred somefile ./another_file }; foreach my $path (@paths) { my $root_path = ''; $root_path = $1 if ($path =~ s/^([.\/]*)(.*)/$2/); my @components = split '/',$path; while (@components) { print $root_path,join('/',@components), "\n"; pop @components; } } __END__ /abc/def/ghi /abc/def /abc /wxy/z /wxy /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 /usr/local/lib/x86_64-linux-gnu/perl /usr/local/lib/x86_64-linux-gnu /usr/local/lib /usr/local /usr bin/fred bin somefile ./another_file

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1232511]
Approved by dorko
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2019-04-20 10:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I am most likely to install a new module from CPAN if:
















    Results (108 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!