Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

glob() and dot files

by perlancar (Hermit)
on Apr 13, 2020 at 03:47 UTC ( #11115413=perlquestion: print w/replies, xml ) Need Help??

perlancar has asked for the wisdom of the Perl Monks concerning the following question:

A couple of questions on handling dotfiles with perl's glob(). Seems like bsd_glob() doesn't offer any special handling of dotfiles.

1. Is there an equivalent for "shopt -s dotglob"? Reading File::Glob indicates there isn't.

2. What would be the easiest to accomplish "listing all files/subdirectories including dotfiles/dotsubdirs but without the . and .." and "listing all dotfiles/dotsubdirs only, without the . and .."? I've given up on glob() for this and simply do something along the line of:

@all_files_including_dot = do { opendir my $dh, "."; grep { $_ ne '.' +&& $_ ne '..' } readdir $dh }; @all_dotfiles = do { opendir my $dh, "."; grep { $_ ne '.' && $_ ne '. +.' && /\A\./ } readdir $dh };

Replies are listed 'Best First'.
Re: glob() and dot files
by Fletch (Bishop) on Apr 13, 2020 at 04:29 UTC

    Path::Tiny has a children method which omits those, but personally I don't see anything wrong which what you've got offhand.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: glob() and dot files (updated)
by haukex (Archbishop) on Apr 13, 2020 at 08:19 UTC

    You might be interested in my node To glob or not to glob for the caveats of glob.

    What would be the easiest to accomplish "listing all files/subdirectories including dotfiles/dotsubdirs but without the . and .." and "listing all dotfiles/dotsubdirs only, without the . and .."?

    If you mean core-only, then:

    use File::Spec::Functions qw/ no_upwards catfile catdir /; opendir my $dh, $path or die "$path: $!"; my @files = map { -d catdir($path,$_) ? catdir($path,$_) : catfile($path,$_) } sort +no_upwards readdir $dh; closedir $dh;

    Note: I think that on some OSes (VMS?), there's a difference between catfile and catdir, that would require you to use the -d test, but I believe the above should work fine on any other OS. (Or, you can omit the catfile entirely if bare filenames are ok.) <update> Confirmed the difference between catfile and catdir with File::Spec::VMS and File::Spec::Mac, so I updated the above example with the -d test accordingly. </update> <update2> I don't have a VMS or Classic Mac to test on, but I realized that my update had a bug in that I wasn't doing the -d test on the full filename. So I hope that this updated version would really be correct on those platforms. </update2>

    If you need absolute pathnames, you probably want to add a $path = rel2abs($path); (also from File::Spec). Otherwise, if CPAN is fine, then I really like Path::Class, its children includes everything except the . and .. by default:

    use Path::Class; my @files = dir($path)->children(); # - or - my @files = dir($path)->absolute->children();

      Hi haukex,

      Thanks for pointing out about your glob() post. I think I read it in the past. If wildcard is problematic, this gives me an idea of creating a glob-like function but with regex instead: re_glob('.*') or re_glob(qr/\.foo/). It will not skip dotfiles by default.

      By the way, most of the time for practical reasons I don't bother with File::Spec at all, because why would I sacrifice myself using catfile() and no_upwards when I will not be using path separator other than "/", and parent directory other than ".." (probably for the rest of my life).

        By the way, most of the time for practical reasons I don't bother with File::Spec at all, because why would I sacrifice myself using catfile() and no_upwards when I will not be using path separator other than "/", and parent directory other than ".." (probably for the rest of my life).

        Well, if you know your scripts are only ever going to be run on *NIX, then sure. But what you're sacrificing is portability. For example, even nowadays, there are some Windows programs that can't handle / path separators and require \. Personally, although I've written code like "$path/$file" myself, I usually like my code to be as portable as possible, and if you're considering writing a re_glob(qr/\.foo/) function, you might want to release it as a module*, and then portability becomes important, IMHO.

        * use Path::Tiny; my @files = path($path)->children(qr/\.foo/); But sadly, Path::Tiny "does not try to work for anything except Unix-like and Win32 platforms." Alternative: use Path::Class; my @files = grep {$_->basename=~/\.foo/} dir(".")->children;

Re: glob() and dot files
by Marshall (Canon) on Apr 13, 2020 at 05:05 UTC
    This glob thing can be a problem. A long time ago I got tripped up with the 3 versions of glob that were in use at that time in the ActiveState version of Perl that I was using. I changed my code to use readdir() and that solved the problem.

    Nowadays, Perl glob is a lot more uniform and well behaved. This prints all simple files, but skips directories.

    my @files = glob ('*.*'); print "",join("\n",@files),"\n"
    For what you want, I would consider File::Find.
    Consider this code also.
    use strict; use warnings; opendir (my $dir, ".") or die "unable to open current directory $_"; my @files; my @directories; foreach my $file (grep{ ($_ ne '.') and ($_ ne '..')} readdir $dir) { if (-f $file ) {push @files, $file;} elsif (-d $file) {push @directories, $file;} else { die "weird beast found! $file"} } print "@files\n"; print "@directories\n";
    I think in Unix there can be special things that are not simple files or directories. I would use a file test to see what this name actually means.
    Note that if this is not the current directory, you need to spec the full path name for file tests.

    Update: File operations like "open file" or "open directory" are "expensive" in terms of performance. I would expect my code to run faster than the OP's code, but I did not benchmark this in any serious way. If the directories are small and this is not done that often, I don't think that will make any difference at all. Also be aware that there is a special variable for repeated file tests, "_". like  elsif (-d _) {do something{ That tests the structure returned by the previous file test operation for a different flag.

    Overall, unless there is a performance or other problem (special kinds of files), I see no problem with the OP's code.

      This prints all simple files, but skips directories.

      my @files = glob ('*.*');

      No, it prints any file or directory names that have a dot in them. It's a very old DOS convention that files had extensions and directories didn't, but nowadays that's not true anymore. You'd need grep {!-d} glob('*') to exclude directories.

        You are correct.
        my @files = glob('*'); # current directory print "",join("\n",@files), "\n";
        # Note that these file names do not say whether or not they are directories, a file test is needed. I demo'ed this at Re: Getting a list of directories matching a pattern.

        Update: I experimented with Windows 10 command line and found that I could indeed create a directory with a "dot suffix". That surprised me.Having said that, I have never seen such a thing in "real life". By convention, that is just not "the way that this is done". A long time ago, I was forced to use readdir and grep to get file names because of incompatible glob's. For production code, I still use readdir and grep because it will always work. For quick hacks, I am fine with glob().

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11115413]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2022-10-04 18:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My preferred way to holiday/vacation is:











    Results (18 votes). Check out past polls.

    Notices?