Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Creating arrays with glob patterns without metacharacters

by Lotus1 (Priest)
on Jan 05, 2018 at 15:51 UTC ( #1206762=perlquestion: print w/replies, xml ) Need Help??
Lotus1 has asked for the wisdom of the Perl Monks concerning the following question:

A new teammate demonstrated something like the following at the beginning of a script.

use strict; use warnings; our @arr = <ABC DEF GHI>;

I'm going to go over when to use our but I'm struggling with what to say about using glob like this. It works as long as there isn't a metacharacter other than curly braces. He has been told already about the use of qw() to create an array like this but I suspect he copied this from somewhere. In the perldocs for glob I found this:

If non-empty braces are the only wildcard characters used in the glob, no filenames are matched, but potentially many strings are returned. For example, this produces nine strings, one for each pairing of fruits and colors:
1. my @many = glob "{apple,tomato,cherry}={green,yellow,red}";

I created the following to try to show why it's a bad idea.

use strict; use warnings; use Data::Dumper; my @arr1 = < abc def ghi f* >; my @arr2 = qw( abc def ghi f* ); my @arr3 = glob('abc def ghi z*'); print Dumper(\@arr1); print Dumper(\@arr2); print Dumper(\@arr3); __DATA__ $VAR1 = [ 'abc', 'def', 'ghi', 'file1.txt', 'file2.txt' ]; $VAR1 = [ 'abc', 'def', 'ghi', 'f*' ]; $VAR1 = [ 'abc', 'def', 'ghi' ];

perl -MO=Deparse glob_to_array.pl produces the following.

use Data::Dumper; use File::Glob (); use warnings; use strict 'refs'; my(@arr1) = glob(' abc def ghi f* '); my(@arr2) = ('abc', 'def', 'ghi', 'f*'); my(@arr3) = glob('abc def ghi z*'); print Dumper(\@arr1); print Dumper(\@arr2); print Dumper(\@arr3); glob_to_array.pl syntax OK

Deparse shows that qw() does the job without calling glob and risking something unexpected if the text happens to contain a metacharacter (other than curly braces). The best thing I can come up with is for a program that the team will need to support this is a bad idea since it could cause unintended and confusing side effects. Suggestions for better demonstrations or documentation would be appreciated. Maybe I'm being too critical and should lower my critic setting.

Note: I changed the title

Replies are listed 'Best First'.
Re: Creating arrays with glob patterns without metacharacters
by haukex (Canon) on Jan 05, 2018 at 16:41 UTC
    it could cause unintended and confusing side effects

    It's of course a matter of opinion, but I do agree with this - it's a neat but somewhat obscure use of glob. It can be useful, as it allows one to generate combinations using just core Perl, without loading e.g. Algorithm::Combinatorics. But all it takes is someone who doesn't know the details ("If non-empty braces are the only wildcard characters used ...") or the interpolation of unchecked variables into the pattern for it to break. Personally, my feeling is that in a one-off script, or with a fixed string plus a comment warning future maintainers, it's ok to use. But if you want to play it safe with respect to future maintainers, I'd avoid it.

    As for using <a b c> as a replacement for qw/a b c/, sure it's clever, but IMO it's a silly place to save two characters, and I'd probably recommend against it. Also, I like to use File::Glob ':bsd_glob';, which breaks it:

    $ perl -le 'print for <a b c>' a b c $ perl -MFile::Glob=:bsd_glob -le 'print for <a b c>' a b c

      I also use ':bsd_glob'. I could show how if I use a function someone else potentially creates using glob to create an array it would quit working with bsd_glob. Thanks.

      Update: demo code.

      use strict; use warnings; use Data::Dumper; use File::Glob ':bsd_glob'; my @arr1 = get_list1(); print Dumper(\@arr1); my @arr2 = get_list2(); print Dumper(\@arr2); sub get_list1 { return < abc def ghi >; } sub get_list2 { return qw( abc def ghi ); } __DATA__ $VAR1 = [ ' abc def ghi ' ]; $VAR1 = [ 'abc', 'def', 'ghi' ];
Re: Why are glob patterns without metacharacters returned
by Eily (Prior) on Jan 05, 2018 at 16:14 UTC

    I find the phrase "If non-empty braces are the only wildcard characters used in the glob" misleading because it makes it sound like there are many wildcard chars to choose from, rather than just * and braces. So that would be "If there is no * in the glob". You also demonstrated that even if there are * in the strings, it can potentially return strings without matching files, because glob("PATTERN_A PATTERN_B") is the same as (glob("PATTERN_A"), glob("PATTERN_B")).

    That said, the other difference between glob and qw is interpolation, and quotes. You can't have an element with a space inside qw, but you can in glob. And glob interpolates while qw does not. IIRC ruby has each quoting construct in pairs, interpolating or non interpolating, so you can insert a few variables in a list of words. glob is tricky though, because interpolation is done first, and spaces and quotes in your variables might not do what you want:

    use Data::Dump qw( pp ); my $interpolate = "A"; my $trap = "'X Y' Z"; pp qw( $interpolate B 'C D' "E F" $trap); pp < $interpolate B 'C D' "E F" $trap \Q$trap>; __DATA__ ("\$interpolate", "B", "'C", "D'", "\"E", "F\"", "\$trap") ("A", "B", "C D", "E F", "X Y", "Z", "'X\\ Y'\\ Z")

    So as a conclusion: qw works exactly like a single quoted string splited on whitespace, but glob can be tricky if braces, *, $, @ or quotes are present in the string. Also qw accepts newlines where glob doesn't.

      I find the phrase "If non-empty braces are the only wildcard characters used in the glob" misleading because it makes it sound like there are many wildcard chars to choose from, rather than just * and braces.

      There are a few other meta characters for glob().

      META CHARACTERS 1. \ Quote the next metacharacter 2. [] Character class 3. {} Multiple pattern 4. * Match any string of characters 5. ? Match any single character 6. ~ User name home directory

        Meh, I stopped at the glob documentation and didn't go any further. Well then, that's a whole bunch of characters to avoid :).

        /me learned something new about perl today. Thanks :)

        Edit: the list is in File::Glob for anyone who is wondering.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1206762]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2018-11-17 19:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My code is most likely broken because:
















    Results (205 votes). Check out past polls.

    Notices?