Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

File::Glob infinite loop with while loop unlike core glob function

by Gulliver (Monk)
on Jul 29, 2011 at 16:12 UTC ( #917494=perlquestion: print w/ replies, xml ) Need Help??
Gulliver has asked for the wisdom of the Perl Monks concerning the following question:

The core glob won't handle the foldername with spaces so I used the File::Glob version but it goes into an infinite loop printing the last file in the directory. How can I use bsd style glob with a while loop like I'm used to?

use warnings; use strict; my $folder = "/usr/fldr wth spaces"; print "====== core glob ======\n"; while (glob "$folder/*.txt") { print "$_\n"; } print "Press <Enter>"; <>; { use File::Glob ':glob'; print "======with File::Glob ':glob'======\n"; while (glob "$folder/*.txt") { print "$_\n"; } print "Press <Enter>"; <>; }

Comment on File::Glob infinite loop with while loop unlike core glob function
Download Code
Re: File::Glob infinite loop with while loop unlike core glob function
by toolic (Chancellor) on Jul 29, 2011 at 16:40 UTC
    I get no infinite loop when I change the "while" to a "for" loop, if that is a viable trade-off for you:
    { use File::Glob ':glob'; print "======with File::Glob ':glob'======\n"; for (glob "$folder/*.txt") { print "$_\n"; } print "Press <Enter>"; <>; }

      Yes that does what I need, thanks.

      I only noticed this because I was working through the Camel book 'Filename Globbing Operator' section in Chapter 2 where it talks about while magic but the example didn't work for me because I was using File::Glob.

Re: File::Glob infinate loop with while loop unlike core glob function
by Anonymous Monk on Jul 29, 2011 at 16:48 UTC
    Hey, i think you should report that with perlbug, thats a bonafide bug there
Re: File::Glob infinate loop with while loop unlike core glob function
by jpl (Monk) on Jul 29, 2011 at 17:44 UTC
    No, I don't think it is a bug. glob returns a list. In scalar content, as while establishes, you just get the last element of the list. In this case, it is a non-empty string, so it acts as true, and it never changes, so there is an infinite loop. If you do
    @files = glob "$folder/*.txt";
    the list, in array context, turns into a real array, over which you can iterate with your favorite iterator.

      I didn't say it was a bug. Perhaps you intended to reply to the Anon Poster.

      From Programming Perl, 3rd edition:

      Whether you use the glob function or the old angle-bracket form, the fileglob operator also does while magic like the line input operator, assigning the result to $_. (That was the rationale for overloading the angle operator in the first place.)

      I expected to be able to just put the use File::Glob ':glob' at the top of my existing code to fix the whitespace issue.

        The comment, was, indeed, targeted at the anonymous poster (or anyone else contemplating filing a bug report). I can see where the snippet you quote could be misleading. The glob core function does do "while magic". I suspect that is a decision they wish they could undo.

      No, I don't think it is a bug. glob returns a list

      This is the bug, that it returns a list in scalar context, and the docs don't warn about it.

      perldoc -f glob says

      In list context, returns a (possibly empty) list of filename expansions on the value of EXPR such as the standard Unix shell /bin/csh would do. In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted.

      ...

      Beginning with v5.6.0, this operator is implemented using the standard "File::Glob" extension. See File::Glob for details, including "bsd_glob" which does not treat whitespace as a pattern separator.

      use File::Glob ':glob'; overrides glob (csh_glob) with a function that behave differently in scalar context (bsd_glob), and the documentation dosn't warn you explicitly.

      making bsd_glob DWIM in scalar context is trivial (just copy/paste from csh_glob)

      fixing the pod is equally trivial.

      No, I'm not contemplating filing a bug report, Gulliver should :)

        Beginning with v5.6.0, this operator is implemented using the standard "File::Glob" extension. See File::Glob for details, including "bsd_glob" which does not treat whitespace as a pattern separator.
        When I look at File::Glob, I don't see it promising to do "while magic". I see it "returning a list". If the core glob took advantage of this functionality to determine the list it is ready to do "while magic" over, that is a decision made by the core, not (necessarily) a requirement on File::Glob. Regardless of how File::Glob might have been "improved" in the v5.6.0 era, it has been behaving in the stated way since then, and it is not "trivial" to modify that behavior now. That could break existing code, and the porters are notoriously (and justifiably) reluctant to do that.

        I agree that emphasizing the absence of "while magic" in the File::Glob documentation could be helpful. I confess to having wondered what was causing the problem with gulliver's code, until I recalled that core glob can do what it does only by "magic". I'm not a big fan of "magic", in part because of problems like this.

Re: File::Glob infinite loop with while loop unlike core glob function
by jpl (Monk) on Aug 01, 2011 at 14:29 UTC
    One, perhaps final, comment. In the process of writing a short program to reproduce the behavior (in case someone, probably me, decides to file a bug report, my earlier whinging notwithstanding), I came up with this:
    #!/usr/bin/perl -w use strict; use File::Glob ':glob'; my $dir = shift || "/etc"; my $max = shift || 10000; my $entries = 0; while (glob("$dir/*")) { last if (++$entries > $max); } print($entries, " items under $dir\n");
    If I comment out the use File::Glob ':glob'; to get a run with the core glob, then put it back in, I see the following
    time globbug.pl /etc 20000 279 items under /etc real 0m0.00s user 0m0.00s sys 0m0.00s time globbug.pl /etc 280 281 items under /etc real 0m0.15s user 0m0.09s sys 0m0.06s
    The first (core) run demonstrates that there are 279 entries in my /etc, and it takes almost no time to determine that. The second run, confined to stop after it is demonstrably wrong, takes quite a while to be wrong. This is consistent with it rebuilding the entire list for each iteration, then throwing away all but the last item.
Re: File::Glob infinite loop with while loop unlike core glob function
by jpl (Monk) on Nov 02, 2011 at 12:54 UTC

    For the benefit of anyone stumbling across this item in the future, there are several fixes being installed in the 5.16 release. The ':glob' tag has been removed from the synopsis and is actively discouraged in the documentation. Tag ':bsd_glob' will serve many of the same useful purposes, without the risk of infinite looping.

    The discussion among the perl porters brought to light another alternative for globbing file names with embedded blanks. The <> operator usually embraces barewords. If it embraces a quoted string, splitting does not occur on blanks; they are treated as ordinary characters. So

    while (<"em bedded/name*">} { print "$_\n"; }

    would have done what the OP wanted, without the need for File::Glob.

      would have done what the OP wanted, without the need for File::Glob.

      Its funny, that piece of code is implemented using glob/File::Glob

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://917494]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2014-07-29 08:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (211 votes), past polls