tommyboy has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to parse a single file in a series of directories that is frequently updated and and renamed with a postfix according to the date. I do this by iterating through the directories and matching the files I want to parse with forced shell expansion inside calls to open open(FH,<*fpc>). It works the first time through and then fails. The <> returns undef but the files I am trying to parse do in fact exist. Does <> shell expansion only get compiled once -- is there a way to force it to evaluate every time. There are other ways to solve this problem but <> is the simplest/most elegant, if it only worked.
#!/usr/bin/perl -w use strict; #directory containing fpc map dirs my $fpcdir = '/home3/ftp/pub/gsc1/fpc_files'; #organism/fpc map dir names my @projects = ("mouse","human","cb","arab"); my %stats; #foreach of the fpc maps get total clones and contigs foreach(@projects) { &get_clones_and_contigs($fpcdir,$_); } sub get_clones_and_contigs($$) { my ($fpcdir,$project) = @_; # this is what I can not figure out - #this works on the first iteration # but fails to expand correctly the second time #&get_lones_and_contigs is called. open(FH,<$fpcdir/$project/*fpc>) || die "$! \n"; while(<FH>) { if ($_ =~ /\sContigs\s.(\d+)\s.Clones\s.(\d+)\s/) { print "$project: contigs[$1] clones[$2] \n"; my @stats = ($1,$2); $stats{'project'} = \@stats; return; } } close FH; }
below is one of an excerpt of one of the fpc files that needs parsing.
// fpc project master_Humanmap // 4.6.9 Date: 14:04 Wed 05 Dec 2001 User: scanner // Contigs 726 Clones 407155 Markers 69507 Bands 12849811 // Framework Chr_Z Genome 0 AvgBand 4000 AvgInsert 174000 // Configure 173 Tol 7 Cut 3e-09 Apx 0.100 Min 3 End 15 Kill -1 Bad 15 + Best 10 Log 0 Std 1 // CpM Off 50 1 0 TBL 1 1e-05 2 1e-04 3 1e-03 // Build 1/1/70 0:0 Cut 3e-12 Off 50 1 0 TBL 1 1e-08 21e-07 3 1e-06 // Clip(0 4600) MinMax(0 32767) AutoRemark

Edit Masem 2002-01-24 - Fixed title, code tage on file format

Replies are listed 'Best First'.
Re: Shell expansion with is funky
by Masem (Monsignor) on Jan 25, 2002 at 01:10 UTC
    The glob command is much better; pass it a wildcard-embedded string, and it returns a list of filenames that match it (assuming in CWD). If you need DOS-based matching, you can use File::DosGlob to replace the system glob with a slightly different version. In your code, specifically:
    my @files = glob "$fpcdir/$project/*fpc"; foreach my $file ( @files ) { open FH, "<$file" or die $!; while (<FH>) { # inner loop unchanged... } close FH; }

    Dr. Michael K. Neylon - || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important

Re: Shell expansion with is funky
by japhy (Canon) on Jan 25, 2002 at 02:39 UTC
    Your question has been answered, but I'd like to make another comment. Please do not use prototypes. It's clear from the code that you don't know how Perl requires them to be presented for them to operate properly.

    A function with a prototype must be declared or defined prior to the function being called. In addition, a function that calls itself must be declared before it is defined, if it is to have a prototype. When calling the function, the prototype is ignored if you preface the function call with an ampersand. You've violated two of those rules; your prototype is wasted.

    There's also the grim fact that prototypes in Perl are far less useful than most programmers expect them to be, and they enforce often bizarre requirements on the arguments to functions -- some of these have been fixed only in the most recent versions of Perl.

    Prototypes are deep magic, and even the best magicians get by without invoking them.

    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      I appreciate the work around the sage advice about prototypes - I would beg to differ with japhy , however, that my question has been answered. I could use glob , but I would rather use <> and I am not sure why the angle brackets are not behaving as advertised. My question is, in a nutshell - why are angle brackets not resulting in correct shell expansion on succesive iterations of a loop. This is puzzling and still remains unanswered. Thanks, Tom
        I cannot replicate your problem.

        Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
        s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Re: Shell expansion with is funky
by chipmunk (Parson) on Jan 25, 2002 at 07:44 UTC
    This syntax works pretty much as I would expect, opening the next file returned by the glob each time the subroutine is called. However, you're still going to have a problem, because no matter what you will run out of files, glob will return undef, the open will fail, and your script will die.

    You should probably be doing something more like this:

    sub foo { defined($file = <abc*>) or return; open(FH, $file) or die $!; ... }
    However, I suspect that you may be creating the files after you have executed the glob the first time. perlop explains why this won't work as you intend:
    A glob evaluates its (embedded) argument only when it is starting a new list. All values must be read before it will start over. In a list context this isn't important, because you automatically get them all anyway. In scalar context, however, the operator returns the next value each time it is called, or a undef value if you've just run out.
    If that's the issue, you should not be using glob; you should keep track of the files when you create them, pulling the names from an internal list rather than globbing the file system.